温馨提示:本文翻译自stackoverflow.com，查看原文请点击：其他 - Binning column with python pandas

dataframe numpy pandas python

其他 - 用python pandas 装箱列

发布于 2020-03-28 23:36:43

我有一个带有数值的数据框列：

df['percentage'].head()
46.5
44.2
100.0
42.12

我想查看该列作为箱数：

bins = [0, 1, 5, 10, 25, 50, 100]

我如何将结果作为垃圾箱value counts？

[0, 1] bin amount
[1, 5] etc 
[5, 10] etc 
......

提问者

Night Walker

被浏览

62

查看英文版

查看原文

jezrael 2017-07-24 14:31

您可以使用pandas.cut：

bins = [0, 1, 5, 10, 25, 50, 100]
df['binned'] = pd.cut(df['percentage'], bins)
print (df)
   percentage     binned
0       46.50   (25, 50]
1       44.20   (25, 50]
2      100.00  (50, 100]
3       42.12   (25, 50]

bins = [0, 1, 5, 10, 25, 50, 100]
labels = [1,2,3,4,5,6]
df['binned'] = pd.cut(df['percentage'], bins=bins, labels=labels)
print (df)
   percentage binned
0       46.50      5
1       44.20      5
2      100.00      6
3       42.12      5

或numpy.searchsorted：

bins = [0, 1, 5, 10, 25, 50, 100]
df['binned'] = np.searchsorted(bins, df['percentage'].values)
print (df)
   percentage  binned
0       46.50       5
1       44.20       5
2      100.00       6
3       42.12       5

...然后value_countsor groupby和合计size：

s = pd.cut(df['percentage'], bins=bins).value_counts()
print (s)
(25, 50]     3
(50, 100]    1
(10, 25]     0
(5, 10]      0
(1, 5]       0
(0, 1]       0
Name: percentage, dtype: int64

s = df.groupby(pd.cut(df['percentage'], bins=bins)).size()
print (s)
percentage
(0, 1]       0
(1, 5]       0
(5, 10]      0
(10, 25]     0
(25, 50]     3
(50, 100]    1
dtype: int64

默认cut返回categorical。

Series像这样的方法Series.value_counts()将使用所有类别，即使数据中不存在某些类别，也可以使用categorical 操作。

qqqwww 2018-05-31 02:38:51

如果没有bins = [0, 1, 5, 10, 25, 50, 100]，我可以说创建5个垃圾箱，它将按平均削减量进行削减吗？例如，我有110条记录，我想将它们切成5个槽，每个槽中有22条记录。

jezrael 2018-05-31 02:41:45

@qqqwww-不确定是否理解，您认为qcut呢？链接

相关问题

1

如何使用python cut方法创建bin，接受一个参数并返回适当的bin？

2

从具有特定条件的列表列表创建字典

3

根据行值选择列，Python，Pandas

4

在数据框中绘制零和一的计数

5

python函数。

6

在两个DataFrame之间执行大量Pandas查找的最佳方法

7

如何获取Pandas数据透视表中的列数和每列的宽度？

8

在Pandas数据框中分组时缺少所需值时显示一列

9

Python隐藏壁虱但显示壁虱标签

10

获取Entry和checkbutton值Tkinter时出现问题

热门github

1

Python tool for converting files and office documents to Markdown.

2

All Algorithms implemented in Python (翻译：用 Python 实现的所有算法)

3

A command line tool and library for transferring data with URL syntax, supporting DICT, FILE, FTP, FTPS, GOPHER, GOPHERS, HTTP, HTTPS, IMAP, IMAPS, LDAP, LDAPS, MQTT, POP3, POP3S, RTMP, RTMPS, RTSP, SCP, SFTP, SMB, SMBS, SMTP, SMTPS, TELNET, TFTP, WS and WSS. libcurl offers a myriad of powerful features (翻译：Curl 是一个命令行工具，用于传输使用 URL 语法指定的数据。)

4

Flutter makes it easy and fast to build beautiful apps for mobile and beyond (翻译：Flutter 可以轻松快速地为移动设备及其他应用构建漂亮的应用程序)

5

Tongyi DeepResearch, the Leading Open-source DeepResearch Agent

6

7

Open-source framework for conversational voice AI agents.

8

A complete computer science study plan to become a software engineer. (翻译：一个如何成为软件工程师的完整、科学的学习计划。)

9

AI wearables. Put it on, speak, transcribe, automatically

10

Main repository for the Linera protocol

11

基于大模型和 RAG 的智能问数系统。Text-to-SQL Generation via LLMs using RAG.

12

3D Reconstruction for all

13

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 80+ languages. (翻译：PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力开发者训练出更好的模型，并应用落地。)

14

Fine-tuning & Reinforcement Learning for LLMs. 🦥 Train OpenAI gpt-oss, Qwen3, Llama 4, DeepSeek-R1, Gemma 3, TTS 2x faster with 70% less VRAM.

15

Lightweight coding agent that runs in your terminal