温馨提示:本文翻译自stackoverflow.com，查看原文请点击：pandas - Python: preprocess data in a format to mine for Association rules and frequent itemsets (apriori/SPA

pandas python

pandas - Python：以某种格式预处理数据以挖掘关联规则和频繁项集（Apriori / SPA

发布于 2020-03-29 21:30:22

我有一个包含245行和2列的格式的数据框，其中“ 唯一 ”列由列表组成：

df = (pd.DataFrame({'TC': ['101', '102', '103'], 
                    'Unique': [[189,113,213,201,125,211],   
                               [206,268,446,149,104,166],
                               [163,103,113,166,800,101]]}))

我想遍历数据框，并将“ 唯一”中的列表分解为单独的列，以便我可以对数据运行一些频繁的项集挖掘算法。 预期产量

TC     0   1    2    3    4     5

101   189  113  213  201  125  211 
102   206  268  446  149  104  166
103   163  103  113  166  800  101

另外，如果可能的话，我想按顺序创建所有唯一字段的嵌套列表：

即

unique=[[189,113,213,201,125,211 ],[206,268,446,149,104,166],[163,103,113,166,800,101]]

提问者

Devarshi Goswami

被浏览

21

查看英文版

查看原文

Dames 2020-01-31 18:28

创建一个嵌套列表：

nested_list = list(df['Unique'])

print(nested_list)
# Output:
[[189, 113, 213, 201, 125, 211],
 [206, 268, 446, 149, 104, 166],
 [163, 103, 113, 166, 800, 101]]

要创建所需的表，只需从此嵌套列表中创建一个新的DataFrame并将列TC添加为索引列

x = pd.DataFrame(nested_list)  # fills df with each nested list as a new column
x['TC'] = df['TC']             # add TC column
x = x.set_index('TC')          # set TC column as index to make it show as first column

print(x)

# Output:
       0    1    2    3    4    5
TC                               
101  189  113  213  201  125  211
102  206  268  446  149  104  166
103  163  103  113  166  800  101 2

Devarshi Goswami 2020-01-31 19:09:40

谢谢！但是，如果我尝试在nested_list上运行Apriori，则会遇到一个问题。y = pd.DataFrame（apriori（nested_list））我收到TypeError：'<'在'float'和'str'的实例之间不支持

Dames 2020-02-01 19:35:36

请为此打开一个新问题，我不熟悉apriori

相关问题

1

如何使用python cut方法创建bin，接受一个参数并返回适当的bin？

2

从具有特定条件的列表列表创建字典

3

根据行值选择列，Python，Pandas

4

在数据框中绘制零和一的计数

5

python函数。

6

在两个DataFrame之间执行大量Pandas查找的最佳方法

7

如何获取Pandas数据透视表中的列数和每列的宽度？

8

在Pandas数据框中分组时缺少所需值时显示一列

9

Python隐藏壁虱但显示壁虱标签

10

获取Entry和checkbutton值Tkinter时出现问题

热门github

1

Python tool for converting files and office documents to Markdown.

2

Run LLMs with MLX

3

基于大模型和 RAG 的智能问数系统。Text-to-SQL Generation via LLMs using RAG.

4

DeepResearchAgent is a hierarchical multi-agent system designed not only for deep research tasks but also for general-purpose task solving. The framework leverages a top-level planning agent to coordinate multiple specialized lower-level agents, enabling automated task decomposition and efficient execution across diverse and complex domains.

5

An AI Hedge Fund Team

6

A cryptocurrency trading API with more than 100 exchanges in JavaScript / TypeScript / Python / C# / PHP / Go (翻译：一个 JavaScript / Python / PHP 加密货币交易 API，支持 100 多个比特币/山寨币交易所)

7

"DeepCode: Open Agentic Coding (Paper2Code & Text2Web & Text2Backend)"

8

Turn any PDF or image document into structured data for your AI. A powerful, lightweight OCR toolkit that bridges the gap between images/PDFs and LLMs. Supports 80+ languages. (翻译：PaddleOCR旨在打造一套丰富、领先、且实用的OCR工具库，助力开发者训练出更好的模型，并应用落地。)

9

zero-shot voice conversion & singing voice conversion, with real-time support

10

AI wearables. Put it on, speak, transcribe, automatically

11

12

3D Reconstruction for all

13

PowerShell for every system! (翻译：适用于各系统的PowerShell)

14

Perplexica is an AI-powered search engine. It is an Open source alternative to Perplexity AI

15

All Algorithms implemented in Python (翻译：用 Python 实现的所有算法)