Warm tip: This article is reproduced from stackoverflow.com, please click
pandas python

Python: preprocess data in a format to mine for Association rules and frequent itemsets (apriori/SPA

发布于 2020-03-29 20:58:44

I have a dataframe of the format consisting of 245 rows and 2 columns in which the column Unique consists of lists :

df = (pd.DataFrame({'TC': ['101', '102', '103'], 
                    'Unique': [[189,113,213,201,125,211],   
                               [206,268,446,149,104,166],
                               [163,103,113,166,800,101]]}))

i want to iterate through the dataframe and explode the lists in Unique into separate columns so that i can run some frequent itemset mining algorithm on my data. expected output

TC     0   1    2    3    4     5

101   189  113  213  201  125  211 
102   206  268  446  149  104  166
103   163  103  113  166  800  101

Also, If possible i want to create a nested list of all unique field in sequential order:

ie

unique=[[189,113,213,201,125,211 ],[206,268,446,149,104,166],[163,103,113,166,800,101]]
Questioner
Devarshi Goswami
Viewed
25
Dames 2020-01-31 18:28

to create a nested list :

nested_list = list(df['Unique'])

print(nested_list)
# Output:
[[189, 113, 213, 201, 125, 211],
 [206, 268, 446, 149, 104, 166],
 [163, 103, 113, 166, 800, 101]]

to create your desired table simply create a new DataFrame from this nested list and add the column TC as index column

x = pd.DataFrame(nested_list)  # fills df with each nested list as a new column
x['TC'] = df['TC']             # add TC column
x = x.set_index('TC')          # set TC column as index to make it show as first column

print(x)

# Output:
       0    1    2    3    4    5
TC                               
101  189  113  213  201  125  211
102  206  268  446  149  104  166
103  163  103  113  166  800  101 2