Python: preprocess data in a format to mine for Association rules and frequent itemsets (apriori/SPA

Dames 2020-01-31 18:28

to create a nested list :

nested_list = list(df['Unique'])

print(nested_list)
# Output:
[[189, 113, 213, 201, 125, 211],
 [206, 268, 446, 149, 104, 166],
 [163, 103, 113, 166, 800, 101]]

to create your desired table simply create a new DataFrame from this nested list and add the column TC as index column

x = pd.DataFrame(nested_list)  # fills df with each nested list as a new column
x['TC'] = df['TC']             # add TC column
x = x.set_index('TC')          # set TC column as index to make it show as first column

print(x)

# Output:
       0    1    2    3    4    5
TC                               
101  189  113  213  201  125  211
102  206  268  446  149  104  166
103  163  103  113  166  800  101 2

Devarshi Goswami 2020-01-31 19:09:40

Thanks! I have one issue though, if i try to run Apriori on the nested_list. y=pd.DataFrame(apriori(nested_list)) I get the TypeError: '<' not supported between instances of 'float' and 'str'

Dames 2020-02-01 19:35:36

Please open a new question for this, I am not familiar with apriori

Related issues

How to use python cut method to create bins, accept one parameter and return appropriate bin?

Create a dictionary from a list of lists with certain criteria

selecting columns based on row value, Python, Pandas

plotting count of zeros and ones in a dataframe

BeautifulSoup find.all() web scraping returns empty

python function. output a keys list from a dictionary if the key is todays date

Best way to perform multiple amount of Pandas lookups between two DataFrames

How to get the number of columns and the width of each column in a Pandas pivot table?

Display a column when a desired value is missing while grouping in Pandas dataframe

Python hide ticks but show tick labels