Create a matrix from a dict of dicts for calculating similarities between docs

amdex 2019-07-03 22:21

You can convert a list of dictionaries into a dataframe by using the pandas DataFrame class directly.

import pandas as pd

a = [{"0": 0}, {"1": 1}]
df = pd.DataFrame(a)

To apply this to your problem, all you have to do is turn mydict into a list of dictionaries instead of a dictionary of dictionaries.

nipato 2019-07-03 22:30:06

Yes but i need to convert it into a matrix not into a dataframe because i want to calculate similarities between docs and i believe you need a matrix of tfidf weights for each doc

amdex 2019-07-03 22:32:38

You have multiple options: you could first convert it to a dataframe, and then call df.as_matrix. Alternatively, you could use the DictVectorizer from sklearn, that will also take care of the problem for you.

nipato 2019-07-03 22:44:16

Yes i have heard of DictVectorizer i will try that, thank you !

Related issues

How to unfold a Matrix on Matlab?

Fill a 2D list with random numbers in python

Multiply i-th 2-d matrix in numpy 3d array with i-th column in 2d array

Cant print an 2d array in function c++

Matrix condition not outputting result C++

Minimum absolute difference between vector pairs (greedy algorithm)

divide each column by max value/last value

plotting two matrices in the same graph with matplotlib

Adding data from vector to matrix

How to count the amount of images correctly classified by predict_generator