I've got two dataframes which looks like:
df1
col1 col2 col3
20 John Positive
21 Kate Negative
22 Nick Another
df2
col1 col2 col3
21 message white
22 text black
20 nothing orange,yellow
19 excel blue
And I want to make them to look like:
df3
col1 col2 col3
20 John orange,yellow
21 Kate white
22 Nick black
I want to change the value from df1 in col3 to value from df2 from col3 if the number from col1 of both dataframes matches. (In df2 in col1 I've got more values than in df1 col1, but it includes all numbers from df1 col1)
I've made solution which looks like:
for i in range(len(df2)):
df1.loc[df1.col1 == df2.col1[i], ['col3']] = df2.col3[i]
And my solution is working, but it's really time consuming. I hope that using pandas I can improve my code. Do you have any idea how to do that?
Use DataFrame.merge
with left join DataFrame.fillna
:
#column fo join with all columns for replace, here col3
cols = ['col1','col3']
df = df1.merge(df2[cols], on='col1', how='left', suffixes=('_','')).fillna(df1)[df1.columns]
print (df)
col1 col2 col3
0 20 John orange,yellow
1 21 Kate white
2 22 Nick black
Or use Series.map
by Series with df2
if want replace
only one column:
df1['col3'] = df1['col1'].map(df2.set_index('col1')['col3']).fillna(df1['col3'])
Or:
df1['col3'] = df1['col1'].replace(df2.set_index('col1')['col3']).fillna(df1['col3'])