Warm tip: This article is reproduced from stackoverflow.com, please click
pandas python python-3.x

Check if any value ( multiple columns) of one data frame exists in any values (multiple columns) of

发布于 2020-03-27 10:30:15

I have two data frames with similar data in different formats

df1:

Nodo      X          Y          Z
CTB3901   CTBX3901   CTBY3901   CTBZ3901
MTR5331   MTRX5331   MTRY5331   MTRZ5331
ADC3451   ADCX3451   ADCY3451   ADCZ3451

df2:

Site_x     Site_y
CTBX3901E  CTBX3901
CTB3901    CTB3901E
CTBZ3901E  CTBZ3901
CTBY3901E  CTB3901
MADX6379E  MADX6379

I want to check if any entries from df2['Site_x', 'Site_y'] is in any of the columns df1['Nodo','X','Y','Z']. The data need not be in the same row in both frames.

The final output after the check shud be as below

Site_x     Site_y   Checked
CTBX3901E  CTBX3901  True
CTB3901    CTB3901E  True
CTBZ3901E  CTBZ3901  True
CTBY3901E  CTB3901   True
MADX6379E  MADX6379  False

Pardon me for the clumsy dataset. In desperation of getting this part right, I had to paste the same data I was working with.

I've tried isin method with the below syntax but the output has False in the entire 'Checked' column.

df2['Checked'] = df2[['Site_x','Site_y']].isin(df1[['Nodo','X','Y','Z']]).any(axis=1)
Questioner
vvk24
Viewed
14
Quang Hoang 2019-07-03 23:40

You are pretty close:

df2['checked'] = df2.apply(lambda x: x.isin(df1.stack())).any(axis=1)