I have a table with 'State'
and the associate IP CIDR range associate with that state.
TABLE A
--------------------------------------------------
| ID | State | IP_subnet |
--------------------------------------------------
| 1 | CA | 192.168.1.0/24 |
--------------------------------------------------
| 2 | TX | 172.68.7.0/24 |
--------------------------------------------------
| 3 | NY | 61.141.47.0/24 |
--------------------------------------------------
I would like to iterate through the table below and match the IP
field against IP_subnet
field.
TABLE B
| ID | IP |
--------------------------------------
| 1 | 61.141.47.1 |
--------------------------------------
| 2 | 192.168.1.48 |
--------------------------------------
| 3 | 172.68.7.124 |
--------------------------------------
| 4 | 40.32.123.212 |
--------------------------------------
Below is the results I am going for: (matching the associated State
to the IP
)
| ID | IP | State |
--------------------------------------------------
| 1 | 61.141.47.1 | null |
--------------------------------------------------
| 2 | 192.168.1.48 | CA |
--------------------------------------------------
| 3 | 172.68.7.124 | TX |
--------------------------------------------------
| 4 | 40.32.123.212 | NY |
--------------------------------------------------
I know the code below would work for 1 value. How do I iterate through a column of IPs
against another one?
from ipaddress import IPv4Address, IPv4Network
IPv4Address('172.68.7.124') in IPv4Network('172.68.7.0/24')
FYi
data = [[1, 'CA', '192.168.1.0/24'], [2, 'TX', '172.68.7.0/24'], ['juli', 14], [3, NY, 61.141.47.0/24]]
df = pd.DataFrame(data, columns = ['ID', 'State', 'IP_subnet'])
First using 2 data frames find state for each IP, then create new column based on this dictionary data and load into original df.
I think it can be done in more compact way, but still it do the job.
import pandas as pd
data = [[1, 'CA', '192.168.1.0/24'], [2, 'TX', '172.68.7.0/24'], [3, 'NY', '61.141.47.0/24']]
df = pd.DataFrame(data, columns=['ID', 'State', 'IP_subnet'])
# replace end of IP
df['IP_subnet'] = df['IP_subnet'].str.replace(r'.0/24', '')
data2 = [[1, '61.141.47.1'], [2, '192.168.1.48'], [3, '172.68.7.124'], [4, '40.32.123.212']]
df2 = pd.DataFrame(data2, columns=['ID', 'IP'])
# match IP with state
data = {}
for index, row in df.iterrows():
ww = df2[df2['IP'].str.contains(row['IP_subnet'])]
data[ww['IP'].values[0]] = row['State']
# create State column
state_data = []
for index, row in df2.iterrows():
if row['IP'] in data:
state_data.append(data.get(row['IP']))
else:
state_data.append('NaN')
df2['State'] = state_data
Output:
ID IP State
0 1 61.141.47.1 NY
1 2 192.168.1.48 CA
2 3 172.68.7.124 TX
3 4 40.32.123.212 NaN
Thank you for your help!!