I have the following data frame and would like to replace the values (delimiter = ";") of the third column.
cat file.text
Name ID Values John 81-502 1 Mike 81-501 2;2;2 Matthew 81-512 1,0
def fun(x):
return x+1
I would like to apply this function to replace the Values column in my dataframe such that:
cat out.txt
Name ID Values John 81-502 2 Mike 81-501 3;3;3 Matthew 81-512 2;1
First is necessary split
values, convert to integers, add 1
, convert to strings and join back by ;
:
df['Values'] = df['Values'].apply(lambda x: ';'.join(str(int(y) + 1) for y in x.split(';')))
Solution with list comprehension:
df['Values'] = [ ';'.join(str(int(y) + 1) for y in x.split(';')) for x in df['Values']]
print (df)
Name ID Values
0 John 81-502 2
1 Mike 81-501 3;3;3
2 Matthew 81-512 2;1
Quick issue! What if df['Values'][0]==NaN
@user171558 - then use
.apply(lambda x: ';'.join(str(int(y) + 1) for y in x.split(';')) if pd.notna(x) else np.nan)
In my actual df, I have strings that I feed into a package function. and my values column consists of strings. Would it be ok if I used pd.NA instead of np.nan? After some reading, I assume they're very similar
@user171558 - I think it is same :)