Warm tip: This article is reproduced from stackoverflow.com, please click
pandas python-3.6

Extracting numeric value from a string of a dataframe's column and replace the string with that nume

发布于 2020-03-31 22:59:39

say if columns 'A' contains values for first 3 rows: 4.5 mg, 5.8 mg, 6.3 mg what i want is: After extracting it should look like: 4.5 , 5.8 , 6.3

Any help? Beside , i can't figure out how to show my dataframe in stackoverflow. So I am really sorry for the question's body formation.

Questioner
Ayan Chowdhury
Viewed
32
jezrael 2020-01-31 19:55

Use Series.str.extract with casting to floats:

df = pd.DataFrame({'A':'4.5 mg, 5.8 mg, 6.3 mg'.split(', ')})

df['new'] = df['A'].str.extract(r'(\d\.\d)+').astype(float)

If possible some integers values:

df['new'] = df['A'].str.extract(r"(\d*\.?\d+|\d+)").astype(float)

print (df)
        A  new
0  4.5 mg  4.5
1  5.8 mg  5.8
2  6.3 mg  6.3

If possible use split by first whitespace use Series.str.split with str for indexing first values:

df['val'] = df['A'].str.split().str[0].astype(float)