最好,我正在寻找dplyr
解决方案。
我有
> str(p)
'data.frame': 25 obs. of 1 variable:
$ intram_size: chr "5" "4,7 x 6,6 mm" "4x6x7 mm" "5" ...
和
> head(p)
intram_size
1 5
2 4,7 x 6,6 mm
3 4x6x7 mm
4 5
5 4x11
6 1x4
p$intram_size
表示某种肿瘤的二维测量。我需要提取最大的数字,即所测得的最大直径。一个问题是,
已经使用过。
Expected output
> head(p)
intram_size new
1 5 5
2 4,7 x 6,6 mm 6.6
3 4x6x7 mm 7
4 5 5
5 4x11 11
6 1x4 4
数据样本
p <- structure(list(intram_size = c("5", "4,7 x 6,6 mm", "4x6x7 mm",
"5", "4x11", "1x4", "7x10", "8", "3", "7", "7x4x3", "10x5", "8",
"7", "11", "7", "10", "5", "13", "5", "3,5", "10", "2,5", "7",
"11 x 6 x 4")), row.names = c(NA, 25L), class = "data.frame")
library(tidyverse)
p %>%
mutate(intram_size = str_replace_all(intram_size, ',', '.'),
new = str_extract_all(intram_size, '\\d+(\\.\\d+)?'),
new = map_dbl(new, ~max(as.numeric(.x))))
# intram_size new
#1 5 5.0
#2 4.7 x 6.6 mm 6.6
#3 4x6x7 mm 7.0
#4 5 5.0
#5 4x11 11.0
#6 1x4 4.0
#7 7x10 10.0
#8 8 8.0
#9 3 3.0
#10 7 7.0
#11 7x4x3 7.0
#12 10x5 10.0
#13 8 8.0
#14 7 7.0
#15 11 11.0
#16 7 7.0
#17 10 10.0
#18 5 5.0
#19 13 13.0
#20 5 5.0
#21 3.5 3.5
#22 10 10.0
#23 2.5 2.5
#24 7 7.0
#25 11 x 6 x 4 11.0
谢谢@Ronak。你让它看起来这么容易。非常感激。