Warm tip: This article is reproduced from stackoverflow.com, please click
dataframe r vector mean

Calculating the mean of a vector that is present in a data frame cell

发布于 2020-03-31 22:57:28

I have a column (named A) in a data frame that contains natural numbers as well as vectors of natural numbers. For the cells in which there is a vector of natural numbers, I want to calculate the mean of that vector. The end result I then want to store in a new column, named B.

Currently, I tried to do the following:

Val <- unlist(lapply(str_split(data$A, ","),
                     function(x) mean(as.numeric(x), na.rm=TRUE)))
Val[length(Val)] <- mean(Val[-length(Val)], na.rm=TRUE)
data$B <- Val

However, this doesn't seem to work correctly. The function above does not provide me with the mean of the vector, and it returns NaN when the vector only has 2 elements in it. Below an example of what it looks like

enter image description here

Questioner
Jeroen
Viewed
47
Ronak Shah 2020-01-31 19:45

If you have column A as text another way is to remove the extra characters from the column using gsub, split on comma and then take mean. Using @zx8754's data

sapply(strsplit(gsub('[c()]', '', df1$A), ","), function(x) mean(as.numeric(x)))
#[1] 1.000 2.000 3.000 2.000 3.000 2.333 3.000 3.000 2.500