Merge rows with the same ID but with overlapping variables

Em Laskey 2020-02-04 21:58

I'm not sure if this actually is what you want, but to combine rows of a data frame based on multiple conditions you can use the dplyr package and its summarise()function. I generated some data to use in R directly, you would have to modify the code according to your needs.

# generate data
ID<-rep(1:20,2)
visitors<-sample(1:50, 40, replace=TRUE)
impact<-sample(rep(c("a", "b", "c", "d", "e"), 8))
arrival<-sample(rep(8:15, 5))
departure <- sample(rep(16:23, 5))

df<-data.frame(ID, visitors, impact, arrival, departure)
df$impact<-as.character(df$impact)

# summarise rows with identical ID
df_summary <- df %>%
  group_by(ID) %>%
  summarise(visitors = max(visitors), arrival = min(arrival), 
            departure = max(departure), impact = paste0(impact, collapse =", "))

Hope this helps!

Andrew Torsney 2020-02-05 22:15:10

This is exactly what I wanted and worked perfectly for my data. I really really appreciate the help.

Em Laskey 2020-02-05 23:04:41

Glad I could help! Could you maybe accept the answer if you're happy with it? Thanks!

Andrew Torsney 2020-02-11 17:39:02

Sorry this was the 1st question I have ever asked, so I didnt realise I had to accept the answer. That is accepted now.

Related issues

Filter in rows of all columns with specific conditions

ggplot2 both axis labels inside plot area

Error: could not find function ... in R

Create loading messages that will change based on loading time of plot in a shiny app

cut.default error in heatmap generation R

Problem with apply function in r: it's applied only in the first column

R to create a tally of previous events within a sliding window time period

combine many columns from one dataframe into another dataframe using setDT

Display download button in a tab based on actions in other tabs of a shiny dashboard

Parsing dates in R with weird format