I would like know how many animals will show up on a specific day. This chart describes people register their animals in advance.
For instance, at 7
days ahead, someone registered for their 4
cats to show up on 5/3/2019
; at 6
days ahead, another 9
cats are registered for 5/3/2019
. So there will be 7+6=13
cats showing up on 5/3/2019
.
When days_ahead
= 0, it simply means someone registered on the event day. For instance, 4
wolves registered for 5/1/2019
on 5/1/2019
(0 days ahead), and there will be 4
wolves that day.
library(dplyr)
set.seed(0)
animal = c(rep('cat', 5), rep('dog', 6), rep('wolf', 3))
date = sample(seq(as.Date("2019/5/1"), as.Date('2019/5/10'), by='day'), 14, replace=TRUE)
days_ahead = sample(seq(0,14), 14, replace=FALSE)
number = sample.int(10, 14, replace=TRUE)
dt = data.frame(animal, date, days_ahead, number) %>% arrange(animal, date)
The expected outcome should have the same 1-3
columns as the example, but the fourth column should be an accumulated number by each date
, accumulating on days_ahead
.
I added an expected outcome here. The comments
are used to explain the accumulated_number
column.
I've considered loop
function but not entirely sure how to loop over three variables (cat, date, and days_ahead). Any advice is appreciated!!
The accumulated_number
is somewhat easy with cumsum()
. See this link for your comments
field:
Cumulatively paste (concatenate) values grouped by another variable
dt%>%
group_by(animal,date)%>%
mutate(accumulated_number = cumsum(number)
,comments = Reduce(function(x1, x2) paste(x1, x2, sep = '+'), as.character(number), accumulate = T)
)%>%
ungroup()
Also, my dataset is slightly different than yours with the same seed. Still, it seems to work.
# A tibble: 14 x 6
animal date days_ahead number accumulated_number comments
<fct> <date> <int> <int> <int> <chr>
1 cat 2019-05-03 10 9 9 9
2 cat 2019-05-04 6 4 4 4
3 cat 2019-05-06 8 5 5 5
4 cat 2019-05-09 5 4 4 4
5 cat 2019-05-10 13 6 6 6
6 dog 2019-05-01 0 2 2 2
7 dog 2019-05-03 3 5 5 5
8 dog 2019-05-07 1 7 7 7
9 dog 2019-05-07 9 8 15 7+8
10 dog 2019-05-09 12 2 2 2
11 dog 2019-05-10 7 9 9 9
12 wolf 2019-05-02 14 5 5 5
13 wolf 2019-05-03 11 8 8 8
14 wolf 2019-05-07 4 9 9 9
I don't think OP need
comments
column, it is only for explanation. It is justcumsum
by group.@Ronak I think you're right. The
comments
was kind of fun to research, though. I'm keeping it in.