I'm trying to loop over some continuous variables to create ggplots. This works fine with aes_string
, but I've now tried 1000s of variations to include cut
in the call to generate bins of the variable. But it either fails or the loop does not work and it uses the same variable value within aes
all the time.
In my actual data, I tried to calculate the breaks
for cut
beforehand similar to cut_interval(n = 6)
, as each variable has a different range, but this also did not help.
library(tidyverse)
data(diamonds)
diamonds <- head(diamonds, 200)
# select some numeric categories to loop over
categories <- names(diamonds)[c(1,5,6)]
# this works fine in a loop
plot_list <- list()
for (category in categories){
plot_list[[category]] <- ggplot(diamonds, aes(x = x, y = z)) +
geom_point(data = diamonds[diamonds$color == "E", ], aes_string(fill = category), colour = "grey50", pch = 21) +
geom_point(data = diamonds[diamonds$color != "E", ], aes_string(fill = category, colour = "price"), pch = 21)
}
plot_list
# together with cut(), it does not work anymore
cut_plot_list <- list()
for (category in categories){
cut_plot_list[[category]] <- ggplot(diamonds, aes(x = x, y = z)) +
geom_point(data = diamonds[diamonds$color == "E", ], aes_string(fill = cut(category, breaks = c(-Inf, 1, 10, 20, Inf)), colour = "grey50", pch = 21) +
geom_point(data = diamonds[diamonds$color != "E", ], aes_string(fill = cut(category, breaks = c(-Inf, 1, 10, 20, Inf))), colour = "price"), pch = 21)
}
# fails: 'x' must be numeric
# this gives identical plots without fill
cut_plot_list <- list()
for (category in categories){
cut_plot_list[[category]] <- ggplot(diamonds, aes(x = x, y = z)) +
geom_point(data = diamonds[diamonds$color == "E", ], aes(fill = cut(get(category), breaks = c(-Inf, 1, 10, 20, Inf)), colour = "grey50", pch = 21) +
geom_point(data = diamonds[diamonds$color != "E", ], aes(fill = cut(get(category), breaks = c(-Inf, 1, 10, 20, Inf)), colour = price), pch = 21)
}
cut_plot_list
How do I combine a for loop (or lapply) in ggplot2 with dynamic discrete values for the variable?
EDIT:
without a for loop for one variable I would call it like this:
ggplot(diamonds, aes(x = x, y = z)) +
geom_point(data = diamonds[diamonds$color == "E", ], aes(fill = table), colour = "grey50", pch = 21) +
geom_point(data = diamonds[diamonds$color != "E", ], aes(fill = table , colour = price), pch = 21)
# or with the binned values
ggplot(diamonds, aes(x = x, y = z)) +
geom_point(data = diamonds[diamonds$color == "E", ], aes(fill = cut(table, breaks = c(-Inf, 1, 10, 20, Inf))), colour = "grey50", pch = 21) +
geom_point(data = diamonds[diamonds$color != "E", ], aes(fill = cut(table, breaks = c(-Inf, 1, 10, 20, Inf)) , colour = price), pch = 21)
We can use non-standard evaluation :
library(ggplot2)
apply_fun <- function(category) {
ggplot(diamonds, aes(x = x, y = z)) +
geom_point(data = diamonds[diamonds$color == "E", ],
aes(fill = cut(!!sym(category), breaks = c(-Inf, 1, 10, 20, Inf))),
colour = "grey50", pch = 21) +
geom_point(data = diamonds[diamonds$color != "E", ],
aes(fill = cut(!!sym(category), breaks = c(-Inf, 1, 10, 20, Inf)) ,
colour = price), pch = 21)
}
and then call for each categories
plot_list <- lapply(categories, apply_fun)
To cut data into n
intervals we can do
apply_fun <- function(category, n) {
breaks = seq(min(diamonds[[category]]), max(diamonds[[category]]), length.out = n)
ggplot(diamonds, aes(x = x, y = z)) +
geom_point(data = diamonds[diamonds$color == "E", ],
aes(fill = cut(!!sym(category), breaks = breaks)),
colour = "grey50", pch = 21) +
geom_point(data = diamonds[diamonds$color != "E", ],
aes(fill = cut(!!sym(category), breaks = breaks) ,
colour = price), pch = 21)
}
Apply the function with
plot_list <- lapply(categories, apply_fun, n = 6)
That works great, thank you! Is there a way to also provide an object as input to the
cut(breaks=)
argument? I tried insteadfill = cut_interval(!!sym(category), n = 6))
inside the call to have fitting bins for each category, but this will give me a total of 12 bins, each 6 forcolor == "E"
andcolor != "E"
. I would define the breaks before the ggplot call, but as they are no breaks, I can't use!!sym()
@crazysantaclaus I have updated the answer. Is that what you want?
that is perfect! could you explain to me, why
breaks
as variable works without using!!sym
, butcategory
does not? I will also edit your code slightly, as I forgot to use apch
here which can actually displayfill
@crazysantaclaus
sym
is used only for columns and not for values likebreaks
.