ggplot2 – R studio ggplot linebreak does not appear when data is NA

Clarifying the question

Let’s first create a somewhat minimal reproducible example for others to help you.

For the data frame:

df %>%
  filter(Datum > as.Date("2017-10-01")) %>%
  dput()

Resulting in this, which can be easily copy-pasted by others to recreate your data frame:

structure(list(Datum = structure(c(17444, 17444, 17444, 17444, 
17470, 17479, 17479, 17479, 17479, 17479, 17486, 17506, 17506, 
17506, 17506, 17570, 17904, 17935, 17945, 17953, 18012, 18016, 
18030, 18039, 18044, 18044, 18059, 18072, 18072, 18086, 18088, 
18100, 18114, 18128, 18128, 18134, 18142, 18156, 18163, 18165, 
18199, 18207, 18229, 18254), class = "Date"), Parameter = c("chloride - nf - mg/l", 
"chloride - nf - mg/l", "chloride - nf - mg/l", "chloride - nf - mg/l", 
"chloride - nf - mg/l", "chloride - nf - mg/l", "chloride - nf - mg/l", 
"chloride - nf - mg/l", "chloride - nf - mg/l", "chloride - nf - mg/l", 
"chloride - nf - mg/l", "chloride - nf - mg/l", "chloride - nf - mg/l", 
"chloride - nf - mg/l", "chloride - nf - mg/l", NA, "chloride - nf - mg/l", 
NA, "chloride - nf - mg/l", "chloride - nf - mg/l", "chloride - nf - mg/l", 
"chloride - nf - mg/l", "chloride - nf - mg/l", "chloride - nf - mg/l", 
"chloride - nf - mg/l", "chloride - nf - mg/l", "chloride - nf - mg/l", 
"chloride - nf - mg/l", "chloride - nf - mg/l", "chloride - nf - mg/l", 
"chloride - nf - mg/l", "chloride - nf - mg/l", "chloride - nf - mg/l", 
"chloride - nf - mg/l", "chloride - nf - mg/l", "chloride - nf - mg/l", 
"chloride - nf - mg/l", "chloride - nf - mg/l", "chloride - nf - mg/l", 
"chloride - nf - mg/l", "chloride - nf - mg/l", "chloride - nf - mg/l", 
"chloride - nf - mg/l", "chloride - nf - mg/l"), Waarden = c(39.2, 
33.3, 37.5, 66.5, 81.3, 70.5, 82.6, 72, 66.3, 85.8, 85.9, 75.9, 
68.7, 58.5, 86.9, NA, 131, NA, 141, 142, 86.6, 115, 121, 115, 
117, 113, 96.7, 91.3, 88.2, 101, 89.3, 92.1, 85.6, 75.2, 76.4, 
91.6, 89.9, 84.9, 92.7, 68.9, 109, 94.5, 82.9, 100), Locatie = c("Wollebrand_Inlaat zwemplas", 
"veilingroute_bovenstroom stuw", "Strijp_inlaat FLORA", "Waterskivijver Wollebrand steiger-vlot", 
"hoofdwatergang_Lange Broekweg nr 78 (?)_tuin", "Wollebrand_Inlaat zwemplas", 
"hoofdwatergang_Lange Broekweg nr 78 (?)_tuin", "veilingroute_bovenstroom stuw", 
"Strijp_inlaat FLORA", "Waterskivijver Wollebrand steiger-vlot", 
"Waterskivijver Wollebrand steiger-vlot", "Wollebrand_Inlaat zwemplas", 
"veilingroute_bovenstroom stuw", "Strijp_inlaat FLORA", "Waterskivijver Wollebrand steiger-vlot", 
NA, "Waterskivijver Wollebrand", NA, "Waterskivijver Wollebrand", 
"Waterskivijver Wollebrand", "Wollebrand_Inlaat zwemplas", "Waterskivijver Wollebrand", 
"Waterskivijver Wollebrand", "Wollebrand_Inlaat zwemplas", "Waterskivijver Wollebrand", 
"Waterskivijver Wollebrand", "Waterskivijver Wollebrand", "Waterskivijver Wollebrand", 
"Wollebrand_Inlaat zwemplas", "Waterskivijver Wollebrand", "Wollebrand_Inlaat zwemplas", 
"Waterskivijver Wollebrand", "Waterskivijver Wollebrand", "Waterskivijver Wollebrand", 
"Waterskivijver Wollebrand", "Wollebrand_Inlaat zwemplas", "Waterskivijver Wollebrand", 
"Waterskivijver Wollebrand", "Waterskivijver Wollebrand", "Wollebrand_Inlaat zwemplas", 
"Wollebrand_Inlaat zwemplas", "Waterskivijver Wollebrand", "Wollebrand_Inlaat zwemplas", 
"Wollebrand_Inlaat zwemplas")), row.names = c(NA, -44L), class = "data.frame")

And we can reduce the problematic code to this:

chloride <- subset(df, Parameter == "chloride - nf - mg/l")
ggplot(data = chloride2, aes(x = Datum, y = as.numeric(as.factor(Waarden)))) +
  geom_line(aes(color = Locatie))

Problem

First of all, notice that you create a data frame chloridebut then plot chloride2.

Second, if we look at chloride, we’ll see that there are no NAs in there. That is because we took a subset from df where Parameter == "chloride - nf - mg/l"whereas all your NA values ​​in df also had NA for Parameter. Even if you didn’t filter them out, ggplot wouldn’t create breaks since Location wouldn’t match any of the lines it’s drawing. We need to add an NA to create a break for each Parameter and each Location.

Correcting just the first problem on our example dataset, we get this plot:

Warning / note: I noticed you plot as.numeric(as.factor(Waarden))) on the y-axis, while the legend you used in the example implies raw values. Using the as.numeric(as.factor(...)) trick creates a ranking and I highly doubt that’s what you want! Example:

> as.numeric(as.factor( c(12,12,13,14,13) ))
[1] 1 1 2 3 2

Solution

We need to add, either to chloride or to dfa row for each date+parameter+location combo where we want a break to occur.

I’ll pick the skippable dates manually:

skip_dates <- as.Date(c("2017-02-08","2018-02-08","2019-02-08"))

We can generate the parameters and locations to skip on these dates:

skip_parameters <- unique(na.omit(df$Parameter))
skip_location <- unique(na.omit(df$Locatie))

Then we create a data frame of all permutations of these three criteria, and make sure the columns have the same names as those in df:

df_skip <- expand.grid(skip_dates, skip_parameters, skip_location)
colnames(df_skip) <- c("Datum","Parameter","Locatie")
df_skip$Waarden <- NA

Now we have in df_skip a single row for each combo we want to plot a break in:

example df_skip

Finally we add df_skip to the original df and run the plot again:

df <- rbind(df, df_skip)

chloride <- subset(df, Parameter == "chloride - nf - mg/l")
ggplot(data = chloride, aes(x = Datum, y = Waarden)) +
  geom_line(aes(color = Locatie))

solution plot

You’ll get a warning about missing values, but since those are intentional we can ignore that.

Complete code
skip_dates <- as.Date(c("2017-02-08","2018-02-08","2019-02-08"))
skip_parameters <- unique(na.omit(df$Parameter))
skip_location <- unique(na.omit(df$Locatie))
df_skip <- expand.grid(skip_dates, skip_parameters, skip_location)
colnames(df_skip) <- c("Datum","Parameter","Locatie")
df_skip$Waarden <- NA
df <- rbind(df, df_skip)

chloride <- subset(df, Parameter == "chloride - nf - mg/l")
ggplot(data = chloride, aes(x = Datum, y = Waarden)) +
  geom_line(aes(color = Locatie))

Leave a Comment