library(dplyr)
library(echarts4r)
library(coronavirus)
# Get the data
data("coronavirus")
In this post, we will visualize spread of worldwide COVID-19 cases through time. I obtained the data from Rami Krispin’s website: https://ramikrispin.github.io/coronavirus/ using coronovirus
package. I also decided to do some experimentation using John Coene’s fantastic echarts4r
package, which allows us to access echarts
API.
Load the libraries and get the data in the R session.
Data Preparation
Print out the first 6 observations.
head(coronavirus)
date province country lat long type cases uid iso2 iso3
1 2020-01-22 Alberta Canada 53.9333 -116.5765 confirmed 0 12401 CA CAN
2 2020-01-23 Alberta Canada 53.9333 -116.5765 confirmed 0 12401 CA CAN
3 2020-01-24 Alberta Canada 53.9333 -116.5765 confirmed 0 12401 CA CAN
4 2020-01-25 Alberta Canada 53.9333 -116.5765 confirmed 0 12401 CA CAN
5 2020-01-26 Alberta Canada 53.9333 -116.5765 confirmed 0 12401 CA CAN
6 2020-01-27 Alberta Canada 53.9333 -116.5765 confirmed 0 12401 CA CAN
code3 combined_key population continent_name continent_code
1 124 Alberta, Canada 4413146 North America NA
2 124 Alberta, Canada 4413146 North America NA
3 124 Alberta, Canada 4413146 North America NA
4 124 Alberta, Canada 4413146 North America NA
5 124 Alberta, Canada 4413146 North America NA
6 124 Alberta, Canada 4413146 North America NA
We are interested in date
and type
. Let’s take a look at the distinct values for type
.
%>% count(type) coronavirus
type n
1 confirmed 330327
2 death 330327
3 recovery 313182
There are only 3 values: confirmed
, death
, and recovery
. Next we will create sum of cases for each of the values and store them in separate data sets.
<- coronavirus %>%
dt1 filter(type == "confirmed") %>%
group_by(date) %>%
summarize(Confirmed = sum(cases, na.rm = TRUE), .groups = "drop")
<- coronavirus %>%
dt2 filter(type == "death") %>%
group_by(date) %>%
summarize(Death = sum(cases, na.rm = TRUE), .groups = "drop")
<- coronavirus %>%
dt3 filter(type == "recovery") %>%
group_by(date) %>%
summarize(Recovered = sum(cases, na.rm = TRUE), .groups = "drop")
There is an error in the recovery figures in 14th December 2020. So I plot only cases and deaths.
Finally, we will merge the 2 datasets so that we will have the counts of each type
in separate columns.
<- dt1 %>%
dt inner_join(dt2, by = "date")
Plot
Finally, time to make the plot! Note how we can build this plot in separate elements.
%>%
dt e_charts(x = date) %>%
e_line(serie = Confirmed) %>%
e_line(serie = Death) %>%
e_tooltip(trigger = "axis") %>%
e_datazoom(type = "slider") %>%
e_title("Worldwide COVID-19 cases") %>%
e_theme("bee-insipired")
This plot is interactive so you can hover over the plot to get the exact readings. You can also toggle time series on or off by clicking on the legends on top.