+ - 0:00:00
Notes for current slide
Notes for next slide

Plot a lot with ggplot2 to find plots

R-Ladies Colombo Meetup

Priyanga Dilini Talagala

2021/01/27

1 / 92

Tidy Workflow

2 / 92

Tidy Workflow

3 / 92

Tidy Workflow

4 / 92

The Datasaurus Dozen

library(datasauRus)
library(ggplot2)
library(gganimate)
ggplot(datasaurus_dozen, aes(x=x, y=y))+
geom_point()+
theme_minimal() +
transition_states(dataset, 3, 1) +
theme(aspect.ratio = 1)

5 / 92

The Datasaurus Dozen

library(datasauRus)
library(ggplot2)
library(gganimate)
ggplot(datasaurus_dozen, aes(x=x, y=y))+
geom_point()+
theme_minimal() +
transition_states(dataset, 3, 1) +
theme(aspect.ratio = 1)

Summary statistics
X Mean 54.26
Y Mean 47.83
X SD 16.77
Y SD 26.94
Corr. -0.06
6 / 92

The Grammar of Graphics

7 / 92

The Book

The Grammar of Graphics

8 / 92

R Base Graphics

9 / 92

The Grammar of Graphics

Pie Chart

Line Chart

Bar Chart

Scatterplot

10 / 92

The Grammar of Graphics

penguins dataset

Size measurements for adult foraging penguins near Palmer Station, Antarctica

Rows: 344
Columns: 8
$ species <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Ade…
$ island <fct> Torgersen, Torgersen, Torgersen, Torgersen, Torgers…
$ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1,…
$ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1,…
$ flipper_length_mm <int> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 18…
$ body_mass_g <int> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475,…
$ sex <fct> male, female, female, NA, female, male, female, mal…
$ year <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 200…

11 / 92

The Grammar of Graphics

Rows: 344
Columns: 8
$ species <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Ade…
$ island <fct> Torgersen, Torgersen, Torgersen, Torgersen, Torgers…
$ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1,…
$ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1,…
$ flipper_length_mm <int> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 18…
$ body_mass_g <int> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475,…
$ sex <fct> male, female, female, NA, female, male, female, mal…
$ year <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 200…

12 / 92

The Grammar of Graphics

Rows: 344
Columns: 8
$ species <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Ade…
$ island <fct> Torgersen, Torgersen, Torgersen, Torgersen, Torgers…
$ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1,…
$ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1,…
$ flipper_length_mm <int> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 18…
$ body_mass_g <int> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475,…
$ sex <fct> male, female, female, NA, female, male, female, mal…
$ year <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 200…

Counts

# A tibble: 3 x 2
species n
<fct> <int>
1 Adelie 152
2 Chinstrap 68
3 Gentoo 124

13 / 92

The Grammar of Graphics

14 / 92

The Grammar of Graphics

Bar Plot

15 / 92

The Grammar of Graphics

Rows: 344
Columns: 8
$ species <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Ade…
$ island <fct> Torgersen, Torgersen, Torgersen, Torgersen, Torgers…
$ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1,…
$ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1,…
$ flipper_length_mm <int> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 18…
$ body_mass_g <int> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475,…
$ sex <fct> male, female, female, NA, female, male, female, mal…
$ year <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 200…

16 / 92

The Grammar of Graphics

17 / 92

The Grammar of Graphics

18 / 92

The Grammar of Graphics

Excel Theme

19 / 92

The ggplot2 API

20 / 92

Which dataset to plot?

21 / 92

palmerpenguins data

The Palmer Archipelago penguins. Artwork by @allison_horst.

# A tibble: 6 x 8
species island bill_length_mm bill_depth_mm flipper_length_… body_mass_g sex
<fct> <fct> <dbl> <dbl> <int> <int> <fct>
1 Adelie Torge… 39.1 18.7 181 3750 male
2 Adelie Torge… 39.5 17.4 186 3800 fema…
3 Adelie Torge… 40.3 18 195 3250 fema…
4 Adelie Torge… NA NA NA NA <NA>
5 Adelie Torge… 36.7 19.3 193 3450 fema…
6 Adelie Torge… 39.3 20.6 190 3650 male
# … with 1 more variable: year <int>
Rows: 344
Columns: 8
$ species <fct> Adelie, Adelie, Adelie, Adelie, Adelie, Adelie, Ade…
$ island <fct> Torgersen, Torgersen, Torgersen, Torgersen, Torgers…
$ bill_length_mm <dbl> 39.1, 39.5, 40.3, NA, 36.7, 39.3, 38.9, 39.2, 34.1,…
$ bill_depth_mm <dbl> 18.7, 17.4, 18.0, NA, 19.3, 20.6, 17.8, 19.6, 18.1,…
$ flipper_length_mm <int> 181, 186, 195, NA, 193, 190, 181, 195, 193, 190, 18…
$ body_mass_g <int> 3750, 3800, 3250, NA, 3450, 3650, 3625, 4675, 3475,…
$ sex <fct> male, female, female, NA, female, male, female, mal…
$ year <int> 2007, 2007, 2007, 2007, 2007, 2007, 2007, 2007, 200…
22 / 92

Which dataset to plot?

ggplot()

23 / 92

Which dataset to plot?

ggplot(data = penguins)

24 / 92

Mapping

25 / 92

Which columns to use for x and y?

ggplot(data = penguins,
mapping = aes(x = flipper_length_mm,
y = body_mass_g))

26 / 92

Geometries

27 / 92

How to draw the plot?

ggplot(data = penguins,
mapping = aes(x = flipper_length_mm,
y = body_mass_g)) +
geom_point()

28 / 92

Data, Mapping and Geometries

29 / 92

How to draw the plot?

ggplot(data = penguins) +
geom_point(mapping = aes(x = flipper_length_mm,
y = body_mass_g))

30 / 92

How to draw the plot?

ggplot() +
geom_point(mapping = aes(x = flipper_length_mm,
y = body_mass_g),
data = penguins)

31 / 92

Mapping Colours

ggplot(penguins) +
geom_point( aes(x = flipper_length_mm,
y = body_mass_g,
color = species,
shape = species))

32 / 92

Mapping Colours

ggplot(penguins) +
geom_point( aes(x = flipper_length_mm,
y = body_mass_g,
colour = flipper_length_mm < 205))

33 / 92

Setting Colours

ggplot(penguins) +
geom_point( aes(x = flipper_length_mm,
y = body_mass_g),
colour = 'purple')

34 / 92
ggplot(penguins,
aes(x = flipper_length_mm,
y = body_mass_g,
color = species,
shape = species)) +
geom_point() +
geom_density_2d()
  • Syntax starts with geom_*.
  • eg: geom_histogram(), geom_bar(), geom_boxplot().
  • Each shape has its own specific aesthetics arguments.

35 / 92
ggplot(penguins,
aes(x = flipper_length_mm,
y = body_mass_g,
color = species,
shape = species)) +
geom_point() +
geom_density_2d()
  • Syntax starts with geom_*.
  • eg: geom_histogram(), geom_bar(), geom_boxplot().
  • Each shape has its own specific aesthetics arguments.

ggplot(penguins) +
geom_histogram(
aes(x = flipper_length_mm))

35 / 92

Each shape has its own specific aesthetics arguments.

?geom_point

36 / 92

Statistics

37 / 92

geom_bar() uses stat_count() by default

ggplot(penguins) +
geom_bar(aes(x = species))

38 / 92

after_stat()

ggplot(penguins) +
geom_bar(aes(x = species,
y = after_stat(100*count/ sum (count)) ))

39 / 92

Old version of ggplot2

ggplot(penguins) +
geom_bar(aes(x = species,
y = 100*(..count..)/sum(..count..) ))

40 / 92

after_stat() (ggplot2 3.0.0)

ggplot(penguins) +
geom_bar(aes(x = species,
y = after_stat(100*count/ sum (count)) ))

41 / 92
  • There are two ways to use statistical functions.

define stat_*() function and geom argument inside that function

ggplot(penguins,
aes(x = flipper_length_mm,
y = body_mass_g)) +
stat_summary(
geom ="point",
fun.y ="mean",
colour ="red")

define geom_*() function and stat argument inside that function

ggplot(penguins,
aes(x = flipper_length_mm,
y = body_mass_g)) +
geom_point(
stat ="summary",
fun.y ="mean",
colour ="red")

42 / 92
Statistics Geometries
stat_count geom_bar
stat_boxplot geom_boxplot
stat_identity geom_col
stat_bin geom_bar, geom_histogram
stat_density geom_density
43 / 92
Statistics Geometries
stat_count geom_bar
stat_boxplot geom_boxplot
stat_identity geom_col
stat_bin geom_bar, geom_histogram
stat_density geom_density
?geom_boxplot

?geom_boxplot

?geom_bar

?geom_bar

43 / 92

Scales

44 / 92

Scales

ggplot(penguins) +
geom_point( aes(x = flipper_length_mm,
y = body_mass_g,
color = species,
shape = species))

45 / 92

Scales

ggplot(penguins) +
geom_point( aes(x = flipper_length_mm,
y = body_mass_g,
color = species,
shape = island))

46 / 92

Scales

ggplot(penguins) +
geom_point( aes(x = flipper_length_mm,
y = body_mass_g,
color = bill_length_mm,
shape = island))

47 / 92

Scales

ggplot(penguins) +
geom_point(aes(x = flipper_length_mm,
y = body_mass_g,
color = species)) +
scale_color_brewer(type = 'qual',
palette = 'Dark2')

48 / 92

Scales

ggplot(penguins) +
geom_point(aes(x = flipper_length_mm,
y = body_mass_g,
color = species)) +
scale_color_brewer(type = 'qual',
palette = 'Dark2')

  • scale_<aesthetic>_<type>
48 / 92

RColorBrewer::display.brewer.all()

49 / 92
ggplot(penguins) +
geom_point(aes(x = flipper_length_mm,
y = body_mass_g,
color = species)) +
scale_color_viridis_d()

50 / 92
ggplot(penguins) +
geom_point(aes(x = flipper_length_mm,
y = body_mass_g,
color = species)) +
scale_color_viridis_d()

  • viridis and RColorBrewer provide different color scales that are robust to color-blindness.
50 / 92
ggplot(penguins) +
geom_point(aes(x = flipper_length_mm,
y = body_mass_g,
color = species)) +
scale_color_viridis_d()

  • viridis and RColorBrewer provide different color scales that are robust to color-blindness.
  • For details and an interactive palette selection tools see http://colorbrewer.org
50 / 92
ggplot(penguins) +
geom_point(aes(x = flipper_length_mm,
y = body_mass_g,
color = species,
shape = species,
alpha = species)) +
scale_x_continuous( breaks = c(170,200,230)) +
scale_y_log10() +
scale_colour_viridis_d(direction = -1, option= 'plasma') +
scale_shape_manual( values = c(17,18,19)) +
scale_alpha_manual( values = c( "Adelie" = 0.6, "Gentoo" = 0.5, #
"Chinstrap" = 0.7))

51 / 92

Facets

52 / 92

facet_wrap()

ggplot(penguins) +
geom_point(aes(
x = flipper_length_mm,
y = body_mass_g)) +
facet_wrap(vars(species))

53 / 92

facet_wrap()

ggplot(penguins) +
geom_point(aes(
x = flipper_length_mm,
y = body_mass_g)) +
facet_wrap(vars(species),
scales = "free_x")

54 / 92

facet_grid()

ggplot(penguins) +
geom_point(aes(
x = flipper_length_mm,
y = body_mass_g)) +
facet_grid( vars(species), vars(sex))

55 / 92

Coordinates

56 / 92

Coordinates

ggplot(penguins) +
geom_bar(aes(x= species, fill = species))

57 / 92
ggplot(penguins) +
geom_bar(aes(x= species, fill = species)) +
coord_flip()

58 / 92
ggplot(penguins) +
geom_bar(aes(x= species, fill = species)) +
coord_flip()

  • There are two types of coordinate systems:
    • Linear coordinate systems
    • Non-linear coordinate systems
58 / 92
ggplot(penguins) +
geom_bar(aes(x= species, fill = species)) +
coord_flip()

  • There are two types of coordinate systems:
    • Linear coordinate systems
    • Non-linear coordinate systems
  • Linear coordinate systems : coord_cartesian(), coord_flip(), coord_fixed()
58 / 92
ggplot(penguins) +
geom_bar(aes(x= species, fill = species)) +
coord_flip()

  • There are two types of coordinate systems:
    • Linear coordinate systems
    • Non-linear coordinate systems
  • Linear coordinate systems : coord_cartesian(), coord_flip(), coord_fixed()
  • Non-linear coordinate systems : eg : coord_map(), coord_quickmap(), coord_sf(), coord_polar(), coord_trans()
58 / 92

Coordinates

ggplot(penguins) +
geom_bar(aes(x= species, fill = species)) +
coord_polar()

59 / 92

Coordinates

ggplot(penguins) +
geom_bar(aes(x= species, fill = species)) +
coord_polar(theta = "y")

60 / 92

Zooming into a plot

ggplot(penguins) +
geom_bar(aes(x= year, fill = species))

61 / 92

Zooming into a plot with scale

ggplot(penguins) +
geom_bar(aes(x= year, fill = species)) +
scale_y_continuous(limits = c(0,115))

62 / 92

Zooming into a plot with scale

ggplot(penguins) +
geom_bar(aes(x= year, fill = species)) +
scale_y_continuous(limits = c(0,115))

When zooming with scale, any data outside the limits is thrown away

62 / 92

Proper zoom with coord_cartesian()

ggplot(penguins) +
geom_bar(aes(x= year, fill = species)) +
coord_cartesian(ylim = c(0,115))

63 / 92

Proper zoom with coord_cartesian()

ggplot(penguins) +
geom_bar(aes(x= year, fill = species)) +
coord_cartesian(ylim = c(0,115))

Zooming with coord is like looking at the plot under a magnifying glass.

63 / 92

Themes

64 / 92

These are complete themes which control all non-data display.

ggplot(data = penguins,
aes(x = flipper_length_mm,
y = body_mass_g)) +
geom_point(aes(
color = species,
shape = species),
size = 3,
alpha = 0.8) +
theme_minimal()

65 / 92

These are complete themes which control all non-data display.

ggplot(data = penguins,
aes(x = flipper_length_mm,
y = body_mass_g)) +
geom_point(aes(
color = species,
shape = species),
size = 3,
alpha = 0.8) +
theme_minimal()

ggplot(data = penguins,
aes(x = flipper_length_mm,
y = body_mass_g)) +
geom_point(aes(
color = species,
shape = species),
size = 3,
alpha = 0.8) +
theme_dark()

65 / 92

Create custom themes in ggplot.

ggplot(penguins,
aes(x = flipper_length_mm,
y = body_mass_g)) +
geom_point(aes(color = species,
shape = species),
size = 3, alpha = 0.8) +
scale_color_viridis_d() +
theme_minimal() +
labs(
title = "Penguin size, Palmer Station LTER",
subtitle = "Flipper length and body mass for Adelie, Chinstrap and Gentoo Penguins",
x = "Flipper length (mm)",
y = "Body mass (g)",
color = "Penguin species",
shape = "Penguin species") +
theme(
aspect.ratio = 1,
legend.position = c(0.2, 0.7),
legend.background =
element_rect(
fill = "white",
color = NA),
plot.title.position = "plot",
plot.caption =
element_text(
hjust = 0,
face= "italic"),
plot.caption.position = "plot")

66 / 92
67 / 92

ggplot2 extensions

68 / 92

ggplot2 extensions: https://exts.ggplot2.tidyverse.org/

69 / 92

1. patchwork for plot composition

70 / 92
p1 <- ggplot(data = penguins, aes(x = flipper_length_mm, y = body_mass_g)) +
geom_point(aes(color = species, shape = species), size = 2) +
scale_color_manual(values = c("darkorange","darkorchid","cyan4")) +
theme(aspect.ratio = 1)
p2 <- ggplot(data = penguins, aes(x = bill_length_mm, y = bill_depth_mm)) +
geom_point(aes(color = species, shape = species), size = 2) +
scale_color_manual(values = c("darkorange","darkorchid","cyan4")) +
theme(aspect.ratio = 1)
p3 <- ggplot(data = penguins, aes(x = flipper_length_mm)) +
geom_histogram(aes(fill = species), alpha = 0.5, position = "identity") +
scale_fill_manual(values = c("darkorange","darkorchid","cyan4"))
71 / 92
library(patchwork)
p1 + p3

72 / 92
library(patchwork)
(p1 | p2) / p3

73 / 92
library(patchwork)
p <- (p1 | p2) / p3
p + plot_layout(guide = 'collect')

74 / 92
library(patchwork)
p <- (p1 | p2) / p3
p +
plot_layout(guide = 'collect') +
plot_annotation(
title = 'Size measurements for adult foraging penguins near Palmer Station, Antarctica',
tag_levels = 'A')

75 / 92
library(patchwork)
p <- (p1 | p2) / p3
p &
theme(legend.position = 'none')

76 / 92

2. plotly

An R package for creating interactive web graphics via the open source JavaScript graphing library plotly.js.

77 / 92
p1 ## a ggplot object

78 / 92
plotly::ggplotly(p1)
1701801902002102202303000400050006000
AdelieChinstrapGentooflipper_length_mmbody_mass_gspecies
79 / 92

3. GGally

80 / 92
GGally::ggpairs(penguins[, 1:5], aes(color = species, fill = species))+
scale_color_viridis_d() +
scale_fill_viridis_d()

81 / 92

4. gganimate

82 / 92
library("ggplot2")
library("dlstats")
data <- cran_stats("ggplot2")
p <- ggplot(data, aes(x= end, y = downloads)) +
geom_line() +
labs(title = "Download stats of ggplot2 package", x = "Time", y = "Downloads")
p

83 / 92
library(gganimate)
p +
transition_reveal(along = end)

84 / 92
p <- ggplot(penguins, aes(flipper_length_mm, body_mass_g , color = species)) +
geom_point() + scale_color_viridis_d() +
labs(title = "Measurements of penguins {closest_state}")+
transition_states(states = year) + enter_grow() + exit_fade()
p

85 / 92

5. ggrepel

86 / 92

Text annotation

df <- penguins %>%
filter( flipper_length_mm > 225 )
ggplot(penguins, aes(x=flipper_length_mm, y= body_mass_g))+
geom_point()+
theme(aspect.ratio = 1) +
geom_text(data= df,
aes(x=flipper_length_mm, y= body_mass_g, label= island))

87 / 92

Text annotation

ggplot(penguins, aes(x=flipper_length_mm, y= body_mass_g))+
geom_point()+
theme(aspect.ratio = 1) +
ggrepel::geom_text_repel(data= df,
aes(x=flipper_length_mm, y= body_mass_g, label= island))

88 / 92

6. ggforce

89 / 92
library(ggforce)
penguins <- penguins %>% drop_na()
p <- ggplot(penguins, aes(x=flipper_length_mm, y= body_mass_g))+
geom_mark_ellipse(aes(
filter = species == "Gentoo",
label = 'Gentoo penguins'),
description = 'Palmer Station Antarctica LTER and K. Gorman. 2020.') +
geom_point()
p

90 / 92
library(ggforce)
ggplot(penguins, aes(x=flipper_length_mm, y= body_mass_g, color = species)) +
geom_point() +
scale_color_viridis_d() +
facet_zoom(x = species == "Gentoo")

91 / 92

Thank you

Slides available at: prital.netlify.app

pridiltal

Acknowledgements:

Hadley Wickham, Thomas Lin Pedersen and ggplot2 development team

Key References

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

92 / 92

Tidy Workflow

2 / 92
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow