Applied Metaphors: Learning TRIZ, Complexity, Data/Stats/ML using Metaphors
  1. Teaching
  2. Data Analytics for Managers and Creators
  3. Descriptive Analytics
  4. Evolution and Flow
  • Teaching
    • Data Analytics for Managers and Creators
      • Tools
        • Introduction to R and RStudio
        • Introduction to Radiant
        • Introduction to Orange
      • Descriptive Analytics
        • Data
        • Summaries
        • Counts
        • Quantities
        • Groups
        • Densities
        • Groups and Densities
        • Change
        • Proportions
        • Parts of a Whole
        • Evolution and Flow
        • Ratings and Rankings
        • Surveys
        • Time
        • Space
        • Networks
        • Experiments
        • Miscellaneous Graphing Tools, and References
      • Statistical Inference
        • 🧭 Basics of Statistical Inference
        • 🎲 Samples, Populations, Statistics and Inference
        • Basics of Randomization Tests
        • 🃏 Inference for a Single Mean
        • 🃏 Inference for Two Independent Means
        • 🃏 Inference for Comparing Two Paired Means
        • Comparing Multiple Means with ANOVA
        • Inference for Correlation
        • 🃏 Testing a Single Proportion
        • 🃏 Inference Test for Two Proportions
      • Inferential Modelling
        • Modelling with Linear Regression
        • Modelling with Logistic Regression
        • 🕔 Modelling and Predicting Time Series
      • Predictive Modelling
        • 🐉 Intro to Orange
        • ML - Regression
        • ML - Classification
        • ML - Clustering
      • Prescriptive Modelling
        • 📐 Intro to Linear Programming
        • 💭 The Simplex Method - Intuitively
        • 📅 The Simplex Method - In Excel
      • Workflow
        • Facing the Abyss
        • I Publish, therefore I Am
      • Case Studies
        • Demo:Product Packaging and Elderly People
        • Ikea Furniture
        • Movie Profits
        • Gender at the Work Place
        • Heptathlon
        • School Scores
        • Children's Games
        • Valentine’s Day Spending
        • Women Live Longer?
        • Hearing Loss in Children
        • California Transit Payments
        • Seaweed Nutrients
        • Coffee Flavours
        • Legionnaire’s Disease in the USA
        • Antarctic Sea ice
        • William Farr's Observations on Cholera in London
    • R for Artists and Managers
      • 🕶 Lab-1: Science, Human Experience, Experiments, and Data
      • Lab-2: Down the R-abbit Hole…
      • Lab-3: Drink Me!
      • Lab-4: I say what I mean and I mean what I say
      • Lab-5: Twas brillig, and the slithy toves…
      • Lab-6: These Roses have been Painted !!
      • Lab-7: The Lobster Quadrille
      • Lab-8: Did you ever see such a thing as a drawing of a muchness?
      • Lab-9: If you please sir…which way to the Secret Garden?
      • Lab-10: An Invitation from the Queen…to play Croquet
      • Lab-11: The Queen of Hearts, She Made some Tarts
      • Lab-12: Time is a Him!!
      • Iteration: Learning to purrr
      • Lab-13: Old Tortoise Taught Us
      • Lab-14: You’re are Nothing but a Pack of Cards!!
    • ML for Artists and Managers
      • 🐉 Intro to Orange
      • ML - Regression
      • ML - Classification
      • ML - Clustering
      • 🕔 Modelling Time Series
    • TRIZ for Problem Solvers
      • I am Water
      • I am What I yam
      • Birds of Different Feathers
      • I Connect therefore I am
      • I Think, Therefore I am
      • The Art of Parallel Thinking
      • A Year of Metaphoric Thinking
      • TRIZ - Problems and Contradictions
      • TRIZ - The Unreasonable Effectiveness of Available Resources
      • TRIZ - The Ideal Final Result
      • TRIZ - A Contradictory Language
      • TRIZ - The Contradiction Matrix Workflow
      • TRIZ - The Laws of Evolution
      • TRIZ - Substance Field Analysis, and ARIZ
    • Math Models for Creative Coders
      • Maths Basics
        • Vectors
        • Matrix Algebra Whirlwind Tour
        • content/courses/MathModelsDesign/Modules/05-Maths/70-MultiDimensionGeometry/index.qmd
      • Tech
        • Tools and Installation
        • Adding Libraries to p5.js
        • Using Constructor Objects in p5.js
      • Geometry
        • Circles
        • Complex Numbers
        • Fractals
        • Affine Transformation Fractals
        • L-Systems
        • Kolams and Lusona
      • Media
        • Fourier Series
        • Additive Sound Synthesis
        • Making Noise Predictably
        • The Karplus-Strong Guitar Algorithm
      • AI
        • Working with Neural Nets
        • The Perceptron
        • The Multilayer Perceptron
        • MLPs and Backpropagation
        • Gradient Descent
      • Projects
        • Projects
    • Data Science with No Code
      • Data
      • Orange
      • Summaries
      • Counts
      • Quantity
      • 🕶 Happy Data are all Alike
      • Groups
      • Change
      • Rhythm
      • Proportions
      • Flow
      • Structure
      • Ranking
      • Space
      • Time
      • Networks
      • Surveys
      • Experiments
    • Tech for Creative Education
      • 🧭 Using Idyll
      • 🧭 Using Apparatus
      • 🧭 Using g9.js
    • Literary Jukebox: In Short, the World
      • Italy - Dino Buzzati
      • France - Guy de Maupassant
      • Japan - Hisaye Yamamoto
      • Peru - Ventura Garcia Calderon
      • Russia - Maxim Gorky
      • Egypt - Alifa Rifaat
      • Brazil - Clarice Lispector
      • England - V S Pritchett
      • Russia - Ivan Bunin
      • Czechia - Milan Kundera
      • Sweden - Lars Gustaffsson
      • Canada - John Cheever
      • Ireland - William Trevor
      • USA - Raymond Carver
      • Italy - Primo Levi
      • India - Ruth Prawer Jhabvala
      • USA - Carson McCullers
      • Zimbabwe - Petina Gappah
      • India - Bharati Mukherjee
      • USA - Lucia Berlin
      • USA - Grace Paley
      • England - Angela Carter
      • USA - Kurt Vonnegut
      • Spain-Merce Rodoreda
      • Israel - Ruth Calderon
      • Israel - Etgar Keret
  • Posts
  • Blogs and Talks

On this page

  • Slides and Tutorials
  • Setting up R Packages
  • What Time Evolution Charts can we plot?
  • What Space Evolution Charts can we plot?
  • Case Study-1: Titanic Dataset
  • Chord Diagram
  • Dumbbell Plots
  • Wait, But Why?
  • Conclusion
  • Your Turn
  • References
  1. Teaching
  2. Data Analytics for Managers and Creators
  3. Descriptive Analytics
  4. Evolution and Flow

Evolution and Flow

Line and Area Plots
Dumbbell Plots
Parallel Set Plots
Alluvial Plots
Sankey Diagrams
Chord Diagrams
Bump Charts
Author

Arvind V.

Published

November 22, 2022

Modified

June 19, 2025

Abstract
Changes in Information over Space and Time

Slides and Tutorials

R Tutorial

“My stories run up and bite me in the leg – I respond by writing them down – everything that goes on during the bite. When I finish, the idea lets go and runs off.”

— Ray Bradbury, science-fiction writer (22 Aug 1920-2012)

Setting up R Packages

library(tidyverse)
library(ggstream)
library(ggformula)
# remotes::install_github("corybrunson/ggalluvial@main", build_vignettes = TRUE)
library(ggalluvial)
library(ggsankeyfier)
# install.packages("devtools")
# devtools::install_github("davidsjoberg/ggsankey")
library(ggsankey)
library(networkD3)
library(echarts4r) # Interactive graphs

Plot Theme

Show the Code
# https://stackoverflow.com/questions/74491138/ggplot-custom-fonts-not-working-in-quarto

# Chunk options
knitr::opts_chunk$set(
  fig.width = 7,
  fig.asp = 0.618, # Golden Ratio
  # out.width = "80%",
  fig.align = "center"
)
### Ggplot Theme
### https://rpubs.com/mclaire19/ggplot2-custom-themes

theme_custom <- function() {
  font <- "Roboto Condensed" # assign font family up front

  theme_classic(base_size = 14) %+replace% # replace elements we want to change

    theme(
      panel.grid.minor = element_blank(), # strip minor gridlines

      # text elements
      plot.title = element_text( # title
        family = font, # set font family
        # size = 20,               #set font size
        face = "bold", # bold typeface
        hjust = 0, # left align
        # vjust = 2                #raise slightly
        margin = margin(0, 0, 10, 0)
      ),
      plot.subtitle = element_text( # subtitle
        family = font, # font family
        # size = 14,                #font size
        hjust = 0,
        margin = margin(2, 0, 5, 0)
      ), plot.title.position = "plot",
      plot.caption = element_text( # caption
        family = font, # font family
        size = 8, # font size
        hjust = 1
      ), # right align

      axis.title = element_text( # axis titles
        family = font, # font family
        size = 10 # font size
      ),
      axis.text = element_text( # axis text
        family = font, # axis family
        size = 8
      ) # font size
    )
}

# Set graph theme
theme_set(new = theme_custom())

What Time Evolution Charts can we plot?

In these cases, the x-axis is typically time…and we chart the variable of another Quant variable with respect to time, using a line geometry.

Let is take a healthcare budget dataset from Our World in Data: We will plot graphs for 5 countries (India, China, Brazil, Russia, Canada ).

ImportantAnd Introducting echarts4r

We will also build interactive versions of these charts using echarts4r!

Download this data by clicking on the button below:

health <-
  read_csv("data/public-health-expenditure-share-GDP-OWID.csv")

health_filtered <- health %>%
  filter(Entity %in% c(
    "India",
    "China",
    "United States",
    "United Kingdom",
    "Russia",
    "Sweden"
  ))
  • Using ggformula
  • Using echarts4r
# Set graph theme
theme_set(new = theme_custom())

gf_point(
  data = health_filtered,
  public_health_expenditure_pc_gdp ~ Year,
  colour = ~Entity,
  ylab = "Healthcare Budget\n as % of GDP",
  title = "Line Charts to show Evolution (over Time )"
) %>%
  gf_line()
###
gf_area(
  data = health_filtered,
  public_health_expenditure_pc_gdp ~ Year,
  fill = ~Entity, alpha = 0.3,
  ylab = "Healthcare Budget\n as % of GDP",
  title = "Area Charts to show Evolution (over Time )"
) %>%
  gf_line(colour = ~Entity)

health_filtered %>%
  group_by(Entity) %>%
  e_charts(Year) %>%
  e_scatter(public_health_expenditure_pc_gdp) %>%
  e_line(public_health_expenditure_pc_gdp) %>%
  e_x_axis(name = "Year", min = 1850, max = 2050) %>%
  e_y_axis(
    name = "Public Health Expenditure",
    nameLocation = "middle", nameGap = 25
  ) %>%
  e_tooltip()
###
health_filtered %>%
  group_by(Entity) %>%
  e_charts(Year) %>%
  e_scatter(public_health_expenditure_pc_gdp) %>%
  e_area(public_health_expenditure_pc_gdp) %>%
  e_x_axis(name = "Year", min = 1850, max = 2050) %>%
  e_y_axis(
    name = "Public Health Expenditure",
    nameLocation = "middle", nameGap = 25
  ) %>%
  e_tooltip()

What Space Evolution Charts can we plot?

Here, the space can be any Qual variable, and we can chart another Quant or Qual variable move across levels of the first chosen Qual variable.

For instance we can contemplate enrollment at a University, and show how students move from course to course in a University. Or how customers drift from one category of products or brands to another….or the movement of cricket players from one IPL Team to another !!

Here is what Thomas Lin Pedersen says:

A parallel sets diagram is a type of visualisation showing the interaction between multiple categorical variables. If the variables have an intrinsic order the representation can be thought of as a Sankey Diagram. If each variable is a point in time it will resemble an Alluvial diagram.

(a) ggsankey aesthetics
(b) ggsankeyfier aesthetics
Figure 1: Geometric Aesthetics from two Sankey Plot Packages
  • The Qualitative variables being connected are mapped to stages/axes
  • Each level within a Qual variable is mapped to nodes / strata / lodes;
  • And the connections between the strata of the axes are called flows / edges / links / alluvia.

Such diagrams are best used when you want to show a many-to-many mapping between two domains or multiple paths through a set of stages E.g Students pursruing different degrees going through multiple courses with multiple departments during a semester of study. Here students, degrees, courses, departments would be some variables we would plot and we would visualize the number of students moving across courses and deparments based on their degree etc.

Here is an example of a Sankey Diagram: This diagram show how energy is converted or transmitted before being consumed or lost: supplies are on the left, and demands are on the right. (Data: Department of Energy & Climate Change via Tom Counsell)1:

NoteSwitching to ggplot here

For the next few charts, there are (as yet) no equivalents in ggformula. Hence we will use ggplot.

Case Study-1: Titanic Dataset

# library(ggalluvial)
data("Titanic")
Titanic <- Titanic %>% as_tibble()
Titanic
ABCDEFGHIJ0123456789
Class
<chr>
Sex
<chr>
Age
<chr>
Survived
<chr>
n
<dbl>
1stMaleChildNo0
2ndMaleChildNo0
3rdMaleChildNo35
CrewMaleChildNo0
1stFemaleChildNo0
2ndFemaleChildNo0
3rdFemaleChildNo17
CrewFemaleChildNo0
1stMaleAdultNo118
2ndMaleAdultNo154
Next
1234
Previous
1-10 of 32 rows
NoteTable Form Data

Note that this data is in tidy wide / table form, with separate columns for each Qualitative variable and a separate count column, which we saw when we examined Categorical Data. This is, in my opinion, intuitively the best form of data to plot a Sankey plot with. Each variable gives us “one part in the flow”. But there are other forms such as the tidy long form which we have been using practically all this while. You will find examples of on the ggalluvial website using tidy long form data. https://corybrunson.github.io/ggalluvial/

  • Using ggplot
  • Using ggsankeyfier
  • Using echarts4r
# Set graph theme
theme_set(new = theme_custom())

##
Titanic %>% ggplot(
  data = .,

  # Select the Categorical Variables for the vertical Axes / Stages
  aes(
    axis1 = Class,
    axis2 = Sex,
    axis3 = Age,
    axis4 = Survived,
    y = n
  ), fill = "white"
) +

  # Alluvials between Categorical Axes
  geom_alluvium(aes(fill = Survived),
    colour = "black",
    linewidth = 0.25
  ) +

  # Vertical segments for each Categorical Variable2
  geom_stratum(
    colour = "black",
    linewidth = 1,
    fill = "white"
  ) +

  # Labels for each "level" of the Categorical Axes
  geom_text(
    stat = "stratum", size = 3,
    aes(label = after_stat(stratum))
  ) +



  # Scales and Colours
  scale_x_discrete(
    limits = c("Class", "Sex", "Age", "Survived"),
    expand = c(0.1, 0.1)
  ) +
  scale_fill_manual(values = c("red3", "springgreen3")) +
  xlab("Demographic") +
  ggtitle(
    "Passengers on the maiden voyage of the Titanic",
    "Stratified by demographics and survival"
  )

Here is how the package ggalluvial defines the elements of a typical alluvial plot:

  • An axis is a dimension (variable) along which the data are vertically arranged at a fixed horizontal position. The plot above uses three categorical axes: Class, Sex, and Age.
  • The groups at each axis are depicted as opaque blocks called strata. For example, the Class axis contains four strata: 1st, 2nd, 3rd, and Crew.
  • Horizontal (x-) splines called alluvia span the entire width of the plot. In this plot, each alluvium corresponds to a fixed strata value of each axis variable, indicated by its vertical position at the axis, as well as of the Survived variable, indicated by its fill color.
  • The segments of the alluvia between pairs of adjacent axes are flows.
  • The alluvia intersect the strata at lodes. The lodes are not visualized in the above plot, but they can be inferred as filled rectangles extending the flows through the strata at each end of the plot or connecting the flows on either side of the center stratum.

The ggsankeyfier also plots alluvial and sankey diagrams. This package takes data in long-form. See this article.. ggsankeyfier has builtin commands to convert data from wide to long:

Titanic %>%
  as_tibble() %>%
  ggsankeyfier::pivot_stages_longer(
    data = .,
    stages_from = c("Class", "Sex", "Age", "Survived"),
    values_from = "n",
    additional_aes_from = "Survived"
  ) -> Titanic_long
Titanic_long
ABCDEFGHIJ0123456789
Survived
<fct>
n
<dbl>
edge_id
<int>
connector
<chr>
node
<fct>
No1181from1st
No1181toMale
Yes622from1st
Yes622toMale
No43from1st
No43toFemale
Yes1414from1st
Yes1414toFemale
No1545from2nd
No1545toMale
Next
123456
Previous
1-10 of 56 rows | 1-5 of 6 columns

This data is in long form, with stages defining the axes in the graph, and the node variable giving us levels within each (Qualitative) axis. The edge_id labels both ends (from and to) of each connector or edge/flow/alluvium.

Let us plot this now:

Titanic_long %>%
  ggplot(aes(
    x = stage, y = n,
    group = node, connector = connector,
    edge_id = edge_id
  )) +
  geom_sankeynode(v_space = "auto") +
  geom_sankeyedge(aes(fill = Survived), v_space = "auto") +
  labs(x = "")

Let us make an interactive graph for this dataset using echarts4.

ClassSex <-
  Titanic %>%
  group_by(Class, Sex) %>%
  summarise(cs = sum(n)) %>%
  ungroup() %>%
  rename("source" = Class, "target" = Sex, "value" = cs)

SexAge <-
  Titanic %>%
  group_by(Sex, Age) %>%
  summarise(sa = sum(n)) %>%
  ungroup() %>%
  rename("source" = Sex, "target" = Age, "value" = sa)

AgeSurvived <-
  Titanic %>%
  group_by(Age, Survived) %>%
  summarise(as = sum(n)) %>%
  ungroup() %>%
  rename("source" = Age, "target" = Survived, "value" = as)

Combo <- rbind(ClassSex, SexAge, AgeSurvived)
Combo
ABCDEFGHIJ0123456789
source
<chr>
target
<chr>
value
<dbl>
1stFemale145
1stMale180
2ndFemale106
2ndMale179
3rdFemale196
3rdMale510
CrewFemale23
CrewMale862
FemaleAdult425
FemaleChild45
Next
12
Previous
1-10 of 16 rows
Combo %>%
  e_charts() %>%
  e_sankey(source, target, value) %>%
  e_title("Titanic: Who lived, and who didn't?") %>%
  e_tooltip()

The process with echarts4r is quite different, since the data structure used by this package is different:

  • The echarts4r package needs to have source and target columns for axes, along with a value to determine the width of the alluvium. 
  • The names in the source and target can repeat, and can appear in both source and target columns in order to create a multi-axis diagram. Hence the data needs to be inherently in long form.
  • However, for the values, we need to manually calculate the aggregate totals for alluvia between each consecutive pairs of axes (i.e Qual variables). This is not done automatically in echarts4r, but it is with ggalluvial.
  • So we create grouped aggregate summaries for each pair of Qualitative variables that we wish to plot consecutively ( i.e as axis1, axis2…)
  • Stack these pair-wise alluvia totals into one combo data frame using rbind(), after renaming the variables to “source”, “target” and “value”.

Phew! seems like too much work to do…I wonder if good, old-fashioned pivot-longer will get us here…

Chord Diagram

We will explore this diagram when we explore network graphs with the tidygraph and ggraph packages.

Dumbbell Plots

A simple plot that can quickly indicate changes in multiple variables/aspects over either a time or a space variable is a dumbbell plot. This is a combination of scatter plot + a segment plot. Let us take our previously loaded health dataset and plot just the change in expenditure for multiple countries, across a time span of 8 years (2010 - 2018)

  • Using ggformula
  • Using ggplot
# Set graph theme
theme_set(new = theme_custom())
##
health_2010_2018 <- health %>%
  # select Years 2010 and 2018
  filter(Year %in% c(2010, 2018)) %>%
  # Make separate columns for each year, easier that way
  # Though not essential
  pivot_wider(
    id_cols = c(Entity, Code),
    names_from = Year,
    names_prefix = "Year",
    values_from = public_health_expenditure_pc_gdp
  )
health_2010_2018
ABCDEFGHIJ0123456789
Entity
<chr>
Code
<chr>
Year2010
<dbl>
Year2018
<dbl>
AlbaniaALB2.4422.878
ArgentinaARG5.5725.965
AustraliaAUS5.7817.009
AustriaAUT7.6307.724
BelgiumBEL7.7838.337
BrazilBRA3.5773.897
BulgariaBGR3.9324.330
CanadaCAN7.4837.577
ChileCHL3.9995.524
ChinaCHN2.1772.914
Next
123456
Previous
1-10 of 53 rows
# Set graph theme
theme_set(new = theme_custom())
##
health_2010_2018 %>%
  # remove NA data across the data set
  drop_na() %>%
  # take the top 20 countries based on 2018 allocation
  slice_max(n = 20, order_by = Year2018) %>%
  gf_segment(Entity + Entity ~ Year2010 + Year2018,
    colour = "grey",
    linewidth = 2
  ) %>%
  gf_point(Entity ~ Year2018,
    colour = ~"2018"
  ) %>%
  gf_point(Entity ~ Year2010,
    colour = ~"2010"
  )

## Can we do better? Sort the bars, improve axis ticks, title..
# Set graph theme
theme_set(new = theme_custom())
##
health_2010_2018 %>%
  # remove NA data across the data set
  drop_na() %>%
  # take the top 20 countries based on 2018 allocation
  slice_max(n = 20, order_by = Year2018) %>%
  # plot segments first
  gf_segment(
    reorder(Entity, Year2018) + reorder(Entity, Year2018) ~
      Year2010 + Year2018,
    colour = "grey",
    linewidth = 2
  ) %>%
  # Then plot points
  gf_point(reorder(Entity, Year2018) ~ Year2018,
    colour = ~"2018",
    size = 3
  ) %>%
  gf_point(
    reorder(Entity, Year2018) ~ Year2010,
    colour = ~"2010", size = 3,
    xlab = "Health Expenditure as Percentage of GDP",
    ylab = "Country",
    title = "Healthcare Budgets Changes between 2010 to 2018",
    subtitle = "Bars are Sorted",
    caption = "And the X-Axis is in percentage"
  ) %>%
  gf_refine(
    scale_x_continuous(
      breaks = scales::breaks_width(2),
      labels = scales::label_percent(suffix = "%", scale = 1)
    ),
    scale_colour_manual(name = "Year", values = c("red", "green"))
  )

# Set graph theme
theme_set(new = theme_custom())


health_2010_2018 <- health %>%
  # select Years 2010 and 2018
  filter(Year %in% c(2010, 2018)) %>%
  # Make separate columns for each year, easier that way
  # Though not essential
  pivot_wider(
    id_cols = c(Entity, Code),
    names_from = Year,
    names_prefix = "Year",
    values_from = public_health_expenditure_pc_gdp
  )

health_2010_2018 %>%
  # remove NA data across the data set
  drop_na() %>%
  # take the top 20 countries based on 2018 allocation
  slice_max(n = 20, order_by = Year2018) %>%
  ggplot() +
  geom_segment(
    aes(
      y = Entity, yend = Entity,
      x = Year2010, xend = Year2018
    ),
    colour = "grey",
    linewidth = 2
  ) +
  geom_point(aes(y = Entity, x = Year2018, colour = "2018")) +
  geom_point(aes(y = Entity, x = Year2010, colour = "2010"))
## Can we do better?

health_2010_2018 %>%
  # remove NA data across the data set
  drop_na() %>%
  # take the top 20 countries based on 2018 allocation
  slice_max(n = 20, order_by = Year2018) %>%
  ggplot() +
  # plot segments first
  geom_segment(
    aes(
      y = reorder(Entity, Year2018), yend = reorder(Entity, Year2018),
      x = Year2010, xend = Year2018
    ),
    colour = "grey",
    linewidth = 2
  ) +

  # Then plot points
  geom_point(aes(
    y = reorder(Entity, Year2018), x = Year2018,
    colour = "2018"
  ), size = 3) +
  geom_point(aes(
    y = reorder(Entity, Year2018), x = Year2010,
    colour = "2010"
  ), size = 3) +
  labs(
    x = "Health Expenditure as Percentage of GDP",
    y = "Country", title = "Healthcare Budgets",
    subtitle = "Changes between 2010 to 2018"
  ) +
  scale_x_continuous(
    breaks = breaks_width(2),
    labels = scales::label_percent(suffix = "%", scale = 1)
  ) +
  scale_colour_manual(name = "Year", values = c("red", "green"))

Wait, But Why?

  • Changes can be over time, or over “space”
  • In the latter case, we can think of some Quantity changing over (multiple levels of) multiple Qualitative variables. E.g. Sales over Product Type over Showroom Location over Festival Season…
  • When a single Quant varies over a single multi-level Qual, the Chord Diagram may be simpler than the Sankey/Alluvial. E.g Bird migration across Multiple Locations. This can even show bidirectional changes. ( Sankeys with loops are also possible, however)
  • When you have a Quant that changes over only one two-level Qual variable, the Dumbbell plot becomes an option.

Conclusion

We see that we can visualize “evolutions” over time and space. The evolutions can represent changes in the quantities of things, or their categorical affiliations or groups.

What business/design data would you depict in this way? Revenue streams? Employment? Expenditures over time and market? Migration? App usage patterns? There are many possibilities!

Note also that the Bump Charts are a special case of Alluvial/Sankey charts where each node connects/flows to only one other node.

logsUserNetworkAPI ServerCell TowerData ProcessorOnline PortalsatellitestransmitterStorageUIphone logsMake callpersistdisplayaccess

Your Turn

  1. Within the ggalluvial package are two datasets, majors and vaccinations. Plot alluvial charts for both of these.
  2. Go to the American Life Panel Website where you will find many public datasets. Try to take one and make charts from it that we have learned in this Module.

References

  1. Global Migration, https://download.gsb.bund.de/BIB/global_flow/ A good example of the use of a Chord Diagram.
  2. ggalluvial cheatsheet,https://cheatography.com/seleven/cheat-sheets/ggalluvial/
  3. John Coene, Sankey plots with echarts4r, https://echarts4r.john-coene.com/articles/chart_types.html#sankey
  4. Other packages: Sankey plot | the R Graph Gallery (r-graph-gallery.com)
  5. Another package: Sankey diagrams in ggplot2 with ggsankey | RCHARTS (r-charts.com)
  6. Sankey Charts using networkD3: http://christophergandrud.github.io/networkD3
R Package Citations
Package Version Citation
echarts4r 0.4.5 Coene (2023)
ggalluvial 0.12.5 Brunson (2020); Brunson and Read (2023)
ggsankey 0.0.99999 Sjoberg (2025)
ggsankeyfier 0.1.8 de Vries (2024)
ggstream 0.1.0 Sjoberg (2021)
networkD3 0.4.1 Allaire et al. (2025)
Allaire, J. J., Christopher Gandrud, Kenton Russell, and CJ Yetman. 2025. networkD3: D3 JavaScript Network Graphs from r. https://doi.org/10.32614/CRAN.package.networkD3.
Brunson, Jason Cory. 2020. “ggalluvial: Layered Grammar for Alluvial Plots.” Journal of Open Source Software 5 (49): 2017. https://doi.org/10.21105/joss.02017.
Brunson, Jason Cory, and Quentin D. Read. 2023. “ggalluvial: Alluvial Plots in ‘ggplot2’.” http://corybrunson.github.io/ggalluvial/.
Coene, John. 2023. Echarts4r: Create Interactive Graphs with “Echarts JavaScript” Version 5. https://doi.org/10.32614/CRAN.package.echarts4r.
de Vries, Pepijn. 2024. ggsankeyfier: Create Sankey and Alluvial Diagrams Using “ggplot2”. https://doi.org/10.32614/CRAN.package.ggsankeyfier.
Sjoberg, David. 2021. ggstream: Create Streamplots in “ggplot2”. https://doi.org/10.32614/CRAN.package.ggstream.
———. 2025. ggsankey: Sankey, Alluvial and Sankey Bump Plots. https://github.com/davidsjoberg/ggsankey.
Back to top

Footnotes

  1. D3 JavaScript Network Graphs from R: christophergandrud.github.io/networkD3/↩︎

Citation

BibTeX citation:
@online{v.2022,
  author = {V., Arvind},
  title = {\textless Iconify-Icon
    Icon=“carbon:sankey-Diagram”\textgreater\textless/Iconify-Icon\textgreater{}
    {Evolution} and {Flow}},
  date = {2022-11-22},
  url = {https://av-quarto.netlify.app/content/courses/Analytics/Descriptive/Modules/70-EvolutionFlow/},
  langid = {en},
  abstract = {Changes in Information over Space and Time}
}
For attribution, please cite this work as:
V., Arvind. 2022. “<Iconify-Icon Icon=‘carbon:sankey-Diagram’></Iconify-Icon> Evolution and Flow.” November 22, 2022. https://av-quarto.netlify.app/content/courses/Analytics/Descriptive/Modules/70-EvolutionFlow/.
Parts of a Whole
Ratings and Rankings

License: CC BY-SA 2.0

Website made with ❤️ and Quarto, by Arvind V.

Hosted by Netlify .