Applied Metaphors: Learning TRIZ, Complexity, Data/Stats/ML using Metaphors
  1. Teaching
  2. Data Science with No Code
  3. Ranking
  • Teaching
    • Data Analytics for Managers and Creators
      • Tools
        • Introduction to R and RStudio
        • Introduction to Radiant
        • Introduction to Orange
      • Descriptive Analytics
        • Data
        • Summaries
        • Counts
        • Quantities
        • Groups
        • Densities
        • Groups and Densities
        • Change
        • Proportions
        • Parts of a Whole
        • Evolution and Flow
        • Ratings and Rankings
        • Surveys
        • Time
        • Space
        • Networks
        • Experiments
        • Miscellaneous Graphing Tools, and References
      • Statistical Inference
        • 🧭 Basics of Statistical Inference
        • 🎲 Samples, Populations, Statistics and Inference
        • Basics of Randomization Tests
        • 🃏 Inference for a Single Mean
        • 🃏 Inference for Two Independent Means
        • 🃏 Inference for Comparing Two Paired Means
        • Comparing Multiple Means with ANOVA
        • Inference for Correlation
        • 🃏 Testing a Single Proportion
        • 🃏 Inference Test for Two Proportions
      • Inferential Modelling
        • Modelling with Linear Regression
        • Modelling with Logistic Regression
        • 🕔 Modelling and Predicting Time Series
      • Predictive Modelling
        • 🐉 Intro to Orange
        • ML - Regression
        • ML - Classification
        • ML - Clustering
      • Prescriptive Modelling
        • 📐 Intro to Linear Programming
        • 💭 The Simplex Method - Intuitively
        • 📅 The Simplex Method - In Excel
      • Workflow
        • Facing the Abyss
        • I Publish, therefore I Am
      • Case Studies
        • Demo:Product Packaging and Elderly People
        • Ikea Furniture
        • Movie Profits
        • Gender at the Work Place
        • Heptathlon
        • School Scores
        • Children's Games
        • Valentine’s Day Spending
        • Women Live Longer?
        • Hearing Loss in Children
        • California Transit Payments
        • Seaweed Nutrients
        • Coffee Flavours
        • Legionnaire’s Disease in the USA
        • Antarctic Sea ice
        • William Farr's Observations on Cholera in London
    • R for Artists and Managers
      • 🕶 Lab-1: Science, Human Experience, Experiments, and Data
      • Lab-2: Down the R-abbit Hole…
      • Lab-3: Drink Me!
      • Lab-4: I say what I mean and I mean what I say
      • Lab-5: Twas brillig, and the slithy toves…
      • Lab-6: These Roses have been Painted !!
      • Lab-7: The Lobster Quadrille
      • Lab-8: Did you ever see such a thing as a drawing of a muchness?
      • Lab-9: If you please sir…which way to the Secret Garden?
      • Lab-10: An Invitation from the Queen…to play Croquet
      • Lab-11: The Queen of Hearts, She Made some Tarts
      • Lab-12: Time is a Him!!
      • Iteration: Learning to purrr
      • Lab-13: Old Tortoise Taught Us
      • Lab-14: You’re are Nothing but a Pack of Cards!!
    • ML for Artists and Managers
      • 🐉 Intro to Orange
      • ML - Regression
      • ML - Classification
      • ML - Clustering
      • 🕔 Modelling Time Series
    • TRIZ for Problem Solvers
      • I am Water
      • I am What I yam
      • Birds of Different Feathers
      • I Connect therefore I am
      • I Think, Therefore I am
      • The Art of Parallel Thinking
      • A Year of Metaphoric Thinking
      • TRIZ - Problems and Contradictions
      • TRIZ - The Unreasonable Effectiveness of Available Resources
      • TRIZ - The Ideal Final Result
      • TRIZ - A Contradictory Language
      • TRIZ - The Contradiction Matrix Workflow
      • TRIZ - The Laws of Evolution
      • TRIZ - Substance Field Analysis, and ARIZ
    • Math Models for Creative Coders
      • Maths Basics
        • Vectors
        • Matrix Algebra Whirlwind Tour
        • content/courses/MathModelsDesign/Modules/05-Maths/70-MultiDimensionGeometry/index.qmd
      • Tech
        • Tools and Installation
        • Adding Libraries to p5.js
        • Using Constructor Objects in p5.js
      • Geometry
        • Circles
        • Complex Numbers
        • Fractals
        • Affine Transformation Fractals
        • L-Systems
        • Kolams and Lusona
      • Media
        • Fourier Series
        • Additive Sound Synthesis
        • Making Noise Predictably
        • The Karplus-Strong Guitar Algorithm
      • AI
        • Working with Neural Nets
        • The Perceptron
        • The Multilayer Perceptron
        • MLPs and Backpropagation
        • Gradient Descent
      • Projects
        • Projects
    • Data Science with No Code
      • Data
      • Orange
      • Summaries
      • Counts
      • Quantity
      • 🕶 Happy Data are all Alike
      • Groups
      • Change
      • Rhythm
      • Proportions
      • Flow
      • Structure
      • Ranking
      • Space
      • Time
      • Networks
      • Surveys
      • Experiments
    • Tech for Creative Education
      • 🧭 Using Idyll
      • 🧭 Using Apparatus
      • 🧭 Using g9.js
    • Literary Jukebox: In Short, the World
      • Italy - Dino Buzzati
      • France - Guy de Maupassant
      • Japan - Hisaye Yamamoto
      • Peru - Ventura Garcia Calderon
      • Russia - Maxim Gorky
      • Egypt - Alifa Rifaat
      • Brazil - Clarice Lispector
      • England - V S Pritchett
      • Russia - Ivan Bunin
      • Czechia - Milan Kundera
      • Sweden - Lars Gustaffsson
      • Canada - John Cheever
      • Ireland - William Trevor
      • USA - Raymond Carver
      • Italy - Primo Levi
      • India - Ruth Prawer Jhabvala
      • USA - Carson McCullers
      • Zimbabwe - Petina Gappah
      • India - Bharati Mukherjee
      • USA - Lucia Berlin
      • USA - Grace Paley
      • England - Angela Carter
      • USA - Kurt Vonnegut
      • Spain-Merce Rodoreda
      • Israel - Ruth Calderon
      • Israel - Etgar Keret
  • Posts
  • Blogs and Talks

On this page

  • What graphs will we see today?
  • What kind of Data Variables will we choose?
  • Inspiration
  • How do these Chart(s) Work?
  • Plotting a Dumbbell Chart
  • Plotting a Radar Chart
  • Dataset: Brood Parasites - Cuckoo Eggs and Host Eggs
    • Examine the Data
    • Data Dictionary
    • Research Questions
    • What is the Story Here?
  • Dataset: Employment vs Population vs Gender
    • Examine the Data
    • Data Dictionary
    • What is the Story Here?
  • Bump Charts
  • Your Turn
  • Wait, But Why?
  • Readings
  1. Teaching
  2. Data Science with No Code
  3. Ranking

Ranking

Gryffindor beats Slytherin to the House Cup

Published

July 19, 2024

What graphs will we see today?

Variable #1 Variable #2 Chart Names Chart Shape
Quant None Dumbbell and Radar Charts

What kind of Data Variables will we choose?

No Pronoun Answer Variable/Scale Example What Operations?
2 How Many / Much / Heavy? Few? Seldom? Often? When? Quantities with Scale. Differences are meaningful, but not products or ratios Quantitative/Interval pH,SAT score(200-800),Credit score(300-850),SAT score(200-800),Year of Starting College Mean,Standard Deviation
3 How, What Kind, What Sort A Manner / Method, Type or Attribute from a list, with list items in some " order" ( e.g. good, better, improved, best..) Qualitative/Ordinal Socioeconomic status (Low income, Middle income, High income),Education level (HighSchool, BS, MS, PhD),Satisfaction rating(Very much Dislike, Dislike, Neutral, Like, Very Much Like) Median,Percentile

Inspiration

(a) Energy Sources in the USA in 2024
(b) 5 tools Players in Baseball
Figure 1: Dumbbell and Radar Charts for Ranking

What do we see here? From https://www.visualcapitalist.com/sp/americas-cheapest-sources-of-electricity-in-2024/ :

From Figure 1 (a):

  • Onshore wind power effectively costs $0 per megawatt-hour (MWh) when subsidies are included!
  • Demand for storage solutions is rising quickly. If storage is included, the minimum cost for onshore wind increases to $8 per MWh.
  • Solar photovoltaics (PV) have similarly attractive economics. With subsidies, the minimum cost is $6 per MWh. When including storage, $38 per MWh. Notably, the maximum cost of solar PV with storage has significantly increased from $102 in 2023 to $210 in 2024.
  • For gas-combined cycle plants, which combine natural gas and steam turbines for efficient electricity generation, the maximum price has climbed $7 year-over-year to $108 per MWh.

And from From Figure 1 (b)?

  • There is a clear difference in the capabilities of the three players compared, though all of them are classified as “5 tools” players.
  • Each player is better than the others at one unique skill: Betts at Throwing, Judge at Hit_power, and Trout at Hit_avg.

How do these Chart(s) Work?

Dumbbell charts show changes in rank/attainment/performance of several entities over two “instants in time” or two “points of interest”. ( Note these two prepositions!! ) The chart is usually sorted to show the entity with the largest change at the very top, or the very bottom. The Y-axis is the “entity” variable (Qual!) and the X-axis is a SINGLE rank or measure of attainment/performance (Quant!). In the above chart, we saw different energy sources as “entities” and their cost as the performance measure, and the energy sources were (roughly) ranked in order of the change in cost. The shape is of course, a bar/dumbbell with endpoints. The length of the bar is proportional to the change.

A Radar chart does not show change; it simply plots a set of static performance measures or ranks. However these measures or ranks are not a single performance measure but MULTIPLE. So how do we have multiple X-axes then? We use angle and create as many axes as we need depending upon the number of measures we wish to show, all axes diverging from a single point. The performance measure us marked off along each such angled axis, usually with the same scale (though that may require external pre-processing). The final shape is of course a polygon, and we can plot many “entities” as overlapping, semi-transparent polygons. In the plot above, the entities are the players, and the performance measures are the so-called 5 tools of baseball.

Plotting a Dumbbell Chart

  • Using Orange
  • Using RAWgraphs
  • Using DataWrapper

There does not appear to be a way of plotting dumbbell charts in Orange. 😢.

There does not appear to be a way of plotting dumbbell charts in RAWgraphs. 😢.

In DataWrapper, a dumbbell plot is referred to as a range plot, which is also quite an appropriate name:
https://academy.datawrapper.de/article/111-how-to-create-a-range-plot

Another rather similar and evocative plot on DataWrapper is the arrow plot: https://academy.datawrapper.de/article/123-how-to-create-an-arrow-plot

Here is an example of a dumbbell chart/range chart created in DataWrapper. This chart ranks different countries on how much better off the nursing profession is in those countries. (The comparison is with the UK).

Hit the Get the data button, and then upload it into https://app.datawrapper.de/ and see if you can recreate this chart:

The direct link to this dataviz is https://www.datawrapper.de/_/9q1tJ/

Plotting a Radar Chart

  • Using Orange
  • Using RAWgraphs
  • Using DataWrapper

Nopes.

Download this RAWgraphs project file to your machine and then upload to https://app.rawgraphs.io/:

Nopes.

Dataset: Brood Parasites - Cuckoo Eggs and Host Eggs

Cuckoo birds drop their eggs into other birds’ nests, where they hatch and are looked after by the unwitting host-parent bird, often at the cost of their own babies, a phenomenon known as brood parasitism.

The data is available at Vincent Arel-Bundock’s website: https://vincentarelbundock.github.io/Rdatasets/csv/DAAG/cuckoohosts.csv. Use this URL to directly import into Orange.

The dataset contains dimensions of the eggs of the host birds and compares them to that of the cuckoo. Import this dataset into Orange and look at the variables, their nature, and their summaries.

Examine the Data

A data frame with 10 observations on the following 12 variables. Each row corresponds to a host species bird.

(a) Egg Data Table
(b) Egg Data Table
Figure 2: Egg Dimensions data
Warning

Don’t be confused with Figure 2 (b) showing means and sds, and the very variable names having means and sds! The table shows computed measures in these variables!

Data Dictionary

NoteQuantitative Data
  • rownames: Not aptly named, but contains the names of the host bird species.
NoteQualitative Data
  • clength: mean length of cuckoo eggs in given host’s nest
  • cl.sd: standard deviation of cuckoo egg lengths
  • cbreadth: mean breadth of cuckoo eggs in given host’s nest
  • cb.sd: standard deviation of cuckoo egg breadths
  • cnum: number of cuckoo eggs
  • hlength: length of host eggs
  • hl.sd: standard deviation of host egg lengths
  • hbreadth: breadth of host eggs
  • hb.sd: standard deviation of host egg breadths
  • hnum: number of host eggs
  • match: number of eggs where color matched
  • nomatch: number where color did not match

Research Questions

NoteQuestion #1

Q1. How different are length, breadth (mean) of host eggs different from those of the cuckoo’s eggs

Figure 3: Bird Eggs Radar Chart
NoteQuestion #2

Q2. Are the statistical measures (standard deviations) of the length/breadth different between cuckoo and host eggs?

Figure 4: Bird Eggs Stats Radar Chart
Figure 5: Bird Eggs Stats Radar Chart by Host Species

What is the Story Here?

  • The Figure 3 shows that both mean-lengths and mean-breadths of the eggs are nearly the same between those of the host and the cuckoo! 😮. The poor host bird has little chance of detecting the parasite egg purely by dimensions….
  • From Figure 4, the statistical variations are also nearly the same, except for a few host species where the variation (sd) in the host-egg-length is much larger.
  • This aspect is seen better in Figure 5, where for the Wren, the Robin, and the Hedge Sparrow, ….s-o-m-e.. times, the parasite cuckoo egg may be much smaller and perhaps detectable..but again small size may render it inconspicous!
  • But..is this over time? Are all the eggs the same age?…Ummm…

Who was it who said:

काकः कृष्णः पिकः कृष्णः को भेदः पिककाकयोः ।
वसन्तकाले संप्राप्ते काकः काकः पिकः पिकः ॥
- कुवलयानन्द

Dataset: Employment vs Population vs Gender

This is a dataset from Our World in Data. Download this data and import into Orange to take a look at it. We might then decide what we wish to see by way of a chart and pre-process the data and saving it with Orange. Then we will send this data to RAWGraphs/DataWrapper to plot our charts.

We will as usual examine the data in Orange, filter and process as needed, and then use the other tools to plot charts to answer our Questions. The workflow for Orange is downloadable with the button below:

Examine the Data

Employment Data Reading and ConversionEmployment Data Reading and Conversion

  • We have converted the Entity and Code variable to Qual
  • We have used the Select Row widget to select just 7 rows from the 53K rows

Data Dictionary

A dataframe with 7 rows and 5 columns.

NoteQuantitative Data
  • employment-to-population-ratio, men(%): Population of men employed
  • employment-to-population-ratio, women(%): Population of women employed
  • Year: year( = 2010)
NoteQualitative Data
  • entity: country
  • code: code for the country
  • continent: continent

Use the Orange Save Data widget to save the filtered file as a new CSV and then upload into DataWrapper! Here is the dumbbell chart from DataWrapper. You can head off to DataWrapper here and edit a copy of this chart.

What is the Story Here?

With a simple but effective chart like this, we can tell the story pretty quickly:

  • India and Pakistan have huge differences between the employment percentages of women and men.
  • All countries shown in the chart have a higher percentage of men employed than women.

Bump Charts

DataWrapper does offer a way of creating bump charts for ranking, that look like this:

Figure 6: Bump Chart

The chart shows the ranking of different chart types over the years. The procedure on DataWrapper is here: https://academy.datawrapper.de/article/347-how-to-create-a-bump-chart

However, I think this procedure is not worth it and creating the plot with R code is far easier and more intuitive.

Your Turn

Note
  1. Try the Bird Eggs dataset with normalization and see if the story changes!
Note
  1. Japanese Sake Wines Find this dataset about the grading of Japanese Sake wines: https://vincentarelbundock.github.io/Rdatasets/csv/heplots/Sake.csv" You should be able to use this URL directly in RAWGraphs/DataWrapper.
Note
  1. Sea Weed Nutrition

Choose the right sheet in the xls! You may need to use Orange to pre-process this data using the Orange Widgets Select Columns, Select Rows, and Preprocess. With the Preprocess widget, you may wish to normalize each column into the range [0,1] for your Radar Charts.

Wait, But Why?

  • We can measure some Performance metric about entities such as Products, Brands, Shops, Companies, Stock Prices/Earnings and see how it changes over two instances of measurement, with a dumbbell chart.
  • The length of the dumbbells tells a very clear story.
  • Dumbbell Plots are clearly are more intuitive and clear than the corresponding bar chart:
Figure 7: Employment Gender Bar Chart
  • Differences between the same set of data at two different aspects is very quickly apparent
  • Differences in differences(DID) are also quite easily apparent. Experiments do use these metrics and these plots would be very useful there.
  • If entities have their performance or quality measured over several different “aspects”, a radar chart would serve you well. Do you think Dumbledore could have used a Radar Chart to decide who could have won the House Trophy at Hogwarts?
  • The area(s) and non-overlapping parts of the (overlaid) radar chart are very evocative of superior performance.

Readings

  1. Highcharts Blog. Why you need to start using dumbbell charts
    https://github.com/hrbrmstr/ggalt#lollipop-charts

  2. See this use of Radar Charts in Education. Choose the country/countries of choice and plot their ranks on various educational parameters in a radar chart. https://gpseducation.oecd.org/Home

Back to top
Structure
Space

License: CC BY-SA 2.0

Website made with ❤️ and Quarto, by Arvind V.

Hosted by Netlify .