πŸƒ Rhythm

Ups and Downs, Rhymes and Reasons, Seasons and Rhythms

Correlations
Line Plots
Author

Arvind V

Published

May 12, 2024

Modified

May 27, 2024

What graphs will we see today?

Variable #1 Variable #2 Chart Names Chart Shape
Quant Quant Line Plot

What kind of Data Variables will we choose?

No Pronoun Answer Variable/Scale Example What Operations?
1 How Many / Much / Heavy? Few? Seldom? Often? When? Quantities, with Scale and a Zero Value.Differences and Ratios /Products are meaningful. Quantitative/Ratio Length,Height,Temperature in Kelvin,Activity,Dose Amount,Reaction Rate,Flow Rate,Concentration,Pulse,Survival Rate Correlation

Inspiration

Ek Ledecky bheegi-bhaagi si, is it?

Yeh Ledecky hai, ya jal-pari?

In Figure 1, the black line is the average of the 50 best times at each distance since 2000. The top 200 times for each distance since 2000 are also plotted, with light orange lines each representing one swimmer.

Her races and her career essentially follow the same pattern β€” the more she swims, the more she separates from the field.

Her 1500 metres record timing is better than the best time for 800m!!😱

How do these Chart(s) Work?

Line Plots take two separate Quant variables as inputs. Each of the variables is mapped to a position, or coordinate: one for the X-axis, and the other for the Y-axis. Each pair of observations from the two Quant variables ( which would be in one row!) give us a point. All this much is identical with the Scatter Plot.

And here, the points are connected together and sometimes thrown away altogether, leaving just the line.

Looking at the lines, we get a very function-al sense of the variation: is it upward or downward? Is it linear or nonlinear? Is it periodic or seasonal…all these questions can be answered with Line Charts.

When one variable is Time?

Line charts often have one variable as a time variable. In such case the data is said to be a time series. We might deal with Time Series later.

Plotting a Scatter Plot

What is the Story here?

  • Over the years different music formats have had their place in the sun
  • All physical forms are on the wane; streaming music is the current mode of music consumption.

The Shape of You Data

Never mind that silly song now.

As mentioned above, data can be in wide or long form. How does one imagine this shape-shifting that seems needed now and then? Let’s see.

Long Form and Wide Form Data

Several tools such as DataWrapper (and others, yes, I admit, even with code, as we will see) need data transformed to a specific shape. We should now look at this idea of shape in data. Consider the data tables below:

Product Power Cost Harmony Style Size Manufacturability Durability Universality
G1 0.5858003 0.2773750 0.7244059 0.0731445 0.1000535 0.4551024 0.9622046 0.9966129
G2 0.0089458 0.8135742 0.9060922 0.7546750 0.9540688 0.9710557 0.7617024 0.5062709
G3 0.2937396 0.2604278 0.9490402 0.2860006 0.4156071 0.5839880 0.7145085 0.4899432
Product Parameter Rating
G1 Power 0.5858003
G1 Cost 0.2773750
G1 Harmony 0.7244059
G1 Style 0.0731445
G1 Size 0.1000535
G1 Manufacturability 0.4551024
G1 Durability 0.9622046
G1 Universality 0.9966129
G2 Power 0.0089458
G2 Cost 0.8135742
G2 Harmony 0.9060922
G2 Style 0.7546750
G2 Size 0.9540688
G2 Manufacturability 0.9710557
G2 Durability 0.7617024
G2 Universality 0.5062709
G3 Power 0.2937396
G3 Cost 0.2604278
G3 Harmony 0.9490402
G3 Style 0.2860006
G3 Size 0.4156071
G3 Manufacturability 0.5839880
G3 Durability 0.7145085
G3 Universality 0.4899432

What we have done is:
- convert all the variable names into a stacked column Parameter
- Put all the numbers into another column Rating
- Repeated the Product column values as many times as needed to cover all Parameters (8 times).

See the gif below to get an idea of how this transformation can be worked reversibly. (Yeah, never mind the code also.)

So how can we actually do this? Turns out there are some nice people at U. San Diego who have built an R-oriented app called Radiant for Business Analytics that can do this pretty much click-and-point style, though it is nowhere as much fun as Orange. Head off there:

https://vnijs.shinyapps.io/radiant

We upload our original data, pivot it, and download the pivotted data. Peasants. Now the pivotted wide-form data should work in DataWrapper.

Whatever.

Dataset: Cancer

Examine the Data

Data Dictionary

Quantitative Data
Qualitative Data
  • Diagnosis: (text) (B)enign, or (M)alignant

Research Questions

Question

Q1.

Question

Q2.

What is the Story Here?

Your Turn

Wait, But Why?

References

  1. Charles Chambliss (1989). The Mundanity of Excellence: An ethnographical report on Stratification and Olympic Swimmers.

  2. Nijs V (2023). radiant: Business Analytics using R and Shiny. R package version 1.6.0, https://github.com/radiant-rstats/radiant.

Back to top