No | Pronoun | Answer | Variable/Scale | Example | What Operations? |
---|---|---|---|---|---|
3 | How, What Kind, What Sort | A Manner / Method, Type or Attribute from a list, with list items in some " order" ( e.g. good, better, improved, best..) | Qualitative/Ordinal | Socioeconomic status (Low income, Middle income, High income),Education level (HighSchool, BS, MS, PhD),Satisfaction rating(Very much Dislike, Dislike, Neutral, Like, Very Much Like) | Median,Percentile |
Counts
Happy Families are All Alike
What graphs will we see today?
Variable #1 | Variable #2 | Chart Names | Chart Shape |
---|---|---|---|
Qual | None | Bar Chart |
What kind of Data Variables will we choose?
Inspiration
How much does the (financial) capital of a country contribute to its GDP? Which would be India’s city? What would be the reduction in percentage?
And these Germans are crazy.(Toc, toc, toc.toc!)
How do these Chart(s) Work?
Bar are used to show “counts” and “tallies” with respect to Qual variables. For instance, in a survey, how many people vs Gender? In a Target Audience survey on Weekly Consumption, how many low, medium, or high expenditure people?
Each Qual variable potentially has many levels as we saw in the Nature of Data. For instance, in the above example on Weekly Expenditure, low
, medium
and high
were levels for the Qual variable Expenditure
. Bar charts perform internal counts for each level of the Qual variable under consideration. The Bar Plot is then a set of disjoint bars representing these counts; see the icon above, and then that for histograms!! The X-axis is the set of levels in the Qual variable, and the Y-axis represents the counts for each level.
And Column charts just plot numbers over categories. No internal counting. As you can see in the Figure 1 above.
Though in many places, these two names are used interchangeably! But be aware of what the tool may be doing!
Plotting a Bar Chart
The Bar Plot widget in Orange is described here. https://orangedatamining.com/widget-catalog/visualize/barplot/
And download the Bar Chart workflow file for this data:
Dataset: Banned Books in the USA
Here is a dataset from Jeremy Singer-Vine’s blog, Data Is Plural. This is a list of all books banned in schools across the US.
Download this data to your machine and use it in Orange.
Examine the Data
Figure 2 states that we have 1586 rows, 7 columns. So 1586 banned books are on this list! 🙀 🙀 🙀
The Figure 3 already has a thumbnail-like bar chart. We will still make a “proper” one with the appropriate widget.
In the workflow below, note how it is still the Distributions widget that gives the Bar Chart. This is unfortunate, since we have been at pains to state how a Bar Chart and the Histogram deal with different types of variables (Qual and Quant respectively). Just one of those things we need to get used to!!
Data Dictionary
-
Date of Challenge
: Date the book was (selected to be?) banned
-
Author
: (text) Meta Data. Can be treated as Qual -
Title
: (text) Meta Data. Can be treated as Qual -
State
: (text) Qual factor -
District
: (text) Qual factor -
Type of Ban
: (text) Qual factor -
Origin of Challenge
: (text) Who requested the Ban?
How many levels in each?? Find out in Orange!!
Research Questions
Q1. Which is the US state that bans the most? Which state is least involved in banning books? What can you say of the “geography of book banning” based on your understanding of the US of A? 🤣
Q2. Create Bar charts of the count of banned books by Reason for Banning!!
Try!!
What is the Story Here?
- Figure 4 says that Texas is the worst at book banning!
- Texas, Florida, Oklahoma, Kansas, Indiana,..are next in line
- Is there a “Bible Belt” story here?
And what, Californians are too busy making money to care about book-banning!!! The state does not even show up in the chart! 😆
What does the second bar chart say?
Your Turn
- AiRbnb Price Data on the French Riviera:
- Apartment price vs ground living area:
- Fertility: This rather large and interesting Fertility related dataset from https://vincentarelbundock.github.io/Rdatasets/csv/AER/Fertility.csv