Applied Metaphors: Learning TRIZ, Complexity, Data/Stats/ML using Metaphors
  1. Teaching
  2. Data Analytics for Managers and Creators
  3. Statistical Inference
  4. Basics of Randomization Tests
  • Teaching
    • Data Analytics for Managers and Creators
      • Tools
        • Introduction to R and RStudio
        • Introduction to Radiant
        • Introduction to Orange
      • Descriptive Analytics
        • Data
        • Summaries
        • Counts
        • Quantities
        • Groups
        • Densities
        • Groups and Densities
        • Change
        • Proportions
        • Parts of a Whole
        • Evolution and Flow
        • Ratings and Rankings
        • Surveys
        • Time
        • Space
        • Networks
        • Experiments
        • Miscellaneous Graphing Tools, and References
      • Statistical Inference
        • 🧭 Basics of Statistical Inference
        • 🎲 Samples, Populations, Statistics and Inference
        • Basics of Randomization Tests
        • 🃏 Inference for a Single Mean
        • 🃏 Inference for Two Independent Means
        • 🃏 Inference for Comparing Two Paired Means
        • Comparing Multiple Means with ANOVA
        • Inference for Correlation
        • 🃏 Testing a Single Proportion
        • 🃏 Inference Test for Two Proportions
      • Inferential Modelling
        • Modelling with Linear Regression
        • Modelling with Logistic Regression
        • 🕔 Modelling and Predicting Time Series
      • Predictive Modelling
        • 🐉 Intro to Orange
        • ML - Regression
        • ML - Classification
        • ML - Clustering
      • Prescriptive Modelling
        • 📐 Intro to Linear Programming
        • 💭 The Simplex Method - Intuitively
        • 📅 The Simplex Method - In Excel
      • Workflow
        • Facing the Abyss
        • I Publish, therefore I Am
      • Case Studies
        • Demo:Product Packaging and Elderly People
        • Ikea Furniture
        • Movie Profits
        • Gender at the Work Place
        • Heptathlon
        • School Scores
        • Children's Games
        • Valentine’s Day Spending
        • Women Live Longer?
        • Hearing Loss in Children
        • California Transit Payments
        • Seaweed Nutrients
        • Coffee Flavours
        • Legionnaire’s Disease in the USA
        • Antarctic Sea ice
        • William Farr's Observations on Cholera in London
    • R for Artists and Managers
      • 🕶 Lab-1: Science, Human Experience, Experiments, and Data
      • Lab-2: Down the R-abbit Hole…
      • Lab-3: Drink Me!
      • Lab-4: I say what I mean and I mean what I say
      • Lab-5: Twas brillig, and the slithy toves…
      • Lab-6: These Roses have been Painted !!
      • Lab-7: The Lobster Quadrille
      • Lab-8: Did you ever see such a thing as a drawing of a muchness?
      • Lab-9: If you please sir…which way to the Secret Garden?
      • Lab-10: An Invitation from the Queen…to play Croquet
      • Lab-11: The Queen of Hearts, She Made some Tarts
      • Lab-12: Time is a Him!!
      • Iteration: Learning to purrr
      • Lab-13: Old Tortoise Taught Us
      • Lab-14: You’re are Nothing but a Pack of Cards!!
    • ML for Artists and Managers
      • 🐉 Intro to Orange
      • ML - Regression
      • ML - Classification
      • ML - Clustering
      • 🕔 Modelling Time Series
    • TRIZ for Problem Solvers
      • I am Water
      • I am What I yam
      • Birds of Different Feathers
      • I Connect therefore I am
      • I Think, Therefore I am
      • The Art of Parallel Thinking
      • A Year of Metaphoric Thinking
      • TRIZ - Problems and Contradictions
      • TRIZ - The Unreasonable Effectiveness of Available Resources
      • TRIZ - The Ideal Final Result
      • TRIZ - A Contradictory Language
      • TRIZ - The Contradiction Matrix Workflow
      • TRIZ - The Laws of Evolution
      • TRIZ - Substance Field Analysis, and ARIZ
    • Math Models for Creative Coders
      • Maths Basics
        • Vectors
        • Matrix Algebra Whirlwind Tour
        • content/courses/MathModelsDesign/Modules/05-Maths/70-MultiDimensionGeometry/index.qmd
      • Tech
        • Tools and Installation
        • Adding Libraries to p5.js
        • Using Constructor Objects in p5.js
      • Geometry
        • Circles
        • Complex Numbers
        • Fractals
        • Affine Transformation Fractals
        • L-Systems
        • Kolams and Lusona
      • Media
        • Fourier Series
        • Additive Sound Synthesis
        • Making Noise Predictably
        • The Karplus-Strong Guitar Algorithm
      • AI
        • Working with Neural Nets
        • The Perceptron
        • The Multilayer Perceptron
        • MLPs and Backpropagation
        • Gradient Descent
      • Projects
        • Projects
    • Data Science with No Code
      • Data
      • Orange
      • Summaries
      • Counts
      • Quantity
      • 🕶 Happy Data are all Alike
      • Groups
      • Change
      • Rhythm
      • Proportions
      • Flow
      • Structure
      • Ranking
      • Space
      • Time
      • Networks
      • Surveys
      • Experiments
    • Tech for Creative Education
      • 🧭 Using Idyll
      • 🧭 Using Apparatus
      • 🧭 Using g9.js
    • Literary Jukebox: In Short, the World
      • Italy - Dino Buzzati
      • France - Guy de Maupassant
      • Japan - Hisaye Yamamoto
      • Peru - Ventura Garcia Calderon
      • Russia - Maxim Gorky
      • Egypt - Alifa Rifaat
      • Brazil - Clarice Lispector
      • England - V S Pritchett
      • Russia - Ivan Bunin
      • Czechia - Milan Kundera
      • Sweden - Lars Gustaffsson
      • Canada - John Cheever
      • Ireland - William Trevor
      • USA - Raymond Carver
      • Italy - Primo Levi
      • India - Ruth Prawer Jhabvala
      • USA - Carson McCullers
      • Zimbabwe - Petina Gappah
      • India - Bharati Mukherjee
      • USA - Lucia Berlin
      • USA - Grace Paley
      • England - Angela Carter
      • USA - Kurt Vonnegut
      • Spain-Merce Rodoreda
      • Israel - Ruth Calderon
      • Israel - Etgar Keret
  • Posts
  • Blogs and Talks

On this page

  • Introduction: The Lady who Drank Tea
    • Game
    • Analysis
  • Commentary
  • References
  1. Teaching
  2. Data Analytics for Managers and Creators
  3. Statistical Inference
  4. Basics of Randomization Tests

Basics of Randomization Tests

Randomization
Permutation
Explanatory Variable
Random Number Generation
Null Distributions
Published

November 27, 2022

Modified

June 27, 2024

Abstract
What is meant by a Randomization Test?

Pick a Card

Introduction: The Lady who Drank Tea

There is a famous story about a lady who claimed that tea with milk tasted different depending on whether the milk was added to the tea or the tea added to the milk. The story is famous because of the setting in which she made this claim. She was attending a party in Cambridge, England, in the 1920s. Also in attendance were a number of university dons and their wives. The scientists in attendance scoffed at the woman and her claim. What, after all, could be the difference?

All the scientists but one, that is. Rather than simply dismiss the woman’s claim, he proposed that they decide how one should test the claim. The tenor of the conversation changed at this suggestion, and the scientists began to discuss how the claim should be tested. Within a few minutes cups of tea with milk had been prepared and presented to the woman for tasting.

At this point, you may be wondering who the innovative scientist was and what the results of the experiment were. The scientist was R. A. Fisher, who first described this situation as a pedagogical example in his 1925 book on statistical methodology[^1]. Fisher developed statistical methods that are among the most important and widely used methods to this day, and most of his applications were biological.

Game

Let’s try an experiment. I’ll flip 10 coins. You guess which are heads and which are tails, and we’ll see how you do. Please write down a sequence of “H” or “T”. Comparing with your classmates, we will undoubtedly see that some of you did better and others worse.

What would be your impression of one of you got 9 guesses correct? Is that SKILL or is that something else? What would be your immediate reaction and next move?

Analysis

Back to the Lady who drank Tea !!

Let’s suppose we decide to test the lady with ten cups of tea. We’ll flip a coin to decide which way to prepare the cups. If we flip a head, we will pour the milk in first; if tails, we put the tea in first. Then we present the ten cups to the lady and have her state which ones she thinks were prepared each way.

It is easy to give her a score (9 out of 10, or 7 out of 10, or whatever it happens to be). It is trickier to figure out what to do with her score. Even if she is just guessing and has no idea, she could get lucky and get quite a few correct – maybe even all 10. But how likely is that?

Now let’s suppose the lady gets 9 out of 10 correct. That’s not perfect, but it is better than we would expect for someone who was just guessing. On the other hand, it is not impossible to get 9 out of 10 just by guessing.

So here is Fisher’s great idea: Let’s figure out how hard it is to get 9 out of 10 by guessing. If it’s not so hard to do, then perhaps that’s just what happened ( that she was guessing ), so we won’t be too impressed with the lady’s tea tasting ability. On the other hand, if it is really unusual to get 9 out of 10 correct by guessing, then we will have some evidence that she must be able to tell something ( and has an unusual Skill).

But how do we figure out how unusual it is to get 9 out of 10 just by guessing? Let’s just flip a bunch of coins and keep track. If the lady is just guessing, she might as well be flipping a coin.

So here’s the plan. We’ll flip 10 coins, and repeat that experiment 10000 times. We’ll call the heads correct guesses and the tails incorrect guesses.

heads
   0    1    2    3    4    5    6    7    8    9   10 
   9  110  413 1145 2096 2446 2066 1141  469   97    8 

So what do we conclude? It is possible that the lady could get 9 or 10 correct just by guessing, but it is not very likely (it only happened in about 97+810000=1.05% of our simulations). So one of two things must be true:
• The lady got unusually “lucky”, by chance; OR
• The lady is not just guessing and really has some ability in this regard.

Commentary

First we realize something is surprising, and that we have a question or doubt. This is based on something we see, or measure, a test statistic. In our story, it is the score of 10/10 that the Lady was able to achieve about how the Tea was made.

We then assume the Lady is guessing and somehow by chance able to guess correctly. This would be our….NULL Hypothesis. This is our (conservative) belief about the Real World.

We then randomly generate many Parallel Counterfactual Worlds, where we repeat the experiment many many times, each time calculating the test statistic, under the assumption of the NULL Hypothesis is TRUE.

We see how often our Parallel Worlds can mimic or exceed Real World measurement of the the test statistic by comparison. If this is common (i.e. probability is high) we say we cannot reject the NULL Hypothesis (and the Lady is lucky). If the occurrence is rare, as in our case, we say we have reason to reject the NULL Hypothesis and reason to believe an underlying pattern (and Lady’s ability is beyond Question !)

This is the essence of the Simulation Method in statistical modelling. Take one more look at the picture from Allen Downey’s blog, below:

From Reference #1:

Hypothesis testing can be thought of as a 4-step process:

  1. State the null and alternative hypotheses.

  2. Compute a test statistic.

  3. Determine the p-value.

  4. Draw a conclusion.

    In a traditional introductory statistics course, once this general framework has been mastered, the main work is in applying the correct formula to compute the standard test statistics in step 2 and using a table or computer to determine the p-value based on the known (usually approximate) theoretical distribution of the test statistic under the null hypothesis.

    In a simulation-based approach, steps 2 and 3 change. In Step 2, it is no longer required that the test statistic be normalized to conform with a known, named distribution. Instead, natural test statistics, like the difference between two sample means y1−y2 can be used.

    In Step 3, we use randomization to approximate the sampling distribution of the test statistic. Our lady tasting tea example demonstrates how this can be done from first principles. More typically, we will use randomization to create new simulated data sets ( “Parallel Worlds”) that are like our original data in some ways, but make the null hypothesis true. For each simulated data set, we calculate our test statistic, just as we did for the original sample. Together, this collection of test statistics computed from the simulated samples constitute our randomization distribution.

    When creating a randomization distribution, we will attempt to satisfy 3 guiding principles.

  5. Be consistent with the null hypothesis. We need to simulate a world in which the null hypothesis is true. If we don’t do this, we won’t be testing our null hypothesis.

  6. Use the data in the original sample. The original data should shed light on some aspects of the distribution that are not determined by null hypothesis. For example, a null hypothesis about a mean doesn’t tell us about the shape of the population distribution, but the data give us some indication.

  7. Reflect the way the original data were collected.

From Chihara and Hesterberg:

This is the core idea of statistical significance or classical hypothesis testing – to calculate how often pure random chance would give an effect as large as that observed in the data, in the absence of any real effect. If that probability is small enough, we conclude that the data provide convincing evidence of a real effect.

References

  1. Chapter 11, Hypothesis Testing with Randomization in Introduction to Modern Statistics (1st Ed) by Mine Çetinkaya-Rundel and Johanna Hardin.

  2. Allen Downey, There is still only one test

  3. R.A. Fisher. Statistical Methods for Research Workers. Oliver & Boyd, 1925

  4. https://timesofindia.indiatimes.com/sports/cricket/icc-mens-t20-world-cup/in-numbers-virat-kohli-and-his-strange-luck-with-the-coin-toss/articleshow/87538443.cms

  5. Laura Chihara, Tim Hesterberg, Mathematical Statistics with Resampling and R, Wiley, 2019.

  6. D. Salsburg. The Lady Tasting Tea: How statistics revolutionized science in the twentieth century. W.H. Freeman, New York, 2001

  7. Daniel Kaplan, Nicholas J. Horton, and Randall Pruim, Simulation-based inference with mosaic https://www.mosaic-web.org/mosaic/articles/Resampling.html

Back to top
🎲 Samples, Populations, Statistics and Inference
🃏 Inference for a Single Mean

License: CC BY-SA 2.0

Website made with ❤️ and Quarto, by Arvind V.

Hosted by Netlify .