flowchart TD A[Inference for Means] -->|Check Assumptions|B[Normality: Shapiro-Wilk Test shapiro.test\n Variances: Fisher F-test var.test\n Outliers: Box Plots] B --> M[Means] subgraph Means direction TB subgraph Single-Mean direction LR OM[Single Mean]<-->|p| TT[t.test] TWT[t.test \n Welch]<-->|p diff var|OM[Single Mean] WT[wilcox.test]<-->|np|OM[Single Mean] end subgraph Paired-Means direction LR TM[Paired Means]<-->|p| TTP[t.test with pairs] WTP[wilcox.test with pairs\n Mann-Whitney U Test]-->|np|TM end subgraph Multiple-Means direction LR MM[Multiple Means] -->|p| ANO[ANOVA] KW[kruskal.test]<-->|np indep| MM FT[friedman.test]<-->|np dep| MM end M -->Single-Mean Single-Mean-->Paired-Means Paired-Means-->Multiple-Means end %%subgraph LM %% direction BT %% LM[Linear Model]-->Means %%end
š§ Basics of Statistical Inference
Introduction
In this set of modules we will explore Data, understand what types of data variables there are, and the kinds of statistical tests and visualizations we can create with them.
The Big Ideas in Stats
Steven Stigler(Stigler 2016) is the author of the book āThe Seven Pillars of Statistical Wisdomā. The Big Ideas in Statistics from that book are:
1.Aggregation
The first pillar I will call Aggregation, although it could just as well be given the nineteenth-century name, āThe Combination of Observations,ā or even reduced to the simplest example, taking a mean. Those simple names are misleading, in that I refer to an idea that is now old but was truly revolutionary in an earlier dayāand it still is so today, whenever it reaches into a new area of application. How is it revolutionary? By stipulating that, given a number of observations, you can actually gain information by throwing information away! In taking a simple arithmetic mean, we discard the individuality of the measures, subsuming them to one summary.
2.Information
In the early eighteenth century it was discovered that in many situations the amount of information in a set of data was only proportional to the square root of the number n of observations, not the number n itself.
3.Likelihood
By the name I give to the third pillar, Likelihood, I mean the calibration of inferences with the use of probability. The simplest form for this is in significance testing and the common P-value.
4.Intercomparison
It represents what was also once a radical idea and is now commonplace: that statistical comparisons do not need to be made with respect to an exterior standard but can often be made in terms interior to the data themselves. The most commonly encountered examples of intercomparisons are Studentās t-tests and the tests of the analysis of variance.
5.Regression
I call the fifth pillar Regression, after Galtonās revelation of 1885, explained in terms of the bivariate normal distribution. Galton arrived at this by attempting to devise a mathematical framework for Charles Darwinās theory of natural selection, overcoming what appeared to Galton to be an intrinsic contradiction in the theory: selection required increasing diversity, in contradiction to the appearance of the population stability needed for the definition of species.
6.Design of Experiments and Observations
The sixth pillar is Design, as in āDesign of Experiments,ā but conceived of more broadly, as an ideal that can discipline our thinking in even observational settings.Starting in the late nineteenth century, a new understanding of the topic appeared, as Charles S. Peirce and then Fisher discovered the extraordinary role randomization could play in inference.
7.Residuals
The most common appearances in Statistics are our model diagnostics (plotting residuals), but more important is the way we explore high-dimensional spaces by fitting and comparing nested models.
In our work with Statistical Models, we will be working with all except Idea 6 above.
What is a Statistical Model?
From Daniel Kaplanās book:
āModelingā is a process of asking questions. āStatisticalā refers in part to data ā the statistical models you will construct will be rooted in data. But it refers also to a distinctively modern idea: that you can measure what you donāt know and that doing so contributes to your understanding.
The conclusions you reach from data depend on the specific questions you ask. The word āmodelingā highlights that your goals, your beliefs, and your current state of knowledge all influence your analysis of data. You examine your data to see whether they are consistent with the hypotheses that frame your understanding of the system under study.
Types of Statistical Models Based on Purpose
There are three main uses for statistical models. They are closely related, but distinct enough to be worth enumerating.
Description. Sometimes you want to describe the range or typical values of a quantity. For example, whatās a ānormalā white blood cell count? Sometimes you want to describe the relationship between things. Example: Whatās the relationship between the price of gasoline and consumption by automobiles?
Classification or Prediction. You often have information about some observable traits, qualities, or attributes of a system you observe and want to draw conclusions about other things that you canāt directly observe. For instance, you know a patientās white blood-cell count and other laboratory measurements and want to diagnose the patientās illness.
Anticipating the consequences of interventions. Here, you intend to do something: you are not merely an observer but an active participant in the system. For example, people involved in setting or debating public policy have to deal with questions like these: To what extent will increasing the tax on gasoline reduce consumption? To what extent will paying teachers more increase student performance?
The appropriate form of a model depends on the purpose. For example, a model that diagnoses a patient as ill based on an observation of a high number of white blood cells can be sensible and useful. But that same model could give absurd predictions about intervention: Do you really think that lowering the white blood cell count by bleeding a patient will make the patient better?
To anticipate correctly the effects of an intervention you need to get the direction of cause (polarity) and effect (magnitude) correct in your models.
An effect size tells how the output of a model changes when a simple change is made to the input.Effect sizes always involve two variables: a response variable and a single explanatory variable. Effect size is always about a model. The model might have one explanatory variable or many explanatory variables. Each explanatory variable will have its own effect size, so a model with multiple explanatory variables will have multiple effect sizes.
But for a model used for classification or prediction, it may be unnecessary to represent causation correctly. Instead, other issues, e.g., the reliability of data, can be the most important. One of the thorniest issues in statistical modeling ā with tremendous consequences for science, medicine, government, and commerce ā is how you can legitimately draw conclusions about interventions from models based on data collected without performing these interventions.
Types of Models Based on Data Variables
Let us look at the famous dataset pertaining to Francis Galtonās work on the heights of children and the heights of their parents. We can create 4 kinds of models based on the types of variables in that dataset.
Linear Models Everywhere
One method in this set of modules is to take the modern view that all these models can be viewed from a standpoint of the Linear Model, also called Linear Regression \(y = \beta_1 *x + \beta_0\) . For example, it is relatively straightforward to imagine Plot B (Quant vs Quant ) as an example of a Linear Model, with the dependent variable modelled as \(y\) and the independent one as \(x\). We will try to work up to the intuition that this model can be used to understand all the models in the Figure.
A Flowchart of Statistical Inference Tests
References
Chester Ismay and Albert Y. Kim. Statistical Inference via Data Science: A ModernDive into R and the Tidyverse. Available Online https://moderndive.com/index.html
TihamƩr von Ghyczy, The Fruitful Flaws of Strategy Metaphors. Harvard Business Review, 2003. https://hbr.org/2003/09/the-fruitful-flaws-of-strategy-metaphors
Daniel T. Kaplan, Statistical Models (second edition). Available online. https://dtkaplan.github.io/SM2-bookdown/
Daniel T. Kaplan, Compact Introduction to Classical Inference, 2020. Available Online. https://dtkaplan.github.io/CompactInference/
Daniel T. Kaplan and Frank Shaw, Statistical Modeling: Computational Technique. Available online https://www.mosaic-web.org/go/SM2-technique/
Jonas Kristoffer LindelĆøv, Common statistical tests are linear models (or: how to teach stats) https://lindeloev.github.io/tests-as-linear/