🃏 Inference for Comparing Two Paired Means
Setting up R Packages
Plot Theme
Show the Code
# https://stackoverflow.com/questions/74491138/ggplot-custom-fonts-not-working-in-quarto
# Chunk options
knitr::opts_chunk$set(
fig.width = 7,
fig.asp = 0.618, # Golden Ratio
# out.width = "80%",
fig.align = "center"
)
### Ggplot Theme
### https://rpubs.com/mclaire19/ggplot2-custom-themes
theme_custom <- function() {
font <- "Roboto Condensed" # assign font family up front
theme_classic(base_size = 14) %+replace% # replace elements we want to change
theme(
panel.grid.minor = element_blank(), # strip minor gridlines
text = element_text(family = font),
# text elements
plot.title = element_text( # title
family = font, # set font family
# size = 20, #set font size
face = "bold", # bold typeface
hjust = 0, # left align
# vjust = 2 #raise slightly
margin = margin(0, 0, 10, 0)
),
plot.subtitle = element_text( # subtitle
family = font, # font family
# size = 14, #font size
hjust = 0,
margin = margin(2, 0, 5, 0)
),
plot.caption = element_text( # caption
family = font, # font family
size = 8, # font size
hjust = 1
), # right align
axis.title = element_text( # axis titles
family = font, # font family
size = 10 # font size
),
axis.text = element_text( # axis text
family = font, # axis family
size = 8
) # font size
)
}
# Set graph theme
theme_set(new = theme_custom())
#
Introduction to Inference for Paired Data
What is Paired Data?
Sometimes the data is collected on the same set of individual categories, e.g. scores by sport persons in two separate tournaments, or sales of identical items in two separate locations of a chain store. Or say the number of customers in the morning and in the evening, at a set of store locations. In this case we treat the two sets of observations as paired, since they correspond to the same set of observable entities. This is how some experiments give us paired data.
We would naturally be interested in the differences in means across these two sets, which exploits this paired nature. In this module, we will examine tests for this purpose.
Workflow for Inference for Paired Means
flowchart TD A[Inference for Paired Means] -->|Check Assumptions| B[Normality: Shapiro-Wilk Test shapiro.test\n Variances: Fisher F-test var.test] B --> C{OK?} C -->|Yes, both\n Parametric| D[t.test with paired=TRUE] C -->|Yes, but not variance\n Parametric| W[t.test with\n Welch Correction, paired =TRUE] C -->|No\n Non-Parametric| E[wilcox.test with paired = TRUE] C -->|No\n Non-Parametric| P[Bootstrap\n or\n Permutation]
We will now use a couple to case studies to traverse all the possible pathways in the Workflow above.
Case Study #1: Results from a Diving Championship
Here we have swimming records across a Semi-Final and a Final:
Inspecting and Charting Data
data("Diving2017", package = "resampledata3")
Diving2017
Diving2017_inspect <- inspect(Diving2017)
Diving2017_inspect$categorical
Diving2017_inspect$quantitative
The data is made up of paired observations per swimmer, one for the semi-final and one for the final race. There are 12 swimmers and therefore 12 paired records. How can we quickly visualize this data?
Let us first make this data into long form:
# Set graph theme
theme_set(new = theme_custom())
#
Diving2017_long <- Diving2017 %>%
pivot_longer(
cols = c(Final, Semifinal),
names_to = "race",
values_to = "scores"
)
Diving2017_long
Next, histograms and densities of the two variables at hand:
Diving2017_long %>%
gf_col(
fct_reorder(Name, scores) ~ scores,
fill = ~race,
alpha = 0.5,
position = "dodge",
xlab = "Scores",
ylab = "Name",
title = "Diving Scores"
)
Diving2017_long %>%
gf_boxplot(
scores ~ race,
fill = ~race,
alpha = 0.5,
xlab = "Race",
ylab = "Scores",
title = "Diving Scores"
)
We see that:
- The data are not normally distributed. With just such few readings (n < 30) it was just possible…more readings would have helped. We will verify this aspect formally very shortly.
- There is no immediately identifiable trend in score changes from one race to the other.
- Although the two medians appear to be different, the box plots overlap considerably. So one cannot visually conclude that the two sets of race timings have different means.
A. Check for Normality
Let us also complete a check for normality: the shapiro.wilk
test checks whether a Quant variable is from a normal distribution; the NULL hypothesis is that the data are from a normal distribution.
shapiro.test(Diving2017$Final)
shapiro.test(Diving2017$Semifinal)
Shapiro-Wilk normality test
data: Diving2017$Final
W = 0.9184, p-value = 0.273
Shapiro-Wilk normality test
data: Diving2017$Semifinal
W = 0.86554, p-value = 0.05738
Hmmm….the Shapiro-Wilk test suggests that both scores are normally distributed (!!!), though Semifinal
is probably marginally so.
Can we check this comparison also with plots? We can plot Q-Q plots for both variables, and also compare both data with normally-distributed data generated with the same means and standard deviations:
# Set graph theme
theme_set(new = theme_custom())
#
set.rseed(1234)
Diving2017 %>%
mutate(
Final_norm = rnorm(
n = 12,
mean = mean(Final),
sd = sd(Final)
),
Semifinal_norm = rnorm(
n = 12,
mean = mean(Semifinal),
sd = sd(Semifinal)
)
) %>%
pivot_longer(
cols =
c(Semifinal, Final, Semifinal_norm, Final_norm),
names_to = "score_type", values_to = "value"
) %>%
gf_boxplot(value ~ score_type,
fill = ~score_type,
show.legend = FALSE
) %>%
gf_labs(title = "Comparing Data and Normal Boxplots")
###
Diving2017_long %>%
gf_qq(~ scores | race, size = 2) %>%
gf_qqline(ylab = "scores", xlab = "theoretical normal")
While the boxplots are not very evocative, we see in the QQ-plots that the Final
scores are closer to the straight line than the Semifinal
scores. But it is perhaps still hard to accept the data as normally distributed…hmm.
B. Check for Variances
Let us check if the two variables have similar variances: the var.test
does this for us, with a NULL hypothesis that the variances are not significantly different:
var.test(scores ~ race,
data = Diving2017_long,
ratio = 1, # What we believe
conf.int = TRUE,
conf.level = 0.95
) %>%
broom::tidy()
The variances are not significantly different, as seen by the \(p.value = 0.08\).
So to summarise our data checks:
- data are normally distributed
- variances are not significantly different
Hypothesis
Based on the graph, how would we formulate our Hypothesis? We wish to infer whether there is any change in performance between the population of swimmers who might have participated in these two races. So accordingly:
\[ H_0: \mu_{semifinal} = \mu_{final}\\ \]
\[ H_a: \mu_{semifinal} \ne \mu_{final}\ \]
Observed and Test Statistic
What would be the test statistic we would use? The difference in means. Is the observed difference in the means between the two groups of scores non-zero? We use the diffmean
function, with the argument only.2 = FALSE
to allow for paired data:
obs_diff_swim <- diffmean(scores ~ race,
data = Diving2017_long,
only.2 = FALSE
) # paired data
# Can use this also
# formula method is better for permutation test!
# obs_diff_swim <- mean(~ (Final - Semifinal), data = Diving2017)
obs_diff_swim
diffmean
-11.975
Inference
Since the data variables satisfy the assumption of being normally distributed, and the variances are not significantly different, we may attempt the classical t.test
with paired data. (we will use the mosaic
variant). Type help(t.test)
in your Console. Our model would be:
\[ mean(Final(i) - Semi\_final(i)) = \beta_0 \\ \]
And that: \[ H_0: \mu_{final} - \mu_{semifinal} = 0; \] \[ H_a: \mu_{final} - \mu_{semifinal} \ne 0;\\ \]
mosaic::t.test(
x = Diving2017$Semifinal,
y = Diving2017$Final,
paired = TRUE, var.equal = FALSE
) %>% broom::tidy()
The confidence interval
spans the zero value, and the p.value
is a high \(0.259\), so there is no reason to accept alternative hypothesis that the means are different. Hence we say that there is no evidence of a difference between SemiFinal
and Final
scores.
Well, we might consider ( based on knowledge of the sport ) that at least one of the variables does not meet the normality criteria, and though their variances are not significantly different. So we would attempt a non-parametric Wilcoxon test, that uses the signed-rank of the paired data differences, instead of the data variables. Our model would be:
\[ mean(\ sign.rank[\ Final(i) - Semifinal(i)\ ]\ ) = \beta_0 \\ \] \[ H_0: \mu_{final} - \mu_{semifinal} = 0; \] \[ H_a: \mu_{final} - \mu_{semifinal} \ne 0;\\ \]
wilcox.test(
x = Diving2017$Semifinal,
y = Diving2017$Final,
mu = 0, # belief
alternative = "two.sided", # difference either way
paired = TRUE,
conf.int = TRUE,
conf.level = 0.95
) %>%
broom::tidy()
Here also with the p.value
being \(0.3804\), we have no reason to accept the Alternative Hypothesis. The parametric t.test
and the non-parametric wilcox.test
agree in their inferences.
We can apply the linear-model-as-inference interpretation both to the original data and to the sign.rank data:
\[ lm(y_i - x_i \sim 1) = \beta_0\\ \\ and\\ lm(\ sign.rank[\ Final(i) - Semifinal(i)\ ] \sim 1) = \beta_0\\ \]
And the Hypothesis for both interpretations would be:
\[
H_0: \beta_0 = 0\\
\\\
H_a: \beta_0 \ne 0\\
\]
lm(Semifinal - Final ~ 1, data = Diving2017) %>%
broom::tidy(conf.int = TRUE, conf.level = 0.95)
# Create a sign-rank function
signed_rank <- function(x) {
sign(x) * rank(abs(x))
}
lm(signed_rank(Semifinal - Final) ~ 1,
data = Diving2017
) %>%
broom::tidy(conf.int = TRUE, conf.level = 0.95)
We observe that using the linear model method for the original scores and the sign-rank scores both sdo not permit us to reject the \(H_0\) Null Hypothesis, since p.values
are high, and the confidence.intervals
straddle \(0\).
For the specific data at hand, we need to shuffle the records between Semifinal
and Final
on a per Swimmer basis (paired data!!) and take the test statistic
(difference between the two swim records for each swimmer). Another way to look at this is to take the differences between Semifinal
and Final
scores and shuffle the differences to either polarity. We will follow this method in the code below:
polarity <- c(rep(1, 6), rep(-1, 6))
# 12 +/- 1s,
# 6 each to make sure there is equal probability
polarity
##
null_dist_swim <- do(4999) *
mean(
data = Diving2017,
~ (Final - Semifinal) * # take (pairwise) differences
mosaic::resample(polarity, # Swap polarity randomly
replace = TRUE
)
)
##
null_dist_swim
[1] 1 1 1 1 1 1 -1 -1 -1 -1 -1 -1
Let us plot the NULL distribution and compare it with the actual observed differences
in the race times:
# Set graph theme
theme_set(new = theme_custom())
#
gf_histogram(data = null_dist_swim, ~mean) %>%
gf_vline(
xintercept = obs_diff_swim,
colour = "red",
linewidth = 1
)
###
gf_ecdf(data = null_dist_swim, ~mean, linewidth = 1) %>%
gf_vline(
xintercept = obs_diff_swim,
colour = "red",
linewidth = 1
)
###
prop1(~ mean <= obs_diff_swim, data = null_dist_swim)
Hmm…so by generating 4999 shuffles of score-difference polarities, it does appear that we can not only obtain the current observed difference but even surpass it frequently. So it does seem that there is no difference in means between Semi-Final and Final swimming scores.
All Tests Together
We can put all the test results together to get a few more insights about the tests:
mosaic::t.test(
x = Diving2017$Semifinal,
y = Diving2017$Final,
paired = TRUE
) %>%
broom::tidy() %>%
gt() %>%
tab_style(
style = list(cell_fill(color = "cyan"), cell_text(weight = "bold")),
locations = cells_body(columns = p.value)
) %>%
tab_header(title = "t.test")
lm(Semifinal - Final ~ 1, data = Diving2017) %>%
broom::tidy(conf.int = TRUE, conf.level = 0.95) %>%
gt() %>%
tab_style(
style = list(cell_fill(color = "cyan"), cell_text(weight = "bold")),
locations = cells_body(columns = p.value)
) %>%
tab_header(title = "Linear Model")
wilcox.test(
x = Diving2017$Semifinal,
y = Diving2017$Final,
paired = TRUE
) %>%
broom::tidy() %>%
gt() %>%
tab_style(
style = list(cell_fill(color = "palegreen"), cell_text(weight = "bold")),
locations = cells_body(columns = p.value)
) %>%
tab_header(title = "Wilcoxon test")
lm(signed_rank(Semifinal - Final) ~ 1,
data = Diving2017
) %>%
broom::tidy(conf.int = TRUE, conf.level = 0.95) %>%
gt() %>%
tab_style(
style = list(cell_fill(color = "palegreen"), cell_text(weight = "bold")),
locations = cells_body(columns = p.value)
) %>%
tab_header(title = "Linear Model with sign.rank")
t.test | |||||||
---|---|---|---|---|---|---|---|
estimate | statistic | p.value | parameter | conf.low | conf.high | method | alternative |
-11.975 | -1.190339 | 0.2589684 | 11 | -34.11726 | 10.16726 | Paired t-test | two.sided |
Linear Model | ||||||
---|---|---|---|---|---|---|
term | estimate | std.error | statistic | p.value | conf.low | conf.high |
(Intercept) | -11.975 | 10.06016 | -1.190339 | 0.2589684 | -34.11726 | 10.16726 |
Wilcoxon test | |||
---|---|---|---|
statistic | p.value | method | alternative |
27 | 0.3803711 | Wilcoxon signed rank exact test | two.sided |
Linear Model with sign.rank | ||||||
---|---|---|---|---|---|---|
term | estimate | std.error | statistic | p.value | conf.low | conf.high |
(Intercept) | -2 | 2.135558 | -0.9365236 | 0.3691097 | -6.70033 | 2.70033 |
The linear model and the t.test
are nearly identical in performance; the p.values are the same. The same is also true of the wilcox.test
and the linear model with sign-rank data differences. This is of course not surprising!
Case Study #2: Walmart vs Target
Is there a difference in the price of Groceries sold by the two retailers Target and Walmart? The data set Groceries
contains a sample of grocery items and their prices advertised on their respective web sites on one specific day. We will:
- Inspect the data set, then explain why this is an example of matched pairs data.
- Compute summary statistics of the prices for each store.
- Conduct a permutation test to determine whether or not there is a difference in the mean prices.
- Create a
histogrambar-chart of the difference in prices. What is unusual about Quaker Oats Life cereal? - Redo the hypothesis test without this observation. Would we reach the same conclusion?
Inspecting and Charting Data
data("Groceries")
Groceries <- Groceries %>%
mutate(Product = stringr::str_squish(Product)) # Knock off extra spaces
Groceries
Groceries_inspect <- inspect(Groceries)
Groceries_inspect$categorical
Groceries_inspect$quantitative
There are just 30 prices for each vendor….just barely enough to get an idea of what the distribution might be. Let us plot the prices for the products, as box plots after pivoting the data to long form, 1 and as bar charts:
## Set graph theme
theme_set(new = theme_custom())
##
Groceries_long <- Groceries %>%
pivot_longer(
cols = c(Walmart, Target),
names_to = "store",
values_to = "prices"
) %>%
mutate(store = as_factor(store))
Let us plot histograms/densities of the two variables that we wish to compare. We will also overlay a Gaussian distribution for comparison:
Not close to the Gaussian…there is clearly some skew to the right, with some items being very costly compared to the rest. More when we check the assumptions on data for the tests.
How about price differences, what we are interested in?
## Set graph theme
theme_set(new = theme_custom())
##
Groceries_long %>%
gf_col(fct_reorder(Product, prices) ~ prices,
fill = ~store,
alpha = 0.5,
position = "dodge",
xlab = "Prices",
ylab = "",
title = "Groceries Costs"
) %>%
gf_col(
data =
Groceries_long %>%
filter(
Product == "Quaker Oats Life Cereal Original"
),
fct_reorder(Product, prices) ~ prices,
fill = ~store,
position = "dodge"
)
We see that the price difference between Walmart and Target prices is highest for the Product
named Quaker Oats Life Cereal Original
. Apart from this Product, the rest have no discernible trend either way. Let us check observed statistic
(the mean difference in prices)
obs_diff_price <- diffmean(prices ~ store,
data = Groceries_long,
only.2 = FALSE
)
# Can also use
# obs_diff_price <- mean( ~ Walmart - Target, data = Groceries)
obs_diff_price
diffmean
0.05666667
Hypothesis
Based on the graph, how would we formulate our Hypothesis? We wish to infer whether there is any change in prices, per product between the two Store chains. So accordingly:
\[ H_0: \mu_{Walmart} = \mu_{Target}\\ \]
\[ H_a: \mu_{Walmart} \ne \mu_{Target}\ \]
Testing for Assumptions on the Data
There are a few checks we need to make of our data, to decide what test procedure to use.
A. Check for Normality
shapiro.test(Groceries$Walmart)
shapiro.test(Groceries$Target)
Shapiro-Wilk normality test
data: Groceries$Walmart
W = 0.78662, p-value = 3.774e-05
Shapiro-Wilk normality test
data: Groceries$Target
W = 0.79722, p-value = 5.836e-05
For both tests, we see that the p.value
is very small, indicating that the data are unlikely to be normally distributed. This means we cannot apply a standard paired t.test
and need to use the non-parametric wilcox.test
, that does not rely on the assumption of normality.
B. Check for Variances
Let us check if the two variables have similar variances:
var.test(Groceries$Walmart, Groceries$Target)
F test to compare two variances
data: Groceries$Walmart and Groceries$Target
F = 0.97249, num df = 29, denom df = 29, p-value = 0.9406
alternative hypothesis: true ratio of variances is not equal to 1
95 percent confidence interval:
0.4628695 2.0431908
sample estimates:
ratio of variances
0.9724868
It appears from the \(p.value = 0.9\) and the \(Confidence Interval = [0.4629, 2.0432]\), which includes \(1\), that we cannot reject the NULL Hypothesis that the variances are not significantly different.
Inference
Well, the variables are not normally distributed, so a standard t.test
is not advised, even if the variances are similar. We can still try:
The p.value
is \(0.64\) ! And the Confidence Interval
straddles \(0\). So the t.test
gives us no reason to reject the Null Hypothesis that the means are similar. But can we really believe this, given the non-normality of data?
However, we have seen that the data variables are not normally distributed. So a Wilcoxon Test, using signed-ranks, is indicated: (recall the model!)
# For stability reasons, it may be advisable to use rounded data or to set digits.rank = 7, say,
# such that determination of ties does not depend on very small numeric differences (see the example).
wilcox.test(Groceries$Walmart, Groceries$Target,
data = Groceries_long,
digits.rank = 7, paired = TRUE,
conf.int = TRUE, conf.level = 0.95
) %>%
broom::tidy()
The Wilcoxon test result is very interesting: the p.value
says there is a significant difference between the two store prices, and the confidence.interval
also is unipolar…
As before we can do the linear model for both the original data and the sign.rank data. The test statistic
is again the difference between thetwo variables:
lm(Target - Walmart ~ 1, data = Groceries) %>%
broom::tidy(conf.int = TRUE, conf.level = 0.95)
# Create a sign-rank function
signed_rank <- function(x) {
sign(x) * rank(abs(x))
}
lm(signed_rank(Target - Walmart) ~ 1,
data = Groceries
) %>%
broom::tidy(conf.int = TRUE, conf.level = 0.95)
Very interesting results, but confirming what we saw earlier: The Linear Model with the original data reports no significant difference, but the linear model with sign-ranks, suggests there is a significant difference in means prices between stores!
Let us perform the pair-wise permutation test on prices, by shuffling the two store names:
# | layout: [[15, 85, 15]]
# Set graph theme
theme_set(new = theme_custom())
#
polarity <- c(rep(1, 15), rep(-1, 15))
##
null_dist_price <- do(9999) *
mean(
data = Groceries,
~ (Target - Walmart) *
resample(polarity, replace = TRUE)
)
null_dist_price
##
gf_histogram(data = null_dist_price, ~mean) %>%
gf_vline(xintercept = obs_diff_price, colour = "red")
prop1(~mean, data = null_dist_price)
prop_-0.292
2e-04
Does not seem to be any significant difference in prices…
All Tests Together
We can put all the test results together to get a few more insights about the tests:
mosaic::t_test(Groceries$Walmart, Groceries$Target, paired = TRUE) %>%
broom::tidy() %>%
gt() %>%
tab_style(
style = list(
cell_fill(color = "cyan"),
cell_text(weight = "bold")
),
locations = cells_body(columns = p.value)
) %>%
tab_header(title = "t.test")
###
lm(Target - Walmart ~ 1, data = Groceries) %>%
broom::tidy(conf.int = TRUE, conf.level = 0.95) %>%
gt() %>%
tab_style(
style = list(
cell_fill(color = "cyan"),
cell_text(weight = "bold")
),
locations = cells_body(columns = p.value)
) %>%
tab_header(title = "Linear Model")
###
wilcox.test(Groceries$Walmart, Groceries$Target,
digits.rank = 7,
paired = TRUE,
conf.int = TRUE,
conf.level = 0.95
) %>%
broom::tidy() %>%
gt() %>%
tab_style(
style = list(
cell_fill(color = "palegreen"),
cell_text(weight = "bold")
),
locations = cells_body(columns = p.value)
) %>%
tab_header(title = "Wilcoxon Test")
###
lm(signed_rank(Target - Walmart) ~ 1,
data = Groceries
) %>%
broom::tidy(conf.int = TRUE, conf.level = 0.95) %>%
gt() %>%
tab_style(
style = list(
cell_fill(color = "palegreen"),
cell_text(weight = "bold")
),
locations = cells_body(columns = p.value)
) %>%
tab_header(title = "Linear Model with Sign.Ranks")
t.test | |||||||
---|---|---|---|---|---|---|---|
estimate | statistic | p.value | parameter | conf.low | conf.high | method | alternative |
-0.05666667 | -0.4704556 | 0.6415488 | 29 | -0.3030159 | 0.1896825 | Paired t-test | two.sided |
Linear Model | ||||||
---|---|---|---|---|---|---|
term | estimate | std.error | statistic | p.value | conf.low | conf.high |
(Intercept) | 0.05666667 | 0.1204506 | 0.4704556 | 0.6415488 | -0.1896825 | 0.3030159 |
Wilcoxon Test | ||||||
---|---|---|---|---|---|---|
estimate | statistic | p.value | conf.low | conf.high | method | alternative |
-0.104966 | 95 | 0.01431746 | -0.1750051 | -0.03005987 | Wilcoxon signed rank test with continuity correction | two.sided |
Linear Model with Sign.Ranks | ||||||
---|---|---|---|---|---|---|
term | estimate | std.error | statistic | p.value | conf.low | conf.high |
(Intercept) | 8.533333 | 2.888834 | 2.953902 | 0.006167464 | 2.625004 | 14.44166 |
Clearly, the parametric tests do not detect a significant difference in prices, whereas the non-parametric tests do.
Suppose we knock off the Quaker Cereal
data item…(note the spaces in the product name)
## Set graph theme
theme_set(new = theme_custom())
##
set.seed(12345)
Groceries_less <- Groceries %>%
filter(Product != "Quaker Oats Life Cereal Original")
##
Groceries_less_long <- Groceries_less %>%
pivot_longer(
cols = c(Target, Walmart),
names_to = "store",
values_to = "prices"
)
##
wilcox.test(Groceries_less$Walmart,
Groceries_less$Target,
paired = TRUE, digits.rank = 7,
conf.int = TRUE,
conf.level = 0.95
) %>%
broom::tidy()
##
obs_diff_price_less <-
mean(~ (Target - Walmart), data = Groceries_less)
obs_diff_price_less
polarity_less <- c(rep(1, 15), rep(-1, 14))
# Due to resampling this small bias makes no difference
null_dist_price_less <-
do(9999) * mean(
data = Groceries_less,
~ (Target - Walmart) * resample(polarity_less, replace = TRUE)
)
##
gf_histogram(data = null_dist_price_less, ~mean) %>%
gf_vline(
xintercept = obs_diff_price_less,
colour = "red"
)
##
mean(null_dist_price_less >= obs_diff_price_less)
We see that removing the Quaker Oats product item from the data does give a significant difference in mean prices !!! That one price difference was in the opposite direction compared to the general trend in differences, so when it was removed, we obtained a truer picture of price differences.
Try to do a regular parametric t.test with this reduced data!
Wait, But Why?
Conclusion
We have learnt how to perform inference for paired-means. We have looked at the conditions that make the regular t.test
possible, and learnt what to do if the conditions of normality and equal variance are not met. We have also looked at how these tests can be understood as manifestations of the linear model, with data and sign-ranked data. It should also be fairly clear now that we can test for the equivalence of two paired means, using a very simple permutation tests. Given computing power, we can always mechanize this test very quickly to get our results. And that performing this test yields reliable results without having to rely on any assumption relating to underlying distributions and so on.
Your Turn
Try the datasets in the
openintro
package. Usedata(package = "openintro")
in your Console to list out the data packages. Then simply type the name of the dataset in a Quarto chunk ( e.g.babynames
) to read it.Same with the
resampledata
andresampledata3
packages.Try the datasets in the
PairedData
package. Yes, install this too, peasants.
References
- Paired Independence test with the package
infer
: https://infer.netlify.app/articles/paired - Randall Pruim, Nicholas J. Horton, Daniel T. Kaplan, StartTeaching with R
- https://bcs.wiley.com/he-bcs/Books?action=index&itemId=111941654X&bcsId=11307
- https://statsandr.com/blog/wilcoxon-test-in-r-how-to-compare-2-groups-under-the-non-normality-assumption/
R Package Citations
Package | Version | Citation |
---|---|---|
gt | 0.11.1 | @gt |
infer | 1.0.7 | @infer |
MKinfer | 1.2 | @MKinfer |
openintro | 2.5.0 | @openintro |