Heptathlon
Setting up R Packages
Plot Theme
Show the Code
# https://stackoverflow.com/questions/74491138/ggplot-custom-fonts-not-working-in-quarto
# Chunk options
knitr::opts_chunk$set(
fig.width = 7,
fig.asp = 0.618, # Golden Ratio
# out.width = "80%",
fig.align = "center"
)
### Ggplot Theme
### https://rpubs.com/mclaire19/ggplot2-custom-themes
theme_custom <- function() {
font <- "Roboto Condensed" # assign font family up front
theme_classic(base_size = 14) %+replace% # replace elements we want to change
theme(
panel.grid.minor = element_blank(), # strip minor gridlines
text = element_text(family = font),
# text elements
plot.title = element_text( # title
family = font, # set font family
size = 20, # set font size
face = "bold", # bold typeface
hjust = 0, # left align
# vjust = 2 #raise slightly
margin = margin(0, 0, 10, 0)
),
plot.subtitle = element_text( # subtitle
family = font, # font family
size = 14, # font size
hjust = 0,
margin = margin(2, 0, 5, 0)
),
plot.caption = element_text( # caption
family = font, # font family
size = 8, # font size
hjust = 1
), # right align
axis.title = element_text( # axis titles
family = font, # font family
size = 10 # font size
),
axis.text = element_text( # axis text
family = font, # axis family
size = 8
) # font size
)
}
# Set graph theme
theme_set(new = theme_custom())
#
Introduction
This is a dataset pertaining to scores of multiple athletes in the 7 events that make up the Heptathlon, modified for ease of analysis and plotting.
Data
library(HSAUR)
heptathlon
hurdles <dbl> | highjump <dbl> | shot <dbl> | run200m <dbl> | longjump <dbl> | javelin <dbl> | run800m <dbl> | score <int> | |
---|---|---|---|---|---|---|---|---|
Joyner-Kersee (USA) | 12.69 | 1.86 | 15.80 | 22.56 | 7.27 | 45.66 | 128.51 | 7291 |
John (GDR) | 12.85 | 1.80 | 16.23 | 23.65 | 6.71 | 42.56 | 126.12 | 6897 |
Behmer (GDR) | 13.20 | 1.83 | 14.20 | 23.10 | 6.68 | 44.54 | 124.20 | 6858 |
Sablovskaite (URS) | 13.61 | 1.80 | 15.23 | 23.92 | 6.25 | 42.78 | 132.24 | 6540 |
Choubenkova (URS) | 13.51 | 1.74 | 14.76 | 23.93 | 6.32 | 47.46 | 127.90 | 6540 |
Schulz (GDR) | 13.75 | 1.83 | 13.50 | 24.65 | 6.33 | 42.82 | 125.79 | 6411 |
Fleming (AUS) | 13.38 | 1.80 | 12.88 | 23.59 | 6.37 | 40.28 | 132.54 | 6351 |
Greiner (USA) | 13.55 | 1.80 | 14.13 | 24.48 | 6.47 | 38.00 | 133.65 | 6297 |
Lajbnerova (CZE) | 13.63 | 1.83 | 14.28 | 24.86 | 6.11 | 42.20 | 136.05 | 6252 |
Bouraga (URS) | 13.25 | 1.77 | 12.62 | 23.59 | 6.28 | 39.06 | 134.74 | 6252 |
Download the Modified data
Not Applicable!
Data Dictionary
NoteQuantitative Variables
Write in.
NoteQualitative Variables
Write in.
NoteObservations
Write in.
Analyse the Data
```{r}
#| label: data-preprocessing
#
# Write in your code here
# to prepare this data as shown below
# to generate the plot that follows
```
predictor <chr> | estimate <dbl> | statistic <dbl> | p.value <dbl> | parameter <int> | conf.low <dbl> | conf.high <dbl> | |
---|---|---|---|---|---|---|---|
highjump | -0.811402536 | -6.657710 | 8.596744e-07 | 23 | -0.9136181 | -0.6127140 | |
shot | -0.651334688 | -4.116671 | 4.209051e-04 | 23 | -0.8322704 | -0.3449917 | |
run200m | 0.773720543 | 5.857103 | 5.719615e-06 | 23 | 0.5452989 | 0.8951771 | |
longjump | -0.912133617 | -10.672149 | 2.209400e-10 | 23 | -0.9609327 | -0.8083373 | |
javelin | -0.007762549 | -0.037229 | 9.706237e-01 | 23 | -0.4016615 | 0.3885602 | |
run800m | 0.779257110 | 5.963236 | 4.430802e-06 | 23 | 0.5550223 | 0.8979129 |
Plot the Data
Task and Discussion
Complete the Data Dictionary. Create the graph shown and discuss the following questions:
- Identify the type of charts
- Identify the variables used for various geometrical aspects (x, y, fill…). Name the variables appropriately.
- Which events in the 7-event heptathlon are most highly correlated with scores in hurdles?
- If an athlete was a record holder in both high jump and hurdles, what would be your opinion about them? Justify based on the graph!