Lab 01 - Installation

Installing and Getting Started with R!

Author

Arvind Venkatadri

Published

July 6, 2021

Modified

July 29, 2025

Abstract

Part of my R Workshop course using the idea of metaphors in written language.

Goals

At the end of this Lab, we will:

have installed R and RStudio on our machines
understood how to add additional R-packages for specific features and graphic capability
run code within RStudio and interpret the results
have learnt to look for help within R and RStudio

Overview

This guide will lead you through the steps to install and use R, a free and open-source software environment for statistical computing and graphics.

What is R?

R is the name of the programming language itself, based off S from Bell Labs, which users access through a command-line interpreter (>)

What is RStudio?

RStudio is a powerful and convenient user interface that allows you to access the R programming language along with a lot of other bells and whistles that enhance functionality (and sanity).

Our end goal is to get you looking at a screen like this:

Install R

Install R from CRAN, the Comprehensive R Archive Network. Please choose a precompiled binary distribution for your operating system.

Check in

Launch R by clicking this logo. You should see one console with a command line interpreter. Try typing 2 + 2 and check !

Close R.

Install RStudio

Install the free, open-source edition of RStudio: http://www.rstudio.com/products/rstudio/download/

RStudio provides a powerful user interface for R, called an integrated development environment. RStudio includes:

a console (the standard command line interface: >),
a syntax-highlighting editor that supports direct code execution, and
tools for plotting, history, debugging and workspace management.

Check in

Launch RStudio. You should get a window similar to the screenshot you see here, but yours will be empty. Look at the bottom left pane: this is the same console window you saw when you opened R in step @Check-In-R

Place your cursor where you see > and type x <- 2 + 2 again hit enter or return, then type x, and hit enter/return again.
If [1] 4 prints to the screen, you have successfully installed R and RStudio, and you can move onto installing packages.

Install packages

The version of R that you just downloaded is considered base R, which provides you with good but basic statistical computing and graphics powers. For analytical and graphical super-powers, you’ll need to install add-on packages, which are user-written, to extend/expand your R capabilities. They may be carefully curated by CRAN (which involves a thorough submission and review process), and thus are easy install using install.packages("name_of_package", dependencies = TRUE) in your CONSOLE.

Place your cursor in the CONSOLE again (where you last typed x and [4] printed on the screen). You can use the first method to install the following packages directly from CRAN, all of which we will use:

knitr,
dplyr,
ggplot2,
babynames
To install a package, you put the name of the package in quotes as in install.packages("name_of_package"). Mind your use of quotes carefully with packages.
To use an already installed package, you must load it first, as in library(name_of_package), leaving the name of the package bare (unquoted). You only need to do this once per RStudio session.

You can download all of these at once, too:

install.packages(c("knitr", "dplyr", "ggplot2", "babynames"), dependencies = TRUE)

A brief aside: c() is a command in R that allows us to combine things into a vector ( one of the ways data is represented in R)

c("hello", "my", "name", "is", "arvind")

[1] "hello"  "my"     "name"   "is"     "arvind"

c(1:3, 20, 50)

[1]  1  2  3 20 50

If you want help, no quotes are needed: help(name_of_package) or ?name_of_package.
If you want the citation for a package (and you should give credit where credit is due), ask R as in citation("name_of_package").

Make a name plot

The webpage you are looking at is derived from a R Quarto doc that you can download, edit and compute with. We will meet R Quarto docs in the next class.

Create a new Quarto Document using the `File -> New File -> Quarto Document` menu.

Change the author name to your own!

Delete all the content below the YAML header ( stuff enclosed in three dashes ---)

Paste this code chunk inside your Quarto document:

library(knitr) # to use this....document! More later!!
library(dplyr) # for manipulating data
library(ggplot2) # for plotting data
library(babynames) # A package containing, yes, Baby Names

Let us greet our data first !!

glimpse(babynames) # dplyr

Rows: 1,924,665
Columns: 5
$ year <dbl> 1880, 1880, 1880, 1880, 1880, 1880, 1880, 1880, 1880, 1880, 1880,…
$ sex  <chr> "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", "F", …
$ name <chr> "Mary", "Anna", "Emma", "Elizabeth", "Minnie", "Margaret", "Ida",…
$ n    <int> 7065, 2604, 2003, 1939, 1746, 1578, 1472, 1414, 1320, 1288, 1258,…
$ prop <dbl> 0.07238359, 0.02667896, 0.02052149, 0.01986579, 0.01788843, 0.016…

head(babynames) # base R

ABCDEFGHIJ0123456789

year <dbl>	sex <chr>	name <chr>	n <int>	prop <dbl>
1880	F	Mary	7065	0.07238359
1880	F	Anna	2604	0.02667896
1880	F	Emma	2003	0.02052149
1880	F	Elizabeth	1939	0.01986579
1880	F	Minnie	1746	0.01788843
1880	F	Margaret	1578	0.01616720

tail(babynames) # same

ABCDEFGHIJ0123456789

year <dbl>	sex <chr>	name <chr>	n <int>	prop <dbl>
2017	M	Zyhier	5	2.55e-06
2017	M	Zykai	5	2.55e-06
2017	M	Zykeem	5	2.55e-06
2017	M	Zylin	5	2.55e-06
2017	M	Zylis	5	2.55e-06
2017	M	Zyrie	5	2.55e-06

names(babynames) # same

[1] "year" "sex"  "name" "n"    "prop"

If you have done the above and produced sane-looking output, you are ready for the next bit. Use the code below to create a new data frame called arvind.

my_name_data <-
  babynames %>%
  filter(name == "Arvind" | name == "Aravind") %>%
  filter(sex == "M")

The first bit makes a new dataset called my_name_data that is a copy of the babynames dataset- the %>% tells you we are doing some other stuff to it later.
The second bit filters our babynames to only keep rows where the name is either Arvind or Aravind (read | as “or”.)
The third bit applies another filter to keep only those where sex is male.

Let’s check out the data.

my_name_data

ABCDEFGHIJ0123456789

year <dbl>	sex <chr>	name <chr>	n <int>	prop <dbl>
1970	M	Arvind	5	2.620e-06
1972	M	Arvind	8	4.780e-06
1975	M	Arvind	7	4.310e-06
1976	M	Arvind	5	3.060e-06
1977	M	Arvind	9	5.260e-06
1978	M	Arvind	6	3.510e-06
1979	M	Arvind	7	3.910e-06
1980	M	Arvind	6	3.230e-06
1981	M	Arvind	8	4.300e-06
1982	M	Arvind	6	3.180e-06

glimpse(my_name_data)

Rows: 61
Columns: 5
$ year <dbl> 1970, 1972, 1975, 1976, 1977, 1978, 1979, 1980, 1981, 1982, 1983,…
$ sex  <chr> "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", "M", …
$ name <chr> "Arvind", "Arvind", "Arvind", "Arvind", "Arvind", "Arvind", "Arvi…
$ n    <int> 5, 8, 7, 5, 9, 6, 7, 6, 8, 6, 7, 7, 7, 13, 8, 11, 6, 8, 12, 10, 1…
$ prop <dbl> 2.620e-06, 4.780e-06, 4.310e-06, 3.060e-06, 5.260e-06, 3.510e-06,…

Again, if you have sane-looking output here, move along to plotting the data!

plot <-
  ggplot(my_name_data, aes(
    x = year,
    y = prop,
    group = name,
    color = name
  )) +
  geom_line()

Now if you did this right, you will not see your plot!

Because we saved the ggplot with a name (plot), R just saved the object for you. But check out the top right pane in RStudio again: under the Environment pane you should see plot, so it is there, you just have to ask for it. Here’s how:

plot

Make a new name plot

Edit my code above to create a new dataset. Pick 2 names to compare how popular they each are (these could be different spellings of your own name, like I did, but you can choose any 2 names that are present in the dataset). Make the new plot, changing the name of the first argument arvind in ggplot() to the name of your new dataset.

Save and share

Save your work so you can share your favorite plot with us. You will not like the looks of your plot if you mouse over to Export and save it. Instead, use ggplot2’s command for saving a plot with sensible defaults:

help(ggsave)

ggsave("file_name_here.pdf", plot) # please make the filename unique!

That’s it! You have created and saved your first chart in R!!!