Interactive mode: click a code block or Show Plot button to reveal/hide its corresponding plot.
By the end of this session you will be able to:
R is a free, open-source programming language for statistical computing, data analysis, and visualization.
On its own, R can be run from a command-line interface. However, most users prefer to use RStudio (now called Posit), which is an Integrated Development Environment (IDE). RStudio provides a graphical interface with multiple panes:
Key idea: R is the engine that does the computation; RStudio is the dashboard that makes it easier to drive.
Exercise: Open RStudio on your computer and identify what each pane currently displays.
R Markdown (RMD) is a format that lets you combine text, code, and results in one reproducible document. It is powerful for writing assignments, reports, and research papers because the results are always linked to the underlying code.
Steps to create your first RMD: 1. Go to
File → New File → R Markdown…. 2.
Give the file a title and author name, then click OK.
3. Try knitting the default file to see how R Markdown works.
Some useful syntax for writing: - Headings: # (level 1),
## (level 2), ### (level 3) - Bold and italic:
**bold**, *italic* - Lists:
- item or 1. item - Inline code:
`code` - Hyperlinks:
[RStudio website](https://posit.co)
Chunks are sections of R code in R Markdown. They start with three
backticks and {r}, and end with three backticks.
# A simple example
print("Hello, world! This is R.")
## [1] "Hello, world! This is R."
You can run chunks by clicking the green triangle next to them, or run lines directly in the Console.
Use <- to assign values to objects. This makes your
code clearer and is preferred over = in most
situations.
x <- 10
y <- 4
z <- x + y
z
## [1] 14
Vectors hold multiple values of the same type.
numbers <- c(1, 2, 3, 4, 5)
names <- c("Alice", "Bob", "Charlie")
mean(numbers)
## [1] 3
numbers * 2
## [1] 2 4 6 8 10
Data frames are like spreadsheets: rows are observations, columns are variables.
students <- data.frame(
name = c("Alice", "Bob", "Charlie"),
age = c(20, 22, 21),
passed = c(TRUE, TRUE, FALSE)
)
summary(students)
## name age passed
## Length:3 Min. :20.0 Mode :logical
## Class :character 1st Qu.:20.5 FALSE:1
## Mode :character Median :21.0 TRUE :2
## Mean :21.0
## 3rd Qu.:21.5
## Max. :22.0
Packages extend the functionality of R. To use them, you must first install them once, and then load them each time you start a new R session.
Installation requires internet access. For example, to install the package tidyverse:
if (FALSE) install.packages("tidyverse")
After installation, load the package into your current session:
library(tidyverse)
If you forget to load the package, R will give an error like
could not find function. Always make sure to load packages
at the top of your script or R Markdown.
# Load tidyverse and try a simple task
library(tidyverse)
my_data <- tibble(
group = rep(c("A", "B"), each = 3),
value = c(2, 4, 6, 1, 3, 5)
)
my_data |> group_by(group) |> summarize(avg_value = mean(value))
# Simple plot with ggplot2
mtcars |> ggplot(aes(x = wt, y = mpg)) +
geom_point() +
labs(title = "Car weight vs fuel efficiency",
x = "Weight (1000 lbs)",
y = "Miles per gallon")+theme_bw()
Tip: Use installed.packages() to see
what is already installed, and update.packages() to keep
them up to date.