Warning: package 'knitr' was built under R version 4.5.2
Code
opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE)Warning: package 'knitr' was built under R version 4.5.2
opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE)After an analysis is done, the next problem is communicating it. The default workflow — run code in R, copy numbers into Word, screenshot the charts, paste them into slides — falls apart the moment the data is updated. Every change forces a manual rebuild, and there is no good way to tell what is current versus stale.
R Markdown solves this by combining the writing and the code in a single document. Render once and you get a polished HTML page, PDF, or Word document with all the charts, tables, and numbers regenerated from the live code. Update the data, render again, and the entire document rebuilds.
R Markdown (.Rmd) is the original. Quarto (.qmd) is the next-generation framework Posit released in 2022; this book is written in Quarto. The differences for a beginner are minor — Quarto uses YAML chunk options (#| label:, #| echo: false) instead of the older inline form, supports more output formats out of the box, and has a more consistent CLI. Everything in this chapter applies to both. New projects should use Quarto; existing R Markdown projects work fine and do not need to be migrated.
The YAML header at the top of an R Markdown or Quarto file controls everything from output format to figure dimensions to citation handling. AI assistants sometimes generate headers that work for the example they were shown but break for your context.
Common failures:
output: html_document (R Markdown style) when you are working in Quarto, where the equivalent is format: html. The render fails or produces unexpected results.fig.cap and fig.height work in chunk headers, but in Quarto’s pipe-comment style they become fig-cap and fig-height (note the hyphen instead of dot). AI sometimes mixes the two.bibliography: book.bib even when no book.bib file exists in your project, and the render errors with a citation lookup failure.Always render the document immediately after creating or modifying YAML to catch these. Errors at render time are easy to diagnose; errors that produce wrong-looking output without erroring are not.
An R Markdown file (.Rmd) is a document where prose and R code live in the same file. When you render (or knit) it, three things happen:
The result can be an HTML page, a PDF, a Word doc, slides, or even a full website (this book is written in R Markdown!). Think of it as a self-updating report.
You need two packages to make this work:
install.packages("rmarkdown")
install.packages("knitr")That’s it. RStudio already has Pandoc (the engine that converts everything) built in, so you’re good to go.
Every .Rmd file has exactly three types of content:
Here is the simplest R Markdown document you can write:
---
title: "My First Report"
author: "Your Name"
date: "2024-01-15"
output: html_document
---
## Introduction
This is my first R Markdown document.
```{r}
summary(iris)
```
Let’s break each piece down.
The YAML header sits at the very top of your file, sandwiched between two lines of ---. It tells R Markdown the basics: what’s this document called, who wrote it, and what format should the output be?
Here is a perfectly good YAML header:
---
title: "Quarterly Sales Report"
author: "Vivek H. Patil"
date: "2024-09-01"
output: html_document
---
That’s really all you need. Four lines, and you’re in business.
Pro tip: Want the date to update automatically every time you knit? Use this:
date: "`r Sys.Date()`"
Now your report always shows today’s date. No more “wait, is this the March version or the April version?”
The three formats you’ll actually use:
| Format | YAML value | When to use it |
|---|---|---|
| HTML | html_document |
Day-to-day reports, sharing via email |
| Word | word_document |
When your boss needs it in Word |
pdf_document |
Formal reports (requires LaTeX) |
HTML is the default, and honestly, it’s the best for most business use cases. It’s interactive, looks great, and you can email it as a single file.
You can add a floating table of contents, pick a theme, and let readers show/hide code—all from the YAML header:
---
title: "Sales Analysis Q3 2024"
author: "Vivek H. Patil"
date: "`r format(Sys.Date(), "%B %d, %Y")`"
output:
html_document:
toc: true
toc_float: true
theme: flatly
code_folding: hide
---
Here’s what each of those options does:
toc: true – Adds a table of contents (generated from your headers).toc_float: true – Makes the table of contents float on the side as you scroll. Very handy for long reports.theme – Changes the visual style. Try cerulean, cosmo, flatly, journal, readable, or united.code_folding: hide – Hides all code by default, but readers can click to reveal it. Perfect for reports where your manager doesn’t want to see code but your analyst colleague does.That’s really all the YAML you need to know. If you want to go deeper, Section @ref(output-formats) has more detail.
The writing part of your document uses Markdown—a simple way to format text without messing around in a toolbar. If you’ve ever used Slack formatting or Reddit, you’ve basically used Markdown.
Use # symbols to create headings. More # signs = smaller heading:
# Big Chapter Title
## Section
### Subsection
*italic* gives you italic
**bold** gives you bold
***bold italic*** gives you bold italic
~~strikethrough~~ gives you Bullet points:
- Revenue grew 12%
- Costs decreased 3%
- Labor costs down 5%
- Materials costs up 2%
- Net profit up 15%
Which renders as:
Numbered lists:
1. Pull the data
2. Clean the data
3. Analyze the data
4. Pretend the data was clean all along
[The R Project](https://www.r-project.org/)
Renders as: The R Project
You can type tables by hand using pipes and dashes:
| Region | Revenue | Growth |
|:--------|--------:|:------:|
| West | $45M | 12% |
| East | $38M | 8% |
| Central | $29M | -2% |
Renders as:
| Region | Revenue | Growth |
|---|---|---|
| West | $45M | 12% |
| East | $38M | 8% |
| Central | $29M | -2% |
The colons control alignment: :--- is left, ---: is right, :---: is centered.
For real data-driven tables, you’ll want to generate them from R—see Section @ref(tables-in-rmarkdown).
Prefix a line with >:
> "Revenue is vanity, profit is sanity, cash is reality."
Renders as:
“Revenue is vanity, profit is sanity, cash is reality.”
Yes, you can write fancy equations if you need to. Most of you won’t need this often, but if you’re in a finance or econ class and your professor wants formulas, here’s the gist:
Inline math uses single dollar signs: $\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i$ produces \(\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i\).
Display math uses double dollar signs:
\[ \hat{\beta} = (X^T X)^{-1} X^T y \]
If you need more than that, Google “LaTeX math symbols” and you’ll find everything.
Code chunks are where the magic happens. This is the R code that actually does your analysis.
A code chunk starts with ```{r} and ends with ```. The keyboard shortcut in RStudio is Ctrl+Alt+I (Windows) or Cmd+Option+I (Mac). Use it. Your fingers will thank you.
head(mtcars) mpg cyl disp hp drat wt qsec vs am gear carb
Mazda RX4 21.0 6 160 110 3.90 2.620 16.46 0 1 4 4
Mazda RX4 Wag 21.0 6 160 110 3.90 2.875 17.02 0 1 4 4
Datsun 710 22.8 4 108 93 3.85 2.320 18.61 1 1 4 1
Hornet 4 Drive 21.4 6 258 110 3.08 3.215 19.44 1 0 3 1
Hornet Sportabout 18.7 8 360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225 105 2.76 3.460 20.22 1 0 3 1
Give your chunks a name right after the r. It makes life easier when debugging (“Error in chunk revenue-plot” is way more helpful than “Error in unnamed-chunk-47”):
```{r}
summary(mtcars$mpg)
```
summary(mtcars$mpg) Min. 1st Qu. Median Mean 3rd Qu. Max.
10.40 15.43 19.20 20.09 22.80 33.90
There are dozens of chunk options, but here are the ones that matter for 95% of business reports:
| Option | Default | What it does |
|---|---|---|
echo |
TRUE |
Show the code in the report? |
eval |
TRUE |
Actually run the code? |
include |
TRUE |
Show anything (code + output) in the document? |
echo=FALSE is your best friend for client-facing reports. It hides the code but shows the result:
mean(mtcars$mpg)[1] 20.09062
The number above came from mean(mtcars$mpg), but the code is hidden. Your VP doesn’t need to see R code. They need the answer.
eval=FALSE shows code without running it. Good for “here’s how you would install this package” examples:
install.packages("tidyverse")include=FALSE runs the code silently—no code, no output in the report. Perfect for setup chunks where you load packages:
```{r}
library(tidyverse)
library(knitr)
```
| Option | Default | What it does |
|---|---|---|
message |
TRUE |
Show messages (like package loading info)? |
warning |
TRUE |
Show warnings? |
When you load packages, R loves to tell you about every attached namespace and masked function. Nobody reading your report cares:
Set message=FALSE and warning=FALSE and enjoy the silence.
| Option | Default | What it does |
|---|---|---|
fig.width |
7 |
Width in inches |
fig.height |
5 |
Height in inches |
fig.cap |
NULL |
Caption for the figure |
fig.align |
"default" |
Alignment: "left", "center", "right"
|
out.width |
NULL |
Width in the output (e.g., "80%") |
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) +
geom_point(size = 2) +
labs(title = "Iris: Sepal Dimensions",
x = "Sepal Length (cm)",
y = "Sepal Width (cm)") +
theme_minimal()
Tired of typing message=FALSE, warning=FALSE on every single chunk? Set defaults for the entire document in a setup chunk at the top:
```{r}
knitr::opts_chunk$set(
echo = TRUE,
warning = FALSE,
message = FALSE,
fig.align = "center"
)
```
Any individual chunk can still override these. Think of it like setting company-wide defaults, with each department free to customize.
This is one of the most underrated features of R Markdown. You can drop R results right into your sentences.
Instead of writing “the average MPG is 20.1” (and hoping you copied the right number), you write:
The average MPG is `r round(mean(mtcars$mpg), 1)`.
And it renders as: The average MPG is [R: round(mean(mtcars$mpg), 1)].
Here’s a more realistic example. Let’s say you computed some key numbers:
Now you can write sentences that update themselves:
[R: round(avg_mpg, 1)] miles per gallon.[R: max_hp] horsepower.[R: n_cyl6] cars with 6 cylinders in the dataset.If the data changes, these numbers update automatically the next time you knit. No more copy-paste errors. No more “wait, did I update that number on slide 14?” This is the kind of thing that separates a good analyst from one who stays late fixing reports.
HTML is the Swiss Army knife of output formats. Here’s a full-featured setup:
---
output:
html_document:
toc: true
toc_float: true
number_sections: true
theme: flatly
code_folding: hide
df_print: paged
---
df_print: paged – Makes data frames display as nice, paginated tables readers can click through. Way better than a wall of text.self_contained: true (the default) – Embeds everything into one HTML file. You can email it to anyone and it just works. No broken image links.PDF output looks polished and professional, but it needs a LaTeX installation. Easiest path:
install.packages("tinytex")
tinytex::install_tinytex()Then in your YAML:
---
output:
pdf_document:
toc: true
number_sections: true
---
When your stakeholder absolutely, positively needs a .docx:
---
output:
word_document:
toc: true
reference_docx: my-styles.docx
---
The reference_docx option is clever—you hand it a Word file with your company’s fonts and styles, and R Markdown applies them to the generated document. Brand guidelines? Handled.
Hand-typing tables in Markdown is fine for small stuff, but for real data, let R build your tables.
The simplest way to turn a data frame into a clean table:
| Sepal.Length | Sepal.Width | Petal.Length | Petal.Width | Species |
|---|---|---|---|---|
| 5.1 | 3.5 | 1.4 | 0.2 | setosa |
| 4.9 | 3.0 | 1.4 | 0.2 | setosa |
| 4.7 | 3.2 | 1.3 | 0.2 | setosa |
| 4.6 | 3.1 | 1.5 | 0.2 | setosa |
| 5.0 | 3.6 | 1.4 | 0.2 | setosa |
| 5.4 | 3.9 | 1.7 | 0.4 | setosa |
You can control column names, alignment, and decimal places:
mtcars_summary <- mtcars %>%
group_by(cyl) %>%
summarize(
Count = n(),
Avg_MPG = mean(mpg),
Avg_HP = mean(hp),
Avg_WT = mean(wt)
)
kable(mtcars_summary,
digits = 2,
col.names = c("Cylinders", "Count", "Avg MPG", "Avg HP", "Avg Weight"),
align = c("c", "c", "r", "r", "r"),
caption = "Summary statistics of mtcars by number of cylinders.")| Cylinders | Count | Avg MPG | Avg HP | Avg Weight |
|---|---|---|---|---|
| 4 | 11 | 26.66 | 82.64 | 2.29 |
| 6 | 7 | 19.74 | 122.29 | 3.12 |
| 8 | 14 | 15.10 | 209.21 | 4.00 |
Want boardroom-ready tables? The kableExtra package has you covered:
install.packages("kableExtra")library(kableExtra)
kable(mtcars_summary,
digits = 2,
col.names = c("Cylinders", "Count", "Avg MPG", "Avg HP", "Avg Weight"),
caption = "Styled summary of mtcars by cylinder count.") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE,
position = "center") %>%
column_spec(1, bold = TRUE) %>%
row_spec(0, bold = TRUE, color = "white", background = "#3366cc")| Cylinders | Count | Avg MPG | Avg HP | Avg Weight |
|---|---|---|---|---|
| 4 | 11 | 26.66 | 82.64 | 2.29 |
| 6 | 7 | 19.74 | 122.29 | 3.12 |
| 8 | 14 | 15.10 | 209.21 | 4.00 |
Striped rows, hover effects, bold headers with a branded color—this is the kind of table that makes people think you spent way more time than you actually did.
Any code chunk that produces a plot automatically includes it in your document. Control the size and add a caption:
ggplot(mtcars, aes(x = factor(cyl), y = mpg, fill = factor(cyl))) +
geom_boxplot(show.legend = FALSE) +
labs(x = "Number of Cylinders", y = "Miles per Gallon") +
theme_minimal() +
scale_fill_brewer(palette = "Set2")
Use out.width for percentage-based sizing, which usually works better than inches:
ggplot(iris, aes(x = Petal.Length, fill = Species)) +
geom_density(alpha = 0.5) +
labs(x = "Petal Length (cm)", y = "Density") +
theme_minimal()
Want two charts next to each other? Use fig.show="hold" with out.width="50%":
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
labs(title = "MPG vs Weight") +
theme_minimal()
ggplot(mtcars, aes(x = hp, y = mpg)) +
geom_point() +
labs(title = "MPG vs Horsepower") +
theme_minimal()
To include an image file (like a company logo or a screenshot), use knitr::include_graphics() inside a code chunk:
knitr::include_graphics("https://www.r-project.org/Rlogo.png")
When your report gets longer than a few pages, you’ll want to say things like “as shown in Figure 3” or “see Table 2.” R Markdown (via bookdown) handles the numbering for you automatically.
To cross-reference a figure, your chunk needs two things: a name and a fig.cap. Then use \@ref(fig:chunk-name) in your text. For example, \@ref(fig:iris-scatter) refers to Figure @ref(fig:iris-scatter).
Same idea: \@ref(tab:chunk-name) where the chunk has a kable() with a caption. Example: \@ref(tab:kable-formatted) refers to Table @ref(tab:kable-formatted).
Give a section a label with {#label}, then reference it with \@ref(label). The YAML header, for instance, is discussed in Section @ref(yaml-header).
| Element | How to label it | How to reference it |
|---|---|---|
| Figure | Chunk name + fig.cap
|
\@ref(fig:chunk-name) |
| Table | Chunk name + kable(caption=...)
|
\@ref(tab:chunk-name) |
| Section | ## Title {#label} |
\@ref(label) |
If you’re writing a research paper or a thesis-style report, R Markdown handles citations. Add a .bib file to your YAML:
---
bibliography: references.bib
link-citations: true
---
Then cite with @key (in-text) or [@key] (parenthetical). The bibliography appears at the end automatically. If your professor or journal requires citations, this will save you hours compared to doing it by hand.
Click Knit at the top of the RStudio editor. That’s it. Seriously.
The dropdown arrow next to it lets you pick the output format if you have more than one in your YAML.
You can also render from the console, which is handy for automation:
rmarkdown::render("my-report.Rmd")Override the output format:
rmarkdown::render("my-report.Rmd", output_format = "pdf_document")When you click Knit:
.md) file..md file into your chosen output format.That’s the whole pipeline. If you get an error, it’s almost always in step 1 (your R code has a bug). Fix the code, knit again.
Every document should begin with a setup chunk that loads packages and sets defaults. Think of it as “opening the store before customers arrive”:
```{r}
knitr::opts_chunk$set(
echo = TRUE,
message = FALSE,
warning = FALSE,
fig.align = "center"
)
library(tidyverse)
library(knitr)
```
“Error in unnamed-chunk-47” is about as helpful as a meeting that could have been an email. Name your chunks so you can actually find problems.
echo=FALSE for Stakeholder ReportsYour CFO does not need to see library(dplyr). Use echo=FALSE (or code_folding: hide in the YAML) to keep reports clean while preserving your code.
If you generate the same report for different regions, time periods, or product lines, use parameters:
---
title: "Regional Sales Report"
params:
region: "West"
year: 2024
output: html_document
---
Then in your code, use params$region and params$year. Render different versions like this:
One template, infinite reports. Your future self will be grateful.
| Windows/Linux | Mac | What it does |
|---|---|---|
| Ctrl+Alt+I | Cmd+Option+I | Insert a new code chunk |
| Ctrl+Shift+K | Cmd+Shift+K | Knit the document |
| Ctrl+Shift+Enter | Cmd+Shift+Enter | Run the current chunk |
| Ctrl+Enter | Cmd+Enter | Run the current line |
Here’s what you now know how to do:
echo, eval, message, and warning.knitr::kable() and kableExtra that look presentation-ready.fig.width, fig.height, and out.width.The bottom line: R Markdown turns your analysis into a self-updating, professional report. The more you use it, the more time you save—and the fewer “can you update the numbers?” emails you’ll get.
And remember, you can always access the R Markdown cheat sheet from RStudio via Help > Cheat Sheets > R Markdown Cheat Sheet.
One last thought. Reproducibility is not just a technical convenience — it is a form of intellectual honesty. When your analysis is transparent and re-runnable, anyone can check your work, challenge your assumptions, and build on what you did. In a world where data-driven decisions affect jobs, budgets, and opportunities, that kind of accountability is not optional. It is what professional integrity looks like in practice.
---
title: "R Markdown"
---
# R Markdown {#rmarkdown}
```{r}
library(knitr)
opts_chunk$set(echo=TRUE, warning=FALSE, message=FALSE)
```
After an analysis is done, the next problem is communicating it. The default workflow — run code in R, copy numbers into Word, screenshot the charts, paste them into slides — falls apart the moment the data is updated. Every change forces a manual rebuild, and there is no good way to tell what is current versus stale.
R Markdown solves this by combining the writing and the code in a single document. Render once and you get a polished HTML page, PDF, or Word document with all the charts, tables, and numbers regenerated from the live code. Update the data, render again, and the entire document rebuilds.
::: {.callout-note}
## Quarto vs R Markdown — what to use in 2026
R Markdown (`.Rmd`) is the original. **Quarto** (`.qmd`) is the next-generation framework Posit released in 2022; this book is written in Quarto. The differences for a beginner are minor — Quarto uses YAML chunk options (`#| label:`, `#| echo: false`) instead of the older inline form, supports more output formats out of the box, and has a more consistent CLI. Everything in this chapter applies to both. New projects should use Quarto; existing R Markdown projects work fine and do not need to be migrated.
:::
::: {.callout-warning}
## AI Pitfall: AI generates YAML headers that produce wrong output
The YAML header at the top of an R Markdown or Quarto file controls everything from output format to figure dimensions to citation handling. AI assistants sometimes generate headers that work for the example they were shown but break for your context.
Common failures:
- **Mixing R Markdown and Quarto YAML.** AI may give you `output: html_document` (R Markdown style) when you are working in Quarto, where the equivalent is `format: html`. The render fails or produces unexpected results.
- **Deprecated chunk options.** Older R Markdown chunk options like `fig.cap` and `fig.height` work in chunk headers, but in Quarto's pipe-comment style they become `fig-cap` and `fig-height` (note the hyphen instead of dot). AI sometimes mixes the two.
- **Bibliography paths that do not exist.** AI confidently generates `bibliography: book.bib` even when no `book.bib` file exists in your project, and the render errors with a citation lookup failure.
Always render the document immediately after creating or modifying YAML to catch these. Errors at render time are easy to diagnose; errors that produce wrong-looking output without erroring are not.
:::
## What is R Markdown? {#what-is-rmarkdown}
An R Markdown file (`.Rmd`) is a document where prose and R code live in the same file. When you *render* (or *knit*) it, three things happen:
1. R runs all your code.
2. It grabs the output---tables, charts, printed results.
3. It stitches everything together into a finished document.
The result can be an HTML page, a PDF, a Word doc, slides, or even a full website (this book is written in R Markdown!). Think of it as a self-updating report.
You need two packages to make this work:
```{r}
#| eval: false
install.packages("rmarkdown")
install.packages("knitr")
```
That's it. RStudio already has Pandoc (the engine that converts everything) built in, so you're good to go.
## The Three Components of an R Markdown Document {#three-components}
Every `.Rmd` file has exactly three types of content:
1. **YAML header** -- The settings block at the top. Think of it as the "cover page setup."
2. **Markdown text** -- Your actual writing: paragraphs, headers, bullet points.
3. **Code chunks** -- The R code that produces your numbers and charts.
Here is the simplest R Markdown document you can write:
````
---
title: "My First Report"
author: "Your Name"
date: "2024-01-15"
output: html_document
---
## Introduction
This is my first R Markdown document.
```{r}`r ''`
summary(iris)
```
````
Let's break each piece down.
## YAML Header {#yaml-header}
The YAML header sits at the very top of your file, sandwiched between two lines of `---`. It tells R Markdown the basics: what's this document called, who wrote it, and what format should the output be?
### The Essentials
Here is a perfectly good YAML header:
````
---
title: "Quarterly Sales Report"
author: "Vivek H. Patil"
date: "2024-09-01"
output: html_document
---
````
That's really all you need. Four lines, and you're in business.
**Pro tip:** Want the date to update automatically every time you knit? Use this:
````
date: "`r knitr::inline_expr('Sys.Date()')`"
````
Now your report always shows today's date. No more "wait, is this the March version or the April version?"
### Picking Your Output Format
The three formats you'll actually use:
| Format | YAML value | When to use it |
|:-------|:-----------------|:----------------------------------------|
| HTML | `html_document` | Day-to-day reports, sharing via email |
| Word | `word_document` | When your boss *needs* it in Word |
| PDF | `pdf_document` | Formal reports (requires LaTeX) |
HTML is the default, and honestly, it's the best for most business use cases. It's interactive, looks great, and you can email it as a single file.
### Making It Look Good
You can add a floating table of contents, pick a theme, and let readers show/hide code---all from the YAML header:
````
---
title: "Sales Analysis Q3 2024"
author: "Vivek H. Patil"
date: "`r knitr::inline_expr('format(Sys.Date(), \"%B %d, %Y\")')`"
output:
html_document:
toc: true
toc_float: true
theme: flatly
code_folding: hide
---
````
Here's what each of those options does:
- **`toc: true`** -- Adds a table of contents (generated from your headers).
- **`toc_float: true`** -- Makes the table of contents float on the side as you scroll. Very handy for long reports.
- **`theme`** -- Changes the visual style. Try `cerulean`, `cosmo`, `flatly`, `journal`, `readable`, or `united`.
- **`code_folding: hide`** -- Hides all code by default, but readers can click to reveal it. Perfect for reports where your manager doesn't want to see code but your analyst colleague does.
That's really all the YAML you need to know. If you want to go deeper, Section \@ref(output-formats) has more detail.
## Markdown Syntax {#markdown-syntax}
The writing part of your document uses Markdown---a simple way to format text without messing around in a toolbar. If you've ever used Slack formatting or Reddit, you've basically used Markdown.
### Headers
Use `#` symbols to create headings. More `#` signs = smaller heading:
````
# Big Chapter Title
## Section
### Subsection
````
### Bold, Italic, and Friends
- `*italic*` gives you *italic*
- `**bold**` gives you **bold**
- `***bold italic***` gives you ***bold italic***
- `~~strikethrough~~` gives you ~~strikethrough~~
### Lists
Bullet points:
````
- Revenue grew 12%
- Costs decreased 3%
- Labor costs down 5%
- Materials costs up 2%
- Net profit up 15%
````
Which renders as:
- Revenue grew 12%
- Costs decreased 3%
- Labor costs down 5%
- Materials costs up 2%
- Net profit up 15%
Numbered lists:
````
1. Pull the data
2. Clean the data
3. Analyze the data
4. Pretend the data was clean all along
````
### Links
```
[The R Project](https://www.r-project.org/)
```
Renders as: [The R Project](https://www.r-project.org/)
### Tables
You can type tables by hand using pipes and dashes:
````
| Region | Revenue | Growth |
|:--------|--------:|:------:|
| West | $45M | 12% |
| East | $38M | 8% |
| Central | $29M | -2% |
````
Renders as:
| Region | Revenue | Growth |
|:--------|--------:|:------:|
| West | $45M | 12% |
| East | $38M | 8% |
| Central | $29M | -2% |
The colons control alignment: `:---` is left, `---:` is right, `:---:` is centered.
For real data-driven tables, you'll want to generate them from R---see Section \@ref(tables-in-rmarkdown).
### Blockquotes
Prefix a line with `>`:
````
> "Revenue is vanity, profit is sanity, cash is reality."
````
Renders as:
> "Revenue is vanity, profit is sanity, cash is reality."
### Math Equations
Yes, you can write fancy equations if you need to. Most of you won't need this often, but if you're in a finance or econ class and your professor wants formulas, here's the gist:
Inline math uses single dollar signs: `$\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i$` produces $\bar{x} = \frac{1}{n}\sum_{i=1}^{n} x_i$.
Display math uses double dollar signs:
$$
\hat{\beta} = (X^T X)^{-1} X^T y
$$
If you need more than that, Google "LaTeX math symbols" and you'll find everything.
## Code Chunks {#code-chunks}
Code chunks are where the magic happens. This is the R code that actually does your analysis.
### Inserting a Code Chunk
A code chunk starts with ```` ```{r} ```` and ends with ```` ``` ````. The keyboard shortcut in RStudio is **Ctrl+Alt+I** (Windows) or **Cmd+Option+I** (Mac). Use it. Your fingers will thank you.
```{r}
head(mtcars)
```
### Naming Your Chunks
Give your chunks a name right after the `r`. It makes life easier when debugging ("Error in chunk `revenue-plot`" is way more helpful than "Error in unnamed-chunk-47"):
````
```{r}`r ''`
summary(mtcars$mpg)
```
````
```{r}
summary(mtcars$mpg)
```
### The Chunk Options You Actually Need
There are dozens of chunk options, but here are the ones that matter for 95% of business reports:
#### Show/Hide Code and Output
| Option | Default | What it does |
|:----------|:--------|:---------------------------------------------------|
| `echo` | `TRUE` | Show the code in the report? |
| `eval` | `TRUE` | Actually run the code? |
| `include` | `TRUE` | Show *anything* (code + output) in the document? |
**`echo=FALSE`** is your best friend for client-facing reports. It hides the code but shows the result:
```{r}
mean(mtcars$mpg)
```
The number above came from `mean(mtcars$mpg)`, but the code is hidden. Your VP doesn't need to see R code. They need the answer.
**`eval=FALSE`** shows code without running it. Good for "here's how you would install this package" examples:
```{r}
#| eval: false
install.packages("tidyverse")
```
**`include=FALSE`** runs the code silently---no code, no output in the report. Perfect for setup chunks where you load packages:
````
```{r}`r ''`
library(tidyverse)
library(knitr)
```
````
#### Silencing Messages and Warnings
| Option | Default | What it does |
|:----------|:--------|:------------------------------------------------|
| `message` | `TRUE` | Show messages (like package loading info)? |
| `warning` | `TRUE` | Show warnings? |
When you load packages, R loves to tell you about every attached namespace and masked function. Nobody reading your report cares:
```{r}
library(ggplot2)
library(dplyr)
```
Set `message=FALSE` and `warning=FALSE` and enjoy the silence.
#### Controlling Figure Size
| Option | Default | What it does |
|:-------------|:--------|:------------------------------------------|
| `fig.width` | `7` | Width in inches |
| `fig.height` | `5` | Height in inches |
| `fig.cap` | `NULL` | Caption for the figure |
| `fig.align` | `"default"` | Alignment: `"left"`, `"center"`, `"right"` |
| `out.width` | `NULL` | Width in the output (e.g., `"80%"`) |
```{r}
ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) +
geom_point(size = 2) +
labs(title = "Iris: Sepal Dimensions",
x = "Sepal Length (cm)",
y = "Sepal Width (cm)") +
theme_minimal()
```
### Global Chunk Options {#global-chunk-options}
Tired of typing `message=FALSE, warning=FALSE` on every single chunk? Set defaults for the entire document in a setup chunk at the top:
````
```{r}`r ''`
knitr::opts_chunk$set(
echo = TRUE,
warning = FALSE,
message = FALSE,
fig.align = "center"
)
```
````
Any individual chunk can still override these. Think of it like setting company-wide defaults, with each department free to customize.
## Inline R Code {#inline-r-code}
This is one of the most underrated features of R Markdown. You can drop R results right into your sentences.
Instead of writing "the average MPG is 20.1" (and hoping you copied the right number), you write:
````
The average MPG is `r knitr::inline_expr('round(mean(mtcars$mpg), 1)')`.
````
And it renders as: The average MPG is `[R: round(mean(mtcars$mpg), 1)]`.
Here's a more realistic example. Let's say you computed some key numbers:
```{r}
avg_mpg <- mean(mtcars$mpg)
max_hp <- max(mtcars$hp)
n_cyl6 <- sum(mtcars$cyl == 6)
```
Now you can write sentences that update themselves:
- The average fuel efficiency across all cars is `[R: round(avg_mpg, 1)]` miles per gallon.
- The most powerful car has `[R: max_hp]` horsepower.
- There are `[R: n_cyl6]` cars with 6 cylinders in the dataset.
If the data changes, these numbers update automatically the next time you knit. No more copy-paste errors. No more "wait, did I update that number on slide 14?" This is the kind of thing that separates a good analyst from one who stays late fixing reports.
## Output Formats in Detail {#output-formats}
### HTML Documents
HTML is the Swiss Army knife of output formats. Here's a full-featured setup:
````
---
output:
html_document:
toc: true
toc_float: true
number_sections: true
theme: flatly
code_folding: hide
df_print: paged
---
````
- **`df_print: paged`** -- Makes data frames display as nice, paginated tables readers can click through. Way better than a wall of text.
- **`self_contained: true`** (the default) -- Embeds everything into one HTML file. You can email it to anyone and it just works. No broken image links.
### PDF Documents
PDF output looks polished and professional, but it needs a LaTeX installation. Easiest path:
```{r}
#| eval: false
install.packages("tinytex")
tinytex::install_tinytex()
```
Then in your YAML:
````
---
output:
pdf_document:
toc: true
number_sections: true
---
````
### Word Documents
When your stakeholder absolutely, positively needs a `.docx`:
````
---
output:
word_document:
toc: true
reference_docx: my-styles.docx
---
````
The `reference_docx` option is clever---you hand it a Word file with your company's fonts and styles, and R Markdown applies them to the generated document. Brand guidelines? Handled.
## Tables in R Markdown {#tables-in-rmarkdown}
Hand-typing tables in Markdown is fine for small stuff, but for real data, let R build your tables.
### knitr::kable()
The simplest way to turn a data frame into a clean table:
```{r}
kable(head(iris), caption = "First six rows of the iris dataset.")
```
You can control column names, alignment, and decimal places:
```{r}
mtcars_summary <- mtcars %>%
group_by(cyl) %>%
summarize(
Count = n(),
Avg_MPG = mean(mpg),
Avg_HP = mean(hp),
Avg_WT = mean(wt)
)
kable(mtcars_summary,
digits = 2,
col.names = c("Cylinders", "Count", "Avg MPG", "Avg HP", "Avg Weight"),
align = c("c", "c", "r", "r", "r"),
caption = "Summary statistics of mtcars by number of cylinders.")
```
### kableExtra for Fancy Styling
Want boardroom-ready tables? The `kableExtra` package has you covered:
```{r}
#| eval: false
install.packages("kableExtra")
```
```{r}
library(kableExtra)
kable(mtcars_summary,
digits = 2,
col.names = c("Cylinders", "Count", "Avg MPG", "Avg HP", "Avg Weight"),
caption = "Styled summary of mtcars by cylinder count.") %>%
kable_styling(bootstrap_options = c("striped", "hover", "condensed"),
full_width = FALSE,
position = "center") %>%
column_spec(1, bold = TRUE) %>%
row_spec(0, bold = TRUE, color = "white", background = "#3366cc")
```
Striped rows, hover effects, bold headers with a branded color---this is the kind of table that makes people think you spent way more time than you actually did.
## Figures and Images {#figures-and-images}
### R-Generated Plots
Any code chunk that produces a plot automatically includes it in your document. Control the size and add a caption:
```{r}
ggplot(mtcars, aes(x = factor(cyl), y = mpg, fill = factor(cyl))) +
geom_boxplot(show.legend = FALSE) +
labs(x = "Number of Cylinders", y = "Miles per Gallon") +
theme_minimal() +
scale_fill_brewer(palette = "Set2")
```
Use `out.width` for percentage-based sizing, which usually works better than inches:
```{r}
ggplot(iris, aes(x = Petal.Length, fill = Species)) +
geom_density(alpha = 0.5) +
labs(x = "Petal Length (cm)", y = "Density") +
theme_minimal()
```
### Side-by-Side Plots
Want two charts next to each other? Use `fig.show="hold"` with `out.width="50%"`:
```{r}
ggplot(mtcars, aes(x = wt, y = mpg)) +
geom_point() +
labs(title = "MPG vs Weight") +
theme_minimal()
ggplot(mtcars, aes(x = hp, y = mpg)) +
geom_point() +
labs(title = "MPG vs Horsepower") +
theme_minimal()
```
### External Images
To include an image file (like a company logo or a screenshot), use `knitr::include_graphics()` inside a code chunk:
```{r}
knitr::include_graphics("https://www.r-project.org/Rlogo.png")
```
## Cross-Referencing {#cross-referencing}
When your report gets longer than a few pages, you'll want to say things like "as shown in Figure 3" or "see Table 2." R Markdown (via bookdown) handles the numbering for you automatically.
### Figures
To cross-reference a figure, your chunk needs two things: a **name** and a **`fig.cap`**. Then use `\@ref(fig:chunk-name)` in your text. For example, `\@ref(fig:iris-scatter)` refers to Figure \@ref(fig:iris-scatter).
### Tables
Same idea: `\@ref(tab:chunk-name)` where the chunk has a `kable()` with a `caption`. Example: `\@ref(tab:kable-formatted)` refers to Table \@ref(tab:kable-formatted).
### Sections
Give a section a label with `{#label}`, then reference it with `\@ref(label)`. The YAML header, for instance, is discussed in Section \@ref(yaml-header).
### Quick Reference
| Element | How to label it | How to reference it |
|:--------|:---------------------------------|:--------------------------------|
| Figure | Chunk name + `fig.cap` | `\@ref(fig:chunk-name)` |
| Table | Chunk name + `kable(caption=...)` | `\@ref(tab:chunk-name)` |
| Section | `## Title {#label}` | `\@ref(label)` |
## Citations and Bibliography {#citations-bibliography}
If you're writing a research paper or a thesis-style report, R Markdown handles citations. Add a `.bib` file to your YAML:
````
---
bibliography: references.bib
link-citations: true
---
````
Then cite with `@key` (in-text) or `[@key]` (parenthetical). The bibliography appears at the end automatically. If your professor or journal requires citations, this will save you hours compared to doing it by hand.
## Rendering Documents {#rendering-documents}
### The Knit Button
Click **Knit** at the top of the RStudio editor. That's it. Seriously.
The dropdown arrow next to it lets you pick the output format if you have more than one in your YAML.
### rmarkdown::render()
You can also render from the console, which is handy for automation:
```{r}
#| eval: false
rmarkdown::render("my-report.Rmd")
```
Override the output format:
```{r}
#| eval: false
rmarkdown::render("my-report.Rmd", output_format = "pdf_document")
```
### What Happens Under the Hood
When you click Knit:
1. **knitr** runs all your code chunks and produces a plain Markdown (`.md`) file.
2. **Pandoc** converts that `.md` file into your chosen output format.
That's the whole pipeline. If you get an error, it's almost always in step 1 (your R code has a bug). Fix the code, knit again.
## Tips and Best Practices {#tips-and-best-practices}
### Start with a Setup Chunk
Every document should begin with a setup chunk that loads packages and sets defaults. Think of it as "opening the store before customers arrive":
````
```{r}`r ''`
knitr::opts_chunk$set(
echo = TRUE,
message = FALSE,
warning = FALSE,
fig.align = "center"
)
library(tidyverse)
library(knitr)
```
````
### Name Your Chunks
"Error in unnamed-chunk-47" is about as helpful as a meeting that could have been an email. Name your chunks so you can actually find problems.
### Use `echo=FALSE` for Stakeholder Reports
Your CFO does not need to see `library(dplyr)`. Use `echo=FALSE` (or `code_folding: hide` in the YAML) to keep reports clean while preserving your code.
### Use Parameterized Reports for Repetitive Work
If you generate the same report for different regions, time periods, or product lines, use parameters:
````
---
title: "Regional Sales Report"
params:
region: "West"
year: 2024
output: html_document
---
````
Then in your code, use `params$region` and `params$year`. Render different versions like this:
```{r}
#| eval: false
rmarkdown::render("report.Rmd", params = list(region = "East", year = 2023))
```
One template, infinite reports. Your future self will be grateful.
### Keyboard Shortcuts Worth Memorizing
| Windows/Linux | Mac | What it does |
|:---------------------|:---------------------|:--------------------------|
| Ctrl+Alt+I | Cmd+Option+I | Insert a new code chunk |
| Ctrl+Shift+K | Cmd+Shift+K | Knit the document |
| Ctrl+Shift+Enter | Cmd+Shift+Enter | Run the current chunk |
| Ctrl+Enter | Cmd+Enter | Run the current line |
## Summary {#rmarkdown-summary}
Here's what you now know how to do:
- **Write a YAML header** that sets up your document's title, author, output format, and appearance.
- **Use Markdown** to format text with headers, bold, italic, lists, links, and tables.
- **Write code chunks** that run R code and control what shows up in the report with options like `echo`, `eval`, `message`, and `warning`.
- **Use inline R code** to drop computed numbers directly into your sentences (no more copy-paste errors).
- **Generate tables** with `knitr::kable()` and `kableExtra` that look presentation-ready.
- **Include and size figures** with chunk options like `fig.width`, `fig.height`, and `out.width`.
- **Cross-reference** figures, tables, and sections automatically.
- **Render** to HTML, PDF, or Word with one click.
The bottom line: R Markdown turns your analysis into a self-updating, professional report. The more you use it, the more time you save---and the fewer "can you update the numbers?" emails you'll get.
And remember, you can always access the R Markdown cheat sheet from RStudio via **Help > Cheat Sheets > R Markdown Cheat Sheet**.
One last thought. Reproducibility is not just a technical convenience --- it is a form of intellectual honesty. When your analysis is transparent and re-runnable, anyone can check your work, challenge your assumptions, and build on what you did. In a world where data-driven decisions affect jobs, budgets, and opportunities, that kind of accountability is not optional. It is what professional integrity looks like in practice.