20  Advanced R Markdown

21 Advanced R Markdown

Code
Warning: package 'tidyverse' was built under R version 4.5.2
Warning: package 'ggplot2' was built under R version 4.5.2
Warning: package 'tibble' was built under R version 4.5.2
Warning: package 'tidyr' was built under R version 4.5.2
Warning: package 'readr' was built under R version 4.5.2
Warning: package 'purrr' was built under R version 4.5.2
Warning: package 'dplyr' was built under R version 4.5.2
Warning: package 'stringr' was built under R version 4.5.2
Warning: package 'forcats' was built under R version 4.5.2
Warning: package 'lubridate' was built under R version 4.5.2
── Attaching core tidyverse packages ──────────────────────── tidyverse 2.0.0 ──
✔ dplyr     1.2.0     ✔ readr     2.2.0
✔ forcats   1.0.1     ✔ stringr   1.6.0
✔ ggplot2   4.0.2     ✔ tibble    3.3.1
✔ lubridate 1.9.5     ✔ tidyr     1.3.2
✔ purrr     1.2.1     
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
✖ dplyr::filter() masks stats::filter()
✖ dplyr::lag()    masks stats::lag()
ℹ Use the conflicted package (<http://conflicted.r-lib.org/>) to force all conflicts to become errors
Code
knitr::opts_chunk$set(echo = TRUE, warning = FALSE, message = FALSE)

After Chapter @ref(rmarkdown) you can knit HTML documents, add code chunks, and format text. This chapter covers the features that turn R Markdown from “useful for one-off reports” into a serious automation tool: parameterized reports that run for every value of a variable, custom output formats, dashboards, and integration with other languages. Most analysts use only a small subset of these, but the right one at the right time can turn a week of manual work into a single render.

WarningAI Pitfall: parameterized reports silently use the default parameter

When you set up parameterized reports, the YAML defines defaults that are used when you knit interactively. If you forget to pass parameters when calling rmarkdown::render() programmatically, R uses those defaults — without warning. Every “regional” report ends up being the same default region.

A concrete failure: an analyst sets up a quarterly sales report parameterized by region, with region: "Northeast" as the YAML default. They write a loop:

for (r in regions) {
  rmarkdown::render("sales-report.Rmd", output_file = paste0(r, ".html"))
}

The script produces 12 files named Northeast.html, Southeast.html, West.html, etc. — each containing identical content for the Northeast. The fix is one line: pass params = list(region = r) inside the render call. AI does not always include that argument when writing parameterized-render loops.

The verification: when running parameterized reports in a loop, open at least three of the output files and confirm the parameter actually changed the content. Don’t just trust the file count.

21.1 Parameterized Reports

Here is a scenario every business student will eventually face: you write a beautiful report for the Northeast region, and your manager says, “Great! Now make one for every other region too.” If you copy-paste the .Rmd file twelve times and change the region name in each one, you will hate your life by region number four. Parameterized reports solve this problem.

The idea is simple: you write your report once, but leave a placeholder (a “parameter”) for anything that changes — like the region name, the product category, or the time period. Then you tell R to generate the report over and over, swapping in different values each time. It is basically a mail merge, but for data analysis.

21.1.1 Setting It Up

You add a params section to your YAML header with default values. Here is a complete, self-contained example you could save as species-report.Rmd and actually run:

---
title: "Species Report"
author: "Your Name"
date: "`r Sys.Date()`"
params:
 species_name: "setosa"
output:
 html_document:
 toc: true
 theme: flatly
---

```{r}
library(tidyverse)
knitr::opts_chunk$set(echo = FALSE, message = FALSE, warning = FALSE)
```

## Overview

This report covers the **`r params$species_name`** species
from the `iris` dataset.

```{r}
species_data <- iris %>%
 filter(Species == params$species_name)
```

The dataset contains `r nrow(species_data)` observations.

## Summary Statistics

```{r}
summary(species_data %>% select(-Species))
```

## Visualization

```{r}
ggplot(species_data, aes(x = Sepal.Length, y = Sepal.Width)) +
 geom_point(color = "steelblue", size = 2) +
 labs(
 title = paste("Sepal Dimensions:", params$species_name),
 x = "Sepal Length (cm)",
 y = "Sepal Width (cm)"
 ) +
 theme_minimal()
```

Notice how params$species_name shows up everywhere — in the filtering code, in the plot title, and even in the narrative text via inline R code. When you knit this document, R fills in the default value (“setosa”). But the real power comes from overriding that default.

To render with a different species, use rmarkdown::render():

Code
rmarkdown::render(
 input = "species-report.Rmd",
 params = list(species_name = "versicolor"),
 output_file = "report-versicolor.html"
)

In a real business setting, swap “species_name” for “region”, “product_line”, “client_name”, or whatever dimension your reports are organized around. Same concept, much higher stakes.

21.1.2 Generating Reports in Bulk

Here is where it gets fun. Want to generate a report for every region in one shot? Just loop through them:

Code
regions <- c("Northeast", "Southeast", "Midwest", "West")

for (r in regions) {
 rmarkdown::render(
 input = "regional-report.Rmd",
 params = list(region = r),
 output_file = paste0("report-", r, ".html")
 )
}

Run that, go get coffee, and come back to four finished reports. Your manager thinks you are a wizard. You are not — you just know about parameterized reports.

21.2 Presentations

Yes, you can make slide decks with R Markdown. They will not replace PowerPoint for every group project, but for data-driven presentations where your charts need to match your analysis exactly, this is a game-changer.

The easiest option is ioslides, which gives you HTML slides that open in any browser:

---
title: "Q4 Sales Review"
author: "Your Name"
date: "`r Sys.Date()`"
output:
 ioslides_presentation:
 widescreen: true
---

New slides start with ## (level 2 headers) or ---. Your R code chunks produce output right on the slides. You can set widescreen: true for a 16:9 aspect ratio (which is what most projectors use these days), and everything else works exactly like a normal R Markdown document. Same syntax, different output.

If you want fancier slides with more customization, check out the xaringan package. It produces gorgeous, highly customizable HTML presentations and is popular for conference talks. It supports two-column layouts, incremental reveals, presenter notes, and custom CSS themes. But honestly, ioslides will get you through any class presentation just fine — save xaringan for when you are trying to impress someone at a conference.

21.3 Dashboards with flexdashboard

Dashboards are the business world’s favorite way to consume data. Executives love them. Clients love them. Your professor will love them. The flexdashboard package lets you build dashboards using the same R Markdown skills you already have.

Code
install.packages("flexdashboard")

21.3.1 The Basic Structure

A flexdashboard is just an R Markdown document with a special output format. The layout uses headers to define columns and boxes:

---
title: "Sales Dashboard"
output:
 flexdashboard::flex_dashboard:
 orientation: columns
 vertical_layout: fill
---

```{r}
library(flexdashboard)
library(tidyverse)
```

Column {data-width=650}
-----------------------------------------------------------------------

### Revenue by Region

```{r}
ggplot(sales, aes(x = Region, y = Revenue, fill = Region)) +
 geom_col() +
 theme_minimal()
```

Column {data-width=350}
-----------------------------------------------------------------------

### Top Products

```{r}
sales %>%
 group_by(Product) %>%
 summarize(Total = sum(Revenue)) %>%
 arrange(desc(Total)) %>%
 head(10) %>%
 knitr::kable()
```

### Total Revenue

```{r}
sum(sales$Revenue)
```

That is it. Level 2 headers (## or the line of dashes) create columns. Level 3 headers (###) create boxes within those columns. The {data-width=650} bit controls how wide each column is relative to the others.

You can also add multiple pages with level 1 headers, and tabbed sections by adding {.tabset} to a column header. Flexdashboard even has special components for highlighting key metrics:

Code
# A big number with an icon --- perfect for KPIs
valueBox(value = 150, caption = "Total Sales", icon = "fa-dollar-sign", color = "primary")

# A gauge/dial that shows progress toward a goal
gauge(value = 72, min = 0, max = 100, symbol = "%",
 gaugeSectors(success = c(80, 100), warning = c(50, 79), danger = c(0, 49)))

These are the kind of flashy elements that executives cannot resist. “What’s our conversion rate?” — boom, there it is in a big green number on the dashboard.

If you want to go even further, adding runtime: shiny to your YAML header turns a static dashboard into a fully interactive app with dropdown menus, sliders, and reactive charts. That requires a running R session (so you cannot just email the HTML file), but it is incredibly powerful for internal tools.

21.4 Interactive Documents

Here is where you really impress people. With just a couple of extra packages, you can make your HTML reports interactive — sortable tables, zoomable plots, the works. No Shiny server required. These are self-contained HTML files you can email to anyone.

21.4.1 Interactive Tables with DT

The DT package turns boring static tables into searchable, sortable, paginated tables that your audience can actually explore:

Code
Code
library(DT)
datatable(
 iris,
 options = list(pageLength = 10, autoWidth = TRUE),
 caption = "Go ahead --- search, sort, and filter this table.",
 filter = "top"
)

Your audience can search across all columns, sort by clicking headers, filter individual columns, and page through the data. It is like giving them a mini-spreadsheet inside your report. Finance people go absolutely nuts for this.

21.4.2 Interactive Plots with plotly

The plotly package is pure magic. You take a regular ggplot, wrap it in ggplotly(), and suddenly it has hover tooltips, zooming, and panning:

Code
Code
library(plotly)

p <- ggplot(iris, aes(x = Sepal.Length, y = Sepal.Width, color = Species)) +
 geom_point(size = 2) +
 labs(title = "Iris Sepal Dimensions",
 x = "Sepal Length (cm)",
 y = "Sepal Width (cm)") +
 theme_minimal()

ggplotly(p)

Hover over any point to see its exact values. Click and drag to zoom in. Use the toolbar to pan, reset the view, or download the plot as a PNG. One line of code (ggplotly(p)) turns a static chart into something that feels like a real data product. This is the single easiest way to make your reports look impressive.

21.4.3 Other Interactive Goodies

The leaflet package creates interactive maps (great for any location-based business data), and there are htmlwidgets for network diagrams, time series, and more. Once you are comfortable with DT and plotly, exploring these is easy because they all follow the same basic pattern.

21.5 A Few Other Things Worth Knowing

Bookdown is the package that built the book you are reading right now. It extends R Markdown to handle multi-chapter documents with cross-referencing across chapters. You are already using it, so you know how it works.

Blogdown lets you build a website or blog using R Markdown. If you ever want a personal data science portfolio site, it is worth a look.

Both are great tools, but you do not need to learn them in depth right now.

21.6 Tips for Reproducibility

Reproducibility means someone else (or future you, six months from now, panicking before a deadline) can run your analysis and get the same results. In a business context, this is the difference between “I can back up these numbers” and “I have no idea how I got these numbers.”

Two simple habits will save you:

21.6.1 Set Your Seeds

Any time your analysis involves randomness — sampling, simulations, bootstrapping — use set.seed() so the results are identical every time:

Code
set.seed(42)
sample(1:100, 5)
[1] 49 65 25 74 18
Code
# Same seed, same results. Every time.
set.seed(42)
sample(1:100, 5)
[1] 49 65 25 74 18

Put set.seed() in your setup chunk and you are good.

21.6.2 Record Your Session Info

At the end of important documents, include sessionInfo(). This records which version of R and which packages you used, so if something breaks later, you can figure out what changed:

Code
R version 4.5.1 (2025-06-13 ucrt)
Platform: x86_64-w64-mingw32/x64
Running under: Windows 11 x64 (build 26200)

Matrix products: default
  LAPACK version 3.12.1

locale:
[1] LC_COLLATE=English_United States.utf8 
[2] LC_CTYPE=English_United States.utf8   
[3] LC_MONETARY=English_United States.utf8
[4] LC_NUMERIC=C                          
[5] LC_TIME=English_United States.utf8    

time zone: America/Los_Angeles
tzcode source: internal

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] plotly_4.12.0   DT_0.34.0       lubridate_1.9.5 forcats_1.0.1  
 [5] stringr_1.6.0   dplyr_1.2.0     purrr_1.2.1     readr_2.2.0    
 [9] tidyr_1.3.2     tibble_3.3.1    ggplot2_4.0.2   tidyverse_2.0.0

loaded via a namespace (and not attached):
 [1] sass_0.4.10         generics_0.1.4      stringi_1.8.7      
 [4] hms_1.1.4           digest_0.6.39       magrittr_2.0.4     
 [7] evaluate_1.0.5      grid_4.5.1          timechange_0.4.0   
[10] RColorBrewer_1.1-3  fastmap_1.2.0       jsonlite_2.0.0     
[13] httr_1.4.8          crosstalk_1.2.2     viridisLite_0.4.3  
[16] scales_1.4.0        lazyeval_0.2.2      jquerylib_0.1.4    
[19] cli_3.6.5           rlang_1.1.7         withr_3.0.2        
[22] cachem_1.1.0        yaml_2.3.12         otel_0.2.0         
[25] tools_4.5.1         tzdb_0.5.0          vctrs_0.7.1        
[28] R6_2.6.1            lifecycle_1.0.5     htmlwidgets_1.6.4  
[31] pkgconfig_2.0.3     pillar_1.11.1       bslib_0.10.0       
[34] gtable_0.3.6        glue_1.8.0          data.table_1.18.2.1
[37] xfun_0.56           tidyselect_1.2.1    knitr_1.51         
[40] farver_2.1.2        htmltools_0.5.9     labeling_0.4.3     
[43] rmarkdown_2.30      compiler_4.5.1      S7_0.2.1           

It is not glamorous, but the day you need it, you will be very glad it is there. Think of it as an insurance policy for your analysis.

21.6.3 Keep Your Project Organized

A quick checklist for reproducible work:

  • Use RStudio Projects (the .Rproj file) so your working directory is always the project root.
  • Never modify raw data. Keep originals in a data/raw/ folder and write code that creates cleaned versions.
  • Use relative file paths. Write read_csv("data/sales.csv"), not read_csv("C:/Users/jsmith/Desktop/stuff/sales.csv"). Your coworkers will thank you.
  • Knit from a clean session. Before you share a report, restart R and knit the whole thing from scratch. If it does not work in a clean session, it does not work.

If you followed the advice in earlier chapters, you are already doing most of this. The point is: future you should be able to open this project in six months and have everything just run.

21.7 Summary

R Markdown is not just for homework reports. Here is what we covered:

  • Parameterized reports let you generate the same report for different regions, products, or time periods automatically. Write once, render many.
  • Presentations with ioslides (and xaringan for the adventurous) let you build slide decks straight from your analysis.
  • Flexdashboard turns R Markdown into dashboards that look like actual business intelligence tools.
  • Interactive documents with DT::datatable() and plotly::ggplotly() make your reports feel like apps — searchable tables, zoomable charts, no server required.
  • Reproducibility basics like set.seed() and sessionInfo() keep your work trustworthy and your future self sane.

None of these are things you need to master today. But now you know they exist, and when the right situation comes up — and it will — you will know exactly where to look.