flowchart LR
A[".qmd file"] --> B["Quarto Engine"]
B --> C["HTML"]
B --> D["PDF"]
B --> E["Word"]
style A fill:#e1f5fe
style B fill:#fff3e0
Part 3: Quarto Basics
Create your first reproducible document
Learning Goals
By the end of this section, you will:
- Create a Quarto document (
.qmd) - Write YAML headers, Markdown text, and R code chunks
- Render documents to HTML
- Understand code chunk options
What is Quarto?
Quarto lets you write code and text in one document:
- Write your analysis and explanation together
- Render to HTML, PDF, Word, and more
- Plain text format = version control friendly, AI friendly
Think of it as a Word document that can run R code and automatically update results.
Anatomy of a .qmd File
A Quarto document has three main parts:
- YAML Header — Settings at the top between
---marks. Controls title, format, options. - Markdown Text — Your narrative writing using simple formatting syntax.
- Code Chunks — R code blocks that execute when you render.
Example structure:
---
title: "My Analysis"
format: html
---
This is my analysis. Here are the results:
```{r}
mean(c(1, 2, 3, 4, 5))
```Hands-on Exercise
Step 1: Open data-cleaning.qmd
- Open
data-cleaning.qmdfrom the Files pane - Look at the file structure—it already has YAML header, text sections, and code chunks
This is a scaffold file we’ve prepared. Your job is to fill in the code.
Step 2: Update the YAML header
Find the YAML header at the top and change YOUR NAME to your actual name:
---
title: "Data Cleaning"
author: "Your Name"
date: today
format: html
---Step 3: Add code to load the data
Goal: Load the NHANES dataset and preview its structure.
Find the load-data chunk and add this code:
data-cleaning.qmd
#| label: load-data
# Load the NHANES package (contains the dataset we'll use)
library(NHANES)
# Load the NHANES data into our R environment
data("NHANES")
# Preview the first few rows to understand the data structure
head(NHANES)What this does: The NHANES package includes built-in survey data, so we don’t need to download a separate file. The head() function shows the first 6 rows, helping us understand what variables are available.
Step 4: Clean the data
Goal: Prepare a clean, analysis-ready dataset by selecting variables, filtering rows, and handling missing data.
Find the clean-data chunk and add this code:
data-cleaning.qmd
#| label: clean-data
# Clean the NHANES data for our analysis
data_clean <- NHANES |>
# Keep only the variables we need for this analysis
select(ID, Age, Gender, Race1, Education, BPSysAve, BMI, Weight) |>
# Restrict to adults (children have different BP norms)
filter(Age >= 20) |>
# Remove rows with any missing values (complete case analysis)
drop_na() |>
# Set education levels in logical order (for proper ordering in plots/tables)
mutate(
Education = factor(Education, levels = c(
"8th Grade", "9 - 11th Grade", "High School",
"Some College", "College Grad"
))
)
# Check the result: how many rows and columns?
dim(data_clean)
# Preview the cleaned data
head(data_clean)What this does:
select()— Keep only the 8 variables we need (reduces noise)filter()— Restrict to adults age 20+ (our target population)drop_na()— Remove incomplete rows (simple but transparent approach)mutate()— Order education levels logically (8th Grade → College Grad)
Step 5: Save the cleaned data
Goal: Export the cleaned dataset so other scripts can use it without re-running all the cleaning code.
Find the save-data chunk and add this code:
data-cleaning.qmd
#| label: save-data
# Create results folder if it doesn't exist yet
dir.create(here("results"), showWarnings = FALSE)
# Save the cleaned data to the results folder
# Other scripts will load this file instead of re-running data cleaning
saveRDS(data_clean, here("results", "data_clean.rds"))What this does:
dir.create()— Creates theresults/folder if it doesn’t exist (safe to run multiple times)saveRDS()— Saves our R object to a file. Later scripts can load it withreadRDS()
here()?
The here() function builds file paths relative to your project root (automatically detected via .git folder, .Rproj, or .here file). This matters for reproducibility:
- Your script works regardless of working directory
- Collaborators can run your code without changing paths
- No more
setwd()or broken absolute paths like/Users/yourname/...
Why .rds format?
- Preserves R data types (factors, dates, attributes) perfectly
- Other scripts don’t need to repeat the cleaning steps
- Creates a clear handoff point between data preparation and analysis
Step 6: Run chunks interactively
Click the green play button (▶) on the right side of each chunk, or:
- Ctrl+Enter / Cmd+Enter — Run current line
- Ctrl+Shift+Enter / Cmd+Shift+Enter — Run entire chunk
Run all chunks in order: setup → load-data → clean-data → save-data
Tip: Create new chunks with Ctrl+Alt+I (Windows) / Cmd+Option+I (Mac)
Your template has a _quarto.yml file that configures the project as a “Quarto Manuscripts” project. This is intentional!
What this means:
- When you click “Render” on a single .qmd file, the entire project builds
- This is normal behavior for Manuscripts projects
- The setup enables the
embedfeature you’ll use in Part 6
Recommended workflow during development:
- Run chunks interactively (green play button) to develop and test code
- Render when you want to see the formatted HTML output
Interactive execution is faster for development; Render is for final output.
Step 7: Render the document
- Save your file (Ctrl/Cmd + S)
- Click the Render button (or press Ctrl/Cmd + Shift + K)
- See the HTML output in the Viewer panel
Did it render? You should see your cleaned data summary. Check that data_clean.rds was created in the results/ folder.
Quick Reference
Markdown Formatting
| What you type | What you get |
|---|---|
**bold** |
bold |
*italic* |
italic |
# Heading 1 |
Large heading |
## Heading 2 |
Medium heading |
- item |
Bullet point |
1. item |
Numbered list |
Code Chunk Options
data-cleaning.qmd
#| echo: true
#| eval: true
#| message: false
#| warning: false
library(tidyverse)| Option | What it does |
|---|---|
echo: true/false |
Show/hide the code |
eval: true/false |
Run/skip the code |
message: false |
Hide package messages |
warning: false |
Hide warnings |
Try it yourself: Experiment with chunk options
- In your
data-cleaning.qmd, add#| echo: falseto the setup chunk - Re-render the document
- Question: What changed in the output? Is the code visible or hidden?
Try different combinations:
| Try this | What do you expect? |
|---|---|
echo: false |
Code hidden, results shown |
eval: false |
Code shown, but not run |
echo: false + eval: false |
??? (try it!) |
The best way to understand chunk options is to experiment! Change one option, render, see what happens.
Labels for Cross-Reference
Name your chunks for later reference:
data-cleaning.qmd
#| label: tbl-demographics
#| tbl-cap: "Demographics of study participants"
# Your table code here- Use
tbl-prefix for tables,fig-for figures - Reference with
@tbl-demographicsin your text
Don’t Forget: Commit & Push!
Save your progress in Git
1. Stage your changes:
In the Source Control panel (left sidebar), click the + button next to data-cleaning.qmd to stage it.
2. Write your commit message:
Think about what you accomplished. A good commit message:
- Starts with a verb (Add, Update, Fix, Create…)
- Describes WHAT changed and optionally WHY
- Is specific enough to understand later
You can use the built-in AI feature to suggest a commit message based on your changes. Alternatively, you can integrate an external AI models like Claude or Gemini into your workflow through their CLI tools.
3. Commit and Push:
- Click Commit to save locally
- Click Push (↑) to upload to GitHub
3.5. View your changes:
In the Source Control panel, expand the commit you just made (in the History view) to see the diff—a line-by-line comparison showing exactly what changed. Green lines are additions, red lines are deletions. This is how Git tracks your work!
4. Verify on GitHub:
- Go to your repository on GitHub
- Check that your changes appear in the commit history
- Confirm
data-cleaning.qmdshows your new code
Commit after completing each logical unit of work.