1 Introduction

1.1 The Standard for Next-Generation Geographic Data Science: What is Quarto?

The Evolution of R Markdown: R Markdown has expanded into a broader ecosystem. Developed by Posit, this system goes beyond R to encompass all aspects of scientific publishing.
Integration of Tools: It allows you to freely use essential geographic research languages like R (sf, terra) and Python (GeoPandas) within a single document, or simply stick to the language you are most comfortable with.
An All-in-One Solution for Geographers: You can complete the entire process—from data collection and spatial analysis to map visualization and final manuscript submission—within a single .qmd file.

1.2 Implementing Reproducible Geography

Unifying Analysis and Writing: The era of exporting maps from GIS software and pasting them into Word is over. If the underlying data changes, all maps and tables in the report are synchronized and updated in real time.
Reproducible & Transparent Research: Because all code (from preprocessing to final modeling)is included in the document, fellow geographers can accurately verify and reproduce your research process.

1.3 An Investment for the Future: Multi-Language Support and Collaboration

Breaking Language Barriers: You might be using R right now, but even if you transition to Python-based spatial models later, the Quarto syntax remains exactly the same.
Flawless Equation Rendering: Equations essential for spatial statistics are rendered at a journal-quality level. Below is the formula for Moran’s I, which measures spatial autocorrelation.

\[I = \frac{n}{\sum_{i=1}^{n}\sum_{j=1}^{n}w_{ij}} \frac{\sum_{i=1}^{n}\sum_{j=1}^{n}w_{ij}(z_i-\bar{z})(z_j-\bar{z})}{\sum_{i=1}^{n}(z_i-\bar{z})^2}\]

The Geographer’s Portfolio: By linking with GitHub Pages, you can publish your own research blog or project page to the world with just a few clicks.

1.4 `One-Source, Multi-Format`: Diverse Outputs from a Single Source

From Reports to Papers: After freely conducting your analysis and drafting in HTML, changing just one line in the settings converts the document into a PDF manuscript formatted for journal submission or an MS Word document for collaboration.
Automated Presentations: There is no need to create a separate PowerPoint file. You can use your existing code to generate web-based slides (Revealjs). At a geography conference, simply open your browser and you are ready to present.

Tip

Try changing format: html at the top of this document to format: docx or format: pdf, and then click Render again. The exact same code will instantly be converted into a Word document or a PDF paper.

PDF Rendering Requirement: TinyTeX

To generate PDF documents, Quarto requires a LaTeX distribution. I strongly recommend installing TinyTeX, a lightweight and portable LaTeX distribution that works seamlessly with Quarto.

You can install it by running the following command in your terminal:


quarto install tinytex

1.5 Why Start with HTML? (A Geographical Perspective)

The Magic of Interactive Maps: Provide an experience that paper maps cannot offer. You can embed maps directly into your report, allowing readers to toggle layers on and off, or click on features to view attribute information.
Zero Entry Barrier: There is no need to struggle with complex LaTeX installations. Simply click the Render button in RStudio, and a beautifully formatted result instantly appears in your web browser.
Data Storytelling: By organically arranging analysis results (maps, charts) and the analyst’s interpretations, you guide the reader to actively explore the data and reach a conclusion.

1.6 Interactive Mapping Demonstration

The map below can be zoomed in and out. Click on a country to see detailed attribute data.

Code

# Load necessary spatial libraries
library(sf)
library(tmap)

# Set tmap mode to interactive 'view' for HTML output
tmap_mode("view")

# Load built-in World dataset for demonstration
data(World)

# Create an interactive map showing life expectancy
tm_shape(World) +
  tm_polygons("life_exp", 
              palette = "YlGnBu", 
              title = "Life Expectancy",
              id = "name",
              popup.vars = c("pop_est", "gdp_cap_est"))

Global Life Expectancy Map

2 Installing Quarto

We’ll assume that you have already installed R as well as R studio.

2.1 Windows

On Windows, the most efficient way to do this is using the Windows Package Manager (winget).

2.1.1 How to Run These Commands

You can execute the installation script in one of two ways:

Inside RStudio: Click the Terminal tab (located right next to the Console tab) and paste the commands there.
Windows Terminal: Press Win + R, type wt (or powershell), and hit Enter to open a standalone terminal window.

2.1.2 Installation Script (PowerShell)

Copy and paste the following block into your terminal. Note that lines starting with # are comments; remove the # if you need to install those specific components.


# If your machine doesn't have R and its components...
# Un-comment the lines below (remove the '#') to install them
# winget install -e --id RProject.R
# winget install -e --id RProject.Rtools
# winget install -e --id Posit.RStudio

# install Quarto
winget install -e --id Posit.Quarto

2.1.3 Why use `winget`?

Automation: It handles the download and installation process automatically.
Consistency: It ensures everyone in the research group is using the same official versions.
Maintenance: You can update all your tools later by simply running winget upgrade --all.

Note

If you are working on a university-managed machine, you might need administrative privileges to run these commands. If winget is not recognized, ensure your Windows 10/11 is up to date.

2.2 Mac

You can do the same on a Mac.

# Install Quarto
brew install --cask quarto

# Optional: Install R & RStudio
# brew install --cask r
# brew install --cask rstudio

Note

On macOS, if you don’t have Homebrew installed yet, visit brew.sh. first. On managed university machines, you may need administrative privileges to run these commands.

2.3 Linux Variants

If Linux is your primary OS, I’m sure you already know how to install it without my help.

3 Basic Usage

3.1 Anatomy of a Quarto Document (.qmd): 3 Core Elements

This document is composed of the following three elements. Click the </> Code button at the top right to see the underlying source.

YAML Header: The ‘brain’ of the document. It acts as a blueprint, defining the title, author, and the desired output format (HTML, PDF, etc.).
Main
- Markdown Text: The logic of your research. This is the space where you write your geographical hypotheses and interpret the spatial analysis results.
- Code Chunks: The ‘engine’ of the document. This is where the actual computation happens—loading spatial data, running statistical models, and rendering maps.

3.1.1 YAML Header

---
title: "A Glimpse into Quarto"
subtitle: "Reproducible & Unified Geospatial Research with Quarto"
author: "Yonghun Suh"
date: "Mar. 9, 2026"
categories: [Code]
image: https://rstudio.github.io/cheatsheets/html/images/quarto-illustration.png
format: 
  html:
    number_sections: true
    code-copy: true
    code-fold: show
    code-tools: true
    code-overflow: scroll
    code-link: true
    number-sections: true
    toc: true
    toc_depth: 3
    lightbox: true
# Excute options
execute:
  warning: false
  message: false
---

Shared Components

---
title: "A Glimpse into Quarto"
subtitle: "Reproducible & Unified Geospatial Research with Quarto"
author: "Yonghun Suh"
date: "Mar. 9, 2026"
.
.
.
---

Output Format Options

---
.
.
.
format: 
  html:
    number_sections: true
    code-copy: true
    code-fold: show
    code-tools: true
    code-overflow: scroll
    code-link: true
    number-sections: true
    toc: true
    toc_depth: 3
    lightbox: true
---

For more information, please refer to HTML Options.

---
.
.
.
format:
  pdf:
    toc: true
    number-sections: true
    colorlinks: true
---

For more information, please refer to PDF Options.

---
.
.
.
format:
  docx:
    toc: true
    number-sections: true
    highlight-style: github
---

For more information, please refer to MS Words Options.

Please refer to the official website.

3.1.2 Main

Markdown Text

Markdown is the foundational layer where you write the narrative and logic of your research. It is designed to be highly readable as raw text while compiling into beautifully formatted outputs.

1). Basic Markdown Syntax

You can quickly format your text without using a mouse or complex menus.

* **Bold text** for emphasis.
* *Italic text* for variables or terms.
* `Inline code` for package names like `sf` or `terra`.

1. Ordered list for methodologies.
2. Second step in the workflow.

2). LaTeX Math Mode

Modeling in the geographic data science requires rigorous spatial statistics. Quarto natively supports LaTeX math rendering, meaning you don’t need external equation editors.

* **Inline math:** Use single dollar signs. For example, the distance threshold is $d_{ij} \le 50$.

Inline math: Use single dollar signs. For example, the distance threshold is $d_{ij} \le 50$.
Display math: Use double dollar signs for standalone equations. Here is the formula for a basic spatial lag model:

$$
y = \rho W y + X \beta + \varepsilon
$$

\[ y = \rho W y + X \beta + \varepsilon \]

3). Inserting Media (Images and Links)

Adding external resources and local images is straightforward.

* Inserting a web link
    * [Official Quarto Documentation](https://quarto.org/)

* Inserting a local image with a caption
![\nTemporal Changes of Our Study Area: Aral Sea](data/Aral_Landsat.mp4)

Inserting a web link
- Official Quarto Documentation
Inserting a local image with a caption

4). Article Layout (Breaking the Boundaries)

Standard document margins are often too narrow for detailed geographic maps or large correlation matrices. Quarto’s Article Layout allows you to break out of the central body text area.

You can use Divs (:::) to expand static content, such as images or text blocks:

::: {.column-page}
![This wide map will span across the normal margins to show more detail.](https://rstudio.github.io/cheatsheets/html/images/quarto-illustration.png)
:::

This wide figure will span across the normal margins to show more detail.

::: {.column-margin}
This text will appear in the right margin, perfect for side notes or minor citations without interrupting the main text flow.
:::

This text will appear in the right margin, perfect for side notes or minor citations without interrupting the main text flow.

For executing code chunks, you can simply use #| column: page or #| column: screen inside the chunk options to make your interactive maps span the entire browser width. Read more about Article Layout.

Code

# Load necessary spatial libraries
library(sf)
library(tmap)

# Set tmap mode to interactive 'view' for HTML output
tmap_mode("view")

# Load built-in World dataset for demonstration
data(World)

# Create an interactive map showing life expectancy
tm_shape(World) +
  tm_polygons("life_exp", 
              palette = "YlGnBu", 
              title = "Life Expectancy",
              id = "name",
              popup.vars = c("pop_est", "gdp_cap_est"))

Global Life Expectancy Map

5). Etc: Callouts and Footnotes

Academic writing often requires footnotes and highlighted notes for readers.

Here is a statement that needs a reference.^[This will automatically generate a numbered footnote at the bottom of the page.]

Here is a statement that needs a reference.¹

::: {.callout-important}
## Attention
Always check your Coordinate Reference System (CRS) before performing spatial joins!
:::

Attention

Always check your Coordinate Reference System (CRS) before performing spatial joins!

Code Chunks

Code chunks are the computational engine of your Quarto document. Instead of keeping your analysis script and your manuscript separate, you can weave them together seamlessly.

1. Chunk Options (The Hash-Pipe Syntax)

In Quarto, the standard and most robust way to pass options to a code chunk is using the hash-pipe (#|) syntax inside the chunk, rather than inside the curly braces. This keeps the code clean and easy to read.

Option	Value	Description
`#\| echo:`	`true` / `false`	Shows or hides the source code in the output.
`#\| eval:`	`true` / `false`	Determines whether to run the code. Useful for showing syntax only.
`#\| warning:`	`true` / `false`	Suppresses warnings (e.g., from package updates or `NA` values).
`#\| cache:`	`true` / `false`	Saves the output. Critical for heavy parallel processing or large spatial data models.

2. Language Support Options

Quarto is fundamentally language-agnostic. While it natively uses Knitr for R and Jupyter for Python, it supports a wide array of languages.

Data Science Core: R (Native), Python (Native, reticulate), Julia (Native, or JuliaCall)
High-Performance Computing: C++ (via Rcpp or Python bindings), Fortran (dyn.load())
System & Pipeline: Bash, PowerShell

You can seamlessly pass data between Python and R within the same document using the reticulate package, allowing you to use the best tool for each specific algorithmic step.

Example: Analyzing Complexity Factors

What is Time Complexity?

In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. In the plot below, $n$ represents the size of the input, and $N$ represents the number of operations the algorithm performs. This relationship is a critical factor in defining the algorithm’s performance, as the efficiency of the algorithm can be assessed by how $N$ changes as the input size $n$ increases.

Learn more

Code

library(data.table)
library(ggplot2)

# Seq `n` gen
n <- seq(0, 100, by = 0.01)  # need to make the increment small in order to avoid `Inf`s

# data.table
df <- data.table(
                  n = n,
                  `O(1)` = 1,
                  `O(log n)` = log2(n),
                  `O(n)` = n,
                  `O(n log n)` = n * log2(n),
                  `O(n^2)` = n^2,
                  `O(2^n)` = 2^n,
                  `O(n!)` = factorial(n)
                )

df_long <- data.table::melt(df, id.vars = "n", variable.name = "Complexity", value.name = "Time")

ggplot(df_long, aes(x = n, y = Time, color = Complexity)) +
      geom_line(size = 1.2) +
            ylim(1, 100) +
            xlim(1, 100)+
      labs(
            #title = "Compirison of Computational Complexity",
            x = "Input Size (n)",
            y = "Number of Operations (N)",
            color = "Complexity"
          ) +
      theme_minimal() +
      theme(
            plot.title = element_text(size = 16, face = "bold"),
            legend.title = element_text(size = 12),
            legend.text = element_text(size = 10)
      )

A line plot of Computational Complexity — It is good to know if the algorithm can handle my problem well!

Compirison of Computational Complexity

4 Appendix

4.1 Using Other Languages in Quarto

Quarto is a language-agnostic publishing system designed to integrate multiple computational engines. By using {python} blocks, researchers can execute Python code natively alongside R and Markdown.

Technical Mechanisms:

Execution Engines: Depending on your setup, Quarto utilizes the Jupyter engine for Python-centric projects or the Knitr engine (via the reticulate package) for hybrid R-Python workflows.
Interoperability: In a hybrid environment, data frames and variables can be shared between R and Python sessions, allowing you to use the best tools from both ecosystems (e.g., Python for machine learning and R for spatial statistics).
Reproducibility: Consolidating multiple languages into a single .qmd file ensures that the entire analytical pipeline is documented and reproducible, regardless of the underlying programming language.

```{r}
print_a <- function(a) {
  a <- 1 
  print(paste("Inside function:", a))
}
```

Python

```{python}

def print_a(a):

    a = 1
    print(f"Inside function: {a}")

```

Julia

```{julia}

function print_a(a)
    println("Inside function: $a")
end

```

```{fortran}

subroutine printA(A)
    implicit none
    
    integer, intent(in) :: A
    
    print*, "The value is:", A    
    
end subroutine printA


```

4.1.1 Reticulate (Python)

Code

library(reticulate)

#' Configure Python environment
#' Windows: Check for existing environment, create if missing, then bind.

env_name <- "quarto_demo"
py_ver <- "3.11"
py_lib <- c("scikit-learn", "pandas", "numpy", "scipy")


if (Sys.info()[["sysname"]] == "Windows") {

  # 1. Check if the environment exists in the conda list
  # We use name matching for robustness
  existing_envs <- conda_list()$name

  if (!(env_name %in% existing_envs)) {

    message("Environment '", env_name, "' not found. Creating environment...")
    # Create environment with specified python version
    conda_create(env_name, python_version = py_ver)

    # Install required packages after creation
    conda_install(env_name, packages = py_lib)
  } else {
    message("Environment '", env_name, "' already exists. Skipping creation.")
  }

  # 2. Bind to the environment
# use_condaenv requires the environment name (string), which is robust for Windows
  use_condaenv(env_name, required = TRUE)

} else {

  #' Linux (CI/CD): Setup baseline using micromamba (due to its ephemeral traits).
  # GitHub Actions / Linux Environment: Always fresh setup
  message("Linux/CI environment detected. Setting up baseline via micromamba...")
  py_lib_str <- paste(py_lib, collapse = " ")

  # Use 'micromamba create' instead of 'install' for a clean setup
  # -n: environment name
  # -c: channel specification
  # -y: automatic yes to prompts

  mamba_cmd <- sprintf("micromamba create -n %s -c conda-forge %s python=%s -y",
                       env_name, py_lib_str, py_ver)

  system(mamba_cmd)

  linux_env_path <- sprintf("/home/runner/micromamba/envs/%s/bin/python", env_name)
  use_condaenv(linux_env_path, required = TRUE)
}

# Verification of the loaded Python configuration
# print(py_config())


sys <- import("sys")

sys$executable

[1] "/home/runner/micromamba/envs/quarto_demo/bin/python"

Compairing Mathematical Foundations: OLS & MLE

In linear regression, we model the relationship between a dependent variable and independent variables as:

\[y = X\beta + \epsilon\]

Where the dimensions (shapes) are:

$y \in \mathbb{R}^{N \times 1}$: Dependent variable vector.
$X \in \mathbb{R}^{N \times K}$: Design matrix (including a column of 1s for the intercept).
$\beta \in \mathbb{R}^{K \times 1}$: Parameter (coefficient) vector.
$\epsilon \in \mathbb{R}^{N \times 1}$: Error (residual) vector.

The Ordinary Least Squares (OLS) method seeks to find $\beta$ that minimizes the Sum of Squared Residuals ($SSR$):

\[SSR(\beta) = \epsilon^T \epsilon = (y - X\beta)^T (y - X\beta)\]

Matrix Expansion Steps

To derive the optimal $\beta$, we expand the expression using matrix algebra properties using following steps.

Distribution of Transpose

Using the rules $(A - B)^T = A^T - B^T$ and $(AB)^T = B^T A^T$:

\[(y - X\beta)^T = y^T - (X\beta)^T = y^T - \beta^T X^T\]

FOIL Expansion

Now we multiply $(y^T - \beta^T X^T)$ by $(y - X\beta)$:

\[SSR(\beta) = y^T y - y^T X \beta - \beta^T X^T y + \beta^T X^T X \beta\]

The Scalar Property (Merging Terms)

Note that $y^T X \beta$ is a scalar ($1 \times 1$). Any scalar is equal to its own transpose ($a = a^T$):

\[(y^T X \beta)^T = \beta^T X^T (y^T)^T = \beta^T X^T y\]

Since they are identical values, we can combine the middle terms:

\[SSR(\beta) = y^T y - 2\beta^T X^T y + \beta^T X^T X \beta\]

The Normal Equation

Taking the partial derivative with respect to $\beta$:

\[\frac{\partial SSR}{\partial \beta} = -2X^T y + 2X^T X \beta = 0\]

Solving for $\beta$ gives the Normal Equation:

\[\hat{\beta}_{OLS} = (X^T X)^{-1} X^T y\]

Maximum Likelihood Estimation (MLE) assumes the errors follow a normal distribution: $\epsilon \sim \mathcal{N}(0, \sigma^2 I)$. The Log-Likelihood function is:

\[\mathcal{L}(\beta, \sigma^2) = -\frac{N}{2}\ln(2\pi\sigma^2) - \frac{1}{2\sigma^2} \sum_{i=1}^{N} (y_i - X_i\beta)^2\]

In MLE, maximizing $\mathcal{L}$ with respect to $\beta$ is identical to minimizing the summation term $\sum (y_i - X_i\beta)^2$, which is exactly the $SSR$ used in OLS. Thus, under Gaussian assumptions, OLS and MLE are mathematically equivalent.

4.1.1.1 Implementation: R and Python Interoperability

We use the ‘meuse’ soil dataset from the sp package to verify this equivalence.

4.1.1.1.1 Data Preparation (R)

We use the ‘meuse’ soil dataset. We will predict Zinc concentration ($y$) using Copper concentration ($x$).

Code

library(sp)
library(reticulate)

# Load Meuse dataset
data(meuse)

# Prepare Design Matrix X (adding an intercept column of 1s) and Target Vector y
y_vec <- as.matrix(meuse$zinc)
x_mat <- as.matrix(cbind(1, meuse$copper))

cat("Number of observations (N):", nrow(x_mat), "\n")

Number of observations (N): 155

4.1.1.1.2 Matrix Algebra & Optimization (Python)

We will perform both OLS (via linear algebra) and MLE (via numerical optimization) explicitly in Python to demonstrate their equivalence.

We use NumPy for the direct linear algebra solution (OLS) and SciPy for numerical optimization (MLE).

Code

import numpy as np
from scipy.optimize import minimize

# 1. Retrieve variables from R environment
X = r.x_mat
y = r.y_vec
N = X.shape[0]

# =========================================================
# Step 1: Ordinary Least Squares (OLS) via Linear Algebra
# =========================================================
def compute_ols(X_matrix: np.ndarray, y_vector: np.ndarray) -> np.ndarray:
    """
    Compute OLS coefficients using the Normal Equation.

    Args:
        X_matrix (np.ndarray): Design matrix.
        y_vector (np.ndarray): Target vector.

    Returns:
        np.ndarray: Beta coefficients.
    """
    xtx = X_matrix.T @ X_matrix
    xty = X_matrix.T @ y_vector
    return np.linalg.solve(xtx, xty)

beta_ols = compute_ols(X, y)

print("--- 1. OLS Results (NumPy Linear Algebra) ---")

--- 1. OLS Results (NumPy Linear Algebra) ---

Code

print(f"Intercept: {beta_ols[0][0]:.4f}")

Intercept: -97.9033

Code

print(f"Slope (Copper): {beta_ols[1][0]:.4f}\n")

Slope (Copper): 14.0792

Code

# =========================================================
# Step 2: Maximum Likelihood Estimation (MLE) via Optimization
# =========================================================
def negative_log_likelihood(params: np.ndarray) -> float:
    """
    Calculate the Negative Log-Likelihood for a Gaussian regression model.

    Args:
        params (np.ndarray): Array containing [intercept, slope, variance].

    Returns:
        float: The NLL value to be minimized.
    """
    beta = params[0:2].reshape(-1, 1)
    variance = params[2]

    # Mathematical constraint: Variance must be strictly positive
    if variance <= 0:
        return np.inf

    # Calculate Sum of Squared Residuals (SSR)
    residuals = y - (X @ beta)
    ssr = np.sum(residuals ** 2)

    # Calculate NLL formula
    nll = (N / 2) * np.log(2 * np.pi * variance) + (ssr / (2 * variance))
    return float(nll)

# Initial guess for the optimizer: [intercept=0, slope=0, variance=10]
initial_guess = np.array([0.0, 0.0, 10.0])

# Run L-BFGS-B optimization algorithm to find the MLE parameters
mle_result = minimize(negative_log_likelihood, initial_guess, method='L-BFGS-B')

beta_mle = mle_result.x[0:2]
variance_mle = mle_result.x[2]

print("--- 2. MLE Results (SciPy Optimization) ---")

--- 2. MLE Results (SciPy Optimization) ---

Code

print(f"Intercept: {beta_mle[0]:.4f}")

Intercept: -97.9134

Code

print(f"Slope (Copper): {beta_mle[1]:.4f}")

Slope (Copper): 14.0794

Code

print(f"Estimated Variance: {variance_mle:.4f}\n")

Estimated Variance: 23424.8462

Code

# =========================================================
# Step 3: Verify Mathematical Equivalence
# =========================================================
diff = np.abs(beta_ols.flatten() - beta_mle)
print("--- 3. Equivalence Check (Difference |OLS - MLE|) ---")

--- 3. Equivalence Check (Difference |OLS - MLE|) ---

Code

print(f"Difference in Intercept: {diff[0]:.2e}")

Difference in Intercept: 1.01e-02

Code

print(f"Difference in Slope: {diff[1]:.2e}")

Difference in Slope: 1.70e-04

4.1.1.1.3 Verification with R’s Native Function (R)

Finally, we verify that our explicit matrix operations match R’s default statistical engine.

Code

# Fit OLS using R's built-in linear model function
r_lm <- lm(zinc ~ copper, data = meuse)

cat("--- R's lm() Coefficients ---\n")

--- R's lm() Coefficients ---

Code

print(coef(r_lm))

(Intercept)      copper 
  -97.90330    14.07921

Why OLS and MLE results differ: Numerical Approximation Error

While OLS and MLE are theoretically identical for linear regression with normally distributed errors, you may observe slight discrepancies in your output (e.g., ~0.006 in slope). This is called Approximation Error.

Exact vs. Iterative Solutions - OLS (Analytical): Uses a “Closed-form Solution” ($\hat{\beta} = (X^TX)^{-1}X^Ty$). It calculates the exact mathematical global minimum in a single step using matrix algebra. - MLE (Numerical): Uses “Numerical Optimization” (e.g., BFGS algorithm in scipy.optimize). It “hunts” for the maximum likelihood by taking iterative steps until it reaches a pre-defined threshold.

DataFrame Operations: Spatial Point Clustering (R $\rightarrow$ Python $\rightarrow$ R)

This is a pipeline that transfers seismic event location data (quakes) from R to Python to perform Density-Based Spatial Clustering of Applications with Noise (DBSCAN), then brings the results back to R for final visualization.

Objective Evaluation: In spatial pattern analysis, scikit-learn’s DBSCAN implementation is highly optimized at the C/C-Python level, offering significantly faster processing speeds for point clouds with hundreds of thousands of points compared to standard R packages. This architecture establishes an ideal division of labor: Python handles the heavy computational load, while R’s ggplot2 produces high-quality visualizations suitable for publication.

Code

# Load built-in Fiji earthquakes dataset
data(quakes)

# Select spatial coordinates and magnitude
spatial_df <- quakes[, c("lat", "long", "mag")]
head(spatial_df, 3)

     lat   long mag
1 -20.42 181.62 4.8
2 -20.62 181.03 4.2
3 -26.00 184.10 5.4

Code

import pandas as pd
from sklearn.cluster import DBSCAN

def perform_spatial_clustering(df: pd.DataFrame, eps: float = 2.0, min_samples: int = 15) -> np.ndarray:
    """
    Apply DBSCAN clustering algorithm to spatial coordinates.

    Args:
        df (pd.DataFrame): DataFrame containing 'lat' and 'long' columns.
        eps (float): The maximum distance between two samples for one to be considered
                     as in the neighborhood of the other.
        min_samples (int): The number of samples in a neighborhood for a point
                           to be considered as a core point.

    Returns:
        np.ndarray: Array of cluster labels (-1 indicates noise).
    """
    # Extract coordinates as a NumPy array
    coords = df[['lat', 'long']].values

    # Initialize and fit DBSCAN
    clustering_model = DBSCAN(eps=eps, min_samples=min_samples)
    labels = clustering_model.fit_predict(coords)

    return labels

# r.spatial_df is automatically converted to a Pandas DataFrame
cluster_labels = perform_spatial_clustering(r.spatial_df)

Code

library(ggplot2)

# Bind the Python-generated labels back to the R dataframe
spatial_df$cluster <- as.factor(py$cluster_labels)

# Generate publication-ready plot
ggplot(spatial_df, aes(x = long, y = lat, color = cluster, size = mag)) +
  geom_point(alpha = 0.7) +
  scale_color_viridis_d(option = "plasma", name = "Cluster\n(-1 = Noise)") +
  theme_minimal() +
  labs(
    title = "Spatial Clustering of Fiji Earthquakes",
    subtitle = "DBSCAN computed in Python, Plotted in R",
    x = "Longitude",
    y = "Latitude"
  )

4.1.2 Observable JS (Client-side Interactivity)

While R and Python are excellent for heavy data processing, Observable JavaScript (OJS) is optimized for building real-time, interactive visualizations directly in the user’s web browser.

Quarto bridges these two environments using the ojs_define() function. The standard workflow is: 1. Process large datasets in R during the build phase. 2. Export the aggregated data to OJS. 3. Render interactive plots in the browser without needing an active R server.

Step A: Data Processing in R

Code

library(data.table)

# 1. Generate sample environmental data (e.g., temperature across stations)
set.seed(42)
dt_env <- data.table(
  station = rep(c("Station_A", "Station_B", "Station_C"), each = 50),
  day = rep(1:50, 3),
  temperature = c(rnorm(50, 15, 2), rnorm(50, 18, 3), rnorm(50, 12, 1.5))
)

# 2. Export the data to the Observable JS environment
ojs_define(env_data = dt_env)

Step B: Interactive Plotting in Observable JS

In the following ojs chunk, we take the data processed by R and create an interactive plot. Try changing the dropdown menu to filter the stations in real-time.

data = transpose(env_data)

// 2. Create a reactive dropdown input
viewof selected_station = Inputs.select(
  ["Station_A", "Station_B", "Station_C"], 
  {label: "Select Station:"}
)

// 3. Filter data based on user selection
filtered_data = data.filter(d => d.station === selected_station)

// 4. Render the plot using Observable Plot
Plot.plot({
  y: {
    grid: true,
    label: "Temperature (°C)"
  },
  x: {
    label: "Day of Observation"
  },
  marks: [
    Plot.line(filtered_data, {x: "day", y: "temperature", stroke: "steelblue", strokeWidth: 2}),
    Plot.dot(filtered_data, {x: "day", y: "temperature", fill: "black"})
  ]
})

Code

if (Sys.info()[[1]]=="Windows") {

    # For my Windows Environment
    system("systeminfo",intern = T)

    } else{

    system("lscpu; free -h",intern = T)



}

 [1] "Architecture:                            x86_64"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                     
 [2] "CPU op-mode(s):                          32-bit, 64-bit"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             
 [3] "Address sizes:                           48 bits physical, 48 bits virtual"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
 [4] "Byte Order:                              Little Endian"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                              
 [5] "CPU(s):                                  4"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
 [6] "On-line CPU(s) list:                     0-3"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
 [7] "Vendor ID:                               AuthenticAMD"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
 [8] "Model name:                              AMD EPYC 7763 64-Core Processor"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
 [9] "CPU family:                              25"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
[10] "Model:                                   1"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
[11] "Thread(s) per core:                      2"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
[12] "Core(s) per socket:                      2"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
[13] "Socket(s):                               1"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
[14] "Stepping:                                1"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
[15] "BogoMIPS:                                4890.85"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
[16] "Flags:                                   fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm constant_tsc rep_good nopl tsc_reliable nonstop_tsc cpuid extd_apicid aperfmperf tsc_known_freq pni pclmulqdq ssse3 fma cx16 pcid sse4_1 sse4_2 movbe popcnt aes xsave avx f16c rdrand hypervisor lahf_lm cmp_legacy svm cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw topoext vmmcall fsgsbase bmi1 avx2 smep bmi2 erms invpcid rdseed adx smap clflushopt clwb sha_ni xsaveopt xsavec xgetbv1 xsaves user_shstk clzero xsaveerptr rdpru arat npt nrip_save tsc_scale vmcb_clean flushbyasid decodeassists pausefilter pfthreshold v_vmsave_vmload umip vaes vpclmulqdq rdpid fsrm"
[17] "Virtualization:                          AMD-V"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
[18] "Hypervisor vendor:                       Microsoft"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  
[19] "Virtualization type:                     full"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
[20] "L1d cache:                               64 KiB (2 instances)"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
[21] "L1i cache:                               64 KiB (2 instances)"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
[22] "L2 cache:                                1 MiB (2 instances)"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
[23] "L3 cache:                                32 MiB (1 instance)"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
[24] "NUMA node(s):                            1"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                          
[25] "NUMA node0 CPU(s):                       0-3"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                        
[26] "Vulnerability Gather data sampling:      Not affected"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[27] "Vulnerability Ghostwrite:                Not affected"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[28] "Vulnerability Indirect target selection: Not affected"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[29] "Vulnerability Itlb multihit:             Not affected"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[30] "Vulnerability L1tf:                      Not affected"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[31] "Vulnerability Mds:                       Not affected"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[32] "Vulnerability Meltdown:                  Not affected"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[33] "Vulnerability Mmio stale data:           Not affected"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[34] "Vulnerability Reg file data sampling:    Not affected"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[35] "Vulnerability Retbleed:                  Not affected"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[36] "Vulnerability Spec rstack overflow:      Vulnerable: Safe RET, no microcode"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         
[37] "Vulnerability Spec store bypass:         Vulnerable"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                 
[38] "Vulnerability Spectre v1:                Mitigation; usercopy/swapgs barriers and __user pointer sanitization"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
[39] "Vulnerability Spectre v2:                Mitigation; Retpolines; STIBP disabled; RSB filling; PBRSB-eIBRS Not affected; BHI Not affected"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                            
[40] "Vulnerability Srbds:                     Not affected"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[41] "Vulnerability Tsa:                       Vulnerable: Clear CPU buffers attempted, no microcode"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                      
[42] "Vulnerability Tsx async abort:           Not affected"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[43] "Vulnerability Vmscape:                   Not affected"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
[44] "               total        used        free      shared  buff/cache   available"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
[45] "Mem:            15Gi       2.3Gi       2.8Gi        50Mi        10Gi        13Gi"                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                    
[46] "Swap:          3.0Gi          0B       3.0Gi"

Footnotes

This will automatically generate a numbered footnote at the bottom of the page.↩︎

--- title: "A Glimpse into Quarto" subtitle: "Reproducible & Unified Geospatial Research with Quarto" author: "Yonghun Suh" date: "Mar. 9, 2026" categories: [Code] image: https://rstudio.github.io/cheatsheets/html/images/quarto-illustration.png format: html: toc: true number_sections: true code-copy: true code-fold: show code-tools: true code-overflow: scroll code-link: true number-sections: true toc_depth: 3 lightbox: true execute: warning: false message: false # freeze: true #comments: false --- # Introduction ## The Standard for Next-Generation Geographic Data Science: What is Quarto? - **The Evolution of R Markdown:** R Markdown has expanded into a broader ecosystem. Developed by Posit, this system goes beyond R to encompass all aspects of scientific publishing. - **Integration of Tools:** It allows you to freely use essential geographic research languages like R (`sf`, `terra`) and Python (`GeoPandas`) within a single document, or simply stick to the language you are most comfortable with. - **An All-in-One Solution for Geographers:** You can complete the entire process—from data collection and spatial analysis to map visualization and final manuscript submission—within a single `.qmd` file. ## Implementing Reproducible Geography - **Unifying Analysis and Writing:** The era of exporting maps from GIS software and pasting them into Word is over. If the underlying data changes, all maps and tables in the report are synchronized and updated in real time. - **Reproducible & Transparent Research:** Because all code (from preprocessing to final modeling)is included in the document, fellow geographers can accurately verify and reproduce your research process. ## An Investment for the Future: Multi-Language Support and Collaboration - **Breaking Language Barriers:** You might be using R right now, but even if you transition to Python-based spatial models later, the Quarto syntax remains exactly the same. - [**Flawless Equation Rendering:**](#latex-math-mode) Equations essential for spatial statistics are rendered at a journal-quality level. Below is the formula for Moran's I, which measures spatial autocorrelation. $$I = \frac{n}{\sum_{i=1}^{n}\sum_{j=1}^{n}w_{ij}} \frac{\sum_{i=1}^{n}\sum_{j=1}^{n}w_{ij}(z_i-\bar{z})(z_j-\bar{z})}{\sum_{i=1}^{n}(z_i-\bar{z})^2}$$ - **The Geographer's Portfolio:** By linking with GitHub Pages, you can publish your own research blog or project page to the world with just a few clicks. ## `One-Source, Multi-Format`: Diverse Outputs from a Single Source - **From Reports to Papers:** After freely conducting your analysis and drafting in HTML, changing just one line in the settings converts the document into a **PDF manuscript** formatted for journal submission or an **MS Word** document for collaboration. - **Automated Presentations:** There is no need to create a separate PowerPoint file. You can use your existing code to generate web-based slides (Revealjs). At a geography conference, simply open your browser and you are ready to present. ::: callout-tip [Try changing `format: html` at the top of this document to `format: docx` or `format: pdf`](#sec-output-options), and then click Render again. The exact same code will instantly be converted into a Word document or a PDF paper. ::: ::: {.callout-important} ## PDF Rendering Requirement: TinyTeX To generate **PDF** documents, Quarto requires a LaTeX distribution. I strongly recommend installing **TinyTeX**, a lightweight and portable LaTeX distribution that works seamlessly with Quarto. You can install it by running the following command in your terminal: ```bash quarto install tinytex ``` ::: ## Why Start with HTML? (A Geographical Perspective) - **The Magic of Interactive Maps:** Provide an experience that paper maps cannot offer. You can embed maps directly into your report, allowing readers to toggle layers on and off, or click on features to view attribute information. - **Zero Entry Barrier:** There is no need to struggle with complex LaTeX installations. Simply click the `Render` button in RStudio, and a beautifully formatted result instantly appears in your web browser. - **Data Storytelling:** By organically arranging analysis results (maps, charts) and the analyst's interpretations, you guide the reader to actively explore the data and reach a conclusion. ## Interactive Mapping Demonstration The map below can be zoomed in and out. Click on a country to see detailed attribute data. ```{r} #| label: interactive-map #| fig-cap: "Global Life Expectancy Map" # Load necessary spatial libraries library(sf) library(tmap) # Set tmap mode to interactive 'view' for HTML output tmap_mode("view") # Load built-in World dataset for demonstration data(World) # Create an interactive map showing life expectancy tm_shape(World) + tm_polygons("life_exp", palette = "YlGnBu", title = "Life Expectancy", id = "name", popup.vars = c("pop_est", "gdp_cap_est")) ``` # Installing Quarto We'll assume that you have already installed [R](https://www.r-project.org/){target="_blank"} as well as [R studio](https://posit.co/download/rstudio-desktop/){target="_blank"}. ## Windows On Windows, the most efficient way to do this is using the **Windows Package Manager (`winget`)**. ### How to Run These Commands You can execute the installation script in one of two ways: 1. **Inside RStudio:** Click the **Terminal** tab (located right next to the **Console** tab) and paste the commands there. 2. **Windows Terminal:** Press **`Win + R`**, type `wt` (or `powershell`), and hit Enter to open a standalone terminal window. ### Installation Script (PowerShell) Copy and paste the following block into your terminal. Note that lines starting with `#` are comments; remove the `#` if you need to install those specific components. ```powershell # If your machine doesn't have R and its components... # Un-comment the lines below (remove the '#') to install them # winget install -e --id RProject.R # winget install -e --id RProject.Rtools # winget install -e --id Posit.RStudio # install Quarto winget install -e --id Posit.Quarto ``` --- ### Why use `winget`? * **Automation:** It handles the download and installation process automatically. * **Consistency:** It ensures everyone in the research group is using the same official versions. * **Maintenance:** You can update all your tools later by simply running `winget upgrade --all`. --- ::: {.callout-note} If you are working on a university-managed machine, you might need administrative privileges to run these commands. If `winget` is not recognized, ensure your Windows 10/11 is up to date. ::: ## Mac You can do the same on a Mac. ```zsh # Install Quarto brew install --cask quarto # Optional: Install R & RStudio # brew install --cask r # brew install --cask rstudio ``` ::: {.callout-note} On macOS, if you don't have Homebrew installed yet, visit [brew.sh](https://brew.sh/){target="_blank"}. first. On managed university machines, you may need administrative privileges to run these commands. ::: ## Linux Variants If Linux is your primary OS, I’m sure you already know how to install it without my help. # Basic Usage ## Anatomy of a Quarto Document (.qmd): 3 Core Elements This document is composed of the following three elements. Click the `</> Code` button at the top right to see the underlying source. 1. **YAML Header:** The 'brain' of the document. It acts as a blueprint, defining the title, author, and the desired output format (HTML, PDF, etc.). 2. **Main** * **Markdown Text:** The logic of your research. This is the space where you write your geographical hypotheses and interpret the spatial analysis results. * **Code Chunks:** The 'engine' of the document. This is where the actual computation happens—loading spatial data, running statistical models, and rendering maps. ### YAML Header ```yaml --- title: "A Glimpse into Quarto" subtitle: "Reproducible & Unified Geospatial Research with Quarto" author: "Yonghun Suh" date: "Mar. 9, 2026" categories: [Code] image: https://rstudio.github.io/cheatsheets/html/images/quarto-illustration.png format: html: number_sections: true code-copy: true code-fold: show code-tools: true code-overflow: scroll code-link: true number-sections: true toc: true toc_depth: 3 lightbox: true # Excute options execute: warning: false message: false --- ``` #### Shared Components {-} ```yaml --- title: "A Glimpse into Quarto" subtitle: "Reproducible & Unified Geospatial Research with Quarto" author: "Yonghun Suh" date: "Mar. 9, 2026" . . . --- ``` #### Output Format Options {- #sec-output-options} ::: {.panel-tabset} ##### HTML {-} ```yaml --- . . . format: html: number_sections: true code-copy: true code-fold: show code-tools: true code-overflow: scroll code-link: true number-sections: true toc: true toc_depth: 3 lightbox: true --- ``` For more information, please refer to [HTML Options](https://quarto.org/docs/reference/formats/html.html){target="_blank"}. ##### PDF {-} ```yaml --- . . . format: pdf: toc: true number-sections: true colorlinks: true --- ``` For more information, please refer to [PDF Options](https://quarto.org/docs/reference/formats/pdf.html){target="_blank"}. ##### MS Words {-} ```yaml --- . . . format: docx: toc: true number-sections: true highlight-style: github --- ``` For more information, please refer to [MS Words Options](https://quarto.org/docs/reference/formats/docx.html){target="_blank"}. ##### Other Formats {-} Please refer to [the official website](https://quarto.org/docs/output-formats/all-formats.html){target="_blank"}. ::: ### Main #### Markdown Text {- #sec-markdown-text} Markdown is the foundational layer where you write the narrative and logic of your research. It is designed to be highly readable as raw text while compiling into beautifully formatted outputs. ##### **1). Basic Markdown Syntax** {-} You can quickly format your text without using a mouse or complex menus. ```md * **Bold text** for emphasis. * *Italic text* for variables or terms. * `Inline code` for package names like `sf` or `terra`. 1. Ordered list for methodologies. 2. Second step in the workflow. ``` ##### **2). LaTeX Math Mode** {- #latex-math-mode} Modeling in the geographic data science requires rigorous spatial statistics. Quarto natively supports [LaTeX math](https://www.overleaf.com/learn/latex/Mathematical_expressions){target="_blank"} rendering, meaning you don't need external equation editors. ```md * **Inline math:** Use single dollar signs. For example, the distance threshold is $d_{ij} \le 50$. ``` * **Inline math:** Use single dollar signs. For example, the distance threshold is $d_{ij} \le 50$. * **Display math:** Use double dollar signs for standalone equations. Here is the formula for a basic spatial lag model: ```md $$ y = \rho W y + X \beta + \varepsilon $$ ``` $$ y = \rho W y + X \beta + \varepsilon $$ ##### **3). Inserting Media (Images and Links)** {-} Adding external resources and local images is straightforward. ```md * Inserting a web link * [Official Quarto Documentation](https://quarto.org/) * Inserting a local image with a caption ![\nTemporal Changes of Our Study Area: Aral Sea](data/Aral_Landsat.mp4) ``` * Inserting a web link * [Official Quarto Documentation](https://quarto.org/) * Inserting a local image with a caption ![\nTemporal Changes of Our Study Area: Aral Sea](data/Aral_Landsat.mp4) ##### **4). Article Layout (Breaking the Boundaries)** {-} Standard document margins are often too narrow for detailed geographic maps or large correlation matrices. Quarto's **Article Layout** allows you to break out of the central body text area. You can use `Divs` (`:::`) to expand static content, such as images or text blocks: ```md ::: {.column-page} ![This wide map will span across the normal margins to show more detail.](https://rstudio.github.io/cheatsheets/html/images/quarto-illustration.png) ::: ``` ::: {.column-page} ![This wide figure will span across the normal margins to show more detail.](https://rstudio.github.io/cheatsheets/html/images/quarto-illustration.png) ::: ```md ::: {.column-margin} This text will appear in the right margin, perfect for side notes or minor citations without interrupting the main text flow. ::: ``` ::: {.column-margin} This text will appear in the right margin, perfect for side notes or minor citations without interrupting the main text flow. ::: > For executing code chunks, you can simply use `#| column: page` or `#| column: screen` inside the chunk options to make your interactive maps span the entire browser width. Read more about [Article Layout](https://quarto.org/docs/authoring/article-layout.html){target="_blank"}. ```{r} #| label: interactive-map-layout #| fig-cap: "Global Life Expectancy Map" #| column: page # Load necessary spatial libraries library(sf) library(tmap) # Set tmap mode to interactive 'view' for HTML output tmap_mode("view") # Load built-in World dataset for demonstration data(World) # Create an interactive map showing life expectancy tm_shape(World) + tm_polygons("life_exp", palette = "YlGnBu", title = "Life Expectancy", id = "name", popup.vars = c("pop_est", "gdp_cap_est")) ``` ##### **5). Etc: Callouts and Footnotes** {-} Academic writing often requires footnotes and highlighted notes for readers. ```md Here is a statement that needs a reference.^[This will automatically generate a numbered footnote at the bottom of the page.] ``` Here is a statement that needs a reference.^[This will automatically generate a numbered footnote at the bottom of the page.] ```md ::: {.callout-important} ## Attention Always check your Coordinate Reference System (CRS) before performing spatial joins! ::: ``` ::: {.callout-important} ## Attention Always check your Coordinate Reference System (CRS) before performing spatial joins! ::: #### Code Chunks {- #sec-code-chunks} Code chunks are the computational engine of your Quarto document. Instead of keeping your analysis script and your manuscript separate, you can weave them together seamlessly. ##### **1. Chunk Options (The Hash-Pipe Syntax)** {-} In Quarto, the standard and most robust way to pass options to a code chunk is using the **hash-pipe (`#|`)** syntax inside the chunk, rather than inside the curly braces. This keeps the code clean and easy to read. | Option | Value | Description | | :--- | :--- | :--- | | `#| echo:` | `true` / `false` | Shows or hides the source code in the output. | | `#| eval:` | `true` / `false` | Determines whether to run the code. Useful for showing syntax only. | | `#| warning:` | `true` / `false` | Suppresses warnings (e.g., from package updates or `NA` values). | | `#| cache:` | `true` / `false` | Saves the output. **Critical for heavy parallel processing or large spatial data models.** | ##### **2. Language Support Options** {-} Quarto is fundamentally language-agnostic. While it natively uses Knitr for R and Jupyter for Python, it supports a wide array of languages. * **Data Science Core:** `R` (Native), `Python` (Native, `reticulate`), `Julia` (Native, or `JuliaCall`) * **High-Performance Computing:** `C++` (via Rcpp or Python bindings), Fortran (`dyn.load()`) * **System & Pipeline:** `Bash`, `PowerShell` You can seamlessly pass data between Python and R within the same document using the `reticulate` package, allowing you to use the best tool for each specific algorithmic step. --- ##### **Example: Analyzing Complexity Factors** {-} ::: callout-tip ## What is Time Complexity? {-} In theoretical computer science, the time complexity is the computational complexity that describes the amount of computer time it takes to run an algorithm. In the plot below, $n$ represents the size of the input, and $N$ represents the number of operations the algorithm performs. This relationship is a critical factor in defining the algorithm's performance, as the efficiency of the algorithm can be assessed by how $N$ changes as the input size $n$ increases. [Learn more](https://en.wikipedia.org/wiki/Time_complexity){target="_blank"} ::: ```{r, warning=FALSE} #| fig-cap: "Compirison of Computational Complexity" #| fig-subcap: "It is good to know if the algorithm can handle my problem well!" #| fig-alt: "A line plot of Computational Complexity" library(data.table) library(ggplot2) # Seq `n` gen n <- seq(0, 100, by = 0.01) # need to make the increment small in order to avoid `Inf`s # data.table df <- data.table( n = n, `O(1)` = 1, `O(log n)` = log2(n), `O(n)` = n, `O(n log n)` = n * log2(n), `O(n^2)` = n^2, `O(2^n)` = 2^n, `O(n!)` = factorial(n) ) df_long <- data.table::melt(df, id.vars = "n", variable.name = "Complexity", value.name = "Time") ggplot(df_long, aes(x = n, y = Time, color = Complexity)) + geom_line(size = 1.2) + ylim(1, 100) + xlim(1, 100)+ labs( #title = "Compirison of Computational Complexity", x = "Input Size (n)", y = "Number of Operations (N)", color = "Complexity" ) + theme_minimal() + theme( plot.title = element_text(size = 16, face = "bold"), legend.title = element_text(size = 12), legend.text = element_text(size = 10) ) ``` # Appendix ## Using Other Languages in Quarto `Quarto` is a *language-agnostic publishing system* designed to integrate multiple computational engines. By using `{python}` blocks, researchers can execute `Python` code natively alongside R and Markdown. Technical Mechanisms: * Execution Engines: Depending on your setup, `Quarto` utilizes the Jupyter engine for `Python`-centric projects or the `Knitr` engine (via the `reticulate` package) for hybrid `R`-`Python` workflows. * **Interoperability**: In a hybrid environment, data frames and variables can be shared between `R` and `Python` sessions, allowing you to use the best tools from both ecosystems (e.g., `Python` for machine learning and `R` for spatial statistics). * Reproducibility: Consolidating multiple languages into **a single** `.qmd` file ensures that the entire analytical pipeline is documented and reproducible, regardless of the underlying programming language. 1. `R` ```{r} print_a <- function(a) { a <- 1 print(paste("Inside function:", a)) } ``` 2. `Python` ```{python} def print_a(a): a = 1 print(f"Inside function: {a}") ``` 3. `Julia` ```{julia} function print_a(a) println("Inside function: $a") end ``` ```{fortran} subroutine printA(A) implicit none integer, intent(in) :: A print*, "The value is:", A end subroutine printA ``` ### Reticulate (Python) ```{r} library(reticulate) #' Configure Python environment #' Windows: Check for existing environment, create if missing, then bind. env_name <- "quarto_demo" py_ver <- "3.11" py_lib <- c("scikit-learn", "pandas", "numpy", "scipy") if (Sys.info()[["sysname"]] == "Windows") { # 1. Check if the environment exists in the conda list # We use name matching for robustness existing_envs <- conda_list()$name if (!(env_name %in% existing_envs)) { message("Environment '", env_name, "' not found. Creating environment...") # Create environment with specified python version conda_create(env_name, python_version = py_ver) # Install required packages after creation conda_install(env_name, packages = py_lib) } else { message("Environment '", env_name, "' already exists. Skipping creation.") } # 2. Bind to the environment # use_condaenv requires the environment name (string), which is robust for Windows use_condaenv(env_name, required = TRUE) } else { #' Linux (CI/CD): Setup baseline using micromamba (due to its ephemeral traits). # GitHub Actions / Linux Environment: Always fresh setup message("Linux/CI environment detected. Setting up baseline via micromamba...") py_lib_str <- paste(py_lib, collapse = " ") # Use 'micromamba create' instead of 'install' for a clean setup # -n: environment name # -c: channel specification # -y: automatic yes to prompts mamba_cmd <- sprintf("micromamba create -n %s -c conda-forge %s python=%s -y", env_name, py_lib_str, py_ver) system(mamba_cmd) linux_env_path <- sprintf("/home/runner/micromamba/envs/%s/bin/python", env_name) use_condaenv(linux_env_path, required = TRUE) } # Verification of the loaded Python configuration # print(py_config()) sys <- import("sys") sys$executable ``` #### **Compairing Mathematical Foundations: OLS & MLE** {-} In linear regression, we model the relationship between a dependent variable and independent variables as: $$y = X\beta + \epsilon$$ Where the dimensions (shapes) are: * $y \in \mathbb{R}^{N \times 1}$: Dependent variable vector. * $X \in \mathbb{R}^{N \times K}$: Design matrix (including a column of 1s for the intercept). * $\beta \in \mathbb{R}^{K \times 1}$: Parameter (coefficient) vector. * $\epsilon \in \mathbb{R}^{N \times 1}$: Error (residual) vector. ::: {.panel-tabset} #### OLS Perspective: Minimizing the SSR The Ordinary Least Squares (OLS) method seeks to find $\beta$ that minimizes the Sum of Squared Residuals ($SSR$): $$SSR(\beta) = \epsilon^T \epsilon = (y - X\beta)^T (y - X\beta)$$ ##### **Matrix Expansion Steps** {-} To derive the optimal $\beta$, we expand the expression using matrix algebra properties using following steps. 1. Distribution of Transpose Using the rules $(A - B)^T = A^T - B^T$ and $(AB)^T = B^T A^T$: $$(y - X\beta)^T = y^T - (X\beta)^T = y^T - \beta^T X^T$$ 2. FOIL Expansion Now we multiply $(y^T - \beta^T X^T)$ by $(y - X\beta)$: $$SSR(\beta) = y^T y - y^T X \beta - \beta^T X^T y + \beta^T X^T X \beta$$ 3. The Scalar Property (Merging Terms) Note that $y^T X \beta$ is a scalar ($1 \times 1$). Any scalar is equal to its own transpose ($a = a^T$): $$(y^T X \beta)^T = \beta^T X^T (y^T)^T = \beta^T X^T y$$ Since they are identical values, we can combine the middle terms: $$SSR(\beta) = y^T y - 2\beta^T X^T y + \beta^T X^T X \beta$$ ##### **The Normal Equation** {-} Taking the partial derivative with respect to $\beta$: $$\frac{\partial SSR}{\partial \beta} = -2X^T y + 2X^T X \beta = 0$$ Solving for $\beta$ gives the Normal Equation: $$\hat{\beta}_{OLS} = (X^T X)^{-1} X^T y$$ #### MLE Perspective: Maximum Likelihood Maximum Likelihood Estimation (MLE) assumes the errors follow a normal distribution: $\epsilon \sim \mathcal{N}(0, \sigma^2 I)$. The Log-Likelihood function is: $$\mathcal{L}(\beta, \sigma^2) = -\frac{N}{2}\ln(2\pi\sigma^2) - \frac{1}{2\sigma^2} \sum_{i=1}^{N} (y_i - X_i\beta)^2$$ In MLE, maximizing $\mathcal{L}$ with respect to $\beta$ is identical to minimizing the summation term $\sum (y_i - X_i\beta)^2$, which is exactly the $SSR$ used in OLS. Thus, under *Gaussian* assumptions, **OLS and MLE are mathematically equivalent**. ::: #### Implementation: R and Python Interoperability We use the 'meuse' soil dataset from the sp package to verify this equivalence. ##### Data Preparation (R) We use the 'meuse' soil dataset. We will predict Zinc concentration ($y$) using Copper concentration ($x$). ```{r} #| label: data-preparation library(sp) library(reticulate) # Load Meuse dataset data(meuse) # Prepare Design Matrix X (adding an intercept column of 1s) and Target Vector y y_vec <- as.matrix(meuse$zinc) x_mat <- as.matrix(cbind(1, meuse$copper)) cat("Number of observations (N):", nrow(x_mat), "\n") ``` ##### Matrix Algebra & Optimization (Python) We will perform both OLS (via linear algebra) and MLE (via numerical optimization) explicitly in Python to demonstrate their equivalence. We use NumPy for the direct linear algebra solution (OLS) and SciPy for numerical optimization (MLE). ```{python} #| label: python-ols-mle import numpy as np from scipy.optimize import minimize # 1. Retrieve variables from R environment X = r.x_mat y = r.y_vec N = X.shape[0] # ========================================================= # Step 1: Ordinary Least Squares (OLS) via Linear Algebra # ========================================================= def compute_ols(X_matrix: np.ndarray, y_vector: np.ndarray) -> np.ndarray: """ Compute OLS coefficients using the Normal Equation. Args: X_matrix (np.ndarray): Design matrix. y_vector (np.ndarray): Target vector. Returns: np.ndarray: Beta coefficients. """ xtx = X_matrix.T @ X_matrix xty = X_matrix.T @ y_vector return np.linalg.solve(xtx, xty) beta_ols = compute_ols(X, y) print("--- 1. OLS Results (NumPy Linear Algebra) ---") print(f"Intercept: {beta_ols[0][0]:.4f}") print(f"Slope (Copper): {beta_ols[1][0]:.4f}\n") # ========================================================= # Step 2: Maximum Likelihood Estimation (MLE) via Optimization # ========================================================= def negative_log_likelihood(params: np.ndarray) -> float: """ Calculate the Negative Log-Likelihood for a Gaussian regression model. Args: params (np.ndarray): Array containing [intercept, slope, variance]. Returns: float: The NLL value to be minimized. """ beta = params[0:2].reshape(-1, 1) variance = params[2] # Mathematical constraint: Variance must be strictly positive if variance <= 0: return np.inf # Calculate Sum of Squared Residuals (SSR) residuals = y - (X @ beta) ssr = np.sum(residuals ** 2) # Calculate NLL formula nll = (N / 2) * np.log(2 * np.pi * variance) + (ssr / (2 * variance)) return float(nll) # Initial guess for the optimizer: [intercept=0, slope=0, variance=10] initial_guess = np.array([0.0, 0.0, 10.0]) # Run L-BFGS-B optimization algorithm to find the MLE parameters mle_result = minimize(negative_log_likelihood, initial_guess, method='L-BFGS-B') beta_mle = mle_result.x[0:2] variance_mle = mle_result.x[2] print("--- 2. MLE Results (SciPy Optimization) ---") print(f"Intercept: {beta_mle[0]:.4f}") print(f"Slope (Copper): {beta_mle[1]:.4f}") print(f"Estimated Variance: {variance_mle:.4f}\n") # ========================================================= # Step 3: Verify Mathematical Equivalence # ========================================================= diff = np.abs(beta_ols.flatten() - beta_mle) print("--- 3. Equivalence Check (Difference |OLS - MLE|) ---") print(f"Difference in Intercept: {diff[0]:.2e}") print(f"Difference in Slope: {diff[1]:.2e}") ``` ##### Verification with R's Native Function (R) Finally, we verify that our explicit matrix operations match R's default statistical engine. ```{r} #| label: r-verification # Fit OLS using R's built-in linear model function r_lm <- lm(zinc ~ copper, data = meuse) cat("--- R's lm() Coefficients ---\n") print(coef(r_lm)) ``` ::: {.callout-note} ## Why OLS and MLE results differ: Numerical Approximation Error While OLS and MLE are theoretically identical for linear regression with normally distributed errors, you may observe slight discrepancies in your output (e.g., ~0.006 in slope). This is called **Approximation Error**. **Exact vs. Iterative Solutions** - **OLS (Analytical):** Uses a "Closed-form Solution" ($\hat{\beta} = (X^TX)^{-1}X^Ty$). It calculates the exact mathematical global minimum in a single step using matrix algebra. - **MLE (Numerical):** Uses "Numerical Optimization" (e.g., BFGS algorithm in `scipy.optimize`). It "hunts" for the maximum likelihood by taking iterative steps until it reaches a pre-defined threshold. ::: ------------------------------------------------------------------------ #### **DataFrame Operations: Spatial Point Clustering (R $\rightarrow$ Python $\rightarrow$ R)** {-} This is a pipeline that transfers seismic event location data (`quakes`) from `R` to `Python` to perform Density-Based Spatial Clustering of Applications with Noise (DBSCAN), then brings the results back to `R` for final visualization. **Objective Evaluation:** In spatial pattern analysis, `scikit-learn`'s DBSCAN implementation is highly optimized at the C/C-Python level, offering significantly faster processing speeds for point clouds with hundreds of thousands of points compared to standard R packages. This architecture establishes an ideal division of labor: Python handles the heavy computational load, while R’s `ggplot2` produces high-quality visualizations suitable for publication. ```{r} #| label: load-spatial-points # Load built-in Fiji earthquakes dataset data(quakes) # Select spatial coordinates and magnitude spatial_df <- quakes[, c("lat", "long", "mag")] head(spatial_df, 3) ``` ```{python} #| label: python-clustering import pandas as pd from sklearn.cluster import DBSCAN def perform_spatial_clustering(df: pd.DataFrame, eps: float = 2.0, min_samples: int = 15) -> np.ndarray: """ Apply DBSCAN clustering algorithm to spatial coordinates. Args: df (pd.DataFrame): DataFrame containing 'lat' and 'long' columns. eps (float): The maximum distance between two samples for one to be considered as in the neighborhood of the other. min_samples (int): The number of samples in a neighborhood for a point to be considered as a core point. Returns: np.ndarray: Array of cluster labels (-1 indicates noise). """ # Extract coordinates as a NumPy array coords = df[['lat', 'long']].values # Initialize and fit DBSCAN clustering_model = DBSCAN(eps=eps, min_samples=min_samples) labels = clustering_model.fit_predict(coords) return labels # r.spatial_df is automatically converted to a Pandas DataFrame cluster_labels = perform_spatial_clustering(r.spatial_df) ``` ```{r} #| label: plot-clusters library(ggplot2) # Bind the Python-generated labels back to the R dataframe spatial_df$cluster <- as.factor(py$cluster_labels) # Generate publication-ready plot ggplot(spatial_df, aes(x = long, y = lat, color = cluster, size = mag)) + geom_point(alpha = 0.7) + scale_color_viridis_d(option = "plasma", name = "Cluster\n(-1 = Noise)") + theme_minimal() + labs( title = "Spatial Clustering of Fiji Earthquakes", subtitle = "DBSCAN computed in Python, Plotted in R", x = "Longitude", y = "Latitude" ) ``` ### Observable JS (Client-side Interactivity) While R and Python are excellent for heavy data processing, **Observable JavaScript (OJS)** is optimized for building real-time, interactive visualizations directly in the user's web browser. Quarto bridges these two environments using the `ojs_define()` function. The standard workflow is: 1. Process large datasets in R during the build phase. 2. Export the aggregated data to OJS. 3. Render interactive plots in the browser without needing an active R server. #### Step A: Data Processing in R {-} ```{r} #| message: false #| warning: false library(data.table) # 1. Generate sample environmental data (e.g., temperature across stations) set.seed(42) dt_env <- data.table( station = rep(c("Station_A", "Station_B", "Station_C"), each = 50), day = rep(1:50, 3), temperature = c(rnorm(50, 15, 2), rnorm(50, 18, 3), rnorm(50, 12, 1.5)) ) # 2. Export the data to the Observable JS environment ojs_define(env_data = dt_env) ``` #### Step B: Interactive Plotting in Observable JS {-} In the following `ojs` chunk, we take the data processed by R and create an interactive plot. Try changing the dropdown menu to filter the stations in real-time. ```{ojs} //| echo: true //| code-fold: false // 1. Transpose the data (R data frames become column-major JSON arrays, OJS needs row-major) data = transpose(env_data) // 2. Create a reactive dropdown input viewof selected_station = Inputs.select( ["Station_A", "Station_B", "Station_C"], {label: "Select Station:"} ) // 3. Filter data based on user selection filtered_data = data.filter(d => d.station === selected_station) // 4. Render the plot using Observable Plot Plot.plot({ y: { grid: true, label: "Temperature (°C)" }, x: { label: "Day of Observation" }, marks: [ Plot.line(filtered_data, {x: "day", y: "temperature", stroke: "steelblue", strokeWidth: 2}), Plot.dot(filtered_data, {x: "day", y: "temperature", fill: "black"}) ] }) ``` ```{r} if (Sys.info()[[1]]=="Windows") { # For my Windows Environment system("systeminfo",intern = T) } else{ system("lscpu; free -h",intern = T) } ```

1 Introduction

1.1 The Standard for Next-Generation Geographic Data Science: What is Quarto?

1.2 Implementing Reproducible Geography

1.3 An Investment for the Future: Multi-Language Support and Collaboration

1.4 One-Source, Multi-Format: Diverse Outputs from a Single Source

1.5 Why Start with HTML? (A Geographical Perspective)

1.6 Interactive Mapping Demonstration

2 Installing Quarto

2.1 Windows

2.1.1 How to Run These Commands

2.1.2 Installation Script (PowerShell)

2.1.3 Why use winget?

2.2 Mac

2.3 Linux Variants

3 Basic Usage

3.1 Anatomy of a Quarto Document (.qmd): 3 Core Elements

3.1.1 YAML Header

Shared Components

Output Format Options

3.1.2 Main

Markdown Text

1). Basic Markdown Syntax

2). LaTeX Math Mode

3). Inserting Media (Images and Links)

4). Article Layout (Breaking the Boundaries)

5). Etc: Callouts and Footnotes

Code Chunks

1. Chunk Options (The Hash-Pipe Syntax)

2. Language Support Options

Example: Analyzing Complexity Factors

4 Appendix

4.1 Using Other Languages in Quarto

4.1.1 Reticulate (Python)

Compairing Mathematical Foundations: OLS & MLE

Matrix Expansion Steps

The Normal Equation

4.1.1.1 Implementation: R and Python Interoperability

4.1.1.1.1 Data Preparation (R)

4.1.1.1.2 Matrix Algebra & Optimization (Python)

4.1.1.1.3 Verification with R’s Native Function (R)

DataFrame Operations: Spatial Point Clustering (R \(\rightarrow\) Python \(\rightarrow\) R)

4.1.2 Observable JS (Client-side Interactivity)

Step A: Data Processing in R

Step B: Interactive Plotting in Observable JS

Footnotes

1.4 `One-Source, Multi-Format`: Diverse Outputs from a Single Source

2.1.3 Why use `winget`?