14  Working with Agents

Published

June 18, 2026

Keywords

LLM, Agents, Agent Frameworks, Coding Assistants, Claude Code, Software Developmenet Kit, SDK, Ethics, Responsible Use of Agents

14.1 Introduction to Working with Agents

14.1.1 Learning Outcomes

By the end of this chapter, you should be able to:

  • Explain the difference between an AI assistant and an agent framework.
  • Use an LLM to explore and analyze a larger real-world dataset.
  • Describe how project context influences assistant and agent behavior.
  • Explain the roles of project instructions, skills, subagents, and retrieval.
  • Recognize the benefits and risks of vibe coding.
  • Apply responsible software engineering practices when working with AI agents.

14.1.2 References

Additional References

NoteA Note on the Development of This Chapter

This chapter was developed with the assistance of both ChatGPT and Claude.

  • These tools were used throughout to explore ideas, generate initial code, and draft prose.
  • Through extended iterative conversations the material was revised, restructured, removed, and adjusted as I tuned it to align with my pedagogical goals for this course and serve as a reference for others.
  • All material was verified against primary references. The framing, sequencing, and editorial judgments are my own.

The contents of this chapter, their clarity, accuracy, and relevance, are my responsibility. Any errors are mine alone.

14.2 From Workflows to Agentic Assistants and Frameworks

In the previous chapter, we moved from single prompts to functions, workflows, agents, and agent frameworks. Along the way, we explored how tools, memory, context engineering, and orchestration allow large language models (LLMs) to solve increasingly complex tasks.

  • The examples in that chapter intentionally used small datasets such as mtcars and penguins so that we could focus on the underlying concepts without being distracted by data collection and preparation.

This chapter shifts from building agents to working with existing AI assistants and agent frameworks.

  • Rather than creating agent infrastructure from scratch, we will examine how modern tools help data scientists analyze larger projects, interact with real datasets, and automate portions of their workflow.

The central question for this chapter is:

How can modern AI assistants help us work with data science projects that have become too large or complex to manage manually?

Having explored the first four levels in Table 14.1, we now focus on the Agent Framework layer.

Table 14.1: Comparison of prompts, workflows, agents, and agent frameworks.
Level Description
Prompt A single interaction with an LLM.
Function A reusable prompt wrapped in code.
Workflow A sequence of steps controlled by software.
Agent An LLM that selects among available tools.
Agent Framework Infrastructure that manages tools, memory, evaluation, and orchestration.

The past year has seen a tremendous shift in how data scientists (and many other professionals) work. Rather than building agent infrastructure from scratch, many practitioners now use agentic assistants and frameworks that already provide tools, memory, evaluation, and orchestration capabilities.

These systems do not all operate in the same way. They differ primarily in:

  • How much context they can access.
  • How much autonomy they are given.
  • Whether they can use tools.
  • Whether they can modify project files.
  • Whether they are designed to help a user or to help build new agentic systems.

One useful way to think about these tools is as a spectrum of increasing capability.

Chat Assistant -> Project-Aware Assistant -> Repository Agent -> Agent Framework

  • At one end of the spectrum are chat assistants. These systems primarily interact through a conversational interface. They can answer questions, explain concepts, analyze uploaded files, and generate ideas, but they have little or no direct access to a project.
  • The next step is the project-aware assistant. These assistants are integrated into development environments and can access the current project, repository, and files being edited. They can help generate code, explain functions, suggest improvements, and assist with debugging. However, they generally require user approval before making substantial changes.
  • A further step is the repository agent. These systems can inspect an entire repository, reason across multiple files, execute commands, modify code, run tests, and carry out multi-step development tasks. Rather than simply responding to questions, they can perform actions on behalf of the user.
  • Finally, agent frameworks provide the infrastructure used to build or customize agentic systems. Rather than serving primarily as assistants, they allow developers to define tools, memory, workflows, evaluation procedures, and orchestration logic for new applications.

As we move from left to right along this spectrum, systems generally gain greater access to project context and a greater ability to take action on behalf of the user.

  • At the same time, the need for oversight, testing, version control, and responsible use also increases.

Table 14.2 shows just some of the many existing assistants and frameworks available today - with more on the way.

Table 14.2: Examples of assistants and frameworks across the spectrum of agentic capabilities.
Category Example Project Access Can Modify Files Can Execute Tools Typical Use
Chat Assistant Claude.ai Uploaded files and conversations No Limited Exploration, explanation, document review, and analysis
Repository Agent Claude Code Full repository Yes Yes Project-wide development and automation
Repository Agent Codex Full repository Yes Yes Multi-step coding and development tasks
IDE Project Assistant/Agent Posit Assistant Current project and repository With approval Limited Interactive coding and analysis
IDE Project-Aware Assistant/Agent GitHub Copilot Current project and repository With approval Limited Code generation, completion, and debugging
Agent Framework OpenAI Agents SDK Developer-defined Developer-defined Developer-defined Building custom agents and workflows

These tools differ in their capabilities, but all seek to reduce the amount of manual effort required to complete complex tasks.

Because the tools in Table 14.2 provide different levels of context and autonomy, they are best suited to different types of tasks.

We will explore some of these capabilities in the following sequence:

  • Claude.ai, a chat-based assistant that requires files and information to be shared explicitly.
  • Claude Code, a repository agent that operates in the terminal next outside your IDE
  • Posit Assistant a project-aware assistants that operates directly within development environments,
  • OpenAI Agents Software Development Kit, a framework used to build custom agentic systems.
Important 14.1: Increasing Autonomy Changes the Role of the Developer

As assistants gain access to more project context and more ability to take action, the role of the data scientist/developer also changes.

  • Historically, developers spent most of their time writing code.
    • Today, many developers spend a growing portion of their time reviewing AI-generated changes, evaluating proposed solutions, testing modifications, and managing requirements.
  • Interacting with an agent increasingly resembles supervising a junior team member.
    • The agent may generate code, propose architectural changes, or complete multi-step tasks, but a human remains responsible for evaluating the results.

This is one reason why version control systems such as Git and GitHub remain important.

  • As agents become more capable, developers increasingly review commits and pull requests, inspect diffs, and test agent generated code prior to approving changes to a project branch.

To meet the challenges of the rapidly evolving roles of data scientists and developers, it is important not only to understand what good code looks like and how to write it, but also how to review, evaluate, and guide code and analyses produced by increasingly capable AI systems.

That requires understanding:

  • What these tools do well.
  • Where their limitations appear and how to detect problems.
  • How they fit into a modern data science workflow.
  • How to use them responsibly.

14.2.1 The Scenario of a Large Analytical Problem

The World Bank exercise introduced a workflow for collecting and analyzing demographic and economic indicators. The resulting dataset contained a few hundred observations and was small enough to explore manually.

For this chapter, we will work with a substantially larger collection of World Bank data.

The dataset contains:

  • 40 European countries.
  • Multiple decades of observations.
  • More than 100 indicators from the World Bank World Development Indicator API.
  • Hundreds of thousands of observations stored as a Parquet dataset, partitioned by country.

This creates a realistic analytical challenge.

As datasets become larger, it becomes more difficult to:

  • Discover relevant variables.
  • Identify meaningful relationships.
  • Compare countries and regions.
  • Generate and evaluate hypotheses.
  • Summarize findings efficiently.

AI assistants can help address many of these challenges.

Note 14.2: AI Assistants as Analytical Partners

Modern assistants are often most valuable when used as analytical partners rather than code generators.

  • They can help identify patterns, suggest analyses, critique conclusions, summarize findings, and explain results.
  • However, responsibility for evaluating those suggestions remains with the analyst.

14.2.2 The World Bank World Development Indicators Dataset

14.2.2.1 Creating the Dataset

Unlike the earlier World Bank exercise, the data have already been downloaded, cleaned, and stored as Parquet files.

  • The goal of this chapter is not data acquisition; we want to focus on how assistants and agent systems can help us work with larger analytical projects.

The dataset was created using two scripts (available in the data_raw folder) that use code similar to Section 12.9.4

  1. A data acquisition script (01_get_worldbank_data_flex.R)
    • Reads country and indicator specifications from an Excel workbook.
    • Downloads data from the World Bank API.
    • Batches countries and indicators to remain within API limits.
    • Retrieves multiple pages of results.
    • Stores the raw data as a partitioned Parquet dataset.
  2. A data preparation script (02_clean_worldbank_data_flex.R)
    • Reads country and indicator metadata.
    • Cleans and standardizes the data.
    • Adds curated metadata.
    • Generates long and wide versions of the dataset.
    • Stores the cleaned datasets as partitioned Parquet datasets.

The resulting datasets contain observations for multiple European countries across multiple decades and indicators stored in two forms:

  • A long format dataset containing one observation per country, year, and indicator.
  • A wide format dataset containing one row per country-year combination and one column per indicator.
Note

Interestingly, the wide Parquet files are approximately 25% larger than the corresponding long-format files.

  • While wide datasets contain fewer rows, they often contain many more columns and missing values.
  • In columnar storage formats such as Parquet, long datasets frequently compress more efficiently because repeated values (such as indicator codes and country identifiers) can be stored using highly compact encoding schemes.

This observation reinforces a broader lesson about data organization, shown in Table 14.3.

Table 14.3: Common strengths of long and wide data formats.
Format Strengths
Long Storage, aggregation, visualization, efficient compression
Wide Modeling, correlation analysis, machine learning

We often create multiple representations of the same data because different tasks require different structures.

That idea applies not only to data, but also to AI systems:

  • Long vs. wide datasets
  • Raw vs. cleaned data
  • Metadata vs. observations
  • General-purpose agents vs. specialized subagents

Each representation is optimized for a particular task.

14.2.2.2 Data Engineering Practices Enable Metadata as Context

The Excel workbook used to create the datasets contains more than just country and indicator codes.

It’s the output of a data engineering effort to create an authoritative human and machine readable source of information for the data outside the code:

  • Curated, canonical human-readable variable names for indicators.
  • Measurement units.
  • Recommended display precision.
  • Curated indicator themes or categories.
  • Reasons for including each variable.
  • A canonical name for each country
  • An ASCII name field for each country

This information serves as a small data dictionary for the project that is under version control and managed by the analyst and/or the data engineering team.

From a data engineering perspective,

  • Separating the specifications from the code allows much more flexibility in maintaining the specifications.
NoteMetadata Changes Over Time

As an example, Türkiye requested that international organizations adopt “Türkiye” rather than “Turkey” as the preferred English name.

  • Because country names are stored in a centralized metadata file, the change can be made in one location without modifying analytical code throughout the project.

Having multiple names and identifiers is also a best practice for data engineering as long as one field is a standard suitable for joins.

  • The country codes are an international standard and fairy stable so one can join on country_code_3 such as TUR, ALB, GRC.
  • Keep a curated country_name for display to humans.
  • Keep a country_name_ascii field for cases where software, URLs, or joins from older sources strip diacritical marks or you get a dataset that only has names and no codes.

14.2.2.3 From an Agent Perspective, Metadata Is Valuable Context

Modern AI assistants often perform better when they have access to metadata in addition to raw data.

For example, the indicator code SP.DYN.TFRT.IN is much more meaningful when accompanied by metadata indicating that it represents:

  • Fertility rate.
  • Births per woman.
  • A demographic indicator.
  • An important driver of long-run population change.

Providing this additional information is an example of context engineering.

  • Humans rely on metadata, documentation, and data dictionaries to understand a project.
  • Modern AI assistants and agents benefit from the same information.

In many systems, metadata becomes part of the context supplied to the model, allowing it to reason more effectively about the data.

In later sections we will see that many agent systems rely heavily on project documentation, metadata, and retrieval systems to provide this type of context.

14.2.2.4 Loading the Data

Now that we understand how the dataset was created and how metadata was incorporated into the cleaning process, we could load the entire data set.

However, for purposes of the first example with claude.ai, we will only need a subset of the data with just a few indicators for all of the countries.

Let’s collect a subset of data for all of the countries and just a few selectors from the wide data set.

  • We’ll use the wide data set.
  • Although long datasets often compress more efficiently and are frequently preferred for storage and visualization workflows, wide datasets are often more convenient for exploratory analysis.
  • This applies not only to humans but also to LLMs. When variables are stored as columns, relationships among variables are easier to identify and discuss.
library(tidyverse)
library(arrow)
library(here)

# Make the connection to the dataset
wb_europe_wide <-
  open_dataset(
    here("data", "world_bank_indicators_europe_wide")
  )
# use the connection to identify the data, then collect() and select variables
wb_europe_subset <-
  wb_europe_wide |>
  collect() |> 
  select(country, country_code_3, year, 
    population_total, population_growth, fertility_rate, 
    gdp_per_capita_current_usd
  ) 

glimpse(wb_europe_subset)
Rows: 2,640
Columns: 7
$ country                    <chr> "Albania", "Albania", "Albania", "Albania",…
$ country_code_3             <chr> "ALB", "ALB", "ALB", "ALB", "ALB", "ALB", "…
$ year                       <int> 2023, 2022, 2021, 2020, 2019, 2018, 2017, 2…
$ population_total           <dbl> 2414095, 2451636, 2489762, 2528480, 2567801…
$ population_growth          <dbl> -1.5431081, -1.5431567, -1.5431208, -1.5431…
$ fertility_rate             <dbl> 1.348, 1.355, 1.365, 1.371, 1.395, 1.415, 1…
$ gdp_per_capita_current_usd <dbl> 9730.8692, 7756.9619, 7242.4551, 6027.9135,…
  • The resulting data frame now contains all the countries and just enough indicators to support testing an analysis workflow.
NoteContext Engineering Applies to Data

Context engineering is not limited to prompts.

  • When working with assistants such as Claude.ai, we often decide which files, variables, summaries, and visualizations should be shared with the model.
  • Providing too little context can lead to weak recommendations, while providing excessive information can make it harder for the model to focus on the problem of interest.
  • Selecting an appropriate subset of information is therefore an important part of working effectively with AI assistants.

When working with a chat assistant like claude.ai we can upload data files to act as context for our requests.

  • Let’s save our subset of the data as a csv file.
wb_europe_subset |>
  write_csv(
    here(
      "data",
      "wb_indicators_europe_subset.csv"
    )
  )
Note

Although the project stores data internally as Parquet files, CSV is often the most convenient format for sharing tabular data with external tools such as Claude.ai.

  • The wide format is particularly useful because each row represents an observation and each column represents a variable, making relationships among variables easier for both humans and LLMs to identify.
  • If the data were a list with hierarchy, then we could consider a JSON structure.

Creating a combination of data and metadata provides a useful foundation for working with AI assistants.

  • In the next section, we will explore how assistants can use this data and context to help identify patterns, suggest analyses, and generate explanations.

14.3 Working with a Chat Assistant: Claude.ai

Section 13.4 discussed building reusable functions that interact with the Groq API such as:

call_groq(
  user_prompt = prompt,
  model = model,
  role_prompt = role
)

In that chapter, we primarily used LLMs to support workflows and agents that we built ourselves.

  • In practice, however, many data scientists use LLMs directly as analytical assistants.

Rather than generating code alone, we can use an LLM to assist with tasks such as:

  • Suggesting potentially important indicators.
  • Suggesting comparisons among countries.
  • Suggesting visualizations.
  • Critiquing assumptions.
  • Critiquing statistical conclusions.
  • Suggesting additional analyses.
  • Explaining unexpected results.
  • Identifying possible data quality issues.

For example, we could use an LLM assistant to suggest/critique at each step in an analytical workflow:

  1. Compute summary statistics.
  2. Create visualizations.
  3. Fit statistical models.
  4. Evaluate model performance.
  5. Interpret the results.

Here, the LLM does not perform or replace the statistical analysis; it acts as a collaborator that can help generate hypotheses, identify alternative explanations, and suggest additional methods or questions worth investigating.

Note 14.3: LLMs Are Better at Generating Ideas Than Verifying Them

Modern LLMs are often effective at generating explanations, identifying patterns, and suggesting next steps.

  • However, they do not automatically verify or critique their own conclusions.
  • The analyst remains responsible for determining whether a proposed interpretation is supported by the data and problem context.

14.3.1 Claude.ai as an Analytical Assistant

Claude.ai provides a conversational interface that allows users to interact with an LLM through a web browser.

  • Because Claude.ai operates outside the development environment, information must still be shared manually.

Unlike a simple chat interface, Claude can work with uploaded files.

  • It can work with CSV files, Excel files, PDF documents, code files, Quarto documents and even images and figures.
  • This allows Claude to reason about project artifacts as context rather than relying solely on information typed into the prompt.
TipContext Engineering of Uploaded Files Improves the Signal and Saves Tokens

Conceptually, Claude (and other LLMs) work with a limited context window. Anything it needs to reason about must ultimately be represented within that context window in some form.

However, when you upload a file:

  • Claude does not simply paste the entire file verbatim into the conversation.
  • The file is processed and made available to the model through Claude’s document handling system.
  • Claude can retrieve relevant portions of the file as needed while answering questions.

By uploading files that have been engineered to focus on the right context, you improve Claude’s ability to find the signal in the file while saving tokens and compute time.

The workflow typically looks like:

  1. Perform analysis locally or in Posit Cloud.
  2. Upload files or outputs to Claude.
  3. Ask questions about the results.
  4. Evaluate and refine the responses.
  5. Return to Posit Cloud to continue the analysis.

This workflow may appear somewhat manual, but it has an important advantage. the analyst remains directly involved in every step of the process and can evaluate each recommendation before taking action.

  • This provides a natural opportunity to apply many of the responsible AI practices discussed throughout this course.

14.3.2 Example: Working with Claude.ai to Create Maps

Many real-world projects require information that is not contained in the original dataset.

Suppose we would like to create choropleth maps showing demographic patterns across Europe.

  • Our World Bank dataset contains country-level indicators, but it does not contain geographic boundaries.
  • This creates a new problem:

Where can we obtain reliable geographic data that can be joined to our World Bank dataset?

A traditional workflow might involve:

  • Searching for shapefiles.
  • Reading documentation.
  • Comparing multiple data sources.
  • Determining which identifier can be used for joins.
  • Downloading and testing candidate datasets.

Claude.ai can help accelerate this process.

Note 14.4: Geographic Data Requires More Than Coordinates

Packages such as {ggplot2} can create maps, but for choropleth maps they require data for the geographic boundaries describing the shapes of countries, states, provinces, or other regions.

  • These boundaries are commonly stored as simple features in shapefiles or related geographic formats and are typically analyzed using packages such as {sf}.
  • Because our World Bank dataset contains indicators but not geographic boundaries, we must obtain a second dataset and determine how the two datasets can be joined together.

14.3.2.1 Working with Claude.ai to Acquire European Boundary Data

If you do not have a claude.ai account already, follow the instructions in Getting Started with Claude.ai for Data Science on the web or in you repository.

Then go to Claude.ai at https://claude.ai/new.

  1. Upload the data file.
  • To upload a file from Posit Cloud first use the Files Pane/More Export command to download the file to your local computer. Then use the upload “+” button in Claude.ai chat interface as in Figure 14.1.
  • You should see the file as an icon in the chat interface.
Figure 14.1: Claude.ai’s Browser-based Chat Interface
  1. Enter a prompt.

After uploading the CSV file, we can construct a prompt using the same components discussed in Section 13.2.1.

  • Role to establishes type of expertise the model should use.
  • Task to define the objective.
  • Context to explains the uploaded data and how it should be used.
  • Constraints to narrows the solution space.
  • Examples to illustrate the types of solutions being sought.
  • Output: to provide an explicit set of standards against which to evaluate the output

Copy and paste the following into the chat interface at the cursor. Feel free to edit if you want to get more specific on a point.

Role:
You are an experienced geospatial data scientist working with R.

Task:
Recommend geographic boundary datasets suitable for creating
country-level choropleth maps of Europe using the uploaded World Bank data.

Context:
I have uploaded a World Bank dataset containing approximately 
40 European countries. 
- Use the uploaded dataset as the source of the countries that must be 
supported by any recommended geographic boundary dataset.

Constraints:
- The dataset uses ISO three-character country codes.
- The boundary dataset must provide a reliable join field that can be matched
to the ISO three-character country codes in the uploaded dataset.
- The solution should work with R.
- The data should be easily downloadable at no cost.
- The geographic boundaries should support all countries in the dataset.

Examples:
Examples of useful sources include Natural Earth,
Eurostat GISCO, GADM, or other comparable datasets.

Output:
For each candidate data source provide:
- Source name
- Download location as a hyperlink
- Geographic coverage
- Available join fields
- Advantages
- Disadvantaes
- Suitability for this project

Conclude with a discussion of the tradeoffs among the alternatives.

You may be thinking, “This is an awfully long prompt. Why not simply ask where to download shapefiles for European countries?”

  • That approach might work, but it often requires multiple rounds of clarification and refinement before the model produces a response that fully addresses the problem.
  • As prompts become more complex, explicit structure often becomes as important as the individual instructions themselves.
  • There is a tradeoff between the effort invested in constructing a well-organized prompt and the effort required to repeatedly refine an incomplete response.
  • In many cases, spending a little more time defining the role, task, context, constraints, and desired output can substantially improve the quality of the first response while also helping you frame and understand the problem more clearly yourself.
TipExplicit Prompt Structure Often Improves Results

Although LLMs can often infer the purpose of different parts of a prompt, it is usually helpful to label major components explicitly.

  • Role
  • Task
  • Context
  • Constraints
  • Examples
  • Output

These labels act much like section headers in a document. They help both humans and LLMs distinguish between different types of information and can make prompts easier to understand, modify, and debug.

  • This approach is particularly useful for longer prompts where instructions, data, and examples are combined into a single request.

By now, you should have seen evidence of the agent inspecting the uploaded data, extracting relevant information, evaluating alternatives, and planning its next actions as shown in Figure 14.2.

  • These intermediate messages provide a glimpse into some of the same concepts discussed in the previous chapter, including tool use, evaluation, planning, and action selection.
Screenshot of Claude.ai processing an uploaded World Bank dataset. The interface shows the structured prompt and status messages indicating that Claude inspected the file, extracted relevant information about the countries in the dataset, identified Kosovo's XKX code as a special case, and is evaluating alternatives before generating recommendations. The messages illustrate agent behaviors such as acting, evaluation, planning, and tool use.
Figure 14.2: Snapshot of Claude.ai at work: acting, evaulating, and planning the next action.

AS the response begins you should see multiple summaries of options followed by th tradeoff analysis and the start of code options as in Figure 14.3.

Screenshot of Claude.ai presenting a structured evaluation of several geographic boundary datasets. The response compares alternative sources across multiple criteria, including licensing, country-code compatibility, data quality, and reproducibility. After discussing tradeoffs, Claude recommends a preferred dataset and begins generating R code to implement the solution.
Figure 14.3: : Example of AI-assisted decision support. After evaluating multiple geographic data sources, Claude.ai summarizes tradeoffs, recommends a preferred solution, and generates code to implement the recommendation. {#fig-claude-ai3-tradeoff-analysis}.

Claude’s response also surfaced a practical issue that we may not have anticipated: Kosovo uses the code XKX in our dataset.

  • This is a useful example of why AI assistants can be helpful in exploratory project work.
  • The challenge was not simply to “make a map.”
  • The real challenge was to identify whether the geographic boundary data could be joined correctly to the World Bank data.

In this case, Claude identified that the join field matters:

  • Our World Bank data use country_code_3.
  • Some boundary datasets use different country identifiers. e.g., th ISO2 standard.
  • Kosovo is often coded inconsistently across geographic data sources.
  • The choice of boundary dataset affects whether the join will work cleanly.

This is the kind of issue that can easily be missed if we only ask for a web site or just code.

  1. Make a decision about which source

Claude recommended using {rnaturalearth} and to “join on adm0_a3”.

Now you decide whether to use the option or a different one.

  • Let’s choose that option
  1. Acquire the Shapefiles

Let’s run the code suggested by Claude.

library(rnaturalearth)
library(sf)
library(dplyr)

europe_sf <- ne_countries(scale = 50, returnclass = "sf") |>
  filter(continent == "Europe" | adm0_a3 %in% your_codes)
Error in `stopifnot()`:
ℹ In argument: `continent == "Europe" | adm0_a3 %in% your_codes`.
Caused by error:
! object 'your_codes' not found
# Kosovo fix: ne_countries() sets iso_a3 = "-99" for Kosovo;
# adm0_a3 correctly gives "XKX"
joined <- europe_sf |>
  left_join(wb_data, by = c("adm0_a3" = "country_code_3"))
Error:
! object 'europe_sf' not found
  • Oops! There is an error.

The code generated by Claude was close to correct but not immediately executable.

  • The response referenced a variable named your_codes that was never defined.
  • This was claude practicing “defensive programming” as it tried to anticipate that there might be codes in your data set (your_codes) that you considered as Europe but that might not be coded as Europe in {rnaturalearth}.
  • A human analyst must still review generated code, identify placeholders or assumptions, and verify that the code is consistent with the actual project.
  • In this example, replacing your_codes with the actual country codes from the World Bank dataset resolves the issue.

This illustrates a common pattern when working with AI assistants:

  • The generated code is often directionally correct.
  • The overall approach is frequently useful.
  • Small implementation details may still require human review and modification.

This is one reason why experienced practitioners often spend more time reviewing AI-generated code than simply copying and executing it.

Let’s fix the code and proceed.

  • We are also going to add a line to just select the two columns we need as the data frame has 169 columns.
wb_country_codes <-
  wb_europe_subset|>
  distinct(country_code_3) |>
  pull(country_code_3)

europe_sf <-
  rnaturalearth::ne_countries(
    scale = 50,
    returnclass = "sf"
  ) |>
  filter(
    continent == "Europe" |
      adm0_a3 %in% wb_country_codes
  ) |>
  select(adm0_a3, geometry)
TipData Engineering in Mid Analysis

Whenever possible, keep only the variables needed for the task at hand.

  • The ne_countries() dataset contains many attributes describing each country. However, for this analysis we only require:
    • A join key (adm0_a3)
    • The country geometry
  • Selecting only the required variables reduces memory usage, simplifies subsequent joins, and makes the code easier to understand.
  1. Now we can join the data
  • We will also adjust the join so that we add the desired geometry column to the original data and not the other way around.
# Kosovo fix: ne_countries() sets iso_a3 = "-99" for Kosovo;
# adm0_a3 correctly gives "XKX"
 wb_europe_sf <- wb_europe_subset |>
  left_join(europe_sf, join_by("country_code_3" ==  "adm0_a3"))

glimpse(wb_europe_sf)
Rows: 2,640
Columns: 8
$ country                    <chr> "Albania", "Albania", "Albania", "Albania",…
$ country_code_3             <chr> "ALB", "ALB", "ALB", "ALB", "ALB", "ALB", "…
$ year                       <int> 2023, 2022, 2021, 2020, 2019, 2018, 2017, 2…
$ population_total           <dbl> 2414095, 2451636, 2489762, 2528480, 2567801…
$ population_growth          <dbl> -1.5431081, -1.5431567, -1.5431208, -1.5431…
$ fertility_rate             <dbl> 1.348, 1.355, 1.365, 1.371, 1.395, 1.415, 1…
$ gdp_per_capita_current_usd <dbl> 9730.8692, 7756.9619, 7242.4551, 6027.9135,…
$ geometry                   <MULTIPOLYGON [°]> MULTIPOLYGON (((19.34238 41...…

14.3.3 Asking for Code After Evaluating Options

Once Claude has helped identify candidate data sources and possible join issues, we can move to a more specific coding request.

A weaker prompt would be:

Give me R code to make a map of Europe.

That may produce code, but it gives Claude very little information about the data, the join field, or the known Kosovo issue.

A stronger follow-up prompt would be:

Role:
You are an experienced geospatial data scientist working with R.

Task:
Write R code to create a country-level choropleth map of Europe using my uploaded World Bank dataset.

Context:
The uploaded dataset contains country-level World Bank indicators for approximately 40 European countries. The dataset uses `country_code_3` as the join field. Claude previously identified that Kosovo uses the code `XKX`, which may require special handling depending on the geographic boundary source.

Constraints:
- Use R.
- Use {sf}, {dplyr}, and {ggplot2}.
- Use {rnaturalearth} for country boundaries.
- Join the World Bank data to the geographic boundaries using a reliable ISO-3 style country code.
- Use `adm0_a3` from Natural Earth rather than `iso_a3` if that better handles Kosovo.
- Filter the World Bank data to one recent year.
- Map one demographic indicator such as `population_growth` or `fertility_rate`.
- Include clear comments in the code.

Output:
Provide complete R code and briefly explain the main steps.

This prompt is better because it aligns the coding request with the earlier evaluation.

  • The assistant is not only being asked to write code. It is being asked to write code that reflects the known data integration problem.

Example Code

Claude may produce code similar to the following.

Show code
# ============================================================
#  Europe Choropleth Map — World Bank Indicators
#  Indicator: fertility_rate (2023)
#  Boundaries: Natural Earth via {rnaturalearth}
#  Join key:   adm0_a3 (handles Kosovo / XKX correctly)
# ============================================================

# ── 0. Packages ──────────────────────────────────────────────
library(sf)               # spatial data handling
library(dplyr)            # data wrangling
library(ggplot2)          # plotting
library(rnaturalearth)    # country boundary polygons
library(rnaturalearthdata)# required for scale = 50 data
library(scales)           # for label formatting in legend


# ── 1. Load World Bank data ───────────────────────────────────
wb_raw <- read.csv(here::here("data", "wb_indicators_europe_subset.csv"),
                   stringsAsFactors = FALSE)

# Filter to a single recent year with good data coverage
TARGET_YEAR      <- 2023
TARGET_INDICATOR <- "fertility_rate"

wb_year <- wb_raw |>
  filter(year == TARGET_YEAR) |>
  select(country, country_code_3, all_of(TARGET_INDICATOR)) |>
  # Drop rows where the indicator is missing
  filter(!is.na(.data[[TARGET_INDICATOR]]))

cat("Countries with data in", TARGET_YEAR, ":", nrow(wb_year), "\n")
Countries with data in 2023 : 40 
Show code
# ── 2. Fetch Natural Earth boundaries ────────────────────────
# scale = 50  →  1:50 million; good balance of detail vs file size
# returnclass = "sf"  →  returns an sf data frame directly
world_sf <- ne_countries(scale = 50, returnclass = "sf")

# Identify which field best carries Kosovo (XKX):
#   iso_a3     → coded "-99" for Kosovo in Natural Earth (avoid)
#   adm0_a3    → coded "XKX"  ✓  use this one
# Quick check:
world_sf |>
  filter(name == "Kosovo") |>
  select(name, iso_a3, adm0_a3) |>
  st_drop_geometry() |>
  print()
    name iso_a3 adm0_a3
1 Kosovo    -99     KOS
Show code
# ── 3. Subset Natural Earth to Europe + neighbors ───────────
# Keep all countries whose adm0_a3 appears in the WB dataset,
# plus a broad Europe bounding box so the base map looks complete
# even for countries without WB data (they'll appear as grey).

# All adm0_a3 codes that appear in the WB dataset
wb_codes <- wb_year$country_code_3

# Pull the full European extent (continent label + Türkiye, Cyprus, etc.)
europe_sf <- world_sf |>
  filter(
    continent == "Europe" |
    adm0_a3 %in% wb_codes        # catch non-"Europe" coded countries in WB data
  )


# ── 4. Join World Bank data to spatial boundaries ────────────
# Left join keeps every European polygon; WB countries without
# geometry simply won't appear (none expected here).
europe_joined <- europe_sf |>
  left_join(wb_year,
            by = c("adm0_a3" = "country_code_3"))

# Sanity check: how many WB countries matched a polygon?
matched <- sum(!is.na(europe_joined[[TARGET_INDICATOR]]))
cat("WB countries matched to geometry:", matched, "/", nrow(wb_year), "\n")
WB countries matched to geometry: 36 / 40 
Show code
# List any WB countries that did NOT match (for debugging)
unmatched_wb <- wb_codes[!wb_codes %in% europe_joined$adm0_a3]
if (length(unmatched_wb) > 0) {
  cat("Unmatched WB codes:", paste(unmatched_wb, collapse = ", "), "\n")
} else {
  cat("All WB country codes matched to Natural Earth polygons.\n")
}
Unmatched WB codes: XKX, NA, NA, NA 
Show code
# ── 5. Define map projection ──────────────────────────────────
# ETRS89-LAEA (EPSG:3035) — standard equal-area projection for Europe;
# reduces the Greenland distortion typical of WGS84/Web Mercator.
CRS_EUROPE <- "+proj=laea +lat_0=52 +lon_0=10 +x_0=4321000 +y_0=3210000
               +ellps=GRS80 +units=m +no_defs"

europe_projected <- st_transform(europe_joined, crs = CRS_EUROPE)


# ── 6. Build the choropleth ───────────────────────────────────
# Color palette: sequential, perceptually uniform (viridis magma strip)
# Diverging palettes would suit population_growth better (neg vs pos);
# for fertility_rate a sequential low→high scheme is appropriate.

ggplot(data = europe_projected) +

  # --- base polygons colored by indicator ---
  geom_sf(aes(fill = .data[[TARGET_INDICATOR]]),
          colour = "white",      # thin white border between countries
          linewidth = 0.25) +

  # --- country outlines for unmatched / NA countries (grayed out) ---
  # (already handled by na.value in scale_fill_viridis_c)

  # --- color scale ---
  scale_fill_viridis_c(
    option    = "magma",        # perceptually uniform, print-friendly
    direction = -1,             # dark = high fertility
    name      = "Fertility rate\n(births per woman)",
    na.value  = "grey85",       # countries with no WB data
    guide     = guide_colorbar(
      barwidth  = 0.6,
      barheight = 10,
      title.position = "top",
      title.hjust    = 0.5
    )
  ) +

  # --- map extent: clip to Europe ----------------------------------------
  coord_sf(
    xlim = c(2.5e6, 6.5e6),    # LAEA x in meters
    ylim = c(1.5e6, 5.5e6),    # LAEA y in meters
    expand = FALSE
  ) +

  # --- titles & caption ---
  labs(
    title    = "Fertility Rate Across Europe",
    subtitle = paste0("Average births per woman, ", TARGET_YEAR,
                      "  |  Source: World Bank"),
    caption  = "Boundaries: Natural Earth (1:50m)  |  Projection: ETRS89-LAEA (EPSG:3035)"
  ) +

  # --- clean map theme ---
  theme_void(base_size = 12) +
  theme(
    plot.title       = element_text(face = "bold", size = 16,
                                    margin = margin(b = 4)),
    plot.subtitle    = element_text(size = 11, colour = "grey40",
                                    margin = margin(b = 10)),
    plot.caption     = element_text(size = 8,  colour = "grey60",
                                    hjust = 0,
                                    margin = margin(t = 8)),
    legend.position  = c(0.92, 0.55),   # inside map, right-center
    legend.title     = element_text(size = 9,  face = "bold"),
    legend.text      = element_text(size = 8),
    plot.margin      = margin(10, 10, 10, 10),
    plot.background  = element_rect(fill = "white", colour = NA)
  )

Show code
# ── 7. Save output ────────────────────────────────────────────
ggsave(
  filename = "europe_fertility_rate_2023.png",
  width    = 10,
  height   = 8,
  dpi      = 200,
  bg       = "white"
)

cat("Map saved to europe_fertility_rate_2023.png\n")
Map saved to europe_fertility_rate_2023.png
Show code
# ── Optional: switch to population_growth ─────────────────────
# population_growth has negative values (shrinking populations),
# so a *diverging* palette centered on 0 is more informative.
#
# Replace TARGET_INDICATOR <- "population_growth" above, then swap
# scale_fill_viridis_c() for:
#
# scale_fill_gradient2(
#   low      = "#2166ac",   # blue  = decline
#   mid      = "white",
#   high     = "#d6604d",   # red   = growth
#   midpoint = 0,
#   name     = "Population\ngrowth (%)",
#   na.value = "grey85",
#   guide    = guide_colorbar(barwidth = 0.6, barheight = 10,
#                             title.position = "top", title.hjust = 0.5)
# )

Claude’s generated code covered many of the major steps that an experienced analyst would want in a choropleth workflow:

  • Load the World Bank data.
  • Select a target year and indicator.
  • Retrieve country boundaries.
  • Inspect the Kosovo country-code issue.
  • Filter the spatial data to relevant countries.
  • Join the statistical data to the spatial data.
  • Check whether all World Bank countries matched.
  • Project the map to a Europe-appropriate coordinate reference system.
  • Create the choropleth.
  • Save the result.
  • Suggest a different variable and how it would suggest a different palette strategy for indicators with negative and positive values.

This is a useful example of how an AI assistant can help a newer programmer see the larger structure of a workflow, not just isolated lines of code.

However, the code was not perfect.

  • First, the path to the CSV file needed to be adjusted. It needed a /data/ to find the data. It did not include the {here} package. Ask it why not?
    • This is a common issue when AI-generated code assumes a project structure that differs from the actual project.
  • Second, Claude selected 2023 as the target year without explaining why. That may be reasonable if 2024 has incomplete data, but the analyst should verify data availability before choosing a year.
  • Third, the color palette choice is defensible but subjective. Claude used the magma palette with direction = -1, making darker colors represent higher fertility rates. That may be acceptable, but color direction should be chosen intentionally and explained, especially when the visual interpretation matters.
  • It saved a file of the graphic without asking.

Each of these is easily fixable and refined.

14.3.4 Summary of Claude.ai as a Chat Assistant

Claude.ai illustrates many of the strengths and limitations of modern chat-based AI assistants.

Unlike an assistant integrated directly into a development environment, Claude.ai operates primarily through a web browser. As a result, the user must manually provide context by uploading files, copying and pasting model output or error messages and providing detailed descriptions of the project and analytical goals.

  • This process may appear cumbersome compared to more integrated tools, but it provides an important benefit: the analyst remains directly involved in deciding what information should be shared with the model.

The shapefile example demonstrated a typical workflow: of uploading data, using structured prompts, and asking Claude to evaluate possible solutions.

  • The analyst then had to select a preferred approach and review, modify, and test the resulting code.
  • Throughout this process, Claude was able to analyze the dataset, identify potential join issues, compare alternatives, explain tradeoffs, recommend a reasonable solution, and generate a substantial amount of (almost) working code.
  • At the same time, the generated code was not perfect; it still required testing and refinement.
  • This review process required exactly the skills developed throughout this course:

Perhaps the most important lesson is that Claude provided value long before it generated any code.

  • Its greatest contribution was helping identify requirements, evaluate alternatives, surface potential problems, and structure the overall solution.
  • In practice, many successful uses of AI assistants involve this type of collaboration.
  • The assistant contributes ideas, analysis, and first drafts, while the human remains responsible for evaluating, refining, and ultimately shaping and approving the final result.

14.4 Working with a Repository Agent: Claude Code

The Claude.ai example demonstrated the strengths and limitations of a browser-based chat assistant.

To analyze the World Bank Europe dataset, Claude.ai required us to manually provide context through uploaded files and detailed prompts.

  • We uploaded datasets and metadata.
  • We described the project structure.
  • We explained the analytical objective.
  • We copied code and error messages between applications.

This process worked well, but it required continuous effort to keep the model informed.

Claude Code takes a different approach.

  • Instead of operating in a browser, Claude Code operates directly inside a project repository. It can inspect files, understand project structure, execute commands, create new files, and modify existing ones.

  • The key difference is the entire repository itself becomes context.

NoteTrying Claude Code Yourself

Claude Code is designed primarily for local development environments where it can interact directly with your files, Git repository, and terminal; it is not designed for Posit Cloud type environments. There is a web version in development.

  • Unlike browser-based chat assistants, Claude Code uses a modern terminal interface that continuously updates the screen as it reads files, executes commands, and displays results.
  • While Claude Code can run inside RStudio’s Terminal pane, the display may quickly become difficult to read because RStudio’s terminal does not fully support the dynamic terminal interface used by Claude Code.

For the best experience, use a dedicated terminal application:

  • macOS: iTerm2 (recommended) or Terminal.app
  • Windows: Windows Terminal

Both are free and widely used by software developers.

14.4.0.0.1 Installing Claude Code

Install iTerm2 (recommended):

brew install --cask iterm2

Or download directly from:

https://iterm2.com/downloads.html

Install Claude Code:

curl -fsSL https://claude.ai/install.sh | sh

Verify the installation:

claude --version
Important
  1. The minimum Windows requirement is Windows 10 build 1809+ (October 2018) or Windows Server 2019+.
  2. 1 GB free should be enough to get Claude Code installed and running, but it’s a tight margin
  3. Anthropic’s official minimum is 4 GB RAM.
  • 8GB will be comfortable for small tasks: single-file edits, small scripts, simple debugging, working in modest-sized repos.
  • 16GB is recommended for “comfortable multitasking” since you probably have RStudio or another IDE running, data in memory, tabs on your browser, and other applications open and potentially, parallel agent sessions.

Install Windows Terminal from the Microsoft Store or via:

winget install Microsoft.WindowsTerminal

Install Claude Code from PowerShell:

irm https://claude.ai/install.ps1 | iex

Verify the installation:

claude --version
ImportantClaude Code is Not Free

Claude Code uses the Claude API which charges for the number of API tokens that are input and provided as output.

However, Anthropic allows one to pay for Claude Code in two ways

  1. By signing in with at least a Claude Pro or Max subscription account (tokens count against your usage limits).
  2. By using an Anthropic Console API account with usage-based billing.

A standard Claude subscription is the simpler as it allows both chat and claude code but the API account might be cheaper for a short experiment (min cost $5.00)

NoteFor This Course

Claude Code is demonstrated to illustrate how repository agents operate; you are not required to install Claude Code or create an Anthropic account.

  • The images in this section are from using a Mac with local RStudio on the left and iTerm on the right.

14.4.1 Starting Claude Code

Claude Code runs from a terminal window.

  • Open a terminal window and navigate so the working directory (prompt/cursor) is at the root of the project directory.
  • Then enter the following:
claude

When Claude Code starts, if it’s a new repo, it asks if you trust the repo before it does anything.

  • Given permission, it then scans the repository and builds an understanding of the project structure.
  • It can identify files, directories, function definitions, package dependencies, and common workflows directly from the code.
  • When complete your screen may look something like Figure 14.4 where RStudio is on the left and the terminal on the right.
Claude Code startup screen showing project information and terminal interface.
Figure 14.4: Claude Code startup screen showing the current project directory and repository context.

However, Claude many not capture all the important aspects of a project from the files alone such as

  • the purpose of the project,
  • preferred coding conventions,
  • expected outputs,
  • data sources,
  • repository-specific workflows,
  • and other project constraints.

Claude Code uses a special file called CLAUDE.md as a means to capture the non-code aspects so it can serve as persistent project context.

14.4.2 CLAUDE.md as Persistent Instructions

As seen in Figure 14.4, Claude suggests you enter one of its “slash” commands, /init, to have it create a CLAUDE.md file.

  • It will then explore the repository and create a file. It will ask permission to write to your repository.
  • The results may look similar to Figure 14.5 where Claude provides a summary of its actions.
Claude Code  screen showing the results of creating a CLAUDE.md file to include the tokens and time consumed.
Figure 14.5: Claude Code created a CLAUDE.md file and provided feedback on the process in terms of tokens, a summmary of the actions, and the time.

In this example, the major sections included

  • Project Purpose
  • Common Commands.
  • Architecture
    • Primary Student File
    • R Helper Scripts (R/)
    • Data Layout
    • Other Key Files
  • Conventions

Since it is a standard markdown file, the analysis can edit it to reflect exactly what it wants it to say as a version-controlled document.

Claude Code automatically loads this file whenever a new session begins.

  • Rather than repeating these instructions in every prompt, the conventions become part of the repository itself.
  • Unlike Claude.ai, no files need to be uploaded manually.
  • Claude Code can already see the files in the repository.

14.4.3 Sample Scripts

For the World Bank Europe project several scripts were created so Claude Code has repository structure and content it can inspect, explain, modify, and extend.

  • The data/ directory contains both long and wide versions of the World Bank Europe dataset.
  • The R/ directory contains small helper scripts organized by task: loading data, mapping, plotting, correlations, and saving outputs.

The structure contains the following:

data/
├── world_bank_indicators_europe/
├── world_bank_indicators_europe_wide/
└── world_bank_metadata.xlsx

R/
├── worldbank_data.R
├── worldbank_maps.R
├── worldbank_plots.R
├── worldbank_correlations.R
└── plot_utils.R

outputs/
├── plots/
└── reports/

The output of the scripts looks like the following:

Show code
source("R/worldbank_data.R")
source("R/worldbank_maps.R")
source("R/worldbank_plots.R")
source("R/worldbank_correlations.R")
source("R/plot_utils.R")
wb_wide <- load_worldbank_wide()

indicators <- c(
  "population_total",
  "population_growth",
  "fertility_rate",
  "gdp_per_capita_current_usd"
)

fertility_map <- make_worldbank_map(
  wb_wide,
  indicator = "fertility_rate",
  years = 2023,
  legend_title = "Births per woman",
  title = "Fertility Rate Across Europe"
)
save_plot(
  fertility_map,
  "fertility_rate_map.png"
)

scatter_plot <- make_indicator_scatter(
  wb_wide,
  x_indicator = "fertility_rate",
  y_indicator = "population_growth",
  years = 2023
)
save_plot(
  scatter_plot,
  "fertility_vs_growth.png"
)

hist_plot <- make_indicator_histogram(
  wb_wide,
  indicator = "gdp_per_capita_current_usd",
  years = 2023
)
save_plot(
  hist_plot,
  "gdp_per_capita_usd.png"
)

corr_plot <- make_correlation_heatmap(
  wb_wide,
  indicators = indicators,
  years = 2023
)

save_plot(
  corr_plot,
  "correlation_heatmap.png"
)
hist_plot
corr_plot
scatter_plot
fertility_map 

Because Claude Code can inspect these files directly, it begins with substantially more context than a browser-based chat assistant.

14.4.4 Repository-Aware Development Tasks

Rather than asking Claude Code to perform a complete analysis, it is often more useful to ask it to extend or modify an existing project.

  • Because Claude Code can see the entire repository, it can understand how functions are organized, identify coding conventions, and make changes that are consistent with the rest of the codebase.

The following three tasks illustrate progressively more sophisticated forms of repository-aware development.

14.4.4.1 Task 1: Modify an Existing Function

To illustrate how repository-aware agents differ from chat assistants, we can begin with a relatively simple software development task. Rather than providing extensive instructions about the project structure, data sources, and coding conventions, we simply ask Claude Code to extend an existing function.

The prompt below asks Claude Code to modify make_indicator_scatter() so that users can optionally add either a linear regression line or a nonlinear LOESS smoother. Because Claude Code has already scanned the repository and loaded the project context from CLAUDE.md, it can identify the correct file, inspect the existing function, and propose an implementation without requiring additional explanation.

Update make_indicator_scatter() so users can optionally add either a linear regression line or a nonlinear LOESS smoother.

The proposed changes are shown as a diff before they are applied to the repository as in Figure 14.6.

  • This approval step provides an opportunity to review the modifications and confirm that the agent’s interpretation matches the intended task.
  • Unlike a chat assistant, which would simply return code, Claude Code can directly edit project files while keeping the user in the review loop.
Claude Code displaying a proposed edit to worldbank_plots.R. The diff view highlights the addition of a smoother argument supporting linear and LOESS smoothers, while RStudio shows the project files and functions available in the repository.
Figure 14.6: Claude Code displaying a proposed edit a function in worldbank_plots.R as a diff and asking permission to proceed.

After the changes are accepted, Claude Code updates the file and summarizes what was modified as in Figure 14.7.

  • The resulting implementation adds support for both linear and nonlinear smoothers while preserving the original behavior when no smoother is requested.
RStudio displaying the updated worldbank_plots.R file alongside Claude Code’s summary of the completed modifications. The new smoother parameter supports NULL, linear regression, and LOESS smoothing options.
Figure 14.7: Claude Code displaying a completed edit as shown in the RStudio files.

The final response includes both the code changes and a concise explanation of the new functionality.

  • Notice that the original prompt was intentionally brief.
  • Claude Code determined which file to edit, how to modify the function signature, where to insert the new plotting logic, and how to document the change by combining information from the repository structure, the existing code, and the project context provided through CLAUDE.md.

As shown in Figure 14.6 and Figure 14.7, repository-aware agents shift some of the burden of context engineering from the prompt into the repository itself.

  • The prompt focuses on the desired outcome, while the agent discovers much of the implementation context automatically.
14.4.4.1.1 Task 2: Creating a New Visualization Function

The second task requires more than modifying an existing function; Claude Code must create a completely new plotting function that follows the conventions established elsewhere in the repository.

The prompt is still relatively brief:

Create a new function named make_indicator_boxplot() that follows the style and structure of the existing plotting functions. Include a notch argument that allows notched boxplots.

To complete this task, Claude Code first examines the existing plotting functions to identify the project’s conventions before proposing a new implementation as seen in Figure 14.8.

Claude Code displaying a proposed addition of a make_indicator_boxplot function to worldbank_plots.R. The proposed function includes a notch argument and follows the structure of the existing plotting functions.
Figure 14.8: Claude Code displaying a proposed edit to add a new function to worldbank_plots.R as a diff and asking permission to proceed.

Unlike the previous task, Claude is not simply extending an existing function. It must infer how new functionality should be organized based on the repository structure, naming conventions, documentation style, and existing code patterns.

After approval, Claude Code adds the new function and explains how it fits within the project’s existing design as in Figure 14.9.

RStudio displaying the completed make_indicator_boxplot function while Claude Code summarizes the new functionality and explains how it follows existing project conventions.
Figure 14.9: Claude Code displaying the results of adding a new function to worldbank_plots.R wiht RStudio showing the new function.

This task took slightly longer than the scatterplot modification shown in Figure 14.6 and Figure 14.7 because Claude Code needed to understand and reproduce the project’s design patterns rather than simply extending existing functionality.

  • As shown in Figure 14.8 and Figure 14.9, repository-aware agents can use information from surrounding code to create new functions that are consistent with the rest of the project.

Task 3: Creating a New Modeling Module

The third task is substantially more complex than the previous two. Rather than modifying an existing function or adding a new plotting function, Claude Code is asked to design and create an entirely new script containing multiple related functions.

The prompt remains relatively concise:

Create a new script named worldbank_models.R containing functions for fitting linear models and ANOVA models and generating summary tables using either the default base R summary or the broom package.

To complete this task, Claude Code first reads several existing scripts in the repository to understand naming conventions, function structure, documentation style, and expected inputs and outputs before proposing a new file as in Figure 14.10.

Claude Code displaying a proposed new file named worldbank_models.R. The agent explains that it reviewed existing helper scripts to understand project conventions before generating a new modeling module containing linear model and ANOVA helper functions.
Figure 14.10: Claude Code displaying the proposed new script as a diff (although it is hard to see).

Unlike the previous tasks, Claude cannot rely on a nearby function as a template. Instead, it must infer the overall design philosophy of the repository and then apply those conventions consistently across an entirely new set of functions.

After the proposed changes are approved, Claude Code generates the new script and summarizes the functions it created as in Figure 14.11.

RStudio displaying the completed worldbank_models.R script while Claude Code summarizes the newly created functions for fitting linear models, performing ANOVA analyses, generating summary tables, and reporting model fit statistics.
Figure 14.11: Claude Code displaying the results of the new script which is shown in RStudio).

The resulting script contains four functions:

  • fit_worldbank_lm() for fitting linear regression models,
  • fit_worldbank_anova() for fitting one-way ANOVA models,
  • model_summary_table() for generating coefficient and ANOVA summary tables, and
  • model_fit_stats() for reporting model-level statistics.

This task required noticeably more time than the previous two examples. As shown in Figure 14.10 and Figure 14.11, Claude Code spent much of its effort gathering context from the repository before writing any code. Because the task involved creating a new module rather than modifying an existing one, the agent needed to inspect multiple scripts, identify common design patterns, and determine how the new functionality should fit within the broader project structure.

The progression across the three tasks is instructive:

  1. Modify an existing function (scatterplot smoother).
  2. Create a new function within an existing script (boxplot).
  3. Create an entirely new module (modeling helpers).

As the scope of the task increases, the amount of repository context Claude Code must gather also increases. This illustrates one of the defining characteristics of repository-aware agents: they do not simply generate code from a prompt, they actively explore and learn from the surrounding codebase before proposing a solution.

TipPrompts Become Shorter, Context Does Not

When first working with a repository agent, many people over-prompt because they are still thinking like chat-assistant users.

Repository-aware agents often require less detailed prompts than chat assistants.

  • This does not mean they need less context. Rather, much of the context is already available through the repository structure, existing code, project documentation, and files such as CLAUDE.md.
  • As a result, prompts can focus more on the desired outcome and less on explaining the environment in which the work takes place.

Start with relatively brief prompts and only add additional constraints when the agent’s initial approach differs from what you intended.

14.4.5 Test the Code

Let’s try quick examples of the updated and new code.

#|message: false
#|warning: false
#|layout-ncol: 2
#| code_fold: true
#| 
source("R/modified/worldbank_plots.R")


make_indicator_scatter(
  wb_df = wb_wide,
  x_indicator = "fertility_rate",
  y_indicator = "population_growth",
  years = 2023,
  label_countries = TRUE,
  smoother = "lm"
)

make_indicator_scatter(
  wb_df = wb_wide,
  x_indicator = "gdp_per_capita_current_usd",
  y_indicator = "life_expectancy_total",
  years = 2023,
  smoother = "loess"
)

 make_indicator_boxplot(
  wb_df = wb_wide,
  indicator = "fertility_rate",
  years = 2000:2023,
  notch = TRUE
)

 make_indicator_boxplot(
  wb_df = wb_wide,
  indicator = "population_growth",
  years = c(2000, 2005, 2010, 2015, 2020, 2023),
  notch = TRUE
)

  • Each of the plotting functions runs fine. There some changes that one might want to make by editing the functions to have different defaults.

Let’s check the model functions.

  • As the default, functions use the {broom} package to return a tidy data frame.
    • This format is easier to filter, sort, and incorporate into reports.
  • Setting use_broom = FALSE returns the standard base R model summary instead.
#|message: false
#|warning: false
#|layout-ncol: 2
#| code_fold: true
#| 
source("R/modified/worldbank_models.R")
lm_fit <- fit_worldbank_lm(
  wb_wide,
  outcome = "population_growth",
  predictors = c(
    "fertility_rate",
    "gdp_per_capita_current_usd"
  ),
  years = 2023
)

model_summary_table(lm_fit)
# A tibble: 3 × 5
  term                         estimate  std.error statistic  p.value
  <chr>                           <dbl>      <dbl>     <dbl>    <dbl>
1 (Intercept)                -2.32      2.32          -1.00  0.322   
2 fertility_rate              0.820     1.56           0.526 0.602   
3 gdp_per_capita_current_usd  0.0000337 0.00000934     3.61  0.000896
model_fit_stats(lm_fit)
# A tibble: 1 × 12
  r.squared adj.r.squared sigma statistic p.value    df logLik   AIC   BIC
      <dbl>         <dbl> <dbl>     <dbl>   <dbl> <dbl>  <dbl> <dbl> <dbl>
1     0.261         0.221  1.74      6.52 0.00374     2  -77.3  163.  169.
# ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
model_summary_table(
  lm_fit,
  use_broom = FALSE
)

Call:
lm(formula = as.formula(formula_str), data = model_df)

Residuals:
    Min      1Q  Median      3Q     Max 
-7.0750 -0.4806  0.1722  0.6056  4.0655 

Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
(Intercept)                -2.323e+00  2.316e+00  -1.003 0.322314    
fertility_rate              8.205e-01  1.558e+00   0.526 0.601704    
gdp_per_capita_current_usd  3.375e-05  9.342e-06   3.612 0.000896 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.738 on 37 degrees of freedom
Multiple R-squared:  0.2607,    Adjusted R-squared:  0.2208 
F-statistic: 6.525 on 2 and 37 DF,  p-value: 0.00374
#|message: false
#|warning: false
#|layout-ncol: 2
#| code_fold: true
lm_fit <- fit_worldbank_lm(
  wb_wide,
  outcome = "life_expectancy_total",
  predictors = c(
    "gdp_per_capita_current_usd",
    "fertility_rate"
  ),
  years = 2023
)

model_summary_table(lm_fit)
# A tibble: 3 × 5
  term                         estimate std.error statistic  p.value
  <chr>                           <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)                81.7       3.01          27.2  4.78e-26
2 gdp_per_capita_current_usd  0.0000726 0.0000121      5.99 6.50e- 7
3 fertility_rate             -3.31      2.02          -1.63 1.11e- 1
model_fit_stats(lm_fit)
# A tibble: 1 × 12
  r.squared adj.r.squared sigma statistic     p.value    df logLik   AIC   BIC
      <dbl>         <dbl> <dbl>     <dbl>       <dbl> <dbl>  <dbl> <dbl> <dbl>
1     0.534         0.509  2.26      21.2 0.000000734     2  -87.7  183.  190.
# ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
#|message: false
#|warning: false
#|layout-ncol: 2
#| code_fold: true
aov_fit <- fit_worldbank_anova(
  wb_wide,
  outcome = "fertility_rate",
  group = "year",
  years = 2000:2023
)

model_summary_table(aov_fit)
# A tibble: 2 × 6
  term         df sumsq meansq statistic p.value
  <chr>     <dbl> <dbl>  <dbl>     <dbl>   <dbl>
1 year         23  2.27 0.0986      1.50  0.0631
2 Residuals   936 61.7  0.0660     NA    NA     
model_fit_stats(aov_fit)
# A tibble: 1 × 6
  logLik   AIC   BIC deviance  nobs r.squared
   <dbl> <dbl> <dbl>    <dbl> <int>     <dbl>
1  -45.0  140.  262.     61.7   960    0.0354
model_summary_table(
  aov_fit,
  use_broom = FALSE
)
             Df Sum Sq Mean Sq F value Pr(>F)  
year         23   2.27 0.09860   1.495 0.0631 .
Residuals   936  61.73 0.06595                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#|message: false
#|warning: false
#|layout-ncol: 2
#| code_fold: true
aov_fit <- fit_worldbank_anova(
  wb_wide,
  outcome = "life_expectancy_total",
  group = "country_name_ascii",
  years = 2023
)

model_summary_table(aov_fit)
# A tibble: 1 × 4
  term                  df sumsq meansq
  <chr>              <dbl> <dbl>  <dbl>
1 country_name_ascii    36  370.   10.3
model_fit_stats(aov_fit)
# A tibble: 1 × 6
  logLik   AIC   BIC deviance  nobs r.squared
   <dbl> <dbl> <dbl>    <dbl> <int>     <dbl>
1    Inf  -Inf  -Inf        0    37         1
  • These all work as well.

14.4.6 Summary of Claude Code’s Performance

Claude Code performed well on all three development tasks. Most importantly, the resulting functions worked correctly without requiring manual debugging or substantial revision.

  • This success was likely due in part to the rich context available within the repository.
  • Rather than relying solely on information provided in a prompt, Claude Code was able to inspect existing scripts, identify coding conventions, study function patterns, and infer how new functionality should fit into the project.

The progression across the three tasks also demonstrated increasing levels of repository awareness.

  • The scatterplot modification required understanding and extending an existing function.
  • The boxplot task required creating a new function that matched the structure and style of the existing plotting helpers.
  • The modeling task required creating an entirely new script while maintaining consistency with the broader project architecture.

At the same time, Claude Code made a number of implementation decisions without consulting the user.

  • For example, when adding regression and LOESS smoothers, it chose to include confidence intervals by setting se = TRUE.
  • This was a reasonable default choice, but another analyst might have preferred to suppress confidence bands.
  • Repository-aware agents frequently make these kinds of decisions when details are left unspecified.

One interesting omission was the lack of any suggestion to use the ... argument when designing the plotting functions.

  • Using … would have made the functions more flexible by allowing additional graphical parameters to be passed directly to the underlying ggplot2 geoms. - Whether this reflects a limitation of the model, the examples available in the repository, or simply a conservative design choice is difficult to determine. - It serves as a reminder that even when the generated code is correct, experienced developers may still identify alternative designs or improvements that the agent does not propose.

Overall, the examples illustrate both the strengths and limitations of repository-aware agents.

  • They can rapidly generate useful, working code that conforms to project conventions, but they do not eliminate the need for human review, design judgment, and software engineering expertise.
NoteAdvanced Topic: The ... Argument

Many R functions include a special argument called ... (“dot-dot-dot”).

  • The ... argument collects additional arguments supplied by the user and passes them to another function.
  • This is a common design pattern in the tidyverse and allows a function to remain flexible without exposing every possible option as a formal argument.
  • For example, a plotting function could pass ... directly to geom_point(), allowing users to control point size, color, shape, transparency, and other graphical settings without modifying the function itself.

As you develop more sophisticated reusable functions, ... becomes an important tool for balancing simplicity and flexibility.

NoteRepository Agents Can Do Much More Than Generate Code

The examples in this section focused on modifying and extending R functions, but repository agents such as Claude Code can perform far more sophisticated tasks.

  • Beyond understanding and editing code, they can inspect entire repositories, coordinate work across multiple files, execute terminal commands, run tests, generate documentation, and assist with Git workflows.

Repository agents can also be customized through project context files such as CLAUDE.md, reusable workflows known as skills, and specialized subagents that handle specific responsibilities such as visualization, code review, or documentation.

  • For example, a skill might automate a multi-step data validation workflow, while a visualization subagent could generate publication-quality figures without requiring the primary agent to manage the plotting details directly.
  • These capabilities allow repository agents to support complex analytical and software development workflows that extend well beyond the capabilities of most chat assistants or IDE assistants.

The tradeoff is that greater autonomy requires greater oversight. As repository agents gain the ability to modify files, execute commands, and coordinate multi-step workflows, careful review, testing, and human judgment become increasingly important.

14.5 Working with Posit Assistant

Posit Assistant is like having Claude (or other LLMs) inside your IDE.

  • It works in RStudio or Positron depending upon the subscription model you are using.
  • It does not work in Post Cloud yet.
Important

To install Positron on your computer, see Install Positron

  • The Positron Windows installer is about 378 MB for release (2026.03.0).
  • Use a download manager (like Free Download Manager) that supports resuming
  • 8 GB will work for light use, but 16 GB is a much more comfortable target if you can get there. If you’re stuck with 8 GB, close your browser and any other apps while working to keep things smooth.
  • Positron is built on top of open source VS Code (Code OSS), which uses components written in C++.
    • If those runtime files are missing or outdated on your machine, Positron can crash or fail to start, sometimes with a cryptic error message.
    • The official Microsoft download page is: https://learn.microsoft.com/en-us/cpp/windows/latest-supported-vc-redist.
    • Once you’re there, scroll down to the “Latest Microsoft Visual C++ Redistributable Version” section and click the X64 download link (the one for most Windows PCs).

The Assistant panel lives in Positron’s sidebar alongside your Variables pane, Plots pane, and console; the same environment where the rest of your analytical work happens.

Posit Assistant is repository-aware.

  • It can inspect open files in the editor as well as other files in the repository.
  • It sees your loaded data frames, your plots, your console history, and your R session state.
  • When you ask a question about a data quality issue, it already knows the shape of the data you are working with.
  • When you ask for a plot fix, it can see the plot that is currently in the Plots pane.
  • That session-level context is what makes it specifically useful for exploratory data science work rather than general coding assistance.

14.5.1 Posit Assistant Workflow

Modern coding assistants are increasingly converging on a common workflow that separates problem solving into three stages: Ask, Plan, and Code.

  • These workflows determine the assistant’s objective rather than the tools available to it.
  • In any workflow, including Ask, the assistant is more than a chat interface; it may use Skills, project memory, web access, MCP servers, and other contextual resources to develop a response.
  • Ask focuses on understanding a problem, answering questions, and making recommendations.
  • Plan focuses on designing and reviewing an implementation strategy before changes are made.
  • Code focuses on implementing solutions and completing tasks within a project.

This progression mirrors how many developers naturally work: first understand the problem, then decide on an approach, and finally implement the solution.

Posit Assistant extends this emerging pattern with a fourth workflow, Clean, that provides specialized support for data preparation, validation, and quality assessment. This addition reflects the reality that data scientists often spend significant time understanding and improving data before analysis or modeling can begin.

Table 14.4 summarizes the four primary workflows available in Posit Assistant.

Table 14.4: Primary workflows available in Posit Assistant.
Workflow Purpose
Ask Explain concepts, answer questions, analyze results, and provide recommendations
Plan Develop an implementation strategy before making changes
Code Create, modify, and test project files
Clean Inspect and improve data quality

A common pattern for larger tasks is:

  • Begin in Ask mode to understand the problem and explore alternatives.
  • Once a direction has been chosen, Plan mode can be used to generate a proposed implementation strategy.
  • After reviewing and approving the approach, Code mode can execute the required changes, often across multiple files and project resources.

For data preparation tasks, Clean mode provides specialized tools for identifying and correcting data quality issues.

14.5.1.1 Ask

Ask is the default conversational workflow. It is designed for explanations, code reviews, debugging assistance, and exploring alternatives before making changes to a project.

Typical uses include:

  • understanding warnings and error messages,
  • reviewing statistical output,
  • comparing alternative implementations,
  • learning unfamiliar packages or functions,
  • and discussing possible approaches to a problem.

The assistant responds in the chat panel but does not automatically modify project files.

14.5.1.2 Plan

Plan focuses on analysis and design. Instead of immediately changing code, the assistant first produces a proposed sequence of steps describing how the task should be completed. (“Think before you code!”)

Plan mode is particularly useful for:

  • large refactoring efforts,
  • multi-file projects,
  • unfamiliar codebases,
  • and situations where design decisions should be reviewed before implementation.

The generated plan can be approved, modified, or rejected before any changes are made.

14.5.1.3 Code

Code is the implementation workflow. The assistant can inspect files, search the repository, write code, modify existing files, and perform multi-step tasks.

  • Because Code mode can access considerably more project context than Ask mode, it is often capable of solving problems that require understanding relationships among multiple files or project components.
  • For well-defined tasks, experienced users often move directly from Ask to Code or bypass Ask entirely and begin in Code mode.
Tip/context Command

Code mode typically consumes more context than Ask mode because it may read project files, session state, execution results, and other resources while working.

  • If you are monitoring usage, use the /context command to see how much context is currently being sent to the model and which files contribute most heavily to the session.

14.5.1.4 Clean

Clean extends the Ask–Plan–Code workflow with data-science-specific capabilities. It focuses on inspecting datasets, identifying quality issues, recommending cleaning strategies, and generating code to implement those corrections.

Typical tasks include:

  • identifying missing values,
  • detecting inconsistent categories,
  • validating joins and merges,
  • correcting data types,
  • locating potential outliers,
  • and evaluating overall data quality.

For many data science projects, Clean mode serves as a natural first step before analysis, visualization, or modeling begins.

14.5.2 Advanced Posit Assistant Features

Beyond basic chat and code generation, Posit Assistant includes several features that support more sophisticated and repeatable workflows.

14.5.2.1 Commands

Commands provide shortcuts for common assistant actions. Rather than writing a detailed prompt, users can invoke predefined workflows directly from the chat interface.

Examples include:

  • /plan — Analyze a task and create an implementation strategy before making changes.
  • /compact — Summarize the conversation and reduce context usage, helping preserve tokens in long sessions.
  • /clear — Clear the conversation history and start fresh without closing the session.
  • /savememory — Create or update project memory based on the current conversation.
  • /report — Generate a Quarto report or document from the current analysis.

Commands help standardize common interactions and can make the assistant more efficient when working on larger projects.

TipThe /compact Command for Context Engineering

As conversations become longer, a growing portion of the model’s context window is consumed by previous messages, code snippets, plans, file contents, and intermediate results.

  • Eventually, the assistant may spend substantial resources processing old information that is no longer relevant to the current task.

Use the /compact command to create a concise summary of the conversation and replace much of the detailed history with that summary.

  • This reduces the number of tokens required to represent prior context while preserving the most important information.
  • For example, after spending an hour discussing data cleaning, package selection, and project setup, a user might invoke /compact before beginning model development.
  • The assistant would retain the major decisions and project context while discarding much of the detailed back-and-forth conversation.

Use /compact when you want to continue working while reducing context usage; use /clear when you want to start a fresh conversation.

  • Project-level instructions stored in AGENTS.md remain available even after a conversation is cleared.

14.5.2.2 Skills

Skills are specialized knowledge modules that provide the assistant with additional expertise in a particular domain or workflow.

  • When the assistant recognizes a relevant task, an appropriate skill may be loaded automatically.

For example:

  • A Shiny application request may activate a Shiny skill.
  • A Quarto publishing task may activate a documentation skill.
  • A data-cleaning request may activate a data-preparation skill.

Skills can contain instructions, workflows, examples, coding conventions, and best practices that guide the assistant’s behavior.

  • Organizations and individual users can also create custom skills to encode project-specific knowledge and preferred workflows.

Examples include:

  • creating Quarto chapters that follow a course template,
  • generating ggplot visualizations using preferred styling conventions,
  • cleaning and validating data before analysis, or,
  • implementing organization-specific coding standards.

You can add your own skills by writing the markdown document and saving in a /skills directory.

  1. Personal skills are available in any project you work on.
  2. Project-specific skills are only active within a particular repository.
  • Project skills take priority over user skills, and both override built-in skills with the same name.

Skills help make assistant behavior more consistent and reduce the need to repeatedly provide the same instructions in prompts.

14.5.2.3 Memory and AGENTS.md

Posit Assistant can maintain persistent project-level context through an AGENTS.md file located in the root of a project.

  • Like CLAUDE.md, this file serves as a central location for documenting information the assistant should consistently consider when working within the repository.
  • The AGENTS.md is intentionally more detailed than CLAUDE.md because Posit Assistant has session-level context (loaded data frames, plots, console history) and benefits from knowing precise function signatures and data shapes upfront, rather than having to inspect files to discover them.

Typical contents include:

  • project goals,
  • coding standards,
  • preferred package choices,
  • testing requirements,
  • directory organization,
  • documentation conventions,
  • and workflow instructions.

For example, an AGENTS.md file might instruct the assistant to:

  • use tidyverse conventions,
  • avoid absolute file paths by using the {here} package,
  • always write functions that are reusable and well-documented,
  • write Quarto examples using labeled chunks,
  • use American English for comments and documentation, and,
  • always include unit tests for new functions.

The /savememory command can help generate or update this file from an ongoing conversation.

  • By moving important instructions into AGENTS.md, teams can create a shared and version-controlled source of project guidance rather than repeatedly supplying the same context in prompts.

14.5.2.4 MCP Servers

MCP (Model Context Protocol) servers allow Posit Assistant to connect to external tools and data sources.

  • MCP provides a standardized mechanism for exposing capabilities to AI assistants without requiring custom integrations for every service.
  • Instead of relying solely on local project files, the assistant can retrieve information, query databases, execute actions, or interact with external systems through MCP-enabled tools.

Examples include:

  • database connections,
  • cloud storage systems,
  • APIs,
  • issue trackers,
  • documentation repositories, and,
  • internal organizational tools.

As the ecosystem develops, MCP is increasingly becoming a common standard for connecting AI assistants to software, services, and organizational knowledge sources.

14.5.3 Getting Started

Posit Assistant is not an extension to install; it is built into Positron.

  • Posit Assistant supports multiple LLM providers, including Anthropic’s Claude models and others.
  • Depending on the model you choose, you may need to supply an API key from the relevant provider; usage is billed by that provider according to their own pricing.
  • API keys are registered inside Positron under Settings → Posit Assistant.
Note

Posit does not track, collect, or store your prompts, code, or conversations when using Posit Assistant.

  • Your session data goes directly to the model provider under your own API key and is subject to that provider’s data handling policies, not Posit’s.
  • This is the same arrangement as using the provider’s API directly. Posit is providing the client, not the model.

14.5.4 The Posit Assistant Panel

Figure 14.12 shows Positron with the Assistant panel open.

The Assistant Panel has three parts worth looking at before using it.

Positron showing the Posit Assistant chat panel on the left, and an R console on the right. Annotations indicate the icons on the activity bar and at the bottom of the chat pane.
Figure 14.12: The Posit Assistant panel open alongside the R console. The activity bar icon toggles the panel; the Chat and model selector and mode dropdown sit at the bottom of the chat pane.

The activity bar icon (left edge, annotated) toggles the Assistant panel open and closed.

  • It sits alongside the file explorer, search, and Git icons.
  • The panel is part of the IDE layout, not a floating window.

The Chat/Ask input near the bottom of the panel is where prompts go.

  • Three features are available inline:
    • drag and drop a file to add to the context,
    • @ selects a chat participant such as a specific file or workspace context;
    • / accesses predefined commands including /help.
  • The Add Context Paperclip button to the left of the chat input opens the local file manager to add files.
  • The icons below the Chat/Command allow you to adjust how you want the conversation to proceed.
    • Allow/Disallow web searches
    • Enable plan mode where the assistant will not execute code changes without permission
    • Enable Cleaning mode where the Assistant will focus on actions and suggestions for cleaning data sets
    • Adjust Thinking controls how much reasoning effort the model is allowed to expend before responding. In practice, this is often a tradeoff between:
      • response speed,
      • token consumption,
      • depth of analysis,
      • and reliability on complex tasks.
    • The Sandbox is for testing code snippets in a safe environment without affecting your actual project files. It allows you to experiment with code generated by the assistant before applying it to your repository.

The Run time and model selector dropdowns sit at the bottom on the right.

Posit Assistant can see this session state and will incorporate it into responses without you having to describe it.

14.5.5 Example: Working with Posit Assistant

To illustrate this workflow, consider the World Bank Europe project and suppose we want to study migration patterns in Europe and understand whether emigration may help explain some of the long-run population changes observed in countries such as Albania.

  • Because Posit Assistant already has access to the repository and session state, it can begin from the current project rather than a blank chat window.
  • The repository already contains data and functions for downloading and processing World Bank indicators, creating visualizations, and fitting simple models.

Before searching for additional data ourselves, we can switch to Code mode and pose the following question:

Scan the repo for the types of data that exist from the world bank and suggest other means to get data about immigration and emigration from European countries. Are there reports or PDF doucments or oter data sources that would work and for which you could write R or python code to convert tables into data

This example illustrates a complete Code-mode workflow.

Posit Assistant begins by examining the project and gathering context before proposing a solution. In Figure Figure 14.13 the user asks the assistant to identify immigration and emigration data sources that would complement the existing World Bank dataset. The assistant first explores the repository and metadata workbook to understand what data already exist.

Figure 14.13: Posit Assistant begins by exploring the repository and examining the World Bank metadata workbook before proposing additional immigration and emigration data sources.
Figure 14.14: The assistant requests permission to read the World Bank metadata workbook so it can understand the existing indicators and data structure.

After reviewing the project files, the assistant performs web searches to identify additional data sources and evaluate their suitability for the project.

Once sufficient context has been collected, the assistant switches into Plan mode. Rather than immediately generating code, it creates a detailed implementation plan describing the proposed workflow, files to be created, and data sources to be used.

Figure 14.15: Posit Assistant enters Plan mode and begins collecting information needed to develop an implementation strategy.
Figure 14.16: The assistant writes a detailed plan to a temporary plan file and prepares a structured implementation proposal.
Figure 14.17: The completed plan is presented to the user for review. The user can continue planning, exit planning mode, or approve the plan and begin implementation.

As shown in Figure Figure 14.17, Plan mode introduces an explicit review step before changes are made to the project.

After the plan is approved, Posit Assistant begins implementation. Proposed file changes are displayed for review before being written to disk.

Figure 14.18: The assistant proposes creation of a new script for downloading and processing Eurostat migration data. The user can review the code before approving the change.
Figure 14.19: A second script is proposed for downloading and processing United Nations DESA migration stock data. Each file modification requires explicit approval.

This approval process allows the user to inspect generated code before it becomes part of the repository.

After implementation is complete, Posit Assistant provides a summary of the work performed and explains how the newly created files relate to the existing project.

Figure 14.20: Posit Assistant summarizes the completed work, describes the new data sources, and explains how they complement the existing World Bank migration indicators.

The final summary serves as a project-level report that documents the assistant’s decisions, newly created files, and recommended next steps.

Figure 14.13 through Figure 14.20 exemplify a typical Code mode interaction.

  • Subsequent refinement could continue in Ask or Plan mode, where the context is typically narrower and focused on the specific files being modified (and generally fewer tokens are consumed).

14.5.6 Checking the Code

Although you nay get different code if you ran it, both scripts required several rounds of refinement an debugging to work with the data sources.

Four distinct issues came up getting the two scripts working, each worth knowing for future data work.

Script 1 — 03_get_eurostat_migration.R (three bugs)

  1. Column renamed in current API version The eurostat package (v4.0.0+, which switched to the SDMX 2.1 API in late 2023) returns the time column as TIME_PERIOD, not time. The rename call rename(year = time) failed silently in some places and hard-error-ed in others.
    • Fix: change to the correct name.
  2. label_eurostat() destroys the geo codes. The script called label_eurostat() before add_iso3(), but labeling converts Eurostat 2-letter codes (“DE”, “FR”) into full country names (“Germany”, “France”). Calling countrycode(..., origin = "eurostat") on a country name returns NA, so the subsequent filter(!is.na(country_code_3)) silently dropped every row, producing a zero-row Parquet file with no error.
    • Fix: run add_iso3() on the raw coded data first, then label if needed.
  3. Duplicate rows from the agedef dimension tps00176 and tps00177 include two age-definition variants per country-year: COMPLET (age on last birthday) and REACH (age reached during the year). Without filtering on this, the join between immigration and emigration tables triggered a many-to-many warning and doubled counts for some countries.
    • Fix: add agedef == "COMPLET" to the filter, which is the standard international definition.

Script 2 — 04_get_undesa_migrant_stock.R (two bugs)

  1. UN server blocks download.file() with 403 The UN DESA server rejects programmatic requests from download.file() but accepts them from a browser.
    • Fix: use httr::GET() with a browser user-agent string and a Referer header pointing to the UN DESA landing page.
  2. Multi-row merged header in the Excel file The Excel workbook has a table-of-contents sheet (not the data), plus 10 rows of title/metadata before the actual column headers, and the year columns span three merged header rows (both sexes / male / female) that readxl cannot parse automatically.
    • Fix: use skip = 10 and supply column names manually, un_code, country_name, then both_1990 …both_2024,male_1990 … female_2024`, then filter rows with un_code < 900 to drop regional aggregates.

What remained a data issue (not a script issue)

Albania, Bosnia, Serbia, Turkey, and Ukraine are absent from Eurostat entirely; they don’t submit data to Eurostat’s POPSTAT collection.

  • Kosovo is absent from both sources (not a UN member state).
  • UN DESA covers all five Eurostat gaps; Kosovo has no international migration data available from any standard source.
Important

Even when an assistant identifies a promising data source, the analyst remains responsible for evaluating the quality of that source.

  • Questions such as data provenance, collection methodology, geographic coverage, missing values, and comparability across countries still require human judgment.
  • AI can accelerate discovery and implementation, but it cannot determine whether a source is appropriate for a particular research question.

14.5.7 Final Results

14.5.7.1 Graph Descriptions

Figure 14.21 shows migration flows can vary substantially from year to year and often respond quickly to economic and political events.

Figure 14.21: Faceted line charts showing annual immigration and emigration flows for selected European countries between 2013 and 2024. Blue lines represent immigration and red lines represent emigration. Most countries exhibit substantial year-to-year variation, with several showing sharp increases after 2021. Germany, Spain, and France have the largest migration flows among the countries shown.
  • The figure illustrates the distinction between migration flows (people moving during a given year) and migrant stocks (people residing in a country at a particular point in time).
  • Countries differ substantially in both the magnitude and direction of migration flows.

Figure 14.22 illustrates the difference between migration stocks and migration flows. Stocks tend to change gradually over time because they reflect the cumulative result of many years of migration.

Figure 14.22: Line chart showing international migrant stock by destination country from 1990 to 2024 on a logarithmic scale. Gray lines represent countries included in both Eurostat and UN DESA data sources, while highlighted red lines identify countries available only through UN DESA. Turkey and Ukraine have the largest migrant stocks among the highlighted countries, while Albania, Bosnia and Herzegovina, and Serbia have substantially smaller immigrant populations.
  • Migrant stock measures the number of foreign-born residents living within a country rather than the number of people moving during a particular year.
  • The highlighted countries demonstrate how UN DESA data can expand geographic coverage beyond what is available through Eurostat alone.

Figure 14.23 highlights the demographic significance of emigration in several Balkan countries, where large shares of the population reside abroad.

Figure 14.23: Line chart showing emigrants living abroad as a percentage of the home-country population from 1990 to 2024 for selected European countries. Bosnia and Herzegovina, Albania, and Moldova exhibit the highest levels, with approximately one-third to one-half of their populations residing abroad by 2024. Several Balkan countries show substantial increases over time, highlighting the demographic importance of emigration.
  • The figure combines UN DESA migrant stock estimates with World Bank population data to illustrate the demographic importance of emigration.
  • Several Balkan countries have experienced substantial long-term population outflows, with emigrant populations approaching or exceeding one-third of their current resident populations.

14.5.7.2 Assistant Interpretation

Figure 14.21

  • The 2022 spike is striking across most countries (Germany, France, Spain, Netherlands, Italy, Poland, Austria, Sweden, Belgium). This is the Ukraine displacement effect, with 2022 seeing an unusually large immigration surge.
  • Poland is historically an emigration country (red ≥ blue pre-2020), but flipped sharply in 2022 as it became a major Ukrainian refugee host.
  • Greece shows the opposite pattern around 2015. Emigration nearly matched immigration during the debt crisis, then immigration rebounded.
  • Romania has persistently high emigration relative to its immigration, reflecting long-running labor emigration to Western Europe.
  • Germany shows the largest absolute volumes, dwarfing all others.

Figure 14.22

  • Turkey had ~6–7M foreign-born residents in 1990 (many ethnic Turks returning from Germany), declined to ~3M around 2000 as return migration ended, then spiked sharply to ~7M by 2024, largely driven by the Syrian refugee crisis post-2015.
  • Ukraine jumped from ~5M to ~6M between 2020 and 2024, reflecting post-invasion displacement being captured in destination-country censuses.
  • Albania, Bosnia, Serbia have tiny migrant stocks; they’re primarily emigration countries, not destination countries.
  • By adding a log scale, Albania and Bosnia are now clearly visible at the ~50–100K range rather than being flat against zero.
  • The log scale also makes Serbia’s declining trend (large diaspora communities from the 1990s wars gradually naturalizing abroad) visible alongside the Turkey/Ukraine story at the top.
  • One important caveat: UN DESA is migrant stock (a point-in-time count at 5-year intervals), not annual flows like the Eurostat data. The two sources measure different things and shouldn’t be combined on the same axis without clearly labeling that distinction.

Figure 14.23

  • Bosnia and Albania are at ~50%, roughly one person living abroad for every person still at home.
  • The 1995 Bosnia spike reflects mass displacement during the Bosnian War; it then declined as some refugees returned, before rising again steadily through EU free movement.
  • Albania’s line rises almost monotonically from 1990, reflecting a sustained out-migration that started the moment borders opened after communism collapsed.
  • Moldova (~35%) tells a similar story to the Balkans; EU-oriented labor emigration accelerating after 2000.
  • Croatia’s sharp drop from ~27% to ~10% between 2010 and 2020 looks strange. This likely reflects Croatia joining the EU in 2013, which reclassified some long-settled Croatians in EU countries as “citizens” rather than “emigrants” in national registration data, causing a statistical break rather than a genuine return. A reminder to check context for data quality.
  • Ukraine’s 2024 jump reflects post-invasion displacement being captured for the first time at scale.
  • One methodological note: because the denominator (home population) is also shrinking due to emigration, the % can rise faster than the absolute emigrant count would suggest — which makes the Bosnia/Albania numbers even more sobering than they first appear.

14.6 Agent SDKs

Agent Software Development Kits (SDKs) provide a software framework for building custom AI agents.

  • Rather than interacting with a pre-built assistant, developers define the agent’s instructions, tools, workflows, and constraints in code.
  • Modern SDKs handle many of the difficult infrastructure tasks automatically, including tool registration, conversation management, model communication, and execution loops.
  • This allows developers to focus on the domain-specific logic of the agent rather than the underlying plumbing.

The OpenAI Agents SDK is a widely used example of this approach.

  • A data scientist might create an agent that analyzes World Bank data, retrieves country-level indicators, generates visualizations, and produces written summaries.
  • Each capability is implemented as a tool that the agent can call when needed.
  • Because the tools are (usually) written in Python, they can be tailored to the specific needs of a project while still benefiting from the SDK’s built-in orchestration and tracing capabilities.

The key advantage of an SDK is flexibility.

  • Claude Code, Codex, Posit Assistant, and similar systems are designed primarily to help users work with code and projects through a predefined interface.
  • An SDK allows you to move beyond those built-in workflows and create agents specialized for a particular task, organization, dataset, or business process.
  • This additional flexibility comes at the cost of greater development effort since you are responsible for designing, testing, and maintaining the agent.

In practice, many data scientists begin with tools such as Claude Code or Posit Assistant because they provide immediate productivity gains with little setup.

  • SDKs become more attractive when the workflow is repeatedly performed, requires specialized tools, must integrate with existing systems, or needs capabilities that are not available in a general-purpose coding assistant.
  • At that point, building a custom agent can provide a level of automation and control that exceeds what a pre-built assistant can offer.

14.7 Leading Edge Concepts and Practices

14.7.1 Reusable Agent Definitions

An emerging practice is reusable agent definitions.

  • Rather than writing a new prompt for every task, practitioners create role descriptions that define responsibilities, preferred workflows, and expected outputs.
  • These definitions can often be reused across projects and, with minor modifications, across different agent frameworks.

Community repositories now provide agent definitions for roles such as data scientist, software engineer, code reviewer, researcher, and project manager.

In many cases, the same role definition can be used with Claude Code, Codex, SDK-based agents, or other agent frameworks, making the agent specification itself a reusable asset.

14.7.2 Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) combines information retrieval with LLM generation.

  • Rather than relying exclusively on information learned during training, a RAG system first searches a collection of documents, databases, project files, or other resources for information relevant to the user’s question.
    • This is a way to get your latest, most accurate , or proprietary data or content into the model.
  • The retrieved material is then inserted into the model’s context window, allowing the model to generate a response grounded in the retrieved content rather than relying entirely on its internal knowledge.

Many of the AI tools discussed in this chapter use some form of retrieval.

  • Claude.ai can search uploaded files, Claude Code can retrieve relevant source files from a repository, and Posit Assistant can inspect project files, data frames, plots, and session state.
  • In each case, the underlying model remains largely the same, but the quality of the response improves because the model is given access to information specific to the current task.
  • As projects grow beyond what can fit into a single prompt, retrieval becomes one of the primary mechanisms for providing relevant context to an AI system.

Conceptually, the workflow looks like:

Question -> Retrieve Information -> Add Context -> Generate Response

RAG systems are widely used when assistants must work with large collections of documents, reports, databases, or project files.

14.7.3 Vibe Coding

Vibe coding refers to a style of software development in which a programmer describes goals in natural language and relies heavily on an AI assistant to generate, modify, and explain code.

  • This approach can dramatically accelerate prototyping, reduce repetitive coding tasks, and make it easier to explore alternative solutions.
  • Tools such as Claude Code, Positron Assistant, GitHub Copilot, Cursor, and other AI-enabled development environments have made this workflow increasingly common.
  • However, the effectiveness of vibe coding depends heavily on the developer’s ability to evaluate the resulting code. - AI-generated solutions can contain subtle bugs, incorrect assumptions, inefficient algorithms, or architectural problems that may not become apparent until much later.
  • The most successful practitioners use AI to accelerate implementation while still applying the same testing, debugging, and code review practices they would use for human-written code.
Warning 14.1

A Common Risk

If you do not know what a correct solution should look like, it becomes much harder to recognize incorrect code generated by an AI assistant.

AI-generated code should be treated as a starting point for review rather than unquestioned truth.

14.7.4 Human-in-the-Loop AI

Human-in-the-loop systems combine AI automation with human review and decision making.

Rather than allowing an AI system to operate completely independently, important actions require human approval. Examples include reviewing generated code before merging it, approving an agent’s proposed plan before execution, or validating analytical results before publication.

Many modern AI development tools use this approach. Claude Code asks for approval before potentially destructive actions, Positron Assistant requires confirmation before applying edits, and production agent systems often pause at predefined checkpoints.

The goal is not to eliminate human involvement but to combine the speed of automation with human judgment and accountability.

14.7.5 Persistent and Multi-Agent Workflows

When people talk about agents running “all day” or “24/7,” they usually do not mean that a model is continuously reasoning in the background.

  • More often, they mean that an agent has been connected to a scheduler, event trigger, queue, or cloud environment.
  • The agent wakes up when something happens, checks relevant information, takes a limited action, records the result, and then stops until the next trigger.
  • For example, a price-monitoring agent might check a product page every hour, a market-monitoring agent might look for large stock-price changes, and a coding-summary agent might review Git commits at the end of each day and suggest ways to improve the developer’s workflow.

In coding workflows, this can look several different ways.

  • With terminal-based tools such as Claude Code, a developer may open one or more terminal sessions, give each session a different scoped task, and let the agents work independently on separate branches or files.
    • One agent might refactor plotting functions, another might write tests, and another might update documentation.
    • With cloud-based coding agents such as Codex, the work may run in separate cloud environments in the background, including in parallel, rather than in visible local terminal windows.
  • In either case, the practical pattern is the same:
    • split a large project into smaller tasks,
    • give each agent clear instructions,
    • isolate its workspace, and
    • review its changes before merging them.
  • Codex Cloud is explicitly designed for background and parallel coding tasks, while Claude Code is designed to work from a terminal, IDE, web interface, or other connected development surface.

For monitoring personal or online activity, the workflow is similar but less code-centered.

  • A practitioner defines the target to monitor, the trigger condition, the tools the agent can use, and the notification or storage mechanism.
    • A simple version might be: every morning, read yesterday’s Git commits, summarize what changed, identify unfinished work, and write a short planning note.
    • A more complex version might monitor product prices, public web pages, RSS feeds, GitHub issues, API endpoints, or analytics dashboards.
    • The important design choice is that the agent should not have unlimited authority.
    • It might be allowed to read data and draft recommendations, but sending emails, placing orders, trading stocks, deleting files, or changing production systems should require explicit human approval.

Multi-agent systems extend this pattern by assigning different roles to different agents.

  • One agent might gather data, another might validate sources, another might write a summary, and another might review the output for errors.
  • This can be more reliable than asking a single large agent to do everything, because each agent has a narrower task and clearer success criteria.
  • The tradeoff is operational complexity: multiple agents require coordination, logging, cost controls, permissions, and a way to resolve conflicts when agents disagree or modify overlapping files.

The safest way to interpret claims about “always-on agents” is to translate them into familiar infrastructure terms:

  • scheduled jobs,
  • event-driven workflows,
  • background tasks,
  • logs,
  • permissions, and
  • review steps.

The novelty is that an LLM can now decide which tools to call and how to synthesize the results.

  • The responsibility remains the same: define scope carefully, monitor what the agent does, and keep humans in control of consequential actions.

14.8 Responsible Use of AI, Assistants, and Agents

The tools discussed throughout this chapter can accelerate many aspects of data science and software development. Whether you are working with a chat assistant, a repository-aware coding tool, or a fully autonomous agent, the same fundamental responsibilities remain.

14.8.1 Five Principles for Data Science and AI

  1. Verify results before using them.
    • AI-generated outputs can be convincing while still being incorrect. Always validate important code, analyses, and conclusions.
  2. Understand the data.
    • No AI system can replace knowledge of how data were collected, what variables mean, or what limitations exist in the dataset.
  3. Maintain reproducible workflows.
    • Analyses should remain transparent and reproducible through code, documentation, version control, and clear records of how results were produced.
  4. Protect sensitive information.
    • Before sharing data with cloud-based AI systems, understand privacy requirements, ownership restrictions, and applicable regulations.
  5. Remain accountable for decisions.
    • AI tools can assist with analysis and implementation, but responsibility for the final work remains with the human using the tool.

14.8.2 Five Additional Principles for Agents

  1. Limit an agent’s scope.
    • Give agents only the permissions and tools needed to complete a task. More capability also means greater risk.
  2. Keep humans in the loop for important actions.
    • Actions such as modifying files, executing code, sending messages, or updating databases should include opportunities for human review.
  3. Evaluate outputs, not appearances.
    • Agents often produce polished and confident results. Correctness must be verified independently.
  4. Monitor costs and resource usage.
    • Agentic workflows can consume large numbers of tokens, API calls, and computational resources if left unchecked.
  5. Avoid deskilling yourself.
    • The ability to evaluate an agent’s work depends on understanding the underlying methods. Continue developing your analytical and programming skills rather than relying entirely on automation.
Important

A useful rule of thumb is simple:

Never delegate responsibility, only tasks.

AI assistants and agents can generate code, analyze data, search repositories, and automate workflows, but they cannot assume responsibility for the quality, correctness, or consequences of the work they produce.

14.9 Summary

  • Most practitioners use existing assistants and frameworks rather than building agents from scratch.
  • AI assistants can help analyze larger and more complex projects.
  • Project context plays a critical role in agent performance.
  • Skills, subagents, and retrieval help manage complexity and reduce context size.
  • Vibe coding can accelerate development but requires careful oversight.
  • Version control becomes even more important when working with AI agents.
  • Responsible data science practices remain essential regardless of how much assistance is provided by AI systems.