14  Working with Agents

Published

June 9, 2026

Keywords

LLM, Agents, Agent Frameworks, Coding Assistants, Claude Code, Software Developmenet Kit, SDK, Ethics, Responsible Use of Agents

14.1 Introduction to Working with Agents

14.1.1 Learning Outcomes

By the end of this chapter, you should be able to:

  • Explain the difference between an AI assistant and an agent framework.
  • Use an LLM to explore and analyze a larger real-world dataset.
  • Describe how project context influences assistant and agent behavior.
  • Explain the roles of project instructions, skills, subagents, and retrieval.
  • Recognize the benefits and risks of vibe coding.
  • Apply responsible software engineering practices when working with AI agents.

14.1.2 References

Additional References

NoteA Note on the Development of This Chapter

This chapter was developed with the assistance of both ChatGPT and Claude.

  • These tools were used throughout to explore ideas, generate initial code, and draft prose.
  • Through extended iterative conversations the material was revised, restructured, removed, and adjusted as I tuned it to align with my pedagogical goals for this course and serve as a reference for others.
  • All material was verified against primary references. The framing, sequencing, and editorial judgments are my own.

The contents of this chapter, their clarity, accuracy, and relevance, are my responsibility. Any errors are mine alone.

14.2 From Workflows to Agentic Assistants and Frameworks

In the previous chapter, we moved from single prompts to functions, workflows, agents, and agent frameworks. Along the way, we explored how tools, memory, context engineering, and orchestration allow large language models (LLMs) to solve increasingly complex tasks.

  • The examples in that chapter intentionally used small datasets such as mtcars and penguins so that we could focus on the underlying concepts without being distracted by data collection and preparation.

This chapter shifts from building agents to working with existing AI assistants and agent frameworks.

  • Rather than creating agent infrastructure from scratch, we will examine how modern tools help data scientists analyze larger projects, interact with real datasets, and automate portions of their workflow.

The central question for this chapter is:

How can modern AI assistants help us work with data science projects that have become too large or complex to manage manually?

Having explored the first four levels in Table 14.1, we now focus on the Agent Framework layer.

Table 14.1: Comparison of prompts, workflows, agents, and agent frameworks.
Level Description
Prompt A single interaction with an LLM.
Function A reusable prompt wrapped in code.
Workflow A sequence of steps controlled by software.
Agent An LLM that selects among available tools.
Agent Framework Infrastructure that manages tools, memory, evaluation, and orchestration.

The past year has seen a tremendous shift in how data scientists (and many other professionals) work. Rather than building agent infrastructure from scratch, many practitioners now use agentic assistants and frameworks that already provide tools, memory, evaluation, and orchestration capabilities.

These systems do not all operate in the same way. They differ primarily in:

  • How much context they can access.
  • How much autonomy they are given.
  • Whether they can use tools.
  • Whether they can modify project files.
  • Whether they are designed to help a user or to help build new agentic systems.

One useful way to think about these tools is as a spectrum of increasing capability.

Chat Assistant -> Project-Aware Assistant -> Repository Agent -> Agent Framework

  • At one end of the spectrum are chat assistants. These systems primarily interact through a conversational interface. They can answer questions, explain concepts, analyze uploaded files, and generate ideas, but they have little or no direct access to a project.
  • The next step is the project-aware assistant. These assistants are integrated into development environments and can access the current project, repository, and files being edited. They can help generate code, explain functions, suggest improvements, and assist with debugging. However, they generally require user approval before making substantial changes.
  • A further step is the repository agent. These systems can inspect an entire repository, reason across multiple files, execute commands, modify code, run tests, and carry out multi-step development tasks. Rather than simply responding to questions, they can perform actions on behalf of the user.
  • Finally, agent frameworks provide the infrastructure used to build or customize agentic systems. Rather than serving primarily as assistants, they allow developers to define tools, memory, workflows, evaluation procedures, and orchestration logic for new applications.

As we move from left to right along this spectrum, systems generally gain greater access to project context and a greater ability to take action on behalf of the user.

  • At the same time, the need for oversight, testing, version control, and responsible use also increases.

Table 14.2 shows just some of the many existing assistants and frameworks available today - with more on the way.

Table 14.2: Examples of assistants and frameworks across the spectrum of agentic capabilities.
Category Example Project Access Can Modify Files Can Execute Tools Typical Use
Chat Assistant Claude.ai Uploaded files and conversations No Limited Exploration, explanation, document review, and analysis
Repository Agent Claude Code Full repository Yes Yes Project-wide development and automation
Repository Agent Codex Full repository Yes Yes Multi-step coding and development tasks
IDE Project Assistant/Agent Positron Assistant Current project and repository With approval Limited Interactive coding and analysis
IDE Project-Aware Assistant/Agent GitHub Copilot Current project and repository With approval Limited Code generation, completion, and debugging
Agent Framework OpenAI Agents SDK Developer-defined Developer-defined Developer-defined Building custom agents and workflows

These tools differ in their capabilities, but all seek to reduce the amount of manual effort required to complete complex tasks.

Because the tools in Table 14.2 provide different levels of context and autonomy, they are best suited to different types of tasks.

We will explore some of these capabilities in the following sequence:

Important 14.1: Increasing Autonomy Changes the Role of the Developer

As assistants gain access to more project context and more ability to take action, the role of the data scientist/developer also changes.

  • Historically, developers spent most of their time writing code.
    • Today, many developers spend a growing portion of their time reviewing AI-generated changes, evaluating proposed solutions, testing modifications, and managing requirements.
  • Interacting with an agent increasingly resembles supervising a junior team member.
    • The agent may generate code, propose architectural changes, or complete multi-step tasks, but a human remains responsible for evaluating the results.

This is one reason why version control systems such as Git and GitHub remain important.

  • As agents become more capable, developers increasingly review commits and pull requests, inspect diffs, and test agent generated code prior to approving changes to a project branch.

To meet the challenges of the rapidly evolving roles of data scientists and developers, it is important not only to understand what good code looks like and how to write it, but also how to review, evaluate, and guide code and analyses produced by increasingly capable AI systems.

That requires understanding:

  • What these tools do well.
  • Where their limitations appear and how to detect problems.
  • How they fit into a modern data science workflow.
  • How to use them responsibly.

14.2.1 The Scenario of a Large Analytical Problem

The World Bank exercise introduced a workflow for collecting and analyzing demographic and economic indicators. The resulting dataset contained a few hundred observations and was small enough to explore manually.

For this chapter, we will work with a substantially larger collection of World Bank data.

The dataset contains:

  • 40 European countries.
  • Multiple decades of observations.
  • More than 100 indicators from the World Bank World Development Indicator API.
  • Hundreds of thousands of observations stored as a Parquet dataset, partitioned by country.

This creates a realistic analytical challenge.

As datasets become larger, it becomes more difficult to:

  • Discover relevant variables.
  • Identify meaningful relationships.
  • Compare countries and regions.
  • Generate and evaluate hypotheses.
  • Summarize findings efficiently.

AI assistants can help address many of these challenges.

Note 14.2: AI Assistants as Analytical Partners

Modern assistants are often most valuable when used as analytical partners rather than code generators.

  • They can help identify patterns, suggest analyses, critique conclusions, summarize findings, and explain results.
  • However, responsibility for evaluating those suggestions remains with the analyst.

14.2.2 The World Bank World Development Indicators Dataset

14.2.2.1 Creating the Dataset

Unlike the earlier World Bank exercise, the data have already been downloaded, cleaned, and stored as Parquet files.

  • The goal of this chapter is not data acquisition; we want to focus on how assistants and agent systems can help us work with larger analytical projects.

The dataset was created using two scripts (available in the data_raw folder) that use code similar to Section 12.9.4

  1. A data acquisition script (01_get_world_bank_data_europe.R)
    • Reads country and indicator specifications from an Excel workbook.
    • Downloads data from the World Bank API.
    • Batches countries and indicators to remain within API limits.
    • Retrieves multiple pages of results.
    • Stores the raw data as a partitioned Parquet dataset.
  2. A data preparation script (02_clean_world_bank_data_europe.R)
    • Reads country and indicator metadata.
    • Cleans and standardizes the data.
    • Adds curated metadata.
    • Generates long and wide versions of the dataset.
    • Stores the cleaned datasets as partitioned Parquet datasets.

The resulting datasets contain observations for multiple European countries across multiple decades and indicators stored in two forms:

  • A long format dataset containing one observation per country, year, and indicator.
  • A wide format dataset containing one row per country-year combination and one column per indicator.
Note

Interestingly, the wide Parquet files are approximately 25% larger than the corresponding long-format files.

  • While wide datasets contain fewer rows, they often contain many more columns and missing values.
  • In columnar storage formats such as Parquet, long datasets frequently compress more efficiently because repeated values (such as indicator codes and country identifiers) can be stored using highly compact encoding schemes.

This observation reinforces a broader lesson about data organization, shown in Table 14.3.

Table 14.3: Common strengths of long and wide data formats.
Format Strengths
Long Storage, aggregation, visualization, efficient compression
Wide Modeling, correlation analysis, machine learning

We often create multiple representations of the same data because different tasks require different structures.

That idea applies not only to data, but also to AI systems:

  • Long vs. wide datasets
  • Raw vs. cleaned data
  • Metadata vs. observations
  • General-purpose agents vs. specialized subagents

Each representation is optimized for a particular task.

14.2.2.2 Data Engineering Practices Enable Metadata as Context

The Excel workbook used to create the datasets contains more than just country and indicator codes.

It’s the output of a data engineering effort to create an authoritative human and machine readable source of information for the data outside the code:

  • Curated, canonical human-readable variable names for indicators.
  • Measurement units.
  • Recommended display precision.
  • Curated indicator themes or categories.
  • Reasons for including each variable.
  • A canonical name for each country
  • An ASCII name field for each country

This information serves as a small data dictionary for the project that is under version control and managed by the analyst and/or the data engineering team.

From a data engineering perspective,

  • Separating the specifications from the code allows much more flexibility in maintaining the specifications.
NoteMetadata Changes Over Time

As an example, Türkiye requested that international organizations adopt “Türkiye” rather than “Turkey” as the preferred English name.

  • Because country names are stored in a centralized metadata file, the change can be made in one location without modifying analytical code throughout the project.

Having multiple names and identifiers is also a best practice for data engineering as long as one field is a standard suitable for joins.

  • The country codes are an international standard and fairy stable so one can join on country_code_3 such as TUR, ALB, GRC.
  • Keep a curated country_name for display to humans.
  • Keep a country_name_ascii field for cases where software, URLs, or joins from older sources strip diacritical marks or you get a dataset that only has names and no codes.

14.2.2.3 From an Agent Perspective, Metadata Is Valuable Context

Modern AI assistants often perform better when they have access to metadata in addition to raw data.

For example, the indicator code SP.DYN.TFRT.IN is much more meaningful when accompanied by metadata indicating that it represents:

  • Fertility rate.
  • Births per woman.
  • A demographic indicator.
  • An important driver of long-run population change.

Providing this additional information is an example of context engineering.

  • Humans rely on metadata, documentation, and data dictionaries to understand a project.
  • Modern AI assistants and agents benefit from the same information.

In many systems, metadata becomes part of the context supplied to the model, allowing it to reason more effectively about the data.

In later sections we will see that many agent systems rely heavily on project documentation, metadata, and retrieval systems to provide this type of context.

14.2.2.4 Loading the Data

Now that we understand how the dataset was created and how metadata was incorporated into the cleaning process, we could load the entire data set.

However, for purposes of the first example with claude.ai, we will only need a subset of the data with just a few indicators for all of the countries.

Let’s collect a subset of data for all of the countries and just a few selectors from the wide data set.

  • We’ll use the wide data set.
  • Although long datasets often compress more efficiently and are frequently preferred for storage and visualization workflows, wide datasets are often more convenient for exploratory analysis.
  • This applies not only to humans but also to LLMs. When variables are stored as columns, relationships among variables are easier to identify and discuss.
library(tidyverse)
library(arrow)
library(here)

# Make the connection to the dataset
wb_europe_wide <-
  open_dataset(
    here("data", "world_bank_indicators_europe_wide")
  )
# use the connection to identify the data, then collect() and select variables
wb_europe_subset <-
  wb_europe_wide |>
  collect() |> 
  select(country, country_code_3, year, 
    population_total, population_growth, fertility_rate, 
    gdp_per_capita_current_usd
  ) 

glimpse(wb_europe_subset)
Rows: 2,640
Columns: 7
$ country                    <chr> "Albania", "Albania", "Albania", "Albania",…
$ country_code_3             <chr> "ALB", "ALB", "ALB", "ALB", "ALB", "ALB", "…
$ year                       <int> 2025, 2024, 2023, 2022, 2021, 2020, 2019, 2…
$ population_total           <dbl> NA, 2377128, 2414095, 2451636, 2489762, 252…
$ population_growth          <dbl> NA, -1.5431439, -1.5431081, -1.5431567, -1.…
$ fertility_rate             <dbl> NA, 1.341, 1.348, 1.355, 1.365, 1.371, 1.39…
$ gdp_per_capita_current_usd <dbl> NA, 11377.776, 9730.869, 7756.962, 7242.455…
  • The resulting data frame now contains all the countries and just enough indicators to support testing an analysis workflow.
NoteContext Engineering Applies to Data

Context engineering is not limited to prompts.

  • When working with assistants such as Claude.ai, we often decide which files, variables, summaries, and visualizations should be shared with the model.
  • Providing too little context can lead to weak recommendations, while providing excessive information can make it harder for the model to focus on the problem of interest.
  • Selecting an appropriate subset of information is therefore an important part of working effectively with AI assistants.

When working with a chat assistant like claude.ai we can upload data files to act as context for our requests.

  • Let’s save our subset of the data as a csv file.
wb_europe_subset |>
  write_csv(
    here(
      "data",
      "wb_indicators_europe_subset.csv"
    )
  )
Note

Although the project stores data internally as Parquet files, CSV is often the most convenient format for sharing tabular data with external tools such as Claude.ai.

  • The wide format is particularly useful because each row represents an observation and each column represents a variable, making relationships among variables easier for both humans and LLMs to identify.
  • If the data were a list with hierarchy, then we could consider a JSON structure.

Creating a combination of data and metadata provides a useful foundation for working with AI assistants.

  • In the next section, we will explore how assistants can use this data and context to help identify patterns, suggest analyses, and generate explanations.

14.3 Working with a Chat Assistant: Claude.ai

Section 13.4 discussed building reusable functions that interact with the Groq API such as:

call_groq(
  user_prompt = prompt,
  model = model,
  role_prompt = role
)

In that chapter, we primarily used LLMs to support workflows and agents that we built ourselves.

  • In practice, however, many data scientists use LLMs directly as analytical assistants.

Rather than generating code alone, we can use an LLM to assist with tasks such as:

  • Suggesting potentially important indicators.
  • Suggesting comparisons among countries.
  • Suggesting visualizations.
  • Critiquing assumptions.
  • Critiquing statistical conclusions.
  • Suggesting additional analyses.
  • Explaining unexpected results.
  • Identifying possible data quality issues.

For example, we could use an LLM assistant to suggest/critique at each step in an analytical workflow:

  1. Compute summary statistics.
  2. Create visualizations.
  3. Fit statistical models.
  4. Evaluate model performance.
  5. Interpret the results.

Here, the LLM does not perform or replace the statistical analysis; it acts as a collaborator that can help generate hypotheses, identify alternative explanations, and suggest additional methods or questions worth investigating.

Note 14.3: LLMs Are Better at Generating Ideas Than Verifying Them

Modern LLMs are often effective at generating explanations, identifying patterns, and suggesting next steps.

  • However, they do not automatically verify or critique their own conclusions.
  • The analyst remains responsible for determining whether a proposed interpretation is supported by the data and problem context.

14.3.1 Claude.ai as an Analytical Assistant

Claude.ai provides a conversational interface that allows users to interact with an LLM through a web browser.

  • Because Claude.ai operates outside the development environment, information must still be shared manually.

Unlike a simple chat interface, Claude can work with uploaded files.

  • It can work with CSV files, Excel files, PDF documents, code files, Quarto documents and even images and figures.
  • This allows Claude to reason about project artifacts as context rather than relying solely on information typed into the prompt.
TipContext Engineering of Uploaded Files Improves the Signal and Saves Tokens

Conceptually, Claude (and other LLMs) work with a limited context window. Anything it needs to reason about must ultimately be represented within that context window in some form.

However, when you upload a file:

  • Claude does not simply paste the entire file verbatim into the conversation.
  • The file is processed and made available to the model through Claude’s document handling system.
  • Claude can retrieve relevant portions of the file as needed while answering questions.

By uploading files that have been engineered to focus on the right context, you improve Claude’s ability to find the signal in the file while saving tokens and compute time.

The workflow typically looks like:

  1. Perform analysis locally or in Posit Cloud.
  2. Upload files or outputs to Claude.
  3. Ask questions about the results.
  4. Evaluate and refine the responses.
  5. Return to Posit Cloud to continue the analysis.

This workflow may appear somewhat manual, but it has an important advantage. the analyst remains directly involved in every step of the process and can evaluate each recommendation before taking action.

  • This provides a natural opportunity to apply many of the responsible AI practices discussed throughout this course.

14.3.2 Example: Working with Claude.ai to Create Maps

Many real-world projects require information that is not contained in the original dataset.

Suppose we would like to create choropleth maps showing demographic patterns across Europe.

  • Our World Bank dataset contains country-level indicators, but it does not contain geographic boundaries.
  • This creates a new problem:

Where can we obtain reliable geographic data that can be joined to our World Bank dataset?

A traditional workflow might involve:

  • Searching for shapefiles.
  • Reading documentation.
  • Comparing multiple data sources.
  • Determining which identifier can be used for joins.
  • Downloading and testing candidate datasets.

Claude.ai can help accelerate this process.

Note 14.4: Geographic Data Requires More Than Coordinates

Packages such as {ggplot2} can create maps, but for choropleth maps they require data for the geographic boundaries describing the shapes of countries, states, provinces, or other regions.

  • These boundaries are commonly stored as simple features in shapefiles or related geographic formats and are typically analyzed using packages such as {sf}.
  • Because our World Bank dataset contains indicators but not geographic boundaries, we must obtain a second dataset and determine how the two datasets can be joined together.

14.3.2.1 Working with Claude.ai to Acquire European Boundary Data

If you do not have a claude.ai account already, follow the instructions in Getting Started with Claude.ai for Data Science on the web or in you repository.

Then go to Claude.ai at https://claude.ai/new.

  1. Upload the data file.
  • To upload a file from Posit Cloud first use the Files Pane/More Export command to download the file to your local computer. Then use the upload “+” button in Claude.ai chat interface as in Figure 14.1.
  • You should see the file as an icon in the chat interface.
Figure 14.1: Claude.ai’s Browser-based Chat Interface
  1. Enter a prompt.

After uploading the CSV file, we can construct a prompt using the same components discussed in Section 13.2.1.

  • Role to establishes type of expertise the model should use.
  • Task to define the objective.
  • Context to explains the uploaded data and how it should be used.
  • Constraints to narrows the solution space.
  • Examples to illustrate the types of solutions being sought.
  • Output: to provide an explicit set of standards against which to evaluate the output

Copy and paste the following into the chat interface at the cursor. Feel free to edit if you want to get more specific on a point.

Role:
You are an experienced geospatial data scientist working with R.

Task:
Recommend geographic boundary datasets suitable for creating
country-level choropleth maps of Europe using the uploaded World Bank data.

Context:
I have uploaded a World Bank dataset containing approximately 
40 European countries. 
- Use the uploaded dataset as the source of the countries that must be 
supported by any recommended geographic boundary dataset.

Constraints:
- The dataset uses ISO three-character country codes.
- The boundary dataset must provide a reliable join field that can be matched
to the ISO three-character country codes in the uploaded dataset.
- The solution should work with R.
- The data should be easily downloadable at no cost.
- The geographic boundaries should support all countries in the dataset.

Examples:
Examples of useful sources include Natural Earth,
Eurostat GISCO, GADM, or other comparable datasets.

Output:
For each candidate data source provide:
- Source name
- Download location as a hyperlink
- Geographic coverage
- Available join fields
- Advantages
- Disadvantaes
- Suitability for this project

Conclude with a discussion of the tradeoffs among the alternatives.

You may be thinking, “This is an awfully long prompt. Why not simply ask where to download shapefiles for European countries?”

  • That approach might work, but it often requires multiple rounds of clarification and refinement before the model produces a response that fully addresses the problem.
  • As prompts become more complex, explicit structure often becomes as important as the individual instructions themselves.
  • There is a tradeoff between the effort invested in constructing a well-organized prompt and the effort required to repeatedly refine an incomplete response.
  • In many cases, spending a little more time defining the role, task, context, constraints, and desired output can substantially improve the quality of the first response while also helping you frame and understand the problem more clearly yourself.
TipExplicit Prompt Structure Often Improves Results

Although LLMs can often infer the purpose of different parts of a prompt, it is usually helpful to label major components explicitly.

  • Role
  • Task
  • Context
  • Constraints
  • Examples
  • Output

These labels act much like section headers in a document. They help both humans and LLMs distinguish between different types of information and can make prompts easier to understand, modify, and debug.

  • This approach is particularly useful for longer prompts where instructions, data, and examples are combined into a single request.

By now, you should have seen evidence of the agent inspecting the uploaded data, extracting relevant information, evaluating alternatives, and planning its next actions as shown in Figure 14.2.

  • These intermediate messages provide a glimpse into some of the same concepts discussed in the previous chapter, including tool use, evaluation, planning, and action selection.
Screenshot of Claude.ai processing an uploaded World Bank dataset. The interface shows the structured prompt and status messages indicating that Claude inspected the file, extracted relevant information about the countries in the dataset, identified Kosovo's XKX code as a special case, and is evaluating alternatives before generating recommendations. The messages illustrate agent behaviors such as acting, evaluation, planning, and tool use.
Figure 14.2: Snapshot of Claude.ai at work: acting, evaulating, and planning the next action.

AS the response begins you should see multiple summaries of options followed by th tradeoff analysis and the start of code options as in Figure 14.3.

Screenshot of Claude.ai presenting a structured evaluation of several geographic boundary datasets. The response compares alternative sources across multiple criteria, including licensing, country-code compatibility, data quality, and reproducibility. After discussing tradeoffs, Claude recommends a preferred dataset and begins generating R code to implement the solution.
Figure 14.3: : Example of AI-assisted decision support. After evaluating multiple geographic data sources, Claude.ai summarizes tradeoffs, recommends a preferred solution, and generates code to implement the recommendation. {#fig-claude-ai3-tradeoff-analysis}.

Claude’s response also surfaced a practical issue that we may not have anticipated: Kosovo uses the code XKX in our dataset.

  • This is a useful example of why AI assistants can be helpful in exploratory project work.
  • The challenge was not simply to “make a map.”
  • The real challenge was to identify whether the geographic boundary data could be joined correctly to the World Bank data.

In this case, Claude identified that the join field matters:

  • Our World Bank data use country_code_3.
  • Some boundary datasets use different country identifiers. e.g., th ISO2 standard.
  • Kosovo is often coded inconsistently across geographic data sources.
  • The choice of boundary dataset affects whether the join will work cleanly.

This is the kind of issue that can easily be missed if we only ask for a web site or just code.

  1. Make a decision about which source

Claude recommended using {rnaturalearth} and to “join on adm0_a3”.

Now you decide whether to use the option or a different one.

  • Let’s choose that option
  1. Acquire the Shapefiles

Let’s run the code suggested by Claude.

library(rnaturalearth)
library(sf)
library(dplyr)

europe_sf <- ne_countries(scale = 50, returnclass = "sf") |>
  filter(continent == "Europe" | adm0_a3 %in% your_codes)
Error in `stopifnot()`:
ℹ In argument: `continent == "Europe" | adm0_a3 %in% your_codes`.
Caused by error:
! object 'your_codes' not found
# Kosovo fix: ne_countries() sets iso_a3 = "-99" for Kosovo;
# adm0_a3 correctly gives "XKX"
joined <- europe_sf |>
  left_join(wb_data, by = c("adm0_a3" = "country_code_3"))
Error:
! object 'europe_sf' not found
  • Oops! There is an error.

The code generated by Claude was close to correct but not immediately executable.

  • The response referenced a variable named your_codes that was never defined.
  • This was claude practicing “defensive programming” as it tried to anticipate that there might be codes in your data set (your_codes) that you considered as Europe but that might not be coded as Europe in {rnaturalearth}.
  • A human analyst must still review generated code, identify placeholders or assumptions, and verify that the code is consistent with the actual project.
  • In this example, replacing your_codes with the actual country codes from the World Bank dataset resolves the issue.

This illustrates a common pattern when working with AI assistants:

  • The generated code is often directionally correct.
  • The overall approach is frequently useful.
  • Small implementation details may still require human review and modification.

This is one reason why experienced practitioners often spend more time reviewing AI-generated code than simply copying and executing it.

Let’s fix the code and proceed.

  • We are also going to add a line to just select the two columns we need as the data frame has 169 columns.
wb_country_codes <-
  wb_europe_subset|>
  distinct(country_code_3) |>
  pull(country_code_3)

europe_sf <-
  rnaturalearth::ne_countries(
    scale = 50,
    returnclass = "sf"
  ) |>
  filter(
    continent == "Europe" |
      adm0_a3 %in% wb_country_codes
  ) |>
  select(adm0_a3, geometry)
TipData Engineering in Mid Analysis

Whenever possible, keep only the variables needed for the task at hand.

  • The ne_countries() dataset contains many attributes describing each country. However, for this analysis we only require:
    • A join key (adm0_a3)
    • The country geometry
  • Selecting only the required variables reduces memory usage, simplifies subsequent joins, and makes the code easier to understand.
  1. Now we can join the data
  • We will also adjust the join so that we add the desired geometry column to the original data and not the other way around.
# Kosovo fix: ne_countries() sets iso_a3 = "-99" for Kosovo;
# adm0_a3 correctly gives "XKX"
 wb_europe_sf <- wb_europe_subset |>
  left_join(europe_sf, join_by("country_code_3" ==  "adm0_a3"))

glimpse(wb_europe_sf)
Rows: 2,640
Columns: 8
$ country                    <chr> "Albania", "Albania", "Albania", "Albania",…
$ country_code_3             <chr> "ALB", "ALB", "ALB", "ALB", "ALB", "ALB", "…
$ year                       <int> 2025, 2024, 2023, 2022, 2021, 2020, 2019, 2…
$ population_total           <dbl> NA, 2377128, 2414095, 2451636, 2489762, 252…
$ population_growth          <dbl> NA, -1.5431439, -1.5431081, -1.5431567, -1.…
$ fertility_rate             <dbl> NA, 1.341, 1.348, 1.355, 1.365, 1.371, 1.39…
$ gdp_per_capita_current_usd <dbl> NA, 11377.776, 9730.869, 7756.962, 7242.455…
$ geometry                   <MULTIPOLYGON [°]> MULTIPOLYGON (((19.34238 41...…

14.3.3 Asking for Code After Evaluating Options

Once Claude has helped identify candidate data sources and possible join issues, we can move to a more specific coding request.

A weaker prompt would be:

Give me R code to make a map of Europe.

That may produce code, but it gives Claude very little information about the data, the join field, or the known Kosovo issue.

A stronger follow-up prompt would be:

Role:
You are an experienced geospatial data scientist working with R.

Task:
Write R code to create a country-level choropleth map of Europe using my uploaded World Bank dataset.

Context:
The uploaded dataset contains country-level World Bank indicators for approximately 40 European countries. The dataset uses `country_code_3` as the join field. Claude previously identified that Kosovo uses the code `XKX`, which may require special handling depending on the geographic boundary source.

Constraints:
- Use R.
- Use {sf}, {dplyr}, and {ggplot2}.
- Use {rnaturalearth} for country boundaries.
- Join the World Bank data to the geographic boundaries using a reliable ISO-3 style country code.
- Use `adm0_a3` from Natural Earth rather than `iso_a3` if that better handles Kosovo.
- Filter the World Bank data to one recent year.
- Map one demographic indicator such as `population_growth` or `fertility_rate`.
- Include clear comments in the code.

Output:
Provide complete R code and briefly explain the main steps.

This prompt is better because it aligns the coding request with the earlier evaluation.

  • The assistant is not only being asked to write code. It is being asked to write code that reflects the known data integration problem.

Example Code

Claude may produce code similar to the following.

Show code
# ============================================================
#  Europe Choropleth Map — World Bank Indicators
#  Indicator: fertility_rate (2023)
#  Boundaries: Natural Earth via {rnaturalearth}
#  Join key:   adm0_a3 (handles Kosovo / XKX correctly)
# ============================================================

# ── 0. Packages ──────────────────────────────────────────────
library(sf)               # spatial data handling
library(dplyr)            # data wrangling
library(ggplot2)          # plotting
library(rnaturalearth)    # country boundary polygons
library(rnaturalearthdata)# required for scale = 50 data
library(scales)           # for label formatting in legend


# ── 1. Load World Bank data ───────────────────────────────────
wb_raw <- read.csv("./data/wb_indicators_europe_subset.csv", #fix path
                   stringsAsFactors = FALSE)

# Filter to a single recent year with good data coverage
TARGET_YEAR      <- 2023
TARGET_INDICATOR <- "fertility_rate"

wb_year <- wb_raw |>
  filter(year == TARGET_YEAR) |>
  select(country, country_code_3, all_of(TARGET_INDICATOR)) |>
  # Drop rows where the indicator is missing
  filter(!is.na(.data[[TARGET_INDICATOR]]))

cat("Countries with data in", TARGET_YEAR, ":", nrow(wb_year), "\n")
Countries with data in 2023 : 40 
Show code
# ── 2. Fetch Natural Earth boundaries ────────────────────────
# scale = 50  →  1:50 million; good balance of detail vs file size
# returnclass = "sf"  →  returns an sf data frame directly
world_sf <- ne_countries(scale = 50, returnclass = "sf")

# Identify which field best carries Kosovo (XKX):
#   iso_a3     → coded "-99" for Kosovo in Natural Earth (avoid)
#   adm0_a3    → coded "XKX"  ✓  use this one
# Quick check:
world_sf |>
  filter(name == "Kosovo") |>
  select(name, iso_a3, adm0_a3) |>
  st_drop_geometry() |>
  print()
    name iso_a3 adm0_a3
1 Kosovo    -99     KOS
Show code
# ── 3. Subset Natural Earth to Europe + neighbors ───────────
# Keep all countries whose adm0_a3 appears in the WB dataset,
# plus a broad Europe bounding box so the base map looks complete
# even for countries without WB data (they'll appear as grey).

# All adm0_a3 codes that appear in the WB dataset
wb_codes <- wb_year$country_code_3

# Pull the full European extent (continent label + Türkiye, Cyprus, etc.)
europe_sf <- world_sf |>
  filter(
    continent == "Europe" |
    adm0_a3 %in% wb_codes        # catch non-"Europe" coded countries in WB data
  )


# ── 4. Join World Bank data to spatial boundaries ────────────
# Left join keeps every European polygon; WB countries without
# geometry simply won't appear (none expected here).
europe_joined <- europe_sf |>
  left_join(wb_year,
            by = c("adm0_a3" = "country_code_3"))

# Sanity check: how many WB countries matched a polygon?
matched <- sum(!is.na(europe_joined[[TARGET_INDICATOR]]))
cat("WB countries matched to geometry:", matched, "/", nrow(wb_year), "\n")
WB countries matched to geometry: 39 / 40 
Show code
# List any WB countries that did NOT match (for debugging)
unmatched_wb <- wb_codes[!wb_codes %in% europe_joined$adm0_a3]
if (length(unmatched_wb) > 0) {
  cat("Unmatched WB codes:", paste(unmatched_wb, collapse = ", "), "\n")
} else {
  cat("All WB country codes matched to Natural Earth polygons.\n")
}
Unmatched WB codes: XKX 
Show code
# ── 5. Define map projection ──────────────────────────────────
# ETRS89-LAEA (EPSG:3035) — standard equal-area projection for Europe;
# reduces the Greenland distortion typical of WGS84/Web Mercator.
CRS_EUROPE <- "+proj=laea +lat_0=52 +lon_0=10 +x_0=4321000 +y_0=3210000
               +ellps=GRS80 +units=m +no_defs"

europe_projected <- st_transform(europe_joined, crs = CRS_EUROPE)


# ── 6. Build the choropleth ───────────────────────────────────
# Color palette: sequential, perceptually uniform (viridis magma strip)
# Diverging palettes would suit population_growth better (neg vs pos);
# for fertility_rate a sequential low→high scheme is appropriate.

ggplot(data = europe_projected) +

  # --- base polygons colored by indicator ---
  geom_sf(aes(fill = .data[[TARGET_INDICATOR]]),
          colour = "white",      # thin white border between countries
          linewidth = 0.25) +

  # --- country outlines for unmatched / NA countries (grayed out) ---
  # (already handled by na.value in scale_fill_viridis_c)

  # --- color scale ---
  scale_fill_viridis_c(
    option    = "magma",        # perceptually uniform, print-friendly
    direction = -1,             # dark = high fertility
    name      = "Fertility rate\n(births per woman)",
    na.value  = "grey85",       # countries with no WB data
    guide     = guide_colorbar(
      barwidth  = 0.6,
      barheight = 10,
      title.position = "top",
      title.hjust    = 0.5
    )
  ) +

  # --- map extent: clip to Europe ----------------------------------------
  coord_sf(
    xlim = c(2.5e6, 6.5e6),    # LAEA x in meters
    ylim = c(1.5e6, 5.5e6),    # LAEA y in meters
    expand = FALSE
  ) +

  # --- titles & caption ---
  labs(
    title    = "Fertility Rate Across Europe",
    subtitle = paste0("Average births per woman, ", TARGET_YEAR,
                      "  |  Source: World Bank"),
    caption  = "Boundaries: Natural Earth (1:50m)  |  Projection: ETRS89-LAEA (EPSG:3035)"
  ) +

  # --- clean map theme ---
  theme_void(base_size = 12) +
  theme(
    plot.title       = element_text(face = "bold", size = 16,
                                    margin = margin(b = 4)),
    plot.subtitle    = element_text(size = 11, colour = "grey40",
                                    margin = margin(b = 10)),
    plot.caption     = element_text(size = 8,  colour = "grey60",
                                    hjust = 0,
                                    margin = margin(t = 8)),
    legend.position  = c(0.92, 0.55),   # inside map, right-center
    legend.title     = element_text(size = 9,  face = "bold"),
    legend.text      = element_text(size = 8),
    plot.margin      = margin(10, 10, 10, 10),
    plot.background  = element_rect(fill = "white", colour = NA)
  )

Show code
# ── 7. Save output ────────────────────────────────────────────
ggsave(
  filename = "europe_fertility_rate_2023.png",
  width    = 10,
  height   = 8,
  dpi      = 200,
  bg       = "white"
)

cat("Map saved to europe_fertility_rate_2023.png\n")
Map saved to europe_fertility_rate_2023.png
Show code
# ── Optional: switch to population_growth ─────────────────────
# population_growth has negative values (shrinking populations),
# so a *diverging* palette centered on 0 is more informative.
#
# Replace TARGET_INDICATOR <- "population_growth" above, then swap
# scale_fill_viridis_c() for:
#
# scale_fill_gradient2(
#   low      = "#2166ac",   # blue  = decline
#   mid      = "white",
#   high     = "#d6604d",   # red   = growth
#   midpoint = 0,
#   name     = "Population\ngrowth (%)",
#   na.value = "grey85",
#   guide    = guide_colorbar(barwidth = 0.6, barheight = 10,
#                             title.position = "top", title.hjust = 0.5)
# )

Claude’s generated code covered many of the major steps that an experienced analyst would want in a choropleth workflow:

  • Load the World Bank data.
  • Select a target year and indicator.
  • Retrieve country boundaries.
  • Inspect the Kosovo country-code issue.
  • Filter the spatial data to relevant countries.
  • Join the statistical data to the spatial data.
  • Check whether all World Bank countries matched.
  • Project the map to a Europe-appropriate coordinate reference system.
  • Create the choropleth.
  • Save the result.
  • Suggest a different variable and how it would suggest a different palette strategy for indicators with negative and positive values.

This is a useful example of how an AI assistant can help a newer programmer see the larger structure of a workflow, not just isolated lines of code.

However, the code was not perfect.

  • First, the path to the CSV file needed to be adjusted. It needed a /data/ to find the data. It did not include the {here} package. Ask it why not?
    • This is a common issue when AI-generated code assumes a project structure that differs from the actual project.
  • Second, Claude selected 2023 as the target year without explaining why. That may be reasonable if 2024 has incomplete data, but the analyst should verify data availability before choosing a year.
  • Third, the color palette choice is defensible but subjective. Claude used the magma palette with direction = -1, making darker colors represent higher fertility rates. That may be acceptable, but color direction should be chosen intentionally and explained, especially when the visual interpretation matters.
  • It saved a file of the graphic without asking.

Each of these is easily fixable and refined.

14.3.4 Summary of Claude.ai as a Chat Assistant

Claude.ai illustrates many of the strengths and limitations of modern chat-based AI assistants.

Unlike an assistant integrated directly into a development environment, Claude.ai operates primarily through a web browser. As a result, the user must manually provide context by uploading files, copying and pasting model output or error messages and providing detailed descriptions of the project and analytical goals.

  • This process may appear cumbersome compared to more integrated tools, but it provides an important benefit: the analyst remains directly involved in deciding what information should be shared with the model.

The shapefile example demonstrated a typical workflow: of uploading data, using structured prompts, and asking Claude to evaluate possible solutions.

  • The analyst then had to select a preferred approach and review, modify, and test the resulting code.
  • Throughout this process, Claude was able to analyze the dataset, identify potential join issues, compare alternatives, explain tradeoffs, recommend a reasonable solution, and generate a substantial amount of (almost) working code.
  • At the same time, the generated code was not perfect; it still required testing and refinement.
  • This review process required exactly the skills developed throughout this course:

Perhaps the most important lesson is that Claude provided value long before it generated any code.

  • Its greatest contribution was helping identify requirements, evaluate alternatives, surface potential problems, and structure the overall solution.
  • In practice, many successful uses of AI assistants involve this type of collaboration.
  • The assistant contributes ideas, analysis, and first drafts, while the human remains responsible for evaluating, refining, and ultimately shaping and approving the final result.

14.4 Working with a Repository Agent: Claude Code

The Claude.ai example demonstrated the strengths and limitations of a browser-based chat assistant.

To analyze the World Bank Europe dataset, Claude.ai required us to manually provide context through uploaded files and detailed prompts.

  • We uploaded datasets and metadata.
  • We described the project structure.
  • We explained the analytical objective.
  • We copied code and error messages between applications.

This process worked well, but it required continuous effort to keep the model informed.

Claude Code takes a different approach.

  • Instead of operating in a browser, Claude Code operates directly inside a project repository. It can inspect files, understand project structure, execute commands, create new files, and modify existing ones.

  • The key difference is the entire repository itself becomes context.

NoteTrying Claude Code Yourself

Claude Code is designed primarily for local development environments where it can interact directly with your files, Git repository, and terminal; it is not designed for Posit Cloud type environments. There is a web version in development.

  • Unlike browser-based chat assistants, Claude Code uses a modern terminal interface that continuously updates the screen as it reads files, executes commands, and displays results.
  • While Claude Code can run inside RStudio’s Terminal pane, the display may quickly become difficult to read because RStudio’s terminal does not fully support the dynamic terminal interface used by Claude Code.

For the best experience, use a dedicated terminal application:

  • macOS: iTerm2 (recommended) or Terminal.app
  • Windows: Windows Terminal

Both are free and widely used by software developers.

14.4.0.0.1 Installing Claude Code

Install iTerm2 (recommended):

brew install --cask iterm2

Or download directly from:

https://iterm2.com/downloads.html

Install Claude Code:

curl -fsSL https://claude.ai/install.sh | sh

Verify the installation:

claude --version

Install Windows Terminal from the Microsoft Store or via:

winget install Microsoft.WindowsTerminal

Install Claude Code from PowerShell:

irm https://claude.ai/install.ps1 | iex

Verify the installation:

claude --version
ImportantClaude Code is Not Free

Claude Code uses the Claude API which charges for the number of API tokens that are input and provided as output.

However, Anthropic allows one to pay for Claude Code in two ways

  1. By signing in with at least a Claude Pro or Max subscription account (tokens count against your usage limits).
  2. By using an Anthropic Console API account with usage-based billing.

A standard Claude subscription is the simpler as it allows both chat and claude code but the API account might be cheaper for a short experiment (min cost $5.00)

NoteFor This Course

Claude Code is demonstrated to illustrate how repository agents operate; you are not required to install Claude Code or create an Anthropic account.

  • The images in this section are from using a Mac with local RStudio on the left and iTerm on the right.

14.4.1 Starting Claude Code

Claude Code runs from a terminal window.

  • Open a terminal window and navigate so the working directory (prompt/cursor) is at the root of the project directory.
  • Then enter the following:
claude

When Claude Code starts, if it’s a new repo, it asks if you trust the repo before it does anything.

  • Given permission, it then scans the repository and builds an understanding of the project structure.
  • It can identify files, directories, function definitions, package dependencies, and common workflows directly from the code.
  • When complete your screen may look something like Figure 14.4 where RStudio is on the left and the terminal on the right.
Claude Code startup screen showing project information and terminal interface.
Figure 14.4: Claude Code startup screen showing the current project directory and repository context.

However, Claude many not capture all the important aspects of a project from the files alone such as

  • the purpose of the project,
  • preferred coding conventions,
  • expected outputs,
  • data sources,
  • repository-specific workflows,
  • and other project constraints.

Claude Code uses a special file called CLAUDE.md as a means to capture the non-code aspects so it can serve as persistent project context.

14.4.2 CLAUDE.md as Persistent Instructions

As seen in Figure 14.4, Claude suggests you enter one of its “slash” commands, /init, to have it create a CLAUDE.md file.

  • It will then explore the repository and create a file. It will ask permission to write to your repository.
  • The results may look similar to Figure 14.5 where Claude provides a summary of its actions.
Claude Code  screen showing the results of creating a CLAUDE.md file to include the tokens and time consumed.
Figure 14.5: Claude Code created a CLAUDE.md file and provided feedback on the process in terms of tokens, a summmary of the actions, and the time.

In this example, the major sections included

  • Project Purpose
  • Common Commands.
  • Architecture
    • Primary Student File
    • R Helper Scripts (R/)
    • Data Layout
    • Other Key Files
  • Conventions

Since it is a standard markdown file, the analysis can edit it to reflect exactly what it wants it to say as a version-controlled document.

Claude Code automatically loads this file whenever a new session begins.

  • Rather than repeating these instructions in every prompt, the conventions become part of the repository itself.
  • Unlike Claude.ai, no files need to be uploaded manually.
  • Claude Code can already see the files in the repository.

14.4.3 Sample Scripts

For the World Bank Europe project several scripts were created so Claude Code has repository structure and content it can inspect, explain, modify, and extend.

  • The data/ directory contains both long and wide versions of the World Bank Europe dataset.
  • The R/ directory contains small helper scripts organized by task: loading data, mapping, plotting, correlations, and saving outputs.

The structure contains the following:

data/
├── world_bank_indicators_europe/
├── world_bank_indicators_europe_wide/
└── world_bank_metadata.xlsx

R/
├── worldbank_data.R
├── worldbank_maps.R
├── worldbank_plots.R
├── worldbank_correlations.R
└── plot_utils.R

outputs/
├── plots/
└── reports/

The output of the scripts looks like the following:

Show code
source("R/worldbank_data.R")
source("R/worldbank_maps.R")
source("R/worldbank_plots.R")
source("R/worldbank_correlations.R")
source("R/plot_utils.R")
wb_wide <- load_worldbank_wide()

indicators <- c(
  "population_total",
  "population_growth",
  "fertility_rate",
  "gdp_per_capita_current_usd"
)

fertility_map <- make_worldbank_map(
  wb_wide,
  indicator = "fertility_rate",
  years = 2023,
  legend_title = "Births per woman",
  title = "Fertility Rate Across Europe"
)
save_plot(
  fertility_map,
  "fertility_rate_map.png"
)

scatter_plot <- make_indicator_scatter(
  wb_wide,
  x_indicator = "fertility_rate",
  y_indicator = "population_growth",
  years = 2023
)
save_plot(
  scatter_plot,
  "fertility_vs_growth.png"
)

hist_plot <- make_indicator_histogram(
  wb_wide,
  indicator = "gdp_per_capita_current_usd",
  years = 2023
)
save_plot(
  hist_plot,
  "gdp_per_capita_usd.png"
)

corr_plot <- make_correlation_heatmap(
  wb_wide,
  indicators = indicators,
  years = 2023
)

save_plot(
  corr_plot,
  "correlation_heatmap.png"
)
hist_plot
corr_plot
scatter_plot
fertility_map 

Because Claude Code can inspect these files directly, it begins with substantially more context than a browser-based chat assistant.

14.4.4 Repository-Aware Development Tasks

Rather than asking Claude Code to perform a complete analysis, it is often more useful to ask it to extend or modify an existing project.

  • Because Claude Code can see the entire repository, it can understand how functions are organized, identify coding conventions, and make changes that are consistent with the rest of the codebase.

The following three tasks illustrate progressively more sophisticated forms of repository-aware development.

14.4.4.1 Task 1: Modify an Existing Function

To illustrate how repository-aware agents differ from chat assistants, we can begin with a relatively simple software development task. Rather than providing extensive instructions about the project structure, data sources, and coding conventions, we simply ask Claude Code to extend an existing function.

The prompt below asks Claude Code to modify make_indicator_scatter() so that users can optionally add either a linear regression line or a nonlinear LOESS smoother. Because Claude Code has already scanned the repository and loaded the project context from CLAUDE.md, it can identify the correct file, inspect the existing function, and propose an implementation without requiring additional explanation.

Update make_indicator_scatter() so users can optionally add either a linear regression line or a nonlinear LOESS smoother.

The proposed changes are shown as a diff before they are applied to the repository as in Figure 14.6.

  • This approval step provides an opportunity to review the modifications and confirm that the agent’s interpretation matches the intended task.
  • Unlike a chat assistant, which would simply return code, Claude Code can directly edit project files while keeping the user in the review loop.
Claude Code displaying a proposed edit to worldbank_plots.R. The diff view highlights the addition of a smoother argument supporting linear and LOESS smoothers, while RStudio shows the project files and functions available in the repository.
Figure 14.6: Claude Code displaying a proposed edit a function in worldbank_plots.R as a diff and asking permission to proceed.

After the changes are accepted, Claude Code updates the file and summarizes what was modified as in Figure 14.7.

  • The resulting implementation adds support for both linear and nonlinear smoothers while preserving the original behavior when no smoother is requested.
RStudio displaying the updated worldbank_plots.R file alongside Claude Code’s summary of the completed modifications. The new smoother parameter supports NULL, linear regression, and LOESS smoothing options.
Figure 14.7: Claude Code displaying a completed edit as shown in the RStudio files.

The final response includes both the code changes and a concise explanation of the new functionality.

  • Notice that the original prompt was intentionally brief.
  • Claude Code determined which file to edit, how to modify the function signature, where to insert the new plotting logic, and how to document the change by combining information from the repository structure, the existing code, and the project context provided through CLAUDE.md.

As shown in Figure 14.6 and Figure 14.7, repository-aware agents shift some of the burden of context engineering from the prompt into the repository itself.

  • The prompt focuses on the desired outcome, while the agent discovers much of the implementation context automatically.
14.4.4.1.1 Task 2: Creating a New Visualization Function

The second task requires more than modifying an existing function; Claude Code must create a completely new plotting function that follows the conventions established elsewhere in the repository.

The prompt is still relatively brief:

Create a new function named make_indicator_boxplot() that follows the style and structure of the existing plotting functions. Include a notch argument that allows notched boxplots.

To complete this task, Claude Code first examines the existing plotting functions to identify the project’s conventions before proposing a new implementation as seen in Figure 14.8.

Claude Code displaying a proposed addition of a make_indicator_boxplot function to worldbank_plots.R. The proposed function includes a notch argument and follows the structure of the existing plotting functions.
Figure 14.8: Claude Code displaying a proposed edit to add a new function to worldbank_plots.R as a diff and asking permission to proceed.

Unlike the previous task, Claude is not simply extending an existing function. It must infer how new functionality should be organized based on the repository structure, naming conventions, documentation style, and existing code patterns.

After approval, Claude Code adds the new function and explains how it fits within the project’s existing design as in Figure 14.9.

RStudio displaying the completed make_indicator_boxplot function while Claude Code summarizes the new functionality and explains how it follows existing project conventions.
Figure 14.9: Claude Code displaying the results of adding a new function to worldbank_plots.R wiht RStudio showing the new function.

This task took slightly longer than the scatterplot modification shown in Figure 14.6 and Figure 14.7 because Claude Code needed to understand and reproduce the project’s design patterns rather than simply extending existing functionality.

  • As shown in Figure 14.8 and Figure 14.9, repository-aware agents can use information from surrounding code to create new functions that are consistent with the rest of the project.

Task 3: Creating a New Modeling Module

The third task is substantially more complex than the previous two. Rather than modifying an existing function or adding a new plotting function, Claude Code is asked to design and create an entirely new script containing multiple related functions.

The prompt remains relatively concise:

Create a new script named worldbank_models.R containing functions for fitting linear models and ANOVA models and generating summary tables using either the default base R summary or the broom package.

To complete this task, Claude Code first reads several existing scripts in the repository to understand naming conventions, function structure, documentation style, and expected inputs and outputs before proposing a new file as in Figure 14.10.

Claude Code displaying a proposed new file named worldbank_models.R. The agent explains that it reviewed existing helper scripts to understand project conventions before generating a new modeling module containing linear model and ANOVA helper functions.
Figure 14.10: Claude Code displaying the proposed new script as a diff (although it is hard to see).

Unlike the previous tasks, Claude cannot rely on a nearby function as a template. Instead, it must infer the overall design philosophy of the repository and then apply those conventions consistently across an entirely new set of functions.

After the proposed changes are approved, Claude Code generates the new script and summarizes the functions it created as in Figure 14.11.

RStudio displaying the completed worldbank_models.R script while Claude Code summarizes the newly created functions for fitting linear models, performing ANOVA analyses, generating summary tables, and reporting model fit statistics.
Figure 14.11: Claude Code displaying the results of the new script which is shown in RStudio).

The resulting script contains four functions:

  • fit_worldbank_lm() for fitting linear regression models,
  • fit_worldbank_anova() for fitting one-way ANOVA models,
  • model_summary_table() for generating coefficient and ANOVA summary tables, and
  • model_fit_stats() for reporting model-level statistics.

This task required noticeably more time than the previous two examples. As shown in Figure 14.10 and Figure 14.11, Claude Code spent much of its effort gathering context from the repository before writing any code. Because the task involved creating a new module rather than modifying an existing one, the agent needed to inspect multiple scripts, identify common design patterns, and determine how the new functionality should fit within the broader project structure.

The progression across the three tasks is instructive:

  1. Modify an existing function (scatterplot smoother).
  2. Create a new function within an existing script (boxplot).
  3. Create an entirely new module (modeling helpers).

As the scope of the task increases, the amount of repository context Claude Code must gather also increases. This illustrates one of the defining characteristics of repository-aware agents: they do not simply generate code from a prompt, they actively explore and learn from the surrounding codebase before proposing a solution.

TipPrompts Become Shorter, Context Does Not

When first working with a repository agent, many people over-prompt because they are still thinking like chat-assistant users.

Repository-aware agents often require less detailed prompts than chat assistants.

  • This does not mean they need less context. Rather, much of the context is already available through the repository structure, existing code, project documentation, and files such as CLAUDE.md.
  • As a result, prompts can focus more on the desired outcome and less on explaining the environment in which the work takes place.

Start with relatively brief prompts and only add additional constraints when the agent’s initial approach differs from what you intended.

14.4.5 Test the Code

Let’s try quick examples of the updated and new code.

#|message: false
#|warning: false
#|layout-ncol: 2
#| code_fold: true
#| 
source("R/modified/worldbank_plots.R")


make_indicator_scatter(
  wb_df = wb_wide,
  x_indicator = "fertility_rate",
  y_indicator = "population_growth",
  years = 2023,
  label_countries = TRUE,
  smoother = "lm"
)

make_indicator_scatter(
  wb_df = wb_wide,
  x_indicator = "gdp_per_capita_current_usd",
  y_indicator = "life_expectancy_total",
  years = 2023,
  smoother = "loess"
)

 make_indicator_boxplot(
  wb_df = wb_wide,
  indicator = "fertility_rate",
  years = 2000:2023,
  notch = TRUE
)

 make_indicator_boxplot(
  wb_df = wb_wide,
  indicator = "population_growth",
  years = c(2000, 2005, 2010, 2015, 2020, 2023),
  notch = TRUE
)

  • Each of the plotting functions runs fine. There some changes that one might want to make by editing the functions to have different defaults.

Let’s check the model functions.

  • As the default, functions use the {broom} package to return a tidy data frame.
    • This format is easier to filter, sort, and incorporate into reports.
  • Setting use_broom = FALSE returns the standard base R model summary instead.
#|message: false
#|warning: false
#|layout-ncol: 2
#| code_fold: true
#| 
source("R/modified/worldbank_models.R")
lm_fit <- fit_worldbank_lm(
  wb_wide,
  outcome = "population_growth",
  predictors = c(
    "fertility_rate",
    "gdp_per_capita_current_usd"
  ),
  years = 2023
)

model_summary_table(lm_fit)
# A tibble: 3 × 5
  term                         estimate  std.error statistic  p.value
  <chr>                           <dbl>      <dbl>     <dbl>    <dbl>
1 (Intercept)                -2.32      2.32          -1.00  0.322   
2 fertility_rate              0.820     1.56           0.526 0.602   
3 gdp_per_capita_current_usd  0.0000337 0.00000934     3.61  0.000896
model_fit_stats(lm_fit)
# A tibble: 1 × 12
  r.squared adj.r.squared sigma statistic p.value    df logLik   AIC   BIC
      <dbl>         <dbl> <dbl>     <dbl>   <dbl> <dbl>  <dbl> <dbl> <dbl>
1     0.261         0.221  1.74      6.52 0.00374     2  -77.3  163.  169.
# ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
model_summary_table(
  lm_fit,
  use_broom = FALSE
)

Call:
lm(formula = as.formula(formula_str), data = model_df)

Residuals:
    Min      1Q  Median      3Q     Max 
-7.0750 -0.4806  0.1722  0.6056  4.0655 

Coefficients:
                             Estimate Std. Error t value Pr(>|t|)    
(Intercept)                -2.323e+00  2.316e+00  -1.003 0.322314    
fertility_rate              8.205e-01  1.558e+00   0.526 0.601704    
gdp_per_capita_current_usd  3.375e-05  9.342e-06   3.612 0.000896 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

Residual standard error: 1.738 on 37 degrees of freedom
Multiple R-squared:  0.2607,    Adjusted R-squared:  0.2208 
F-statistic: 6.525 on 2 and 37 DF,  p-value: 0.00374
#|message: false
#|warning: false
#|layout-ncol: 2
#| code_fold: true
lm_fit <- fit_worldbank_lm(
  wb_wide,
  outcome = "life_expectancy_total",
  predictors = c(
    "gdp_per_capita_current_usd",
    "fertility_rate"
  ),
  years = 2023
)

model_summary_table(lm_fit)
# A tibble: 3 × 5
  term                         estimate std.error statistic  p.value
  <chr>                           <dbl>     <dbl>     <dbl>    <dbl>
1 (Intercept)                81.7       3.01          27.2  4.78e-26
2 gdp_per_capita_current_usd  0.0000726 0.0000121      5.99 6.50e- 7
3 fertility_rate             -3.31      2.02          -1.63 1.11e- 1
model_fit_stats(lm_fit)
# A tibble: 1 × 12
  r.squared adj.r.squared sigma statistic     p.value    df logLik   AIC   BIC
      <dbl>         <dbl> <dbl>     <dbl>       <dbl> <dbl>  <dbl> <dbl> <dbl>
1     0.534         0.509  2.26      21.2 0.000000734     2  -87.7  183.  190.
# ℹ 3 more variables: deviance <dbl>, df.residual <int>, nobs <int>
#|message: false
#|warning: false
#|layout-ncol: 2
#| code_fold: true
aov_fit <- fit_worldbank_anova(
  wb_wide,
  outcome = "fertility_rate",
  group = "year",
  years = 2000:2023
)

model_summary_table(aov_fit)
# A tibble: 2 × 6
  term         df sumsq meansq statistic p.value
  <chr>     <dbl> <dbl>  <dbl>     <dbl>   <dbl>
1 year         23  2.27 0.0986      1.50  0.0631
2 Residuals   936 61.7  0.0660     NA    NA     
model_fit_stats(aov_fit)
# A tibble: 1 × 6
  logLik   AIC   BIC deviance  nobs r.squared
   <dbl> <dbl> <dbl>    <dbl> <int>     <dbl>
1  -45.0  140.  262.     61.7   960    0.0354
model_summary_table(
  aov_fit,
  use_broom = FALSE
)
             Df Sum Sq Mean Sq F value Pr(>F)  
year         23   2.27 0.09860   1.495 0.0631 .
Residuals   936  61.73 0.06595                 
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
#|message: false
#|warning: false
#|layout-ncol: 2
#| code_fold: true
aov_fit <- fit_worldbank_anova(
  wb_wide,
  outcome = "life_expectancy_total",
  group = "country_name_ascii",
  years = 2023
)

model_summary_table(aov_fit)
# A tibble: 1 × 4
  term                  df sumsq meansq
  <chr>              <dbl> <dbl>  <dbl>
1 country_name_ascii    39  404.   10.4
model_fit_stats(aov_fit)
# A tibble: 1 × 6
  logLik   AIC   BIC deviance  nobs r.squared
   <dbl> <dbl> <dbl>    <dbl> <int>     <dbl>
1    Inf  -Inf  -Inf        0    40         1
  • These all work as well.

14.4.6 Summary of Claude Code’s Performance

Claude Code performed well on all three development tasks. Most importantly, the resulting functions worked correctly without requiring manual debugging or substantial revision.

  • This success was likely due in part to the rich context available within the repository.
  • Rather than relying solely on information provided in a prompt, Claude Code was able to inspect existing scripts, identify coding conventions, study function patterns, and infer how new functionality should fit into the project.

The progression across the three tasks also demonstrated increasing levels of repository awareness.

  • The scatterplot modification required understanding and extending an existing function.
  • The boxplot task required creating a new function that matched the structure and style of the existing plotting helpers.
  • The modeling task required creating an entirely new script while maintaining consistency with the broader project architecture.

At the same time, Claude Code made a number of implementation decisions without consulting the user.

  • For example, when adding regression and LOESS smoothers, it chose to include confidence intervals by setting se = TRUE.
  • This was a reasonable default choice, but another analyst might have preferred to suppress confidence bands.
  • Repository-aware agents frequently make these kinds of decisions when details are left unspecified.

One interesting omission was the lack of any suggestion to use the ... argument when designing the plotting functions.

  • Using … would have made the functions more flexible by allowing additional graphical parameters to be passed directly to the underlying ggplot2 geoms. - Whether this reflects a limitation of the model, the examples available in the repository, or simply a conservative design choice is difficult to determine. - It serves as a reminder that even when the generated code is correct, experienced developers may still identify alternative designs or improvements that the agent does not propose.

Overall, the examples illustrate both the strengths and limitations of repository-aware agents.

  • They can rapidly generate useful, working code that conforms to project conventions, but they do not eliminate the need for human review, design judgment, and software engineering expertise.
NoteAdvanced Topic: The ... Argument

Many R functions include a special argument called ... (“dot-dot-dot”).

  • The ... argument collects additional arguments supplied by the user and passes them to another function.
  • This is a common design pattern in the tidyverse and allows a function to remain flexible without exposing every possible option as a formal argument.
  • For example, a plotting function could pass ... directly to geom_point(), allowing users to control point size, color, shape, transparency, and other graphical settings without modifying the function itself.

As you develop more sophisticated reusable functions, ... becomes an important tool for balancing simplicity and flexibility.

NoteRepository Agents Can Do Much More Than Generate Code

The examples in this section focused on modifying and extending R functions, but repository agents such as Claude Code can perform far more sophisticated tasks.

  • Beyond understanding and editing code, they can inspect entire repositories, coordinate work across multiple files, execute terminal commands, run tests, generate documentation, and assist with Git workflows.

Repository agents can also be customized through project context files such as CLAUDE.md, reusable workflows known as skills, and specialized subagents that handle specific responsibilities such as visualization, code review, or documentation.

  • For example, a skill might automate a multi-step data validation workflow, while a visualization subagent could generate publication-quality figures without requiring the primary agent to manage the plotting details directly.
  • These capabilities allow repository agents to support complex analytical and software development workflows that extend well beyond the capabilities of most chat assistants or IDE assistants.

The tradeoff is that greater autonomy requires greater oversight. As repository agents gain the ability to modify files, execute commands, and coordinate multi-step workflows, careful review, testing, and human judgment become increasingly important.

14.5 Working with Positron Assistant

Positron Assistant is like having Claude inside your IDE.

  • Rather than switching to a terminal or a browser tab, the Assistant panel lives in Positron’s sidebar alongside your Variables pane, Plots pane, and console, the same environment where the rest of your analytical work happens.

Positron is repository-aware.

  • It can inspect open files in the editor as well as other files in the repository.
  • It sees your loaded data frames, your plots, your console history, and your R session state.
  • When you ask a question about a data quality issue, it already knows the shape of the data you are working with.
  • When you ask for a plot fix, it can see the plot that is currently in the Plots pane.
  • That session-level context is what makes it specifically useful for exploratory data science work rather than general coding assistance.

14.5.1 Getting Started

Positron Assistant is not an extension to install; it is built into Positron and available from version 2025.07.0-204 onward.

  • Positron Assistant is enabled by default for Anthropic’s Claude models through their Console API.
  • Positron Assistant requires an Anthropic API key and usage is billed according to Anthropic’s pricing.
  • Then you can get an API key and register it inside Positron.

14.5.2 The Positron Assistant Panel

Figure 14.12 shows Positron with the Assistant panel open.

The Assistant Panel has three parts worth looking at before using it.

Positron IDE showing the Assistant chat panel on the left, R 4.5.2 console in the center, and Variables and Plots panes on the right. Blue annotations indicate the toggle icon on the activity bar and the model selector at the bottom of the chat pane.
Figure 14.12: The Positron Assistant panel open alongside the R console. The activity bar icon toggles the panel; the model selector and mode dropdown sit at the bottom of the chat pane.

The activity bar icon (left edge, annotated) toggles the Assistant panel open and closed.

  • It sits alongside the file explorer, search, and Git icons.
  • The panel is part of the IDE layout, not a floating window.

The chat input near the bottom of the panel is where prompts go.

  • Three features are available inline:
    • @ selects a chat participant such as a specific file or workspace context;
    • # adds explicit context such as a file or variable; and
    • / accesses predefined commands including /help.
  • The Add Context button to the left of the input does the same as # with a point- and-click interface.

The mode selector and model selector dropdowns sit at the bottom on the left and right.

  • The mode selector drop-down shows Ask by default; click it to switch to Edit or Agent.
  • The model selector drop-down shows the current model; click it to switch between available Claude versions or other models you have enabled and authorized.
  • The screenshot shows Claude Sonnet 4.6, which is the default and appropriate for most analytical tasks.

Positron Assistant can see this session state and will incorporate it into responses without you having to describe it.

14.5.3 Positron Assistant’s Ask, Edit, and Agent Modes

Positron Assistant has three modes, accessible from the drop-down at the bottom left of the Assistant panel.

Ask is the default chat mode. You ask questions, request code, or describe what you want.

  • The response appears in the panel.
  • This is the right mode for targeted questions such as “why is this join producing duplicates”, “what does this warning mean”, “rewrite this function to use purrr::map”, where you want to review the answer before applying it.

Edit applies changes directly to a file you have open.

  • Rather than copying code out of the chat panel, Claude edits the file in place and shows a Git diff.
  • You review the changes and accept or discard them with the Keep button.
  • This is the right mode when you know what you want changed and do not need to reason through it first.

Agent is the autonomous mode.

  • Claude can inspect files, execute searches, write code, and perform multi-step tasks before returning a result.
  • It is useful for larger tasks such as building scripts, investigating repositories, refactoring code, or generating analyses.

For exploratory analysis where you are still forming the question, Ask is usually right. For execution tasks where the question is already clear, Agent is faster, and then use Edit to refine code.

Note

Agent mode sends substantially more context to the API than Ask mode as it reads project files, session state, and intermediate results as it works.

  • Positron uses prompt caching with Anthropic’s API, which reduces the cost of repeated context on follow-up exchanges in the same session.
  • If you are monitoring costs, the Output panel under Assistant shows exactly what is being sent and approximately how many tokens each exchange consumed.
Note

Posit does not track, collect, or store your prompts, code, or conversations when using Positron Assistant.

  • Your session data goes directly to Anthropic under your own API key and is subject to Anthropic’s data handling policies, not Posit’s.
  • This is the same arrangement as using the API directly. Posit is providing the client, not the model.

14.5.4 Example: Working with Positron Assistant

To illustrate this workflow, consider the World Bank Europe project and suppose we want to study migration patterns in Europe and understand whether emigration may help explain some of the long-run population changes observed in countries such as Albania.

  • Because Positron Assistant already has access to the repository and session state, it can begin from the current project rather than a blank chat window.
  • The repository already contains data and functions for downloading and processing World Bank indicators, creating visualizations, and fitting simple models.

Before searching for additional data ourselves, we can switch to Agent mode and pose the following question:

Scan the repo for the types of data that exist from the world bank and suggest other means to get data about immigration and emigration from European countries. Are there reports or PDF doucments or oter data sources that would work and for which you could write R or python code to convert tables into data

This example illustrates a complete Agent-mode workflow.

The assistant:

  1. inspected the repository,
  2. identified missing migration indicators,
  3. evaluated alternative data sources,
  4. generated code to acquire new data, and
  5. remained available to continue the analysis.

The screenshots below show each stage of that process.

Figure 14.13 shows Positron Assistant operating in Agent mode. Rather than answering immediately, the assistant first examined the repository to understand the available data sources, scripts, and metadata files.

The assistant inspected multiple files, including the World Bank API specifications spreadsheet, plotting functions, correlation scripts, and data acquisition code. This initial investigation allowed it to determine which indicators were already available and identify gaps in the existing workflow.

Positron Assistant in Agent mode reviewing multiple repository files and displaying an activity log of automated file inspection and search operations.
Figure 14.13: Positron Assistant operating in Agent mode while scanning the repository. The assistant inspects project files, metadata spreadsheets, and analysis scripts before generating a response. The activity log shows multiple file retrieval and search operations performed automatically by the agent.

After examining the repository, the assistant concluded that the project did not contain direct immigration or emigration indicators. It then suggested several external sources that could be integrated into the workflow.

The response included migration-related indicators available through the World Bank, as well as alternative sources such as Eurostat, UN DESA, OECD migration databases, UNHCR refugee statistics, and IOM reports. Importantly, the assistant discussed both machine-readable APIs and PDF-based reports, along with packages that could be used to extract data from each source.

Assistant response summarizing migration data sources and showing example R code for downloading or processing migration datasets.
Figure 14.14: Positron Assistant presenting a structured summary of migration data sources, including World Bank migration indicators and alternative sources such as Eurostat, UN DESA, OECD, UNHCR, and IOM. Example R code is provided for accessing several of the recommended resources.

Once a suitable source had been identified, the assistant was asked to generate code that followed the conventions already established in the repository.

Because the agent had already examined the existing data acquisition scripts, it was able to create a new script that mirrored the structure, naming conventions, and processing pipeline used elsewhere in the project. The generated code included migration indicators, data acquisition logic, and documentation describing the workflow.

Editor window displaying a newly generated R script for downloading and processing World Bank migration indicators.
Figure 14.15: A new migration data acquisition script generated by Positron Assistant. The code follows the structure and conventions of existing repository scripts while incorporating migration indicators obtained from the World Bank API.

After completing the initial task, the assistant remained available to continue development. Additional prompts could request visualizations, joins with existing datasets, exploratory analyses, or entirely new workflows built on the migration data.

This illustrates an important distinction between Agent mode and traditional chat interfaces. The interaction becomes an iterative development process rather than a single question-and-answer exchange.

Completed Agent mode conversation showing repository analysis results and a prompt area ready for additional development tasks.
Figure 14.16: Positron Assistant after completing the repository scan and migration data investigation. The chat interface displays the completed analysis and waits for the next development task.
Note

The agent was not limited to answering a question about migration data. It first investigated the repository, identified missing information, researched potential sources, generated code to obtain new data, and then remained available to continue the analysis. This type of iterative workflow is where Agent mode begins to resemble a lightweight software-development assistant rather than a traditional chatbot.

Figure 14.13 through Figure 14.16 exemplify a typical Agent mode interaction.

  • Subsequent refinement could continue in Ask or Edit mode, where the context is typically narrower and focused on the specific files being modified (and generally fewer tokens are consumed).
Important

Even when an assistant identifies a promising data source, the analyst remains responsible for evaluating the quality of that source.

Questions such as data provenance, collection methodology, geographic coverage, missing values, and comparability across countries still require human judgment.

  • AI can accelerate discovery and implementation, but it cannot determine whether a source is appropriate for a particular research question.

14.6 Agent SDKs

Agent Software Development Kits (SDKs) provide a software framework for building custom AI agents.

  • Rather than interacting with a pre-built assistant, developers define the agent’s instructions, tools, workflows, and constraints in code.
  • Modern SDKs handle many of the difficult infrastructure tasks automatically, including tool registration, conversation management, model communication, and execution loops.
  • This allows developers to focus on the domain-specific logic of the agent rather than the underlying plumbing.

The OpenAI Agents SDK is a widely used example of this approach.

  • A data scientist might create an agent that analyzes World Bank data, retrieves country-level indicators, generates visualizations, and produces written summaries.
  • Each capability is implemented as a tool that the agent can call when needed.
  • Because the tools are (usually) written in Python, they can be tailored to the specific needs of a project while still benefiting from the SDK’s built-in orchestration and tracing capabilities.

The key advantage of an SDK is flexibility.

  • Claude Code, Codex, Positron Assistant, and similar systems are designed primarily to help users work with code and projects through a predefined interface.
  • An SDK allows you to move beyond those built-in workflows and create agents specialized for a particular task, organization, dataset, or business process.
  • This additional flexibility comes at the cost of greater development effort since you are responsible for designing, testing, and maintaining the agent.

In practice, many data scientists begin with tools such as Claude Code or Positron Assistant because they provide immediate productivity gains with little setup.

  • SDKs become more attractive when the workflow is repeatedly performed, requires specialized tools, must integrate with existing systems, or needs capabilities that are not available in a general-purpose coding assistant.
  • At that point, building a custom agent can provide a level of automation and control that exceeds what a pre-built assistant can offer.

14.7 Leading Edge Concepts and Practices

14.7.1 Reusable Agent Definitions

An emerging practice is reusable agent definitions.

  • Rather than writing a new prompt for every task, practitioners create role descriptions that define responsibilities, preferred workflows, and expected outputs.
  • These definitions can often be reused across projects and, with minor modifications, across different agent frameworks.

Community repositories now provide agent definitions for roles such as data scientist, software engineer, code reviewer, researcher, and project manager.

In many cases, the same role definition can be used with Claude Code, Codex, SDK-based agents, or other agent frameworks, making the agent specification itself a reusable asset.

14.7.2 Retrieval-Augmented Generation (RAG)

Retrieval-Augmented Generation (RAG) combines information retrieval with LLM generation.

  • Rather than relying exclusively on information learned during training, a RAG system first searches a collection of documents, databases, project files, or other resources for information relevant to the user’s question.
    • This is a way to get your latest, most accurate , or proprietary data or content into the model.
  • The retrieved material is then inserted into the model’s context window, allowing the model to generate a response grounded in the retrieved content rather than relying entirely on its internal knowledge.

Many of the AI tools discussed in this chapter use some form of retrieval.

  • Claude.ai can search uploaded files, Claude Code can retrieve relevant source files from a repository, and Positron Assistant can inspect project files, data frames, plots, and session state.
  • In each case, the underlying model remains largely the same, but the quality of the response improves because the model is given access to information specific to the current task.
  • As projects grow beyond what can fit into a single prompt, retrieval becomes one of the primary mechanisms for providing relevant context to an AI system.

Conceptually, the workflow looks like:

Question -> Retrieve Information -> Add Context -> Generate Response

RAG systems are widely used when assistants must work with large collections of documents, reports, databases, or project files.

14.7.3 Vibe Coding

Vibe coding refers to a style of software development in which a programmer describes goals in natural language and relies heavily on an AI assistant to generate, modify, and explain code.

  • This approach can dramatically accelerate prototyping, reduce repetitive coding tasks, and make it easier to explore alternative solutions.
  • Tools such as Claude Code, Positron Assistant, GitHub Copilot, Cursor, and other AI-enabled development environments have made this workflow increasingly common.
  • However, the effectiveness of vibe coding depends heavily on the developer’s ability to evaluate the resulting code. - AI-generated solutions can contain subtle bugs, incorrect assumptions, inefficient algorithms, or architectural problems that may not become apparent until much later.
  • The most successful practitioners use AI to accelerate implementation while still applying the same testing, debugging, and code review practices they would use for human-written code.
Warning 14.1

A Common Risk

If you do not know what a correct solution should look like, it becomes much harder to recognize incorrect code generated by an AI assistant.

AI-generated code should be treated as a starting point for review rather than unquestioned truth.

14.7.4 Human-in-the-Loop AI

Human-in-the-loop systems combine AI automation with human review and decision making.

Rather than allowing an AI system to operate completely independently, important actions require human approval. Examples include reviewing generated code before merging it, approving an agent’s proposed plan before execution, or validating analytical results before publication.

Many modern AI development tools use this approach. Claude Code asks for approval before potentially destructive actions, Positron Assistant requires confirmation before applying edits, and production agent systems often pause at predefined checkpoints.

The goal is not to eliminate human involvement but to combine the speed of automation with human judgment and accountability.

14.7.5 Persistent and Multi-Agent Workflows

When people talk about agents running “all day” or “24/7,” they usually do not mean that a model is continuously reasoning in the background.

  • More often, they mean that an agent has been connected to a scheduler, event trigger, queue, or cloud environment.
  • The agent wakes up when something happens, checks relevant information, takes a limited action, records the result, and then stops until the next trigger.
  • For example, a price-monitoring agent might check a product page every hour, a market-monitoring agent might look for large stock-price changes, and a coding-summary agent might review Git commits at the end of each day and suggest ways to improve the developer’s workflow.

In coding workflows, this can look several different ways.

  • With terminal-based tools such as Claude Code, a developer may open one or more terminal sessions, give each session a different scoped task, and let the agents work independently on separate branches or files.
    • One agent might refactor plotting functions, another might write tests, and another might update documentation.
    • With cloud-based coding agents such as Codex, the work may run in separate cloud environments in the background, including in parallel, rather than in visible local terminal windows.
  • In either case, the practical pattern is the same:
    • split a large project into smaller tasks,
    • give each agent clear instructions,
    • isolate its workspace, and
    • review its changes before merging them.
  • Codex Cloud is explicitly designed for background and parallel coding tasks, while Claude Code is designed to work from a terminal, IDE, web interface, or other connected development surface.

For monitoring personal or online activity, the workflow is similar but less code-centered.

  • A practitioner defines the target to monitor, the trigger condition, the tools the agent can use, and the notification or storage mechanism.
    • A simple version might be: every morning, read yesterday’s Git commits, summarize what changed, identify unfinished work, and write a short planning note.
    • A more complex version might monitor product prices, public web pages, RSS feeds, GitHub issues, API endpoints, or analytics dashboards.
    • The important design choice is that the agent should not have unlimited authority.
    • It might be allowed to read data and draft recommendations, but sending emails, placing orders, trading stocks, deleting files, or changing production systems should require explicit human approval.

Multi-agent systems extend this pattern by assigning different roles to different agents.

  • One agent might gather data, another might validate sources, another might write a summary, and another might review the output for errors.
  • This can be more reliable than asking a single large agent to do everything, because each agent has a narrower task and clearer success criteria.
  • The tradeoff is operational complexity: multiple agents require coordination, logging, cost controls, permissions, and a way to resolve conflicts when agents disagree or modify overlapping files.

The safest way to interpret claims about “always-on agents” is to translate them into familiar infrastructure terms:

  • scheduled jobs,
  • event-driven workflows,
  • background tasks,
  • logs,
  • permissions, and
  • review steps.

The novelty is that an LLM can now decide which tools to call and how to synthesize the results.

  • The responsibility remains the same: define scope carefully, monitor what the agent does, and keep humans in control of consequential actions.

14.8 Responsible Use of AI, Assistants, and Agents

The tools discussed throughout this chapter can accelerate many aspects of data science and software development. Whether you are working with a chat assistant, a repository-aware coding tool, or a fully autonomous agent, the same fundamental responsibilities remain.

14.8.1 Five Principles for Data Science and AI

  1. Verify results before using them.
    • AI-generated outputs can be convincing while still being incorrect. Always validate important code, analyses, and conclusions.
  2. Understand the data.
    • No AI system can replace knowledge of how data were collected, what variables mean, or what limitations exist in the dataset.
  3. Maintain reproducible workflows.
    • Analyses should remain transparent and reproducible through code, documentation, version control, and clear records of how results were produced.
  4. Protect sensitive information.
    • Before sharing data with cloud-based AI systems, understand privacy requirements, ownership restrictions, and applicable regulations.
  5. Remain accountable for decisions.
    • AI tools can assist with analysis and implementation, but responsibility for the final work remains with the human using the tool.

14.8.2 Five Additional Principles for Agents

  1. Limit an agent’s scope.
    • Give agents only the permissions and tools needed to complete a task. More capability also means greater risk.
  2. Keep humans in the loop for important actions.
    • Actions such as modifying files, executing code, sending messages, or updating databases should include opportunities for human review.
  3. Evaluate outputs, not appearances.
    • Agents often produce polished and confident results. Correctness must be verified independently.
  4. Monitor costs and resource usage.
    • Agentic workflows can consume large numbers of tokens, API calls, and computational resources if left unchecked.
  5. Avoid deskilling yourself.
    • The ability to evaluate an agent’s work depends on understanding the underlying methods. Continue developing your analytical and programming skills rather than relying entirely on automation.
Important

A useful rule of thumb is simple:

Never delegate responsibility, only tasks.

AI assistants and agents can generate code, analyze data, search repositories, and automate workflows, but they cannot assume responsibility for the quality, correctness, or consequences of the work they produce.

14.9 Summary

  • Most practitioners use existing assistants and frameworks rather than building agents from scratch.
  • AI assistants can help analyze larger and more complex projects.
  • Project context plays a critical role in agent performance.
  • Skills, subagents, and retrieval help manage complexity and reduce context size.
  • Vibe coding can accelerate development but requires careful oversight.
  • Version control becomes even more important when working with AI agents.
  • Responsible data science practices remain essential regardless of how much assistance is provided by AI systems.