Update chapter 7 section on scraping multiple dynamic pages to align with new website selectors and added saved data to see the results.
E.2 September 2025
Added examples of {httr2} functions for multiple requests and other APIs in chapter 6.
Added chapter 19 on working with LLMs and first section on Prompt Engineering.
Updated section on Web Scraping to reflect current IMDB site.
Updated section on Web Scraping to correct issues with the updated Wikipedia site for mosque data.
Updated sections on scraping dynamic pages.
Update section on RSelenium configuration to discuss approach with Chrome for Testing and replacing no-longer free call to netstat with the free {httpuv} package.
Greatly expanded and reorganized the SQL chapter to address many more building blocks of the syntax.
E.3 August 2025
Added appendix of General Instructions for Assignments
Added section on scraping JSON-LD scripts with embedded data
E.4 June 2025
Updated section 7 URL for wikipedia Hurricane data to new page.
Added new appendix discussing projects, folders, files, and paths. Includes using the {here} package.
E.5 April 2025
Did a complete rewrite of section 7.7 to address changes in the websites to be javascript driven instead of static HTML. Introduce {rplaywright}.
Turned off evaluation in 7.9 as the websites have changed to the code does not all work but remains illustrative of the approach. Added a warning.
Updated a few expired links due to changes in publisher sites.
Updated the list of browsers for html_live in 7.5.7
E.6 October 2024
Update to Web scraping to use {rvest} read_html_live().
E.7 August 2024
Extensive update to section 7 on web scraping to account for changes in IMDB website.
Added new example with Selector Gadget with UN Aid organizations
Changed all the IMDB sections to use Developer Tools with new CSS selectors
Added section on using {chromote} package to get scroll the IMDB pages
Adjusted the Taylor Swift example to match new column headers on Wikipedia.
E.8 July 2024
Update appendix A for installing software
Update appendix B to eliminate PATs for GitHub and include 2 factor authentication and GCM for both Mac and Windows.
E.9 May 2024
Section 7.9: Update from the {tabulizer} package for scraping PDFs to its replacement, the {tabulapdf} package.