AU DATA 793 Data Science Practicum

1 Course Information

This course serves as the primary capstone option for the Data Science (MS) program and provides an important bridge between the academic setting and the professional world. Students apply the skills acquired during their program to real-world research problems. Students work with clients from across campus or other companies and institutions on various research problems in data science.

1.1 Purpose

As the capstone experience in the MS program in Data Science, DATA 793 provides a unique opportunity for you to collaborate with a client from academia, government, or various industries and institutions, working on real-world problems in the realm of Data Science. The goal is for you to demonstrate and strengthen your competencies as a professional data scientist while creating a project solution you will be proud to share as part of your portfolio of data science accomplishments.

1.2 Description

Upon successful completion of this course, you will be able to demonstrate competence in developing solutions requiring diverse data science methods.

Specifically, you should be able to:

  • Research and conceptualize a problem, define the scope of the project and break it down into workable components.
  • Gather data necessary to support the solution and clean, shape, explore and analyze the data as required to support the solution.
  • Manage a project effectively by identifying key tasks, projecting and monitoring resources, assessing performance and risk, and ensuring all deadlines are met.
  • Establish a positive working relationship with a client sponsor by understanding the clients needs and deadlines, responding quickly to client requests, and managing their expectations.
  • Deliver an organized, articulate presentation demonstrating the effective application of data science methods and tools and communicating results.
  • Prepare a professional, cleanly-written document that encapsulates reproducible results on the topic.

1.3 Prerequisites

This course requires successful completion of DATA-613 as a prerequisite.

1.4 Required Resources

There is no specific text for this course.

Lecture notes can be found at Lecture Notes for AU DATA 793 Data Science Practicum.

1.6 Computing Environment

  • A computing device capable of running one or more programming languages (e.g., R, Python, SQL) and IDEs such as RStudio or VSCode on the device or in the cloud.
  • Access to American University’s Canvas Learning Management System
  • Access to internet capable of supporting Zoom video sessions while using AU’s Zoom license.

1.7 Software Requirements

  • The free R statistical programming language, minimum version R version 4.4.2 (2024-10-31), which is free and may be downloaded from the R website.
  • The free version of the RStudio Desktop Integrated Development Environment (IDE), minimum version 2024.12.0 Build 467.
  • The free software Git.
  • A free personal GitHub account (available at https://github.com/) and access to the GitHub repository website.
  • The free Quarto software to produce documents written in a literate programming style and convert them to HTML/PDF format.
  • See Lecture Notes for AU DATA 413-613 Data Science, Appendices A, B and Chapter 2 for background on installing and using the software.

2 Schedule of Topics

The following is a general sequence of topics over the 15 weeks of the course.

  • A week starts on on the designated course day and ends the day before the next class.
  • Note that Section 001 will have a holiday on 1/20/2025 so after the first week the sections will go in sequence: 003 on Wednesday, 002 on Thursday, and 001 the following Monday.
Week Topics
1 Course overview, Project Management Overview
2-14 Weekly Project Reviews/Professional Development
15 Project Presentations during Final Exam Period

3 Overall Structure

  • We will use Canvas as our Learning Management System for the course.
  • We will meet each week in person in the designated classroom or on Zoom.
  • The typical class will include individual presentations by each student on the project status and discussion by the group.
  • Students shall also engage online with their peers each week on different discussion topics.
  • Students shall engage with their Client Sponsor each week to discuss the project.
  • Students shall use a GitHub repository established for the class for their course-oriented assignments.
  • Students may use the GitHub repository for client-focused work as permitted by the client.
  • Additional topics on professional development will be covered as time permits.
  • I will use Zoom to record classes. Zoom Recordings will be available on Canvas Media Gallery within a day or two after class.

4 Graded Elements

There are multiple graded elements for the course:

  • Solution Development
  • Final Solution
  • Project Management
  • Client Assessment
  • Professional Development
  • Attendance and Engagement

Each element has one or more assignments with the exception of Client Assessment.

  • Students will be provided quarto templates for multiple assignments. They shall tailor the documents to fit their project.

Each assignment will have a designated number of possible points.

  • Most assignments will have an associated rubric that show how the assignment will be evaluated and points may be allocated.
  • Students are encouraged to review the rubrics before developing the assignments.

4.1 Solution Development

Solution development covers the work focused on creating the solution. Solutions may focus on a research problem, an analytical problem, or a code-focused application. The content and focus of deliverables will vary with the type of solution.

Solution Development includes three assignments: Performance Work Statement (PWS), Solution Framing Report, and Draft Solution Deliverables.

  • The PWS is a foundational document for the course as it formalizes the client’s requirements for the project.
  • The Solution Framing Report focuses on defining the essential aspects of the solution and the technical approach based on a literature review and ethical analysis.
  • The Draft Solution Deliverable are initial versions of reports or working prototypes of applications.

4.2 Final Solution

The Final Solution includes all deliverables as defined by the PWS in their final form as well as the Final Presentation slides and the presentation itself.

4.3 Project Management

Project Management focuses on the plan for the work to create the solution and the management of progress and risk so the final solution meets all client requirements in a timely manner.

Assignments include the Project Plan and Weekly Project Update Presentations (slides and discussion).

Part of the weekly update and final presentation is a reflection on your learning about the content, program management, and client collaboration.

4.4 Client Assessment

The Client Assessment is a significant portion of the course evaluation as the goal is to meet the client’s expectations and requirements established in the PWS.

The client sponsors will provide formal feedback at the end of the courses and may provide feedback during the course.

Students will have to submit a client deliverable acceptance form with the final solution.

4.5 Professional Development

This element includes assignments focused on professional development. Topics include: Interview Analysis using Big Interview, Resumes for a Job Announcement and a reflection onStudent Online Presence.

4.6 Attendance and Engagement

Attendance will be taken each class. These are small classes. All students are expected to engage during class. This includes asking questions, answering questions from the instructor or other students, and offering your own insights or recommendations.

5 Final Grades

5.1 Weighting of Elements

Element Weight
Solution Development 20%
Final Solution 20%
Project Management 15%
Client Assessment 20%
Professional Development 10%
Attendance and Engagement 15%

5.2 Converting Scores to Final Grades

Final grades are based on the weighted average of the graded elements rounded to the closest integer with an emphasis on how students demonstrate the course learning outcomes by the end of the course.

Range % Letter
93 or above
90-92 A-
87-89 B+
83-86
80-82 B-
77-79 C+
70-76
67 - 69 C-
60-66
59 or less

6 Getting Help with Your Project

This is your project.

  • All work you submit is expected to be your own.
  • You should be able to explain all aspects of your deliverables.

However, data science is a team sport and most projects will require you to do something you have not been taught.

It is fine to use other sources to learn as the goal is to get to a successful solution.

This is not like many other courses. I see this course as a team effort where we are all on the same team but supporting different clients. We want every project to be a success.

You will be sharing your work with peers every week.

You are encouraged to collaborate with peers, get peer-feedback, and peer reviews of your project solution and all deliverables.

6.1 Collaborating with the Project Client Sponsor

The expectation is your course work in the program has provided you a foundation of data science knowledge and skills capable of solving many problems. You should not expect the client sponsor to “teach” you course material you have already have seen.

However, many projects may require advanced skills in a particular area or specific domain knowledge. Before asking your client sponsor for assistance you are expected to learn as much as you can via other sources. Once you have done that, feel free to solicit additional guidance from the client sponsor.

Your client sponsor may also want to provide you specific guidance/teaching on their domain or specialized methods or tools and that is fine.

If you have questions you can always ask me!

6.2 Peer Collaboration

This course depends upon peer-to-peer learning. You are encouraged to collaborate with peers on all aspects of your project.

Peer learning means you learn from others about your project but you also learn from them about their project.

Make it a practice to review each others deliverables (documents and code) for structure, content, and proofreading.

However, do not ask or expect other people to write your code or your documents. It must be your work.

Do not write portions of documents or code for others. It is fine to offer comments, feedback on pull requests, etc., but their final work must be their work.

6.3 Searching for Help Online

You can use online resources as a guide to understanding a question or how to implement your solution, but you must actually implement it yourself. Using these resources is expected. However, DO NOT just copy and paste blocks of code into your work.

Students are encouraged to use generative AI tools, such as ChatGPT, Claude, or meta.ai to improve understanding of the problem and possible solutions, debug code, develop code snippets or other contributions.

  • Using these tools responsibly means acknowledging, through appropriate citation, for example, where and how they’ve been used. Include the prompts as part of your citation.
Warning
  • Caution: CHAT GPT and meta.ai are not perfect and will often “hallucinate” instead of telling you they don’t know the answer.

  • They will provide you an answer that is 100% wrong with 100% confidence and they will repeat it. You must be careful, especially on new or innovative challenges and software versions.

  • Students are responsible for assessing the accuracy and value of the output of any generative AI tools, and are accountable for all work they submit.

  • Students are responsible for recognizing the limitations of these tools, and are accountable for AI-generated work that produces invented data or sources.

  • Improper usage may constitute violations of the University’s Academic Integrity Code.
  • Note that representing the outputs of these models as original work usually also violates the terms of service for the tools.

Students are permitted to use software tools to review and revise the organization, content, and syntax of your writing.

  • This includes the use of spell checkers, grammar checkers, thesauruses built included with applications such as MS Word or Google Docs.
  • This also includes standalone tools such as Grammerly, WordTune, Virtual Writing Tutor, or SlickWrite.
    • If you use a standalone tool, include a reference to its use at the end of the document but you do not have to cite the individual revisions.