Unit 01 Data Science Project

The final project for this unit will be a small research project. Working with the data science tools we explored this unit, you will ask and answer a question using data we collected about our digital lives. You will present your findings to your peers.

To be successful in this project, you will need to find a question that is both interesting to you and answerable (at least in part) with data from the social media dataset. Your teachers will help you make sure your question achieves that.

Starter Code

{{ code(arg0=-action) }} Starter code for the project is provided in the project-data-science repo. Download it onto your laptop.

$ cd cs9/unit_01
$ git clone https://github.com/the-isf-academy/project-data-science-YOUR-GITHUB-USERNAME.git

Project Proposal (Google Doc)

For this project, your proposal will be the first three sections of the final project (Introduction, Data, Analytic Sub-Question).

You should complete this proposal in 1-2 class periods Before you start working on your project, you are required to write a project proposal and get it approved by a teacher. You can find your design doc in your Google Drive folder.

Once you have completed the above, meet with a teacher to talk through your proposal. Don't start programming until you get your proposal approved, or you might have to change it.

Project (project.ipynb)

This document is where your code and analysis will be located. To complete your project, you should fill out the project.ipynb Jupyter notebook with each of the following sections:

Research Communication (communication.jpeg)

In addition to performing data analysis to answer a question, you will share your results and discussion with the wider community in the form of an Instagram post, a flyer, a YouTube video, or some other broadcast medium.

To do this, you will need to consider:


Example Project

Here is an example project proposal and an example repository containing all of the elements of a complete project (minus the research communication).





You are responsible for assessing your own project, though your teachers will let you know if they disagree. In assessment.md, you are required to explain how your project should be scored, and to give evidence to support your assessment. The rubric is based on claims that you should be able to make about your learning in this unit. Each of these claims comes with examples of evidence that would show that you can make the claim about your learning.

As a reminder, here's a guide for using the rubric:


Each project will be assessed with a rubric tailored to the skills and concepts the project targets. This project is focused on developing the skills learned throughout the drawing unit.

Criterion A: Knowing, understanding, and computational thinking

Students appropriately apply computer science concepts and tools in context. On top of computer science concepts and tools, students apply computational thinking practices including habits such as writing pseudocode, developing iteratively, using abstractions, decomposing problems, and debugging.

In this unit, we covered some of the core ideas of computational thinking (like inputs/outputs of functions) and some fundamental tools of computer science (like data types, variables, and data structures).

| Learning Claim | Possible Forms of Evidence | |---------------------------------------------------------------------------------------------------------- |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | I can analyze and visualize data to answer questions I have about what the data means. |

| | I can use appropriate data structures to capture the form of my data and to serve my analytical purpose. | | | I can use functions to abstract and decompose computational processes. | |

Criterion B: Planning and development

Students create personally meaningful projects through an iterative design cycle. Students’ work is grounded in a development plan which students create before beginning the project. Students document the development of their projects in order to create a record of decisions, assumptions, and lingering flaws. Students define the intended functionality and develop towards evaluation.

From idea to communication, data science requires significant forethought: what data will you use? How will you answer your question? Who will care about the answer?

Learning ClaimPossible Forms of Evidence
I can thoughtfully plan a large computer science project.
  • A thorough design document.
  • Updates to your project plan to account for challenges during development
I can document the development of my project using version control tools such as GitHub.
  • At least 5 regular and descriptive git commits on your project
  • A description of a GitHub issue / bug you encountered, as well as an explanation of how you fixed it (you may include links that you used, such as StackOverflow).
  • Comments for each of the modules and functions in your project

Criterion C: Evaluation

Students produce evidence of a testing plan that evaluates the main areas of functionality of the product and reflect on the development process as well as a proposal for further development to improve the shortcomings of the current product.

Data science research requires us to evaluate our work from data selection to data anlysis to data visualization to make sure our work is accurate and ethical.

| Learning Claim | Possible Forms of Evidence | |--------------------------------------------------------------------------------------------------- |--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | I can select and clean data to serve as a source for my analysis. |

| | I evaluate the analytical choices I make by producing evidence to show that my choices are sound. | |

Criterion D: Reflection on Tech and Society

Students demonstrate an understanding of their responsibility to society as technology creators by evaluating the implications of their work. Students investigate the applications of their work to specific problems or issues.

Knowledge can change the world. As such, the tools of data science come with great responsibility. Data is one way to investigate truth, but only if we also investigate our biases.

| Learning Claim | Possible Forms of Evidence | |------------------------------------------------------------------------------------------- |----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | | I can perform research with an audience in mind. |

| | I understand that data analysis can impact understanding and decision-making in the world | | | I can identify limitations and potential improvements of my research | |