Overview
Review the document template and fill in all the sections.
You will deliver the following:
- The main goal of this document is to create the “context and scope” of what you are answering. What is your “research topic” in 3 bullet questions.
- Submit to github a link to your google document or your markdown content in the document folder.
- Submit to GitHub your folder of raw_data (or links to a Google Folder if over 1GB).
- Target Challenge Goals
- As a guide your document should be between 1500 and 2500 words at most excluding your datasets. (3-5 pages). You will be marked down for going over this.
You do not need to have any code written yet, but you may want to use some code to help you learn about the data. For example, you may want to print out the columns or get some statistical information about the data using code.
You can use Excel or other tools to view the data.
Things NOT necessary for this deliverable:
- web scraping
- data clean up
- unit testing
- plots
- report
- presentation
Document Purpose
It is a frequent student behavior to dive into a research project without fully understanding what data is available and challenges ahead. This deliverable is to assure that the student has the necessary data and has the understanding of the pending challenges.
The Discovery Document’s purpose is to:
- Illustrate that you have found appropriate data. The data must be:
- Large enough (500 lines or more)
- Available for download (CSV file format)
- Has the necessary information to successfully conduct your research
- Present a few questions (about the data)
- This is the focal point of your research
- Express why your questions are of interest (motivation)
- Illustrate that you understand the data:
- Know how the data was sourced
- Know how the data may be limited (reliability, accuracy, completeness, messy)
- Identify & explain relevant columns: names, format, units, ranges, cleanliness
- Issues or challenges in working with the data (e.g. too big, non-standard key formatting making cross-referencing difficult, missing information, too broad or narrow)
- Establish Challenge Goals:
- While this may change, it is important to consider what challenges you intend to take on.
Grading
Grading for this document will follow this rubric guide here