Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Datasets Summary:

All the data can be found on this Fake link to Google Folder.

This shows that we are using only three datasets.

DataSetSourceSizeNotes
Report_Card_Graduation_2018-19.csvYour link must be a deep link that goes to the data like this:
catalog.data.gov
81,267Graduation information for washington state.
teachers_2014.csvdata.gov48x10Contains full-time teacher pay and benefits by school district
geo_wa_counties.jsonNatural EarthNAContains geometry data for the counties in Washington state

Graduation_2018.csv

This dataset contains graduation rates of high school students in the year 2018 only. The rates are by race and school district.

ColumnDescription
DistrictNamestring: The name of the school district
Countystring: A list of county names that the school district is in. A district may span multiple counties
StudentGroupstring: The race of the students in this row. Races included are [White, Hispanic/ Latino of any race(s), Black/ African American, Asian...]
GraduationRatedouble: The percent of students of this race that graduated high school in four years.

Teachers_2014.csv

This dataset contains salary & benefits information for full-time teachers by school district in the year 2014.

ColumnDescription
DNUMinteger: The number for the school district. For example, Northshore is 417.
PERVinteger: The number of personal vacation days that a teacher gets per year.
BASEdouble: The Base salary of a full-time teacher.
HRPAYdouble: The additional pay given to a teacher beyond their base salary for simply being a teacher.
SPSTdouble: The average additional pay (stipend) given to a teacher for coaching a sport.
APSTdouble: The additional pay (stipend) given to an AP Teacher.

Data Challenges

The datasets come from different years because we could not get accurate data for both sets during the same year. If we correlate the data across different years, we are not representing the true data. We need to highlight this!

While the teacher pay dataset is extensive, there is no single column that gives a simple summary how much an “average” teacher makes. This is because we don’t know how many teachers receive certain types of stipends.

It would be valuable to track the changes of graduation rates over time as related to the changes of salary over time. I will be doing some extra work to find more datasets to allow graphing over time.

The School Districts don’t map easily across datasets. One dataset uses a number while the other uses a string. I may need to manually create a mapping dataset that allows me to join the two together.

It would be good to geospatially plot graduation rates, but the geometry data that I’ve found so far is only by county while the school districts can span many counties. I may have to manually pick, or randomly guess, which county a school district mostly represents. Or, perhaps I can locate geometry for the school districts themselves.