In response to the 2008 U.S. Farm Bill, the U.S. Department of Agriculture’s Economic Research Service compiled a June 2009 report to Congress:
According to data from the latest census (2000), about 23.5 million people, or 8.4 percent of the U.S. population, live in low-income neighborhoods that are more than a mile from a supermarket. Low-income neighborhoods are areas where more than 40 percent of the population has income less than or equal to 200 percent of the Federal poverty threshold ($44,000 per year for a family of four in 2008).
In this assessment, we’ll simulate their analysis by creating geospatial maps to help us understand food access in Washington. There are three geographic region types that we’ll use in this assessment:
- Census tract is a geographic region used in the U.S. census. It is the smallest of the three region types.
- County is a geographic region used for administrative purposes that can include one or more census tracts.
- State is a geographic region such as the State of Washington. It is the largest of the three region types.
A census tract is defined as low access if enough people in the tract do not have nearby access to grocery stores offering affordable and nutritious food. In urban areas, “low access” is defined as 0.5 miles; in rural areas, “low access” is defined as 10 miles.
tl_2010_53_tract00.shp
contains the 2010 US census dataset in geospatial shapefile format only for Washington state (53). The only columns you need to use are CTIDFP00
, the census tract identifier, and geometry
, the geometric shape of the tract.
food_access.csv
contains the food access dataset in tabular CSV format. Each row in the dataset corresponds to a census tract for every state in the country (not just Washington). This dataset has many columns but you only need to understand the following:
CensusTract
is the census tract identifier.State
is the state name for the census tract.County
is the county name for the census tract.Urban
is a flag (0 or 1) that indicates if this census tract is an urban environment.Rural
is a flag that indicates if this census tract is a rural environment.LATracts_half
is a flag that indicates if this census tract is “low access” in a half mile radius.LATracts10
is a flag that indicates if this census tract is “low access” in a 10 mile radius.LowIncomeTracts
is a flag that indicates if this census tract is “low income”.POP2010
is the number of people in this census tract according to the 2010 census.lapophalf
is the number of people in this census tract considered having “low access” in a half mile radius.lapop10
is the number of people in this census tract considered having “low access” in a 10 mile radius.lalowihalf
is similar tolapophalf
but only counts people considered low access and low income.lalowi10
is similar tolapop10
but only counts people considered low access and low income.
!pip install -q folium mapclassify
import geopandas as gpd
import matplotlib.pyplot as plt
import pandas as pd
# Testing utilities that should not be used for implementation
from matplotlib.axes import Axes
from matplotlib.collections import PatchCollection
from folium.features import GeoJson
from folium.folium import Map
from mapping_utils import assert_patches_allclose, rural_idx, rural_la_idx, urban_ha_idx, lalowi_idx
all_counties = gpd.read_file("counties.geojson")
Outside Sources¶
Update the following Markdown cell to include your name and list your outside sources. Submitted work should be consistent with the curriculum and your sources.
Name: YOUR_NAME_HERE
- Enter your outside sources as a list here, or remove this line if you did not consult any outside sources at all.
Task: Load in data¶
Write a function load_data
that takes path for census dataset and the path for the food access dataset and returns the GeoDataFrame
resulting from merging the two datasets on the census tract identifiers CTIDFP00
/ CensusTract
. Assume the census tract identifier columns exist: use only these two column names. Not all census tracts have food access data, but we want to make sure to include all the census tracts data. in the resulting GeoDataFrame
.
def load_data(shp_path, csv_path):
"""
...
"""
...
state_data = load_data("tl_2010_53_tract00.shp", "food_access.csv")
display(state_data)
assert type(state_data) == gpd.GeoDataFrame, "this function should return a GeoDataFrame"
assert list(state_data.columns) == [
"STATEFP00", "COUNTYFP00", "TRACTCE00", "CTIDFP00", "NAME00", "NAMELSAD00", "MTFCC00",
"FUNCSTAT00", "ALAND00", "AWATER00", "INTPTLAT00", "INTPTLON00", "geometry", "CensusTract",
"State", "County", "Urban", "Rural", "LATracts_half", "LATracts10", "GroupQuartersFlag",
"OHU2010", "NUMGQTRS", "PCTGQTRS", "LowIncomeTracts", "POP2010", "lapophalf", "lalowihalf",
"lapop10", "lalowi10",
], "the returned data frame should include columns from both datasets"
assert state_data["CensusTract"].dtype != "object", "gpd.read_file only accepts geospatial data"
assert len(state_data) == 1318, "incorrect number of rows loaded; the two datasets are"\
+ " not merged using the correct join type"
Task: Plot census tracts¶
Write a function plot_census_map
that takes the merged data and returns the Axes
that contains shapes of all the census tracts in Washington. Title the plot “Washington Census Tracts” and turn off axis labels.
def plot_census_map(state_data):
"""
...
"""
...
ax = plot_census_map(state_data)
assert type(ax) == Axes, "this function should return an Axes object"
layers = ax.findobj(PatchCollection)
assert_patches_allclose(layers[0], geoms=state_data.geometry, num_colors=1, layer=0)
assert len(layers) == 1, f"expected to have 1 layer but got {len(layers)} layers"
assert ax.get_title() == "Washington Census Tracts", "title does not match expected"
assert not ax.axison, "borders and labels must be hidden"
Preparing the state background¶
When given no arguments, the dissolve
method considers the entire GeoDataFrame
as a single group. This will be useful for plotting backgrounds in later tasks. We precompute this because dissolve
is an expensive method to run (in terms of runtime).
entire_state = state_data[["geometry"]].dissolve()
display(entire_state)
ax = entire_state.plot(color="#EEE")
ax.set_axis_off()
Task: Plot census tract populations¶
Write a function plot_census_population_map
that takes the merged data and the entire_state
as state background and return the Axes
that plots all the census tracts in Washington where each census tract is colored according to the POP2010
column. There will be some missing census tracts. Underneath, plot the entire state of Washington in the background color #EEE
. Title the plot “Washington Census Tract Populations”, turn off axis labels, include a legend, and increase the figure size so that the map is the same height as the legend (hint: figure size should be roughly 2 by 1 in terms of width by height).
def plot_census_population_map(state_data, state_background):
"""
...
"""
...
ax = plot_census_population_map(state_data, entire_state)
assert type(ax) == Axes, "this function should return an Axes object"
layers = ax.findobj(PatchCollection)
assert len(layers) == 2, "expected to have 2 layers (one background and one foreground)"\
+f" but got {len(layers)} layer(s)"
assert_patches_allclose(layers[0], geoms=entire_state.geometry, color="#EEE", layer=0)
assert_patches_allclose(layers[1], geoms=state_data.dropna().geometry, num_colors=183, layer=1)
assert ax.get_title() == "Washington Census Tract Populations", "title does not match expected"
assert not ax.axison, "borders and labels must be hidden"
cbar = ax.get_figure().get_axes()[-1]
assert cbar.get_label() == "<colorbar>", "missing legend"
assert ax.bbox.height == cbar.bbox.height, "map can be enlarged"
Task: Plot county populations¶
Write a function plot_county_populations_map
that takes the merged data and the entire_state
as state background and returns the Axes
that plots all the counties in Washington where each county is colored according to the POP2010
column. This will require combining all the census tract data and geometries for each county, though there will be missing data for some counties. Underneath, plot the entire state of Washington in the background color #EEE
. Title the plot “Washington County Populations”, turn off axis labels, include a legend, and increase the figure size so that the map is the same height as the legend (hint: figure size should be roughly 2 by 1 in terms of width by height).
def plot_county_population_map(state_data, state_background):
"""
...
"""
...
ax = plot_county_population_map(state_data, entire_state)
assert type(ax) == Axes, "this function should return an Axes object"
layers = ax.findobj(PatchCollection)
assert len(layers) == 2, "expected to have 2 layers (one background and one foreground)"\
+f" but got {len(layers)} layer(s)"
assert_patches_allclose(layers[0], geoms=entire_state.geometry, color="#EEE", layer=0)
assert_patches_allclose(layers[1], geoms=all_counties.geometry, num_colors=20, layer=1)
assert ax.get_title() == "Washington County Populations", "title does not match expected"
assert not ax.axison, "borders and labels must be hidden"
cbar = ax.get_figure().get_axes()[-1]
assert cbar.get_label() == "<colorbar>", "missing legend"
assert ax.bbox.height == cbar.bbox.height, "map can be enlarged"
Task: Plot food access by county¶
Write a function plot_food_access_by_county_map
that takes the merged data and the entire_state
as state background and returns a 4-tuple of Axes
that represent the subplots in a 2-by-2 figure consisting of 4 choropleth maps:
- Top left plot titled “Low Access: Half Mile” showing the proportion of people per county who have low access to food within a half mile
lapophalf
out of the total populationPOP2010
. - Top right plot titled “Low Access + Low Income: Half Mile” showing the proportion of people per county considered low income who also have low access to food within a half mile
lalowihalf
out of the total populationPOP2010
. - Bottom left plot titled “Low Access: 10 Miles” showing the proportion of people per county who have low access to food within 10 miles
lapop10
out of the total populationPOP2010
. - Bottom right plot titled “Low Access + Low Income: 10 Miles” showing the proportion of people per county considered low income who also have low access to food within 10 miles
lalowi10
out of the total populationPOP2010
.
When calling plot
, specify the keyword arguments vmin=0
and vmax=1
so that the subplots all share the same scale. Underneath, plot the entire state of Washington in the background color #EEE
. We recommend preparing subplots with figsize=(15, 10)
. Turn off axis labels on each subplot.
def plot_food_access_by_county_map(state_data, state_background):
"""
...
"""
...
axs = plot_food_access_by_county_map(state_data, entire_state)
assert axs is not None, "this function should return a 4-tuple of Axes objects"
assert len(axs) == 4, "the returned tuple's length should be 4"
expected_titles = ["Low Access: Half Mile", "Low Access + Low Income: Half Mile",
"Low Access: 10 Miles", "Low Access + Low Income: 10 Miles"]
for ax, expected_num_colors, expected_title in zip(axs, [31, 23, 19, 16], expected_titles):
assert type(ax) == Axes, "each value in the returned tuple should be an Axes object"
layers = ax.findobj(PatchCollection)
assert len(layers) == 2, "expected to have 2 layers (one background and one foreground)"\
+f" but got {len(layers)} layer(s)"
assert_patches_allclose(layers[0], geoms=entire_state.geometry, color="#EEE", layer=0)
assert_patches_allclose(layers[1], geoms=all_counties.geometry, num_colors=expected_num_colors, layer=1)
assert ax.get_title() == expected_title, f"title {ax.get_title()} does not match expected"
assert ax.get_legend() is None, "plot should not have a legend"
assert not ax.axison, "borders and labels must be hidden"
Writeup: Food access by county¶
Setting aside the lack of a legend in plot_food_access_by_county_map
, is it an effective visualization? Using the data visualization principles we learned in class, explain why or why not.
TODO: Replace this text with your answer.
Task: Plot low access census tracts¶
Write a function plot_census_low_access_map
that takes the merged data and the entire_state
as state background and returns the Axes
that plots all census tracts (not counties) considered low access using 5 plot
layers for the following definitions for “low access” in urban and rural tracts. For this task, do not use the LATracts_half
or LATracts10
columns in the merged data; the procedure described below intentionally results in different values. Your 5 plot layers should be in the following order:
Plot the map of Washington in the background with color
#EEE
.Plot all the
Urban
census tracts for which we have food access data in the color#AAA
.Plot all the
Rural
census tracts for which we have food access data in the color#AAA
.Plot all the
Urban
census tracts considered low access in the default (blue) color.Low access in an urban census tract is defined by a
lapophalf
value that is at least 500 people or at least 33% of the census tract population.Plot all the
Rural
census tracts considered low access in the default (blue) color.Low access in a rural census tract is defined by a
lapop10
value that is at least 500 people or at least 33% of the census tract population.
Finally, title the plot “Low Access Census Tracts” and turn off axis labels.
def plot_census_low_access_map(state_data, state_background):
"""
...
"""
...
ax = plot_census_low_access_map(state_data, entire_state)
assert type(ax) == Axes, "this function should return an Axes object"
layers = ax.findobj(PatchCollection)
assert len(layers) == 5, "expected to have 5 layers (one background and 4 foregrounds)"\
+f" but got {len(layers)} layer(s)"
assert_patches_allclose(layers[0], geoms=entire_state.geometry, color="#EEE", layer=0)
urban_idx = state_data["Urban"].notna() & ~state_data.index.isin(rural_idx)
urban_la_idx = urban_idx & ~state_data.index.isin(urban_ha_idx)
try:
assert_patches_allclose(layers[1], geoms=state_data.loc[urban_idx, "geometry"], color="#AAA", layer=1)
assert_patches_allclose(layers[2], geoms=state_data.loc[rural_idx, "geometry"], color="#AAA", layer=2)
assert_patches_allclose(layers[3], geoms=state_data.loc[urban_la_idx, "geometry"], layer=3)
assert_patches_allclose(layers[4], geoms=state_data.loc[rural_la_idx, "geometry"], layer=4)
## Debugging tip: if you want to check one layer, comment out the four lines above
## and use each of the following lines ONE AT A TIME
# assert_patches_allclose(layers[1], geoms=state_data.loc[urban_idx, "geometry"], color="#AAA", layer=1)
# assert_patches_allclose(layers[1], geoms=state_data.loc[rural_idx, "geometry"], color="#AAA", layer=1)
# assert_patches_allclose(layers[1], geoms=state_data.loc[urban_la_idx, "geometry"], layer=1)
# assert_patches_allclose(layers[1], geoms=state_data.loc[rural_la_idx, "geometry"], layer=1)
except AssertionError as e:
# this message was to remind you to plot in the correct order; you can ignore if you just kept one layer
print("Please make sure the order of your plot is correct, and also check any other errors below.")
raise e
assert ax.get_title() == "Low Access Census Tracts", "title does not match expected"
assert not ax.axison, "borders and labels must be hidden"
Writeup: Data-driven decision-making¶
What is one way that government or food access organizations could use plot_food_access_by_county_map
or plot_census_low_access_map
to shape their decisions or policies? Then, explain one limitation or concern with using the plots in the way you suggested.
TODO: Replace this text with your answer.
Task: Interactive map¶
Although the initial report to Congress was completed in June 2009, the Economic Research Service has since then maintained an interactive map for their Food Access Research Atlas. Open this interactive map, turn off the default layer “LI and LA and 1 and 10 miles”, and turn on the layer “LI and LA at 1/2 and 10 miles”. This layer displays:
Low-income census tracts where a significant number or share of residents is more than 1/2 mile (urban) or 10 miles (rural) from the nearest supermarket.
Write a function interactive_map
that returns a interactive Map
of low income and low access census tracts in Washington. Include only LowIncomeTracts
that are either low access at half a mile LATracts_half
for Urban
census tracts or low access at 10 miles LATracts10
for Rural
census tracts. This dataset does not match the Food Access Research Atlas, so some differences are to be expected. Your interactive map won’t appear in the PDF printout as PDF files cannot embed interactive maps so don’t worry about it not showing up in your PDF export.
def interactive_map(state_data):
"""
...
"""
...
map = interactive_map(state_data)
assert type(map) == Map, "this function should return an interactive Map object"
display(map)
last_child = next(reversed(map._children.values()))
assert type(last_child) == GeoJson, "last child should be GeoJson; do not specify column"
geojson = last_child.data
assert set(int(d["id"]) for d in geojson["features"]) == set(lalowi_idx), "wrong selection"
The following cell plots a preview of your interactive map for the PDF printout.
gpd.GeoDataFrame.from_features(geojson, crs="EPSG:4326").plot(ax=entire_state.plot(color="#EEE")).set_axis_off()
Writeup: Build a new supermarket¶
Using the interactive map above, locate the low-income low-access census tract closest to your favorite place in Washington. Then, identify a location (a specific street intersection, such as “University Way NE & NE 45th St”) to add a new supermarket that would serve the people living in that census tract. Finally, explain the considerations that factored into your choice of location.
This dataset is outdated, so assume there are no supermarkets in any low-income low-access census tract even if supermarkets are present today. An effective writeup will demonstrate expertise with the data setting.
TODO: Replace this text with your answer.