Skip to article frontmatterSkip to article content

In response to the 2008 U.S. Farm Bill, the U.S. Department of Agriculture’s Economic Research Service compiled a June 2009 report to Congress:

According to data from the latest census (2000), about 23.5 million people, or 8.4 percent of the U.S. population, live in low-income neighborhoods that are more than a mile from a supermarket. Low-income neighborhoods are areas where more than 40 percent of the population has income less than or equal to 200 percent of the Federal poverty threshold ($44,000 per year for a family of four in 2008).

In this assessment, we’ll simulate their analysis by creating geospatial maps to help us understand food access in Washington. There are three geographic region types that we’ll use in this assessment:

  • Census tract is a geographic region used in the U.S. census. It is the smallest of the three region types.
  • County is a geographic region used for administrative purposes that can include one or more census tracts.
  • State is a geographic region such as the State of Washington. It is the largest of the three region types.

A census tract is defined as low access if enough people in the tract do not have nearby access to grocery stores offering affordable and nutritious food. In urban areas, “low access” is defined as 0.5 miles; in rural areas, “low access” is defined as 10 miles.

tl_2010_53_tract00.shp contains the 2010 US census dataset in geospatial shapefile format only for Washington state (53). The only columns you need to use are CTIDFP00, the census tract identifier, and geometry, the geometric shape of the tract.

food_access.csv contains the food access dataset in tabular CSV format. Each row in the dataset corresponds to a census tract for every state in the country (not just Washington). This dataset has many columns but you only need to understand the following:

  • CensusTract is the census tract identifier.
  • State is the state name for the census tract.
  • County is the county name for the census tract.
  • Urban is a flag (0 or 1) that indicates if this census tract is an urban environment.
  • Rural is a flag that indicates if this census tract is a rural environment.
  • LATracts_half is a flag that indicates if this census tract is “low access” in a half mile radius.
  • LATracts10 is a flag that indicates if this census tract is “low access” in a 10 mile radius.
  • LowIncomeTracts is a flag that indicates if this census tract is “low income”.
  • POP2010 is the number of people in this census tract according to the 2010 census.
  • lapophalf is the number of people in this census tract considered having “low access” in a half mile radius.
  • lapop10 is the number of people in this census tract considered having “low access” in a 10 mile radius.
  • lalowihalf is similar to lapophalf but only counts people considered low access and low income.
  • lalowi10 is similar to lapop10 but only counts people considered low access and low income.
!pip install -q folium mapclassify

import geopandas as gpd
import matplotlib.pyplot as plt
import pandas as pd

# Testing utilities that should not be used for implementation
from matplotlib.axes import Axes
from matplotlib.collections import PatchCollection
from folium.features import GeoJson
from folium.folium import Map
from mapping_utils import assert_patches_allclose, rural_idx, rural_la_idx, urban_ha_idx, lalowi_idx
all_counties = gpd.read_file("counties.geojson")

Outside Sources

Update the following Markdown cell to include your name and list your outside sources. Submitted work should be consistent with the curriculum and your sources.

Name: YOUR_NAME_HERE

  1. Enter your outside sources as a list here, or remove this line if you did not consult any outside sources at all.

Task: Load in data

Write a function load_data that takes path for census dataset and the path for the food access dataset and returns the GeoDataFrame resulting from merging the two datasets on the census tract identifiers CTIDFP00 / CensusTract. Assume the census tract identifier columns exist: use only these two column names. Not all census tracts have food access data, but we want to make sure to include all the census tracts data. in the resulting GeoDataFrame.

def load_data(shp_path, csv_path):
    """
    ...
    """
    ...


state_data = load_data("tl_2010_53_tract00.shp", "food_access.csv")
display(state_data)
assert type(state_data) == gpd.GeoDataFrame, "this function should return a GeoDataFrame"
assert list(state_data.columns) == [
    "STATEFP00", "COUNTYFP00", "TRACTCE00", "CTIDFP00", "NAME00", "NAMELSAD00", "MTFCC00",
    "FUNCSTAT00", "ALAND00", "AWATER00", "INTPTLAT00", "INTPTLON00", "geometry", "CensusTract",
    "State", "County", "Urban", "Rural", "LATracts_half", "LATracts10", "GroupQuartersFlag",
    "OHU2010", "NUMGQTRS", "PCTGQTRS", "LowIncomeTracts", "POP2010", "lapophalf", "lalowihalf",
    "lapop10", "lalowi10",
], "the returned data frame should include columns from both datasets"
assert state_data["CensusTract"].dtype != "object", "gpd.read_file only accepts geospatial data"
assert len(state_data) == 1318, "incorrect number of rows loaded; the two datasets are"\
                                + " not merged using the correct join type"

Task: Plot census tracts

Write a function plot_census_map that takes the merged data and returns the Axes that contains shapes of all the census tracts in Washington. Title the plot “Washington Census Tracts” and turn off axis labels.

def plot_census_map(state_data):
    """
    ...
    """
    ...


ax = plot_census_map(state_data)
assert type(ax) == Axes, "this function should return an Axes object"
layers = ax.findobj(PatchCollection)
assert_patches_allclose(layers[0], geoms=state_data.geometry, num_colors=1, layer=0)
assert len(layers) == 1, f"expected to have 1 layer but got {len(layers)} layers"
assert ax.get_title() == "Washington Census Tracts", "title does not match expected"
assert not ax.axison, "borders and labels must be hidden"

Preparing the state background

When given no arguments, the dissolve method considers the entire GeoDataFrame as a single group. This will be useful for plotting backgrounds in later tasks. We precompute this because dissolve is an expensive method to run (in terms of runtime).

entire_state = state_data[["geometry"]].dissolve()
display(entire_state)
ax = entire_state.plot(color="#EEE")
ax.set_axis_off()

Task: Plot census tract populations

Write a function plot_census_population_map that takes the merged data and the entire_state as state background and return the Axes that plots all the census tracts in Washington where each census tract is colored according to the POP2010 column. There will be some missing census tracts. Underneath, plot the entire state of Washington in the background color #EEE. Title the plot “Washington Census Tract Populations”, turn off axis labels, include a legend, and increase the figure size so that the map is the same height as the legend (hint: figure size should be roughly 2 by 1 in terms of width by height).

def plot_census_population_map(state_data, state_background):
    """
    ...
    """
    ...


ax = plot_census_population_map(state_data, entire_state)
assert type(ax) == Axes, "this function should return an Axes object"
layers = ax.findobj(PatchCollection)
assert len(layers) == 2, "expected to have 2 layers (one background and one foreground)"\
                         +f" but got {len(layers)} layer(s)"
assert_patches_allclose(layers[0], geoms=entire_state.geometry, color="#EEE", layer=0)
assert_patches_allclose(layers[1], geoms=state_data.dropna().geometry, num_colors=183, layer=1)
assert ax.get_title() == "Washington Census Tract Populations", "title does not match expected"
assert not ax.axison, "borders and labels must be hidden"
cbar = ax.get_figure().get_axes()[-1]
assert cbar.get_label() == "<colorbar>", "missing legend"
assert ax.bbox.height == cbar.bbox.height, "map can be enlarged"

Task: Plot county populations

Write a function plot_county_populations_map that takes the merged data and the entire_state as state background and returns the Axes that plots all the counties in Washington where each county is colored according to the POP2010 column. This will require combining all the census tract data and geometries for each county, though there will be missing data for some counties. Underneath, plot the entire state of Washington in the background color #EEE. Title the plot “Washington County Populations”, turn off axis labels, include a legend, and increase the figure size so that the map is the same height as the legend (hint: figure size should be roughly 2 by 1 in terms of width by height).

def plot_county_population_map(state_data, state_background):
    """
    ...
    """
    ...


ax = plot_county_population_map(state_data, entire_state)
assert type(ax) == Axes, "this function should return an Axes object"
layers = ax.findobj(PatchCollection)
assert len(layers) == 2, "expected to have 2 layers (one background and one foreground)"\
                         +f" but got {len(layers)} layer(s)"
assert_patches_allclose(layers[0], geoms=entire_state.geometry, color="#EEE", layer=0)
assert_patches_allclose(layers[1], geoms=all_counties.geometry, num_colors=20, layer=1)
assert ax.get_title() == "Washington County Populations", "title does not match expected"
assert not ax.axison, "borders and labels must be hidden"
cbar = ax.get_figure().get_axes()[-1]
assert cbar.get_label() == "<colorbar>", "missing legend"
assert ax.bbox.height == cbar.bbox.height, "map can be enlarged"

Task: Plot food access by county

Write a function plot_food_access_by_county_map that takes the merged data and the entire_state as state background and returns a 4-tuple of Axes that represent the subplots in a 2-by-2 figure consisting of 4 choropleth maps:

  • Top left plot titled “Low Access: Half Mile” showing the proportion of people per county who have low access to food within a half mile lapophalf out of the total population POP2010.
  • Top right plot titled “Low Access + Low Income: Half Mile” showing the proportion of people per county considered low income who also have low access to food within a half mile lalowihalf out of the total population POP2010.
  • Bottom left plot titled “Low Access: 10 Miles” showing the proportion of people per county who have low access to food within 10 miles lapop10 out of the total population POP2010.
  • Bottom right plot titled “Low Access + Low Income: 10 Miles” showing the proportion of people per county considered low income who also have low access to food within 10 miles lalowi10 out of the total population POP2010.

When calling plot, specify the keyword arguments vmin=0 and vmax=1 so that the subplots all share the same scale. Underneath, plot the entire state of Washington in the background color #EEE. We recommend preparing subplots with figsize=(15, 10). Turn off axis labels on each subplot.

def plot_food_access_by_county_map(state_data, state_background):
    """
    ...
    """
    ...


axs = plot_food_access_by_county_map(state_data, entire_state)
assert axs is not None, "this function should return a 4-tuple of Axes objects"
assert len(axs) == 4, "the returned tuple's length should be 4"
expected_titles = ["Low Access: Half Mile", "Low Access + Low Income: Half Mile",
                   "Low Access: 10 Miles", "Low Access + Low Income: 10 Miles"]
for ax, expected_num_colors, expected_title in zip(axs, [31, 23, 19, 16], expected_titles):
    assert type(ax) == Axes, "each value in the returned tuple should be an Axes object"
    layers = ax.findobj(PatchCollection)
    assert len(layers) == 2, "expected to have 2 layers (one background and one foreground)"\
                             +f" but got {len(layers)} layer(s)"
    assert_patches_allclose(layers[0], geoms=entire_state.geometry, color="#EEE", layer=0)
    assert_patches_allclose(layers[1], geoms=all_counties.geometry, num_colors=expected_num_colors, layer=1)
    assert ax.get_title() == expected_title, f"title {ax.get_title()} does not match expected"
    assert ax.get_legend() is None, "plot should not have a legend"
    assert not ax.axison, "borders and labels must be hidden"

Writeup: Food access by county

Setting aside the lack of a legend in plot_food_access_by_county_map, is it an effective visualization? Using the data visualization principles we learned in class, explain why or why not.

TODO: Replace this text with your answer.

Task: Plot low access census tracts

Write a function plot_census_low_access_map that takes the merged data and the entire_state as state background and returns the Axes that plots all census tracts (not counties) considered low access using 5 plot layers for the following definitions for “low access” in urban and rural tracts. For this task, do not use the LATracts_half or LATracts10 columns in the merged data; the procedure described below intentionally results in different values. Your 5 plot layers should be in the following order:

  1. Plot the map of Washington in the background with color #EEE.

  2. Plot all the Urban census tracts for which we have food access data in the color #AAA.

  3. Plot all the Rural census tracts for which we have food access data in the color #AAA.

  4. Plot all the Urban census tracts considered low access in the default (blue) color.

    Low access in an urban census tract is defined by a lapophalf value that is at least 500 people or at least 33% of the census tract population.

  5. Plot all the Rural census tracts considered low access in the default (blue) color.

    Low access in a rural census tract is defined by a lapop10 value that is at least 500 people or at least 33% of the census tract population.

Finally, title the plot “Low Access Census Tracts” and turn off axis labels.

def plot_census_low_access_map(state_data, state_background):
    """
    ...
    """
    ...


ax = plot_census_low_access_map(state_data, entire_state)
assert type(ax) == Axes, "this function should return an Axes object"
layers = ax.findobj(PatchCollection)
assert len(layers) == 5, "expected to have 5 layers (one background and 4 foregrounds)"\
                         +f" but got {len(layers)} layer(s)"
assert_patches_allclose(layers[0], geoms=entire_state.geometry, color="#EEE", layer=0)
urban_idx = state_data["Urban"].notna() & ~state_data.index.isin(rural_idx)
urban_la_idx = urban_idx & ~state_data.index.isin(urban_ha_idx)
try:
    assert_patches_allclose(layers[1], geoms=state_data.loc[urban_idx, "geometry"], color="#AAA", layer=1)
    assert_patches_allclose(layers[2], geoms=state_data.loc[rural_idx, "geometry"], color="#AAA", layer=2)
    assert_patches_allclose(layers[3], geoms=state_data.loc[urban_la_idx, "geometry"], layer=3)
    assert_patches_allclose(layers[4], geoms=state_data.loc[rural_la_idx, "geometry"], layer=4)
    ## Debugging tip: if you want to check one layer, comment out the four lines above
    ## and use each of the following lines ONE AT A TIME
    # assert_patches_allclose(layers[1], geoms=state_data.loc[urban_idx, "geometry"], color="#AAA", layer=1)
    # assert_patches_allclose(layers[1], geoms=state_data.loc[rural_idx, "geometry"], color="#AAA", layer=1)
    # assert_patches_allclose(layers[1], geoms=state_data.loc[urban_la_idx, "geometry"], layer=1)
    # assert_patches_allclose(layers[1], geoms=state_data.loc[rural_la_idx, "geometry"], layer=1)
except AssertionError as e:
    # this message was to remind you to plot in the correct order; you can ignore if you just kept one layer
    print("Please make sure the order of your plot is correct, and also check any other errors below.")
    raise e
assert ax.get_title() == "Low Access Census Tracts", "title does not match expected"
assert not ax.axison, "borders and labels must be hidden"

Writeup: Data-driven decision-making

What is one way that government or food access organizations could use plot_food_access_by_county_map or plot_census_low_access_map to shape their decisions or policies? Then, explain one limitation or concern with using the plots in the way you suggested.

TODO: Replace this text with your answer.

Task: Interactive map

Although the initial report to Congress was completed in June 2009, the Economic Research Service has since then maintained an interactive map for their Food Access Research Atlas. Open this interactive map, turn off the default layer “LI and LA and 1 and 10 miles”, and turn on the layer “LI and LA at 1/2 and 10 miles”. This layer displays:

Low-income census tracts where a significant number or share of residents is more than 1/2 mile (urban) or 10 miles (rural) from the nearest supermarket.

Write a function interactive_map that returns a interactive Map of low income and low access census tracts in Washington. Include only LowIncomeTracts that are either low access at half a mile LATracts_half for Urban census tracts or low access at 10 miles LATracts10 for Rural census tracts. This dataset does not match the Food Access Research Atlas, so some differences are to be expected. Your interactive map won’t appear in the PDF printout as PDF files cannot embed interactive maps so don’t worry about it not showing up in your PDF export.

def interactive_map(state_data):
    """
    ...
    """
    ...


map = interactive_map(state_data)
assert type(map) == Map, "this function should return an interactive Map object"
display(map)
last_child = next(reversed(map._children.values()))
assert type(last_child) == GeoJson, "last child should be GeoJson; do not specify column"
geojson = last_child.data
assert set(int(d["id"]) for d in geojson["features"]) == set(lalowi_idx), "wrong selection"

The following cell plots a preview of your interactive map for the PDF printout.

gpd.GeoDataFrame.from_features(geojson, crs="EPSG:4326").plot(ax=entire_state.plot(color="#EEE")).set_axis_off()

Writeup: Build a new supermarket

Using the interactive map above, locate the low-income low-access census tract closest to your favorite place in Washington. Then, identify a location (a specific street intersection, such as “University Way NE & NE 45th St”) to add a new supermarket that would serve the people living in that census tract. Finally, explain the considerations that factored into your choice of location.

This dataset is outdated, so assume there are no supermarkets in any low-income low-access census tract even if supermarkets are present today. An effective writeup will demonstrate expertise with the data setting.

TODO: Replace this text with your answer.