Skip to article frontmatterSkip to article content

In this lesson, we will practice working with DataFrames.

import pandas as pd
import doctest

Our Dataset!

Run the cell below to see the DataFrame you’ll be working with today. It is a dataset about the competing bakers at the Great Seattle Bake Off.

great_seattle_bakeoff = pd.DataFrame({
    "BakerID": [101, 102, 103, 104, 105, 106, 107, 108, 109, 110, 111],
    "Name": ["Arona", "Hannah", "Renusree", "Arpan", "Mia", "Asmi", "Alyssa", "Vani", "Vatsal", "Laura", "Suh Young"],
    "DessertBaked": ["red velvet cake", "basque cheesecake", "baumkuchen", "pound cake", "german chocolate cake", "victoria sponge cake", "birthday cake", "carrot cake", "matcha cake", "tres leches", "fruit cake"],
    "FlavorScore": [60, 92, 78, 40, 38, 73, 50, 59, 75, 99, 0],
    "PresentationScore": [95, 80, 88, 92, 98, 100, 98, 60, 77, 100, 100],
    "CityOfOrigin": ["Paris", "New York City", "Florence", "Seattle", "Seattle", "Paris", "New York City", "Seattle", "Paris", "New York City", "Florence"],
    "StartedBaking": [2006, 2007, 2010, 2020, 2020, 2015, 2019, 2013, 2018, 2018, 1325]
})

great_seattle_bakeoff

Individual Activity

Display all the column names in the cell below.

# TODO: Display column names

Then, select the bakers who recieved more than 70 points on their cake’s flavor.

# TODO: Rows where the flavor score is > 70

Group Activity

Add a new column called CreativityScore with the values [100, 80, 70, 90, 100, 20, 80, 90, 100, 70, 70]

# TODO: Create CreativityScore column

Now, calculate the mean of the FlavorScore, PresentationScore, and CreativityScore column, and store that value under a new column, TotalScore.

# TODO: Calculate mean and create TotalScore

Next, change the index of the DataFrame to the BakerID and find what Baker 105’s CityOfOrigin is. What about their StartedBaking year?

# TODO: Update the index
# Who is Baker 105?

Final Activity

Write a function advancing_bakers that advances bakers into “Advanced” if their TotalScore is 85 or above, and “Eliminated” otherwise. Apply this function to create a new column CompetitionStatus.

def advancing_bakers(data):
    """
    TODO: Create a docstring and an additional doctest
    
    >>> advancing_bakers(great_seattle_bakeoff).loc[101]["CompetitionStatus"]
    'Advanced'
    """

doctest.run_docstring_examples(advancing_bakers, globals())