This activity checks understanding for a model to learn the feature importance of data.
We construct a learning model. You will train a model on this data and see if it “learns” the hidden formula features for this.
Output each feature and the importance (coef_) to complete the activity.
import pandas as pd
import numpy as np
from sklearn.linear_model import LinearRegression
import matplotlib.pyplot as plt
# --- STEP 1: GENERATE THE DATA ---
np.random.seed(10)
n = 100
data = {
'Hours_Slept': np.random.uniform(4, 10, n),
'Practice_Problems': np.random.randint(0, 50, n),
'Coffee_Cups': np.random.randint(0, 8, n), # Noise (Mostly)
'Video_Game_Hours': np.random.uniform(0, 5, n) # Negative impact
}
df = pd.DataFrame(data)
# The Hidden Formula: Score = (10 * Sleep) + (0.5 * Problems) - (5 * VideoGames) + Random Noise
df['Test_Score'] = (10 * df['Hours_Slept']) + (0.5 * df['Practice_Problems']) - (5 * df['Video_Game_Hours']) + np.random.normal(0, 2, n)
# --- STEP 2: YOUR CODE HERE ---
# 1. Define X (all columns except Test_Score) and y (Test_Score)
# 2. Initialize and Fit a LinearRegression model
# 3. Print the .coef_ for each feature