I have a data frame with a columns named ‘age’.
The ages range from 6 – 90.
Is there a way to group ages in interval range as ‘5-9′, ’10-14’ etc. So that we can display on a graph the age ranges between these instead of individual ages.
>Solution :
I hope this is useful:
import pandas as pd
import matplotlib.pyplot as plt
# Sample data
data = {'age': [6, 10, 12, 15, 20, 22, 25, 30, 35, 52, 53, 54, 55, 60, 65, 70, 75, 84, 85, 90]}
df = pd.DataFrame(data)
# Define the age ranges
age_ranges = [(5, 9), (10, 14), (15, 19), (20, 24), (25, 29), (30, 34), (35, 39), (40, 44),
(45, 49), (50, 54), (55, 59), (60, 64), (65, 69), (70, 74), (75, 79), (80, 84), (85, 89)]
# Group ages into ranges
df['age_range'] = pd.cut(df['age'], bins=[start-0.5 for start, _ in age_ranges] + [age_ranges[-1][-1]+0.5],
labels=[f"{start}-{end}" for start, end in age_ranges])
# Count the occurrences of each age range
age_counts = df['age_range'].value_counts().sort_index()
# Plotting the data
age_counts.plot(kind='bar', rot=0)
plt.xlabel('Age Range')
plt.ylabel('Count')
plt.title('Age Distribution')
plt.show()