Plotting count values over time for specific country names with pandas

I have a dataframe, df, containing information about a company, the country they are located in, and the year they were founded. I now need to plot the development of the amount of companies founded per country for each year in the dataset (between 1995 – 2015) in a line, however all I manage to create is a pie chart with the total companies funded per country, but not including the year_founded information.

The data looks like this:

Company Country Year_founded
A USA 1996
B NLD 2004
C CAN 2014
D USA 2000
E NLD 1999
F CAN 2000
etc.

Ideally I would like to plot the total amount of companies per country in a line chart with different lines per country.

Any ideas on how to approach this problem?

>Solution :

IIUC, you can use crosstab and plot.line:

ax = pd.crosstab(df['Year_founded'], df['Country']).plot.line()
ax.set_ylabel('Number of founded companies')
from matplotlib.ticker import MaxNLocator
ax.xaxis.set_major_locator(MaxNLocator(integer=True))

output:

enter image description here

crosstab:

Country       CAN  NLD  USA
Year_founded               
1996            0    0    1
1999            0    1    0
2000            1    0    1
2004            0    1    0
2014            1    0    0

Leave a Reply