I have a dataframe, df, containing information about a company, the country they are located in, and the year they were founded. I now need to plot the development of the amount of companies founded per country for each year in the dataset (between 1995 – 2015) in a line, however all I manage to create is a pie chart with the total companies funded per country, but not including the year_founded information.
The data looks like this:
| Company | Country | Year_founded |
|---|---|---|
| A | USA | 1996 |
| B | NLD | 2004 |
| C | CAN | 2014 |
| D | USA | 2000 |
| E | NLD | 1999 |
| F | CAN | 2000 |
| etc. |
Ideally I would like to plot the total amount of companies per country in a line chart with different lines per country.
Any ideas on how to approach this problem?
>Solution :
IIUC, you can use crosstab and plot.line:
ax = pd.crosstab(df['Year_founded'], df['Country']).plot.line()
ax.set_ylabel('Number of founded companies')
from matplotlib.ticker import MaxNLocator
ax.xaxis.set_major_locator(MaxNLocator(integer=True))
output:
crosstab:
Country CAN NLD USA
Year_founded
1996 0 0 1
1999 0 1 0
2000 1 0 1
2004 0 1 0
2014 1 0 0
