Home How to substract rows of a Pandas dataframe based upon some conditions?

Questions

How to substract rows of a Pandas dataframe based upon some conditions?

April 20, 2022

I am performing analysis on this dataset.

After using the code below, I am left with the cleaned version of the data.

covid_df.drop(columns = ["Sno", "Time"], inplace = True)

covid_df["State/UnionTerritory"] = covid_df["State/UnionTerritory"].replace({
    "Bihar****": "Bihar",
    "Maharashtra***": "Maharashtra", 
    "Madhya Pradesh***": "Madhya Pradesh", 
    "Karanataka": "Karnataka",
    "Telangana": "Telengana",
    "Himanchal Pradesh": "Himachal Pradesh",
    "Dadra and Nagar Haveli": "Dadra and Nagar Haveli and Daman and Diu",
    "Daman & Diu": "Dadra and Nagar Haveli and Daman and Diu"
    })

invalid_states = ["Cases being reassigned to states", "Unassigned"]

for invalid_state in invalid_states:
  invalid_state_index = covid_df.loc[covid_df["State/UnionTerritory"] == invalid_state, :].index
  covid_df.drop(index = invalid_state_index, inplace = True)

covid_df = covid_df.groupby(["State/UnionTerritory", "Date"], as_index = False).sum()

covid_df["Date"] = pd.to_datetime(covid_df["Date"])
covid_df.sort_values(by = ["State/UnionTerritory", "Date"], inplace = True)

This cleaned data has the cumulative cases for each State/UnionTerritory for each Date. How can I extract the daily new cases for each State/UnionTerritory?

One way I have in mind is by subtracting the previous row from the current row if both have the same State/UnionTerritory value. What would be the most efficient way to do this?

Would be very appreciated if you could suggest a better way of cleaning the data.

>Solution :

You could use shift.
For example:

df = pd.DataFrame({'cumul': [0, 2, 3, 5, 7]})
df['quantity'] = df - df.shift(1)

quantity will be:

   quantity
0    NaN
1    2.0
2    1.0
3    2.0
4    2.0

You can then fillna or just change the zeroth value in quantity for the zeroth value in cumul.

Edit: prepare the dataframe by applying your conditions first 🙂

pandas

byMR

Published April 20, 2022

Add a comment

Splice deleting first items from array of objects instead of index

byMR

April 20, 2022

Questions

How to refresh only a specific section of a page except using Ajax

byMR

April 20, 2022

Questions

Instance of C++ template class as a member of another template class

byMR

April 20, 2022

Questions

Getting scraped href linked with our website

byMR

April 20, 2022

Questions

Write List data to csv powershell

byMR

April 20, 2022

Questions

Extract from pattern if present, else keep as-is

byMR

April 20, 2022

How to substract rows of a Pandas dataframe based upon some conditions?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Splice deleting first items from array of objects instead of index

How to refresh only a specific section of a page except using Ajax

Instance of C++ template class as a member of another template class

Getting scraped href linked with our website

Write List data to csv powershell

Extract from pattern if present, else keep as-is

Keep Up to Date with the Most Important News

How to substract rows of a Pandas dataframe based upon some conditions?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Splice deleting first items from array of objects instead of index

How to refresh only a specific section of a page except using Ajax

Instance of C++ template class as a member of another template class

Getting scraped href linked with our website

Write List data to csv powershell

Extract from pattern if present, else keep as-is

Discover more from Dev solutions