Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Creating a dataframe that contains two specific years that's from a dataframe

I’m using pandas and I’ve been stuck with making a new DataFrame for the years 2012 and 2015 and both the ‘Team’ TOR and NYA. I have imported a .csv file and that’s where I want to call the year 2012 and 2015 and put them into a single DataFrame.

df_2012 = pd.DataFrame(df_baseball[(df_baseball['Year '] == 2012) & 
                                   (df_baseball['Year '] == 2015) & 
                                   (df_baseball['Team '] == 'TOR') & 
                                   (df_baseball['Team '] == 'NYA')], 
                       columns = ['Games_Won', 'Runs_Scored','At_Bats','Hits',
                                  'Doubles','Triples','Home_Runs','Walks', 
                                  'Runs_Against','Earned_Runs',
                                  'Earned_Run_Average','Complete_Games',
                                  'Shutout','Saves','Infield_Put_Outs',
                                  'Hits_Allowed','Home_Run_Allowed', 
                                  'Walks_Allowed','Strikeouts_Allowed',
                                  'Errors','Fielding_Percentage'])

Am I using the wrong operator or is my syntax wrong? Would highly appreciate the responses!

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

You want the OR operator because a year cannot be 2012 and 2015 at the same time; similarly a team cannot be TOR and NYA at the same time. You could also use isin, instead of writing OR between every condition.

Also, since isin (or OR) creates a boolean mask that you can use to filter df_baseball, you don’t need to pass the result into a DataFrame constructor, since the sliced outcome will be a DataFrame, so the following should suffice:

df_2012 = df_baseball[df_baseball['Year '].isin([2012, 2015]) & df_baseball['Team '].isin(['TOR','NYA'])]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading