Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python: How to query data from ranges in another dataframe using pandas

Have a df_subject_tracker has columns subject , marks_greater , marks_lower and marks_to_be_given and df_students_marks with columns name subject , rol , marks

using marks column in df_students_marks dataframe how to query the marks which falls in between marks_greater and marks_lower.

if marks fall between them we have to assign the marks_to_be_given column as alloted in df_marks_final dataframe. if marks does’nt fall under these ranges then we have to give alloted as 0

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df_subject_tracker

subject      marks_greater       marks_lower    marks_to_be_given

english            50                  100              20
english            150                 200              40
english            250                 300              60

social             50                  100              30
social             150                 200              50
social             250                 300              70

df_students_marks

name     subject          rol   marks

mark     english          123   75
mark     social           123   75
tom     english          124   20
tom     social           124   289

df_marks_final

name     subject          rol   marks   alloted

mark     english          123   75        20
mark     social           123   75        20
tom     english          124   20         0
tom     social           124   289        70

tried this :

df_subject_tracker.loc[(df_subject_tracker['marks_greater'] >= int(marks_greater)) & (df_subject_tracker['marks_greater'] <=int(marks_lower)),'alloted'] = int(marks_to_be_given)

>Solution :

You can merge your two dataframes then keep only rows where marks is between marks_greater and marks_lower.

# df1 <- df_subject_tracker
# df2 <- df_students_marks
df2['alloted'] = (
    df2.reset_index().merge(df1, on='subject', how='left')
       .query("marks.between(marks_greater, marks_lower, inclusive='left')")
       .set_index('index')['marks_to_be_given'].reindex(df2.index, fill_value=0)
)
print(df2)

# Output
   name  subject  rol  marks  alloted
0  mark  english  123     75       20
1  mark   social  123     75       30
2   tom  english  124     20        0
3   tom   social  124    289       70

Step-by-step:

>>> out = df2.reset_index().merge(df1, on='subject', how='left')
    index  name  subject  rol  marks  marks_greater  marks_lower  marks_to_be_given
0       0  mark  english  123     75             50          100                 20
1       0  mark  english  123     75            150          200                 40
2       0  mark  english  123     75            250          300                 60
3       2   tom  english  124     20             50          100                 20
4       2   tom  english  124     20            150          200                 40
5       2   tom  english  124     20            250          300                 60
6       1  mark   social  123     75             50          100                 30
7       1  mark   social  123     75            150          200                 50
8       1  mark   social  123     75            250          300                 70
9       3   tom   social  124    289             50          100                 30
10      3   tom   social  124    289            150          200                 50
11      3   tom   social  124    289            250          300                 70

>>> out = out.query("marks.between(marks_greater, marks_lower, inclusive='left')")
    index  name  subject  rol  marks  marks_greater  marks_lower  marks_to_be_given
0       0  mark  english  123     75             50          100                 20
6       1  mark   social  123     75             50          100                 30
11      3   tom   social  124    289            250          300                 70

>>> out = out.set_index('index')['marks_to_be_given'].reindex(df2.index,fill_value=0)
0    20
1    30
2     0
3    70
Name: marks_to_be_given, dtype: int64
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading