Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Can you help me finding a row content from a pandas df into another df and then add the count of the findings into a new column of the first df?

I just started using python and I am trying to improve myself by engaging in different projects.

At the moment I am trying to create a work allocation tool. I loaded 2 xlsx files as df.

In df "dfau" there is a list of the employees with their capacity left and the languages that they speak.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Employees Capacity Left Languages
E1 0 EN, GER
E2 -11 EN, IT, ES
E3 6 EN, CH
E4 2 EN, ES, FR
E5 -1 EN
E6 1 EN
E7 -9 EN, GER
E8 8 EN, GER

In df "dfln" there is a list of the languages.

Languages
EN
IT
ES
GER
CH
FR
NL
PT

I would like to look for the languages listed in "dfln" into "dfau" and add the count of these languages as a second column in "dfln" (and this is where i am stuck).

Afterwards, in "dfln", I would also like to add a column which would show the capacity per each language by summing the capacities left of each employee which can speak the language.

The employees which have a "Capacity Left" below 1 should not be counted nor summed.

This is how it should appear:

Languages Count Capacity
EN 6 85
IT 0 0
ES 3 15
GER 1 8
CH 1 6
FR 1 2
NL 1 45
PT 2 13

I tried to re-write the code in different ways but still this is where i got stuck (the result remains the basic dfln.

import pandas as pd
dfau = pd.read_excel (r'C:\Users\Projects\pywp\SampleAu.xlsx', sheet_name='Employees')
dfln = pd.read_excel (r'C:\Users\Projects\pywp\SampleAu.xlsx', sheet_name='Languages')
    
LCount=0
dfln.insert(1,"Count",LCount)
    
for language in dfln["Languages"]:
    if dfau.loc[dfau["Languages"].str.contains(language, case=False)] is True:
        LCount+=1
    print(dfln)

Can someone help me please?

>Solution :

I cannot make sense of the final output. However, you can achieve what you are asking by running the following code:

dfIn['Count'] = 0
dfIn['Capacity'] = 0

for idx,row in dfIn.iterrows():
    dfIn.iloc[idx,1] = dfau['Languages'][dfau['Capacity Left'] >= 1].str.contains(row.Languages).sum()
    dfIn.iloc[idx, 2] = dfau['Capacity Left'][dfau['Capacity Left'] >= 1][dfau['Languages'].str.contains(row.Languages)].sum()

Notice that the final output looks like this, and not what you posted above:

enter image description here

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading