Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Find the column name of the highest weighted average

I have a DF that looks like this. In the investor columns, 1 means invested and 0 mean no. If there are more than 1 investors invested in a project, we can assume that they share the investment equally (eg: Investor A and B each invested 50000 in project "something"). I want to calculate the total amount that each investor put in and find out who invested the most.

Project Invested Amount Investor A Investor B Investor C
Something 100000 1 1 0
Another 5000000 0 0 1
Last 25000000 1 1 1

Right now I am thinking of filtering by each investor that has a 1 and then divide by them sum of all the investor column. Here is what I tried but I am still missing something:

Investor_A = df[df['Investor A'] == 1]
test = Investor_A['Invested Amount'] / (df.iloc[:,3:5].sum())

Expected output:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Investor A would have put in a total of 100000/2 + 25000000/3 = 8383333.33333. Investor B would have put in a total of 100000/2 + 25000000/3 = 8383333.33333. Investor C would have put in a total of 5000000 + 25000000/3 = 13333333.3333 –> Investor C invested the most amount of money.

>Solution :

Call filter() on the Investor columns and divide the Invested Amount column by the row-wise sum of the Investor columns. Then multiply it by the investors columns again to get the total share of each investor for each item. Then calling sum() would find the total investment of each investor and idxmax() would fetch the name of the investor.

investors = df.filter(like='Investor')
avg_invested_amount = df['Invested Amount'] / investors.sum(1)
investment_shares = investors.mul(avg_invested_amount, axis=0)
investment_per_investor = investment_shares.sum()
investment_per_investor.idxmax()
#'Investor C'

FYI, this code can be written in 2 lines (but much less legible):

investors = df.filter(like='Investor')
investors.mul(investors.sum(1).rdiv(df['Invested Amount']), axis=0).sum().idxmax()
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading