How to create and populate pandas columns based on cell values

I have created a dataframe called df as follows:

import pandas as pd
d = {'feature1': [1, 22,45,78,78], 'feature2': [33, 2,2,65,65], 'feature3': [100, 2,359,87,2],}
df = pd.DataFrame(data=d)
print(df)

The dataframe looks like this:

enter image description here

I want to create two new columns called Freq_1 and Freq_2 that count, for each record, how many times the number 1 and number 2 appear respectively. So, I’d like the resulting dataframe to look like this:

enter image description here

So, let’s take a look at the column called Freq_1:

  • for the first record, it’s equal to 1 because the number 1 appears only once across the whole first record;
  • for the other records, it’s equal to 0 because the number 1 never appears.

Let’s take a look now at the column called Freq_2:

  • for the first record, Freq_2 is equal to 0 because number 2 doesn’t appear;
  • for second record, Freq_2 is equal to 2 because the number 2 appears twice;
  • and so on …

How do I create the columns Freq_1 and Freq_2 in pandas?

>Solution :

Try this:

freq = {
    i: df.apply(lambda col: col.eq(i)).sum(axis=1)
    for i in range(10)
}

pd.concat([df, pd.DataFrame(freq).add_prefix("Freq_")], axis=1)

Result:

 feature1  feature2  feature3  Freq_0  Freq_1  Freq_2  Freq_3  Freq_4  Freq_5  Freq_6  Freq_7  Freq_8  Freq_9
        1        33       100       0       1       0       0       0       0       0       0       0       0
       22         2         2       0       0       2       0       0       0       0       0       0       0
       45         2       359       0       0       1       0       0       0       0       0       0       0
       78        65        87       0       0       0       0       0       0       0       0       0       0
       78        65         2       0       0       1       0       0       0       0       0       0       0

Leave a Reply