Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to fill (based on the index of a dataframe) an empty column

I’m trying to add the column ‘Information’ to my dataframe (df3) and filling it with string values (‘True’ if the index is 0 and ‘False’, otherwise). The problem is pandas put 'False' in every single row, even in the ones having an index 0 (see the output below).

Input :

import pandas as pd
import numpy as np

df1 = pd.DataFrame({'Column1': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J'],
                    'Column2': [1, 2, 3, 4, 5, 6, 7, 8, 9, 10],
                    'Column3': ['I', 'II', 'III', 'IV', 'V', 'VI', 'VII', 'VIII', 'IX', 'X'],
                    'Column4': ['K', 'L', 'M', 'N', 'O', 'P', 'Q', 'R', 'S', 'T'],
                    'Column5': [11, 12, 13, 14, 15, 16, 17, 18, 19, 20],
                    'Column6': ['XI', 'XII', 'XIII', 'XIV', 'XV', 'XVI', 'XVII', 'XVIII', 'XIX', 'XX'],
                    'Column7': ['U', 'V', 'W', 'X', 'Y', 'Z', '', '', '', ''],
                    'Column8': [21, 22, 23, 24, 25, 26, pd.NA, pd.NA, pd.NA, pd.NA],
                    'Column9': ['XXI', 'XXII', 'XXIII', 'XXIV', 'XXV', 'XXVI', '', '', '', '']})

column_names = ['Letters', 'Numbers', 'RomanNumerals']
df3 = pd.DataFrame(columns = column_names)

i=0
while i<len(df1.columns):
    df2 = df1.iloc[:, i:i+3]
    df2.columns = column_names
    df3 = pd.concat([df3, df2])
    i+=3

df3.dropna(inplace=True)

for index, row in df3.iterrows():
    df3['Information'] = np.where(index == 0, True,  False)

display(df3)

Output :

Letters Number RomanNumeral Information
0 A 1 I FALSE
1 B 2 II FALSE
2 C 3 III FALSE
3 D 4 IV FALSE
4 E 5 V FALSE
5 F 6 VI FALSE
6 G 7 VII FALSE
7 H 8 VIII FALSE
8 I 9 IX FALSE
9 J 10 X FALSE
0 K 11 XI FALSE
1 L 12 XII FALSE
2 M 13 XIII FALSE
3 N 14 XIV FALSE
4 O 15 XV FALSE
5 P 16 XVI FALSE
6 Q 17 XVII FALSE
7 R 18 XVIII FALSE
8 S 19 XIX FALSE
9 T 20 XX FALSE
0 U 21 XXI FALSE
1 V 22 XXII FALSE
2 W 23 XXIII FALSE
3 X 24 XXIV FALSE
4 Y 25 XXV FALSE
5 Z 26 XXVI FALSE

Is there an explanation to this scenario ?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Change the for loop with this snippet


df3['Information']= df3.index.map(lambda x: x==0)

What happen in the for loop is you actually make a new column based on a scalar. Not that you typed


df3['Information'] = np.where(index == 0, True,  False)

Instead of


row['Information'] = np.where(index == 0, True,  False)

But even the code above won’t work because you assign to nothing

Edit:

Another way to do this (for further explanation you can check pandas dataframe apply)


def get_information(index):
    if index==0:
        return True
    else:
        return False

df3['Information']= df3.index.map(get_information)

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading