Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Why cannot I replace a column with NaN value with values from another column directly?

I have a pandas dataframe which I created as follows:

df = pd.DataFrame(columns= [["A","B","C"]] )
df["A"] = np.arange(1, 8761, 1)

Column A contains values from 1 to 8760. And Column B and C have NaN values in them. It looks as follows:
enter image description here

df.info() returns me the following:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 8760 entries, 0 to 8759
Data columns (total 3 columns):
 #   Column  Non-Null Count  Dtype 
---  ------  --------------  ----- 
 0   (A,)    8760 non-null   int64 
 1   (B,)    0 non-null      object
 2   (C,)    0 non-null      object
dtypes: int64(1), object(2)
memory usage: 205.4+ KB

I’d like to have the same value in column B as column A.
When I try

df["B"] = df["A"]

column B still has NaN values. But when I create a new column,

df["D"] = df["A"]

, column D has same values as column A.

I can get the same values in column B as in A, using

df.iloc[:,1] = df.iloc[:, 0]

But I am curious why I did not get it on the first time using
`

df["B"] = df["A"]

`?

>Solution :

This is because you are creating a MultiIndex:

df = pd.DataFrame(columns=[["A","B","C"]]) # <- note the list of list
df["A"] = np.arange(1, 8761, 1)
df.columns

output:

MultiIndex([('A',),
            ('B',),
            ('C',)],
           )

Thus you would need df[('B',)] = df[('A',)] to make the correct assignment.

The correct code should probably be have written if you want a simple index is:

df = pd.DataFrame(columns=["A","B","C"])
df["A"] = np.arange(1, 8761, 1)
df["B"] = df["A"]

output:

>>> df.head()
   A  B    C
0  1  1  NaN
1  2  2  NaN
2  3  3  NaN
3  4  4  NaN
4  5  5  NaN
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading