Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Pandas how to use initially generated column names without renaming them

I was couriose that is any way we can use these initially generated column names by Pandas while reading a csv/Text files like as follows

df = pd.read_csv("some_text_file.txt", header = None)

which will give something like

     0         1         2

0   data1    data2     data3  
1  r2 data1  r2 data2     r2 data3  

When we used header = None it genarated some column names as = 0 1 2 by default.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

When I try to acces them like

-->    df['0'] = sometask

It throws error

raise KeyError(key) from err
KeyError: '0'

Aren’t they column names at all?. I’ve seen some people calling them as Levels. Like

level0 - column 0
level1 - column 1
level2 - column 2 

I’ve also tried

-->    df[level0] = sometask

it throwed

NameError: name 'level0' is not definedNameError: name 'level0' is not defined

I know we have to rename the column names and use them like

df.columns =['col1','col2'.....]

But, Wondering there is any way we can these pandas genarated column names without renamaing them as shown above.

>Solution :

Inside pd.read_csv, you can pass a list to the names parameter. E.g.:

df = pd.read_csv('some_text_file.txt', header=None, 
                 names=[f'col_{i}' for i in range(1,4)])

print(df)

      col_1     col_2     col_3
0     data1     data2     data3
1  r2 data1  r2 data2  r2 data3

Note that the list of names cannot contain any duplicates (e.g. ['col', 'col', 'col2'] will cause an error).


The default col "names" 0,1,2 etc. are integers, rather than strings. You can check this as follows:

print(df.columns)

Int64Index([0, 1, 2], dtype='int64')

E.g. to access column 0, you should use df[0] or df.loc[:,0], not df['0'] etc.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading