Reassigning unique column values to easier names

May 5, 2022

I am parsing a larger csv that looks loosely like so:

time   id        angle
0.0   1_2_3       ...
0.0   ad_42       ...
0.0   34_02_03    ...
0.1   1_2_3       ...
0.1   ad_42       ...
0.1   f_1         ...
....

As you can see, the id field has a lot of variety in the naming schematic, but definitely has values that repeat. My goal is to read in the csv and reassign id values as they appear while tracking the ones in common. So it would be nice to write into the dataframe and have output like so:

time   id      angle
0.0   id1       ...
0.0   id2       ...
0.0   id3       ...
0.1   id1       ...
0.1   id2       ...
0.1   id4       ...
....

Where the ids correspond but have a more human-readable form (ie 1-x).

Any advice would be greatly appreciated.

>Solution :

You can do:

ids = df['id'].unique().tolist()
id_dict = {ids[i-1]:'id'+str(i) for i in range(1,len(ids)+1)}
df['id'] = df['id'].map(id_dict)

The ids gives you the unique id values and to each unique id you assign a id + number as in id_dict. Then map the dict onto your column to get the new values.

And note that you don’t need to worry about the order of the values: unique() – preserves the order of the values in which they appear.