Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Verify that a column name is a unique identifier

I have a dataset called df_authors and in that dataset I have a column called author. I have to verify that df_authors.author is a unique identifier.

What I tried, len(df_authors) == len(df_authors['author'].unique()), and this returns True.

My question is have I done this right. I found this line of code online and not a 100% sure if it does what I think it does.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

My understanding of a unique identifier is that, each row in that column has unique values and this line of code checking each row against the dataset to be unique or not.

If someone could tell me if I’m right or way of here, I’d greatly appreciated it. Thank you.

>Solution :

Your understanding of a unique identifier is correct, however this line of code works a bit differently:

len(df_authors) gives you the number of lines in the DataFrame. len(df_authors['author'].unique()) gives you the number of unique values in the author column. If both lengths are same, that necessarily means that author is unique.

You can also leverage pandas more directly by using set_index:

df_with_index = df_authors.set_index('author', verify_integrity=True)

If the index is not unique, that statement will fail (because of verify_integrity), plus you will be able to use the author as an index, e.g.:

df_with_index.loc[author]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading