How to ignore (or convert) "\N" in a CSV with Pandas?

I have a very large CSV file with many cells that have "\N" as the value. For example:

OBJ_1 OBJ_2 TCA
16908 37152 2019-07-29 01:13:37
37152 16908 2019-07-29 01:13:37
16908 37152 2019-07-29 01:13:37
\N 16908 2019-07-29 01:13:37
19483 23132 \N
22829 \N 2019-07-29 01:13:37

When I run the function to read the file: pd.read_csv("path")

I get the error: ValueError: could not convert string to float: '\\N'

How can I read a CSV file with "\N" values and have them either ignored or replaced with some default value (like zero or undefined)?

>Solution :

According to the docs, you can use the na_values argument to automatically convert these to NaNs, like this:

df = pd.read_csv("path", na_values="\\N")

Leave a Reply