Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Select rows with exact text in dataframes in Python

I am trying to select the rows whoes "Column_A" has ".2.". My code is as follow.

data = {'Column_A':['L.9922070.128.1.020','L.9922080.125.1.001','F.1622002.001.2.001','F.1622002.001.2.001','F.1622002.001.2.001']}
df1 = pd.DataFrame(data)
c = df1[df1["Column_A"].str.contains(".2.")==True]
print(c[["Column_A"]])

However, the output is as follows.

              Column_A
0  L.9922070.128.1.020
1  L.9922080.125.1.001
2  F.1622002.001.2.001
3  F.1622002.001.2.001
4  F.1622002.001.2.001

The output that I want is as follows.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

              Column_A
2  F.1622002.001.2.001
3  F.1622002.001.2.001
4  F.1622002.001.2.001

Please help me to find the error. Thank you.

>Solution :

Because . is special regex metacharacter use regex=False:

c = df1[df1["Column_A"].str.contains(".2.", regex=False)]

Or escape it:

c = [df1["Column_A"].str.contains(r"\.2\.")]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading