To check if a row starts with punctuation, I was thinking of the use of string.punctuation and string.startswith. But when I do
df['start_with_punct']=df['name'].str.startswith(string.punctuation)
I get False when the names actually start with punctuation.
Example of data is
Name
_faerrar_
!gfaherr_!£
nafjetes_
Expected output
Name start_with_punct
_faerrar_ True
!gfaherr_!£ True
nafjetes_ False
I would need to understand how to get the right output as I would need also to test this with names starting with capital letter.
>Solution :
Use tuple for pass multiple values to Series.str.startswith:
df['start_with_punct']=df['Name'].str.startswith(tuple(string.punctuation))
print (df)
Name start_with_punct
0 _faerrar_ True
1 !gfaherr_! True
2 nafjetes_ False
For testing if first value is uppercase use Series.str.isupper with indexing str[0]:
df['start_with_upper']=df['Name'].str[0].str.isupper()
print (df)
Name start_with_upper
0 Aaerrar_ True
1 dgfaherr_! False
2 Nafjetes_ True