I want to do an extract to capture all characters that match a regular expression and add those extracted characters to another column. I want to use extract to get all the letters in a dataframe cell except W. When I run the code below, it only captures the first character. Again I want to capture all the letters but no W and also no numbers or any dashes.
Here’s the code:
df["type"] = df["callsign"].str.extract(r'([^W0-9-])')
Currently the data frame shows the below result.
| callsign | type |
|---|---|
| 1AB3-W9 | A |
| 23DC-W0 | D |
But I need it to produce:
| callsign | type |
|---|---|
| 1AB3-W9 | AB |
| 23DC-W0 | DC |
>Solution :
Assuming you want to extract the letters right before the -W, use:
df["type"] = df["callsign"].str.extract(r'([a-zA-Z]+)-W')
For the first set of letters that are not W, you’re missing a +:
df["callsign"].str.extract(r'([^W0-9-]+)')