Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Remove Specific characters in dataframe

I have a dataframe as below

Example 1
Some data frame always has 9 characters in tagname column before Last full stop ‘.’

| id | tagname                         | datetime                |
|----|---------------------------------|-------------------------|
| 1  | A2.WAM_A.ACCESS_CLOSE_FAULT_LH  | 2022-01-20 15:07:36.310 |
| 2  | A2.WAM_A.ACCESS_SENSOR_FAULT_RH | 2022-01-20 15:07:36.310 |
| 3  | A2.WAM_A.OUTPUT_POWER_CP_FAULT  | 2022-01-20 15:07:36.310 |

If I use,

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

df['tagname'] = df['tagname'].str[9:]

Output: –

| id | tagname                | datetime                |
|----|------------------------|-------------------------|
| 1  | ACCESS_CLOSE_FAULT_LH  | 2022-01-20 15:07:36.310 |
| 2  | ACCESS_SENSOR_FAULT_RH | 2022-01-20 15:07:36.310 |
| 3  | OUTPUT_POWER_CP_FAULT  | 2022-01-20 15:07:36.310 |

Example 2
But in some tables I have diff length characters & multiple dot’s in tagname before Last full stop ‘.’, like below

| id | tagname                                      | datetime                |
|----|----------------------------------------------|-------------------------|
| 1  | A2.AC.CH.CONDITION.1ST_VACUUM_CH             | 2021-09-28 17:31:48.191 |
| 2  | A2.AC.CH.CONDITION.SMALL_LEAK_TEST_VACUUM_CH | 2021-09-28 17:31:48.193 |
| 3  | A2.AC.CH.CONDITION.VACC_VALUE_CH_FLOAT_R270  | 2021-09-28 17:31:48.196 |
| 4  | A2.CP01.PRL2_TRIM1.ONCHANGE.PRL2             | 2021-09-28 17:31:48.199 |
| 5  | AY2.CP01.DL5_TRIM1.ONCHANGE.DL5              | 2021-09-28 17:31:48.199 |

Requirement: Output
Remove all characters in column tagname before that last dot ‘.’ in tagname column

| id | tagname                   | datetime                |
|----|---------------------------|-------------------------|
| 1  | 1ST_VACUUM_CH             | 2021-09-28 17:31:48.191 |
| 2  | SMALL_LEAK_TEST_VACUUM_CH | 2021-09-28 17:31:48.193 |
| 3  | VACC_VALUE_CH_FLOAT_R270  | 2021-09-28 17:31:48.196 |
| 4  | PRL2                      | 2021-09-28 17:31:48.199 |
| 5  | DL5                       | 2021-09-28 17:31:48.199 |

>Solution :

Use str.rsplit:

Example 1:

df1['tagname'] = df1['tagname'].str.rsplit('.', 1).str[1]
print(df1)

# Output
   id                 tagname                 datetime
0   1   ACCESS_CLOSE_FAULT_LH  2022-01-20 15:07:36.310
1   2  ACCESS_SENSOR_FAULT_RH  2022-01-20 15:07:36.310
2   3   OUTPUT_POWER_CP_FAULT  2022-01-20 15:07:36.310

Example 2:

df2['tagname'] = df2['tagname'].str.rsplit('.', 1).str[1]
print(df2)

# Output
   id                    tagname                 datetime
0   1              1ST_VACUUM_CH  2021-09-28 17:31:48.191
1   2  SMALL_LEAK_TEST_VACUUM_CH  2021-09-28 17:31:48.193
2   3   VACC_VALUE_CH_FLOAT_R270  2021-09-28 17:31:48.196
3   4                       PRL2  2021-09-28 17:31:48.199
4   5                        DL5  2021-09-28 17:31:48.199
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading