Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

pandas groupby middle heading

How to group this data with pandas by heading in column sl. no
I have data like below in CSV or excel.

sl. No v1 v2 v3
Heading1
1 243 45 3244
2 3 25 33
3 43 324 54
Head2
1 3 45 54
2 24 4 42

Here is the dummy csv file

sl. No,v1,v2,v3,v4
Heading1,,,,
1,243,45,3244,
2,3,25,33,
3,43,324,54,
Head2,,,,
1,3,45,54,
2,24,4,42,

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

By pattern

To group by string/pattern match, you can use a string operation (str.startswith/str.contains/…) and cumsum to increment a group number:

group = df['sl. No'].str.startswith('Head').cumsum()

Or, to get the original names combine with where and ffill:

group = df['sl. No'].where(df['sl. No'].str.startswith('Head')).ffill()

By "type"

To identify the headings as non-integers, use a variant with pandas.to_numeric:

group = df['sl. No'].where(pd.to_numeric(df['sl. No'], errors='coerce').isna()).ffill()

To use the fact the other columns are empty/NaN, isna/isnull/eq+cumsum:

group = df['v1'].isna().cumsum()
# or for empty strings
# group = df['v1'].eq('').cumsum()

Or:

group = df['sl. No'].where(df['v1'].isna()).ffill()
# or for empty strings
# group = df['sl. No'].where(df['v1'].eq('')).ffill()

Demo:

df.assign(group1 = df['sl. No'].str.startswith('Head').cumsum(),
          group2 = df['sl. No'].where(df['sl. No'].str.startswith('Head')).ffill(),
          group3 = df['sl. No'].where(pd.to_numeric(df['sl. No'], errors='coerce').isna()).ffill(),
          group4 = df['v1'].isna().cumsum(),
          group5 = df['sl. No'].where(df['v1'].isna()).ffill(),
         )

     sl. No     v1     v2      v3  v4  group1    group2    group3  group4    group5
0  Heading1    NaN    NaN     NaN NaN       1  Heading1  Heading1       1  Heading1
1         1  243.0   45.0  3244.0 NaN       1  Heading1  Heading1       1  Heading1
2         2    3.0   25.0    33.0 NaN       1  Heading1  Heading1       1  Heading1
3         3   43.0  324.0    54.0 NaN       1  Heading1  Heading1       1  Heading1
4     Head2    NaN    NaN     NaN NaN       2     Head2     Head2       2     Head2
5         1    3.0   45.0    54.0 NaN       2     Head2     Head2       2     Head2
6         2   24.0    4.0    42.0 NaN       2     Head2     Head2       2     Head2
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading