Expand a string from a column into different separate columns in Pandas

I have a df in the following form:

id   sid      steps
A     1       step1
A     1    step1-step2
A     1  step1-step2-step3

This contains data of how a user A went through a certain series of pages (steps) in a given session (sid). I want to take these dash delimitated steps and create individual columns for every step.

Result:

id     sid      steps       page_step1 page_step2  page_step3
  A     1       step1         step1        NA           NA
  A     1    step1-step2      step1      step2          NA
  A     1  step1-step2-step3  step1      step2        step3

I don’t know exactly how many steps so I was hoping that they’re created dynamically. Been stuck on this all week, thanks!

>Solution :

Use str.split:

>>> df.join(df["steps"].str.split("-",expand=True).add_prefix("page_step"))

 id  sid              steps page_step0 page_step1 page_step2
0  A    1              step1      step1       None       None
1  A    1        step1-step2      step1      step2       None
2  A    1  step1-step2-step3      step1      step2      step3

If the numbering must start from 1 instead of 0:

steps = df["steps"].str.split("-",expand=True)
output = df.join(steps.rename(columns={i: f"page_setup{i+1}" for i in steps.columns}))

>>> output
  id  sid              steps page_setup1 page_setup2 page_setup3
0  A    1              step1       step1        None        None
1  A    1        step1-step2       step1       step2        None
2  A    1  step1-step2-step3       step1       step2       step3

Leave a Reply