Home how to transform R dataframe to rows of indicator values

Questions

how to transform R dataframe to rows of indicator values

February 17, 2022

I have a dataframe

A 2
B 4
C 3

and I would like to make a data frame with the following

A 0
A 1
B 0
B 0
B 0
B 1
C 0
C 0
C 1.

So for B, I want to make 4 rows and each one is 0 except for the last one which is 1. Similarly, for A, I’ll have 2 rows and the first one has a 0 and the second one has a 1.

In general, if I have a row in the original table with X n, I want to return n rows in the new table with n-1 of them being X 0 and the final one as X 1.

Is there a way to do this in R? Or Python or SQL?

>Solution :

In R, we may use uncount to replicate the rows from the second column and replace the second column with binary by converting the first to logical column (duplicated)

library(tidyr)
library(dplyr)
df1 %>% 
 uncount(v2) %>%
 mutate(v2 = +(!duplicated(v1, fromLast = TRUE)))

-output

Or in Python

import pandas as pd
df1 = pd.DataFrame({"v1":["A", "B", "C"], "v2": [2, 4, 3]})
df2 = df1.reindex(df1.index.repeat(df1.v2))
df2['v2'] = (~df2.duplicated(subset = ['v2'], keep = "last")) + 0
df2
   v1   v2
0   A   0
0   A   1
1   B   0
1   B   0
1   B   0
1   B   1
2   C   0
2   C   0
2   C   1

data

df1 <- structure(list(v1 = c("A", "B", "C"), v2 = c(2L, 4L, 3L)), 
class = "data.frame", row.names = c(NA, 
-3L))

dataframe

byMR

Published February 17, 2022

Add a comment

Properly handling missing values and formatting for pandas data frame printed to tabulate

byMR

February 17, 2022

Questions

How to populate a streamlit selectbox with vlaues from a SQL query

byMR

February 17, 2022

Questions

What is a format of PyPi's requires_dist in JSON API

byMR

February 17, 2022

Questions

Why are results rounding down ? (C#)

byMR

February 17, 2022

Questions

Can't use console.log inside app.get method, but works outside of it

byMR

February 17, 2022

Questions

Select the row with the maximum value in each group based on multiple columns in R dplyr

byMR

February 17, 2022

how to transform R dataframe to rows of indicator values

MEDevel.com: Open-source for Healthcare and Education

>Solution :

data

Like this:

Leave a ReplyCancel reply

Read more

Properly handling missing values and formatting for pandas data frame printed to tabulate

How to populate a streamlit selectbox with vlaues from a SQL query

What is a format of PyPi's requires_dist in JSON API

Why are results rounding down ? (C#)

Can't use console.log inside app.get method, but works outside of it

Select the row with the maximum value in each group based on multiple columns in R dplyr

Keep Up to Date with the Most Important News

how to transform R dataframe to rows of indicator values

MEDevel.com: Open-source for Healthcare and Education

>Solution :

data

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Properly handling missing values and formatting for pandas data frame printed to tabulate

How to populate a streamlit selectbox with vlaues from a SQL query

What is a format of PyPi's requires_dist in JSON API

Why are results rounding down ? (C#)

Can't use console.log inside app.get method, but works outside of it

Select the row with the maximum value in each group based on multiple columns in R dplyr

Discover more from Dev solutions