Home Pandas data frame – Group a column values then Randomize new values of that column

Questions

Pandas data frame – Group a column values then Randomize new values of that column

January 2, 2023

I have one column (X) that contains some values with duplicates (several rows have the same value and they all are sequenced).
I have a requirement to randomize new values for that columns for testing one issue. so I tried:

np.random.seed(RSEED)
df["X"] = np.random.randint(100, 500, df.shape[0])

But this is not enough, I need to keep the sequences, I mean to group by same value then to randomize for all of the rows of that value a new number, and to do it for all grouped values of the original column. e.g.

X	new X (randomized)
210	500
210	500
.	.
.	.
340	100
340	100
.	.
.	.

I started looking if Pandas has something built-in, I can group by pandas.DataFrame.groupBy but couldn’t find a pandas.DataFrame.random that can be applied for the same group.

>Solution :

Simple approach is to use groupby and transform to broadcast random integers per group

df.groupby('X')['X'].transform(lambda _: np.random.randint(100, 500))

0    137
1    137
2    .
3    .
4    335
5    335
Name: X, dtype: int64

pandas

byMR

Published January 02, 2023

Add a comment

how to setup a function in my code as it's own stateful widget and then pass variables to it

byMR

January 2, 2023

Questions

CUDA shared memory read/write order within a single thread

byMR

January 2, 2023

Questions

Given a number x and a range n. Generate ranges of size n from x

byMR

January 2, 2023

Questions

How to horizontally use an arrayformula to increment the week number based on a range of days?

byMR

January 2, 2023

Questions

Getting max and min value from array of objects and returning another value associated with the same object. JavaScript

byMR

January 2, 2023

Questions

Curl issue. JSON.loads() works fine with python-requests, but fails when using curl to the flask API. Changes all double quotes to single

byMR

January 2, 2023

Pandas data frame – Group a column values then Randomize new values of that column

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

how to setup a function in my code as it's own stateful widget and then pass variables to it

CUDA shared memory read/write order within a single thread

Given a number x and a range n. Generate ranges of size n from x

How to horizontally use an arrayformula to increment the week number based on a range of days?

Getting max and min value from array of objects and returning another value associated with the same object. JavaScript

Curl issue. JSON.loads() works fine with python-requests, but fails when using curl to the flask API. Changes all double quotes to single

Keep Up to Date with the Most Important News

Pandas data frame – Group a column values then Randomize new values of that column

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

how to setup a function in my code as it's own stateful widget and then pass variables to it

CUDA shared memory read/write order within a single thread

Given a number x and a range n. Generate ranges of size n from x

How to horizontally use an arrayformula to increment the week number based on a range of days?

Getting max and min value from array of objects and returning another value associated with the same object. JavaScript

Curl issue. JSON.loads() works fine with python-requests, but fails when using curl to the flask API. Changes all double quotes to single

Discover more from Dev solutions