Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Assigning random value for categories in pandas

I have a df

Name        Week
Google      1
Google      1
Amazon      1
Tesla       1
Tesla       1
Google      2
Google      2
Tesla       2
Tesla       2
Uber        3
Uber        3

I am trying to create a new column value which would be a random integer between x an y for combinations of Name and Week like so:

Name        Week        Value
Google      1           100
Google      1           100
Amazon      1           150
Tesla       1           170
Tesla       1           170
Google      2           250
Google      2           250
Tesla       2           157
Tesla       2           157
Uber        3           500
Uber        3           500

Where the same value is assigned for the combination of Name and `Week.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

I tried:

def random_group_int(df_):
    
    week = df_.week_no
    supplier = df_.sm_supp_name

    combinations = list(itertools.combinations(df.Week.unique(), df.Name.unique()))

    rand_values_dict_by_combination = {combination: np.random.randint(100,200) for combination in combinations}

    # return value by the combination on the line
    # don't know how to do that

And I feel like this is not the best approach. I also tried:

df_rand = df.groupby(['Name','Week']).count()
df_rand['Value'] = df_rand['Week'].apply(lambda x : np.random.randint(100,200))
df_rand.reset_index(inplace = True)
df.merge(df_rand[['Value', 'Name', 'Week']], left_on = ['Name', 'Week'], right_on = ['Name', 'Week'], how = 'left')

Which does work but again, I am not sure if that’s the approach I should be using.

>Solution :

You can use GroupBy.transform and generate a random value in the transform:

import random
x, y = 100, 200
df['Value'] = (df.groupby(['Name', 'Week'])['Name'] # the column doesn't matter
                 .transform(lambda _: random.randint(x, y))
               )

example output:

      Name  Week  Value
0   Google     1    153
1   Google     1    153
2   Amazon     1    196
3    Tesla     1    198
4    Tesla     1    198
5   Google     2    122
6   Google     2    122
7    Tesla     2    180
8    Tesla     2    180
9     Uber     3    106
10    Uber     3    106
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading