Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

creating Pandas DataFrame as a cross-product between family x city x member

sorry if this may seem like a simple question, but I am new to python.
I would like to create a DataFrame containing 10 values for family names, 10 values for city of birth and for each pair of family name-city of birth, 3 members of that family, which have the "name" a random string up to 8 characters.
How can i create such a DataFrame?
I don’t really know how to use the same pair of family name-city of birth for more than one value for "member".

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

There are a few ways to go about this, but here’s a simple one that’s easy to follow (with 5 values instead of the required 10 but you get the idea) :

import random
import string

import pandas as pd

cities = ["New York", "London", "Paris", "Beijing", "Casablanca"]
names = ["Smith", "Heston", "Dupont", "Torvalds", "Clooney"]

df = pd.DataFrame(
    [
        {
            "city": cities[i],
            "family_name": names[i],
            "first_name": "".join([random.choice(string.ascii_lowercase) for _ in range(8)]),
        }
        for i in range(5)
        for _ in range(3)
    ]
)

print(df)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading