Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Randomly draw a sample for 2 columns

A well known function for this in Python is random.sample()

However, my dataset consist of multiple columns, and i need the ‘lat’ and ‘lng’ coordinates to be sampled. As these two are related, i cannot use the random.sample() separately to get some random lat coordinates + some non corresponding lng coordinates.

What would be the most elegant solution for this?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Perhaps first making a third column, in which i combine lat&lng
Then sample
Then unmerge?

If so, how should i do this, the fact that both lat and lng values are floats with different lengts doesn’t make it easier. Probably by adding a’-‘ in between?

>Solution :

Essentially, you’re talking about sampling an entire row which has values [lat_i, lng_i]. This leads to a very simple (but perhaps too verbose) solution:

random_row_index = random.randint(0, number_of_rows_in_dataset - 1)
random_row = dataset[randon_row_index, :]

If you have a Pandas dataframe, simply use DataFrame.sample.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading