Home i faced this problem while i was strat sampling the dataset

Questions

i faced this problem while i was strat sampling the dataset

March 30, 2022

In [16] : Strat_d3=d3.groupby('Label', group_keys=False).apply(lambda x: x.sample(1000))
Traceback (most recent call last):

  File "<ipython-input-16-f54910ba8f95>", line 1, in <module>
    Strat_d3=d3.groupby('Label', group_keys=False).apply(lambda x: x.sample(1000))

  File "C:\Users\Msi\anaconda3\lib\site-packages\pandas\core\groupby\groupby.py", line 894, in apply
    result = self._python_apply_general(f, self._selected_obj)

  File "C:\Users\Msi\anaconda3\lib\site-packages\pandas\core\groupby\groupby.py", line 928, in _python_apply_general
    keys, values, mutated = self.grouper.apply(f, data, self.axis)

  File "C:\Users\Msi\anaconda3\lib\site-packages\pandas\core\groupby\ops.py", line 238, in apply
    res = f(group)

  File "<ipython-input-16-f54910ba8f95>", line 1, in <lambda>
    Strat_d3=d3.groupby('Label', group_keys=False).apply(lambda x: x.sample(1000))

  File "C:\Users\Msi\anaconda3\lib\site-packages\pandas\core\generic.py", line 5350, in sample
    locs = rs.choice(axis_length, size=n, replace=replace, p=weights)

  File "mtrand.pyx", line 959, in numpy.random.mtrand.RandomState.choice

ValueError: Cannot take a larger sample than population when 'replace=False'

>Solution :

The messages means in, at least, one group you have not enough sample (< 1000).
2 solutions:

Use replace=True to get 1000 samples but some duplicates:

# You don't need apply here
Strat_d3 = d3.groupby('Label', group_keys=False).sample(1000, replace=True)

Use this trick if you accept some groups have less than 1000 samples:

Strat_d3 = d3.groupby('Label', group_keys=False).apply(lambda x: x.sample(min(len(x), 1000)))

To debug your groups, use the following code to check labels where number of samples are below 1000:

d3.value_counts('Label').loc[lambda x: x < 1000]

dataset

byMR

Published March 30, 2022

Add a comment

How do I add the output of a for-loop to a list in Scala?

byMR

March 30, 2022

Questions

Django: Reverse for 'add_review' with arguments '('',)' not found. 1 pattern(s) tried: ['movies/addreview/(?P<id>[0-9]+)/\\Z']

byMR

March 30, 2022

Questions

Create an image with 5 bits for each pixel

byMR

March 30, 2022

Questions

SQLITE: Keep all unmatched rows during join

byMR

March 30, 2022

Questions

fetch() retrieved data in JS (React) throws error with no apparent reason

byMR

March 30, 2022

Questions

Cant print whithouth brackets

byMR

March 30, 2022

i faced this problem while i was strat sampling the dataset

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Like this:

Leave a ReplyCancel reply

Read more

How do I add the output of a for-loop to a list in Scala?

Django: Reverse for 'add_review' with arguments '('',)' not found. 1 pattern(s) tried: ['movies/addreview/(?P<id>[0-9]+)/\\Z']

Create an image with 5 bits for each pixel

SQLITE: Keep all unmatched rows during join

fetch() retrieved data in JS (React) throws error with no apparent reason

Cant print whithouth brackets

Keep Up to Date with the Most Important News

i faced this problem while i was strat sampling the dataset

>Solution :

MEDevel.com: Open-source for Healthcare and Education

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

How do I add the output of a for-loop to a list in Scala?

Django: Reverse for 'add_review' with arguments '('',)' not found. 1 pattern(s) tried: ['movies/addreview/(?P<id>[0-9]+)/\\Z']

Create an image with 5 bits for each pixel

SQLITE: Keep all unmatched rows during join

fetch() retrieved data in JS (React) throws error with no apparent reason

Cant print whithouth brackets

Discover more from Dev solutions