Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

python random.choice TypeError: '(array([ 109, 1280, 427, 531, 1563, 102, 1774, 802, 560, 0]), slice(None, None, None))' is an invalid key

So I’m trying to read in an excel document and then to select x amount of rows that are random and not replaced. I’m getting the Error when I try to run and would love for some guidance. I’m writing a Jupyter Notebook using VS Code.

#import libraries.
import os
import subprocess
import sys
import pandas as pd
import numpy as np
import tkinter as tk

#allow user to browse for specific excel file
from tkinter import filedialog
root = tk.Tk()
root.withdraw()
file_path = filedialog.askopenfilename()
sizeOfSample = 10

#read in excel as dataframe after user selects file in explorer
df = pd.read_excel (file_path)

#select random rows from df to display.
number_of_rows = df.shape[0]
random_indices = np.random.choice(number_of_rows, size=sizeOfSample, replace=False)
random_rows = df[random_indices, :]

print (random_rows)

This is the output I’m getting.

---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
~\AppData\Local\Temp/ipykernel_1716/1509119795.py in <module>
     21 number_of_rows = initArr.shape[0]
     22 random_indices = np.random.choice(number_of_rows, size=sizeOfSample, replace=False)
---> 23 random_rows = initArr[random_indices, :]
     24 
     25 print (random_rows)

C:\Python39\lib\site-packages\pandas\core\frame.py in __getitem__(self, key)
   3456             if self.columns.nlevels > 1:
   3457                 return self._getitem_multilevel(key)
-> 3458             indexer = self.columns.get_loc(key)
   3459             if is_integer(indexer):
   3460                 indexer = [indexer]

C:\Python39\lib\site-packages\pandas\core\indexes\base.py in get_loc(self, key, method, tolerance)
   3359             casted_key = self._maybe_cast_indexer(key)
   3360             try:
-> 3361                 return self._engine.get_loc(casted_key)
   3362             except KeyError as err:
   3363                 raise KeyError(key) from err

C:\Python39\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

C:\Python39\lib\site-packages\pandas\_libs\index.pyx in pandas._libs.index.IndexEngine.get_loc()

TypeError: '(array([ 109, 1280,  427,  531, 1563,  102, 1774,  802,  560,    0]), slice(None, None, None))' is an invalid key

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

Replace:

random_rows = df[random_indices, :]

By:

random_rows = df.loc[random_indices, :]

But you can use:

random_rows = df.sample(n=sizeOfSample, replace=True)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading