I am trying to split a dataset into two separate ones by finding its minimum point in the first column. I have used idxmin to firstly identify the location of the minimum entry and secondly iloc to slice the array from 0 to the minimum point.
The error I encounter is:
TypeError: cannot do positional indexing on RangeIndex with these indexers [1 96
dtype: int64] of type Series
An example dataset is as shown:
x y
0 1.000000 6
1 1.000000 2
2 0.999999 5
3 0.999996 3
4 0.999986 4
.. ... ...
196 0.999987 3
197 0.999996 3
198 0.999999 2
199 1.000000 1
200 1.000000 4
The x column starts from 1 and decreases to a minimum point near zero, where it increases back to 1. I am looking for the smallest x and its corresponding y point to separate the two.
This is the current code I have written:
data = pd.DataFrame(data)
minimum = pd.DataFrame.idxmin(data)
lower_surface = data.iloc[:minimum]
I understand that the variable minimum will return a location in the DataFrame, and hence I thought I could use iloc to separate the array from the beginning to the minimum point but this is not the case.
>Solution :
You should pick one column as reference. Using the whole DataFrame, you will get an index for each column, which cannot be used to slice:
data.idxmin()
x 4
y 199
dtype: int64
You should instead run:
minimum = data['x'].idxmin()
Also, technically you have to use loc to slice, not iloc since idxmax return an indice not a position.
data.loc[:minimum]
Output:
x y
0 1.000000 6
1 1.000000 2
2 0.999999 5
3 0.999996 3
4 0.999986 4
If you want to slice with iloc you have to use numpy.argmin:
df.iloc[:np.argmin(df['x'])]
The output is however slightly different since iloc excludes the end of the slice:
x y
0 1.000000 6
1 1.000000 2
2 0.999999 5
3 0.999996 3