I have a 2d data set of (x,y). x and y are integer values.
How can I use only Pandas code to find all x values where y reaches its maximum values (there are multiple and same absolute maxima)?
I also want to plot (with pandas.DataFrame.plot) x vs. y and mark the maxima positions.
Example code:
import numpy as np
import pandas as pd
np.random.seed(10)
x = np.arange(100)*0.2
y = np.random.randint(0, 20, size=100)
data = np.vstack((x, y)).T
df = pd.DataFrame(data, columns=['x', 'y'])
ymax = df['y'].max()
df_ymax = df[df['y'] == ymax]
print(df_ymax)
# x y
# 13 2.6 19.0
# 24 4.8 19.0
# 28 5.6 19.0
# 86 17.2 19.0
# 88 17.6 19.0
df.plot(x='x', y='y', figsize=(8, 4),
ylabel='y', legend=False, style=['b-'])
I have no idea how to mark the maxima values (df_ymax) in the same plot, e.g. using circles. How can that be solved?
The final plot should look like this (here I programmed everything with numpy and matplotlib):
>Solution :
Get the Axes returned by df.plot and reuse it to plot the maxima values:
ax = df.plot(x='x', y='y', figsize=(8, 4), ylabel='y', legend=False, style=['b-'])
df_ymax.plot.scatter(x='x', y='y', color='r', ax=ax)


