Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to add outliers as separate colored markers to a line plot

val             time
5.6     2021-11-18 03:00:00
2.034   2021-11-18 05:00:00
1.171   2021-11-18 07:00:00
3.023   2021-11-18 09:00:00
4.202   2021-11-18 16:00:00
1.202   2021-11-18 17:00:00
5.202   2021-11-18 18:00:00
7.202   2021-11-18 19:00:00
2.202   2021-11-18 20:00:00
12.202  2021-11-18 21:00:00
1.202   2021-11-18 21:00:00

above is my dataframe and i want to plot it (x=time,y=value), and make value plot red where (val>5).

plt.plot(ab['time'], ab['value'], '-gD', markevery=marks, label='line with select markers')

where marks [7.202,12.202] is a list which i created manually. but this does not work.
error -: markevery is iterable but not a valid numpy fancy index

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

Line plot with different markers if condition is true python 3 i found one here, but if points are alot, this is time consuming

>Solution :

  • The easiest solution is to use Boolean indexing to create a separate dataframe for values greater then 5, and then plot them as a scatter plot with pandas.DataFrame.plot
  • The x-axis is formatted as %M-%d %H automatically. The format will change when there’s more data, and there are other answers discussing how to format pandas datetime axis.
import pandas as pd
import matplotlib.pyplot as plt

# sample data
data = {'val': [5.6, 2.034, 1.171, 3.023, 4.202, 1.202, 5.202, 7.202, 2.202, 12.202, 1.202], 'time': ['2021-11-18 03:00:00', '2021-11-18 05:00:00', '2021-11-18 07:00:00', '2021-11-18 09:00:00', '2021-11-18 16:00:00', '2021-11-18 17:00:00', '2021-11-18 18:00:00', '2021-11-18 19:00:00', '2021-11-18 20:00:00', '2021-11-18 21:00:00', '2021-11-18 21:00:00']}
df = pd.DataFrame(data)

# convert the time column to a datetime dtype
df.time = pd.to_datetime(df.time)

# get the values greater than 5
masked = df[df.val.gt(5)]

# plot the line plot
ax = df.plot(x='time', marker='o', figsize=(15, 5), zorder=0)

# plot those greater than 5
masked.plot(kind='scatter', x='time', y='val', color='red', ax=ax, s=30, label='outliers')

enter image description here

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading