ValueError with scipy.signal.find_peaks

February 22, 2022

Right now im struggling with plotting the graph that should be showing the peaks of my dataset, but it looks like the find_peaks function is cutting off every datapoint that doesnt fit into to the peak detection.
Does anybody know how i can still plot the graphs by maybe replacing the datapoints that dont fit with zeros or is there any other possibility?

I am getting the following Error Message:
"ValueError: x and y must have same first dimension, but have shapes (800,) and (105,)"

def plot():
    i = 1
    d_time, d_x, d_y, d_z = [], [], [], []
    columns = ["Time", "y", "x", "z"]
    df = pd.read_csv("mydata.csv", usecols=columns)
    for zeile in df.Time:
        if i % 30 == 0:
            d_time.append(df.Time[i])
            d_x.append(df.x[i])
            d_y.append(df.y[i])
            d_z.append(df.z[i])
            i += 1
        elif i > 24000:
            break
        else:
            i += 1

    fig = plt.figure(dpi=64, figsize=(100, 60))

    p_z, _ = scipy.signal.find_peaks(d_z, 0, distance=5)

    plt.plot(d_time, d_z, c='red', label="Z-Achse")
    plt.plot(d_time, p_z, "x", c='blue', label="Peaks Z-Achse")

    plt.title("Peak Detection", fontsize=16)
    plt.xlabel('t(s)', fontsize=16)

    fig.autofmt_xdate()
    plt.ylabel("a(m/s²)", fontsize=16)
    plt.tick_params(axis='both', which='major')

    plt.legend()
    plt.show()


plot()

Link to Mydata.csv:
https://cdn.discordapp.com/attachments/635516210473336844/945630182415405106/mydata.csv

>Solution :

Your problem lies in the fact (as you also mentioned) that p_z cuts a lot of points so d_time and p_z don’t have the same length. Therefore, you get the error. What you can do is create a np.linspace equal to the length of d_time and plot it with the new time vector. Following is my solution:

import matplotlib.pyplot as plt
import pandas as pd
from scipy import signal
import numpy as np

def plot():
        i=1
        d_time, d_x, d_y, d_z = [], [], [], []
        columns = ["Time", "y", "x", "z"]
        df = pd.read_csv("mydata.csv", usecols = columns)
        for zeile in df.Time:
            if i % 30 == 0:
                d_time.append(df.Time[i])
                d_x.append(df.x[i])
                d_y.append(df.y[i])
                d_z.append(df.z[i])
                i+=1
            elif i > 24000:
                break
            else:
                i+=1


        fig = plt.figure(dpi=64, figsize=(100, 60))

        p_z, _ = signal.find_peaks(d_z, 0, distance=5)

        new_time = np.linspace(d_time[0], d_time[-1], len(p_z))
        plt.plot(d_time, d_z, c='red', label = "Z-Achse")
        # plt.plot(d_time, p_z, "x", c='blue', label = "Peaks Z-Achse")
        plt.plot(new_time, _['peak_heights'], "x", c='blue', label = "Peaks Z-Achse")

        plt.title("Peak Detection", fontsize=16)
        plt.xlabel('t(s)', fontsize=16)
        plt.yscale("log")

        fig.autofmt_xdate()
        plt.ylabel("a(m/s²)", fontsize=16)
        plt.tick_params(axis='both', which='major')

        plt.legend()
        plt.show()
plot()

As you can see in line 28, I have created a new time vector of length equal to d_time which solves your problem. Also, I have changed the y-axis to log scale (line 35) for seeing the results better.