Home How do I remove outliers from a column in a dataframe?

Questions

How do I remove outliers from a column in a dataframe?

December 5, 2022

The solutions I found online only show removing outliers from the entire dataframe, not just a specific column. So I’m having trouble figuring out how to perform outlier removal on a single column.

I tried creating a method, the code is shown below.

def find_outlier(df, column):
    # Find first and third quartile
    q1 = df[column].quantile(0.25)
    q3 = df[column].quantile(0.75)
    
    # Find interquartile range
    IQR = q3 - q1
    
    # Find lower and upper bound
    lower_bound = q1 - 1.5 * IQR
    upper_bound = q3 + 1.5 * IQR
    
    # Remove outliers
    df[column] = df[column][df[column] > lower_bound]
    df[column] = df[column][df[column] < upper_bound]
    
    return df

But when I ran the code, it said "Columns must be same length as key".

The code I used to run is shown below.

df['no_of_trainings'] = find_outlier(df, 'no_of_trainings')

Any help is appreciated.

>Solution :

The comparison result is by-index, so you can use it to reduce the DataFrame

    df = df[df[column] > lower_bound]
    df = df[df[column] < upper_bound]
    return df

more concisely

    ...
    return df[(df[column] > lower_bound) & (df[column] < upper_bound)]

jupyter-notebook

byMR

Published December 05, 2022

Add a comment

Why does a Javascript regular expression with alternatives generate an array of size 2 when there is only 1 match?

byMR

December 5, 2022

Questions

How do I retrieve only one data column of the latest date?

byMR

December 5, 2022

Questions

consteval broken? on Apple clang++

byMR

December 5, 2022

Questions

Python – if statements

byMR

December 5, 2022

Questions

How to reduce an array and add a count as a new field?

byMR

December 5, 2022

Questions

R tidyverse split strings by commas and calculate mean

byMR

December 5, 2022

How do I remove outliers from a column in a dataframe?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Why does a Javascript regular expression with alternatives generate an array of size 2 when there is only 1 match?

How do I retrieve only one data column of the latest date?

consteval broken? on Apple clang++

Python – if statements

How to reduce an array and add a count as a new field?

R tidyverse split strings by commas and calculate mean

Keep Up to Date with the Most Important News

How do I remove outliers from a column in a dataframe?

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Why does a Javascript regular expression with alternatives generate an array of size 2 when there is only 1 match?

How do I retrieve only one data column of the latest date?

consteval broken? on Apple clang++

Python – if statements

How to reduce an array and add a count as a new field?

R tidyverse split strings by commas and calculate mean

Discover more from Dev solutions