# Finding a median in sql server

I need to find the median of a column and the answer needs to be rounded to 4 decimal places. Since sql server doesn’t have the "MEDIAN()" function, I needed to get the smallest number from the top 50% of the list and the biggest of the bottom 50% and then divide by 2. I… Read More Finding a median in sql server

# Writing a filtered median function in python

I have an input in list: signal = [0,5,1,1,0,1] y1 = 0 (signal[0]) y2 = median(0, 5, 1) = 1 y3 = median(5, 1, 1) = 1 y4 = median(1, 1, 0) = 1 y5 = median(1, 0, 1) = 1 y6 = 1 (signal[-1]) The expected output is [0, 1, 1, 1, 1, 1]… Read More Writing a filtered median function in python

# how to specify data on pearson correlation heatmap?

I have a pearson correlation heat map coded, but its showing data from my dataframe which i dont need. is there a way to specify which columns i’d like to include? thanks in advance sb.heatmap(df[‘POPDEN’, ‘RoadsArea’, ‘MedianIncome’, ‘MedianPrice’, ‘PropertyCount’, ‘AvPTAI2015’, ‘PTAL’].corr(), annot=True, fmt=’.2f’) ————————————————————————— TypeError Traceback (most recent call last) <ipython-input-54-832fc3c86e3e> in <module> —-> 1… Read More how to specify data on pearson correlation heatmap?

# How to transform dataframe to binary based on values being above/below the row median (if > median, 1, else 0)?

I am looking to transform a dataframe to binary based on row median. Please see my input and expected output below. import pandas as pd df_input = pd.DataFrame({‘row1’: [5, 10, 20], ‘row2’: [1, 30, 40],}, index = [‘2021-02-24’, ‘2021-02-25’, ‘2021-02-26’]) df_expected_output = pd.DataFrame({‘row1’: [1, 0, 0], ‘row2’: [0, 1, 1],}, index = [‘2021-02-24’, ‘2021-02-25’, ‘2021-02-26’])… Read More How to transform dataframe to binary based on values being above/below the row median (if > median, 1, else 0)?

# Plot looks different everytime i run the code

I have a problem with my code. It looks different everytime i run it. Any ideas? I don’t see any problem. I am looking at this code since 2h and I can’t find the problem… import numpy as np import matplotlib.pyplot as plt from scipy import stats x = np.arange(0,24, 1) y = stats.poisson.pmf(x, mu=13)… Read More Plot looks different everytime i run the code

# compute median every 12 values

ex_array = [-8.23294593e-02, -4.07239507e-02, 6.08131029e-02, 2.72433402e-02, -4.73587631e-02, 5.15452252e-02, 1.32902476e-01, 1.22322232e-01, 2.71845990e-02, -1.16927038e-01, -2.62239877e-01, -1.46526396e-01, -1.82859136e-01, -1.02089602e-01, -1.91863501e-04, -5.42572200e-02, -1.41798506e-01, 2.32538185e-02, 1.44525705e-01, 1.33945461e-01, 5.01618120e-02, -1.32664337e-01, -2.97395262e-01, -1.02531532e-01, -7.80204566e-02, -5.46991495e-02, 1.05868862e-01, 7.25526818e-03, 5.04192997e-02, 7.41281286e-02, 1.75069159e-01, 1.64488914e-01, 7.55396024e-02, -6.23800645e-02, -1.76950023e-01, -5.91491004e-02, -4.00535768e-02, 6.59473071e-04, 5.98125666e-02, -1.49608356e-02, -1.45519585e-02, 1.49876707e-01, 1.92880709e-01, 2.33158881e-01, 7.59751625e-02, -2.46659059e-02, -1.40025102e-01, -3.02416639e-02] I need to compute the… Read More compute median every 12 values

# Pandas Rolling Function is not working properly

I have the following DataFrame sample: df = pd.DataFrame({‘date’:[‘2021-05-03′,’2021-05-10′,’2021-05-17′,’2021-05-24’, ‘2021-05-31′,’2021-06-07′,’2021-06-14′,’2021-06-21′,’2021-06-28′,’2021-07-05′,’2021-07-12′,’2021-07-19′,’2021-05-26’], ‘spend’:[1418,4130,4216,3374,3587,3665,4118,4534,4829,3156,2998,3025,3397]}) This is the code used: df[‘spend avg’] = df[‘spend’].rolling(7).median() This is the output that I got: df = pd.DataFrame({‘date’ : [‘2021-05-03′,’2021-05-10′,’2021-05-17′,’2021-05-24’, ‘2021-05-31′,’2021-06-07′,’2021-06-14′,’2021-06-21′,’2021-06-28′,’2021-07-05′,’2021-07-12′,’2021-07-19′,’2021-05-26’], ‘spend’:[1418,4130,4216,3374,3587,3665,4118,4534,4829,3156,2998,3025,3397], ‘spend_avg’ :[np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,3665.0,4118.0,4118.0,3665.0,3665.0,3665.0,3397.0]}) As you can see, it is not calculating the average with the rolling averages (window = 7). I understand… Read More Pandas Rolling Function is not working properly

# rewriting `summarise_all` without deprecated `funs`, using Simple list and Auto-named list

I’m trying to count the number of NA values in each of 2 columns. The code below works. temp2 %>% select(c18basic, c18ipug) %>% summarise_all(funs(sum(is.na(.)))) But I get this warning: Warning message: `funs()` was deprecated in dplyr 0.8.0. Please use a list of either functions or lambdas: # Simple named list: list(mean = mean, median =… Read More rewriting `summarise_all` without deprecated `funs`, using Simple list and Auto-named list

# Vectorised argument for a function in R. The function gives out multiple data frames, whereas I'd like it to output only one

I’d like to compute trimmed mean for each trimming proportion alpha, and then see which trimming proportion gives the minimal variance of the trimmed means, when Bootstrap simulations of size N=200 are applied. The problem that I have, is that when I try to create a data frame of column1 = mean and column2 =… Read More Vectorised argument for a function in R. The function gives out multiple data frames, whereas I'd like it to output only one

# Calculate median of column with multiple values per cell (ranges)

I have this code df = pd.DataFrame( {‘R’: {0: ‘1’, 1: ‘2’, 2: ‘3’, 3: ‘4’, 4: ‘5’, 5: ‘6’, 6: ‘7’}, ‘a’: {0: 1.0, 1: 1.0, 2: 2.0, 3: 3.0, 4: 3.0, 5: 2.0, 6: 3.0}, ‘nv1’: {0: [-1.0], 1: [-1.0], 2: [], 3: [], 4: [-2.0], 5: [-2.0, -1.0, -3.0, -1.0], 6: [-2.0,… Read More Calculate median of column with multiple values per cell (ranges)