Follow

Follow

Contact

Home Convert DataFrame with 5*5 grid of choice with string of ranges

Questions

Convert DataFrame with 5*5 grid of choice with string of ranges

byMR

January 4, 2023

Score and Score2 column ranges are not working with df.eq(s) because of the input as ranges.

Default values:-

import pandas as pd

df = pd.DataFrame({'Score': ['>285', '285-280', '280-275', '275-260', '<260'],'Grade1': ['A1', 'A2', 'A3', 'A4', 'A5'],'Score2': ['>270', '270-260', '260-250', '250-200', '<200'],'Grade3': ['D1', 'D2', 'D3', 'D4', 'D5'],'Grade4': ['ID1', 'ID2', 'ID3', 'ID4', 'ID5']})

Input:-

df_input = pd.DataFrame({'Score': [290], 'Grade1': ['A1'], 'Score2': [190], 'Grade3': ['D3'], 'Grade4': ['ID2']})

Expected output:-

df_output = {'Tiers': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5}, 'Score': {0: 290, 1: '', 2: '', 3: '', 4: ''}, 'Grade1': {0: '', 1: '', 2: '', 3: '', 4: 'A1'}, 'Score2': {0: '', 1: '', 2: '', 3: '', 4: 190}, 'Grade3': {0: '', 1: '', 2: 'D3', 3: '', 4: ''}, 'Grade4': {0: '', 1: 'ID2', 2: '', 3: '', 4: ''}}

s = df_input.loc[0]

mask = df.eq(s)

mask.to_dict() = {'Score': {0: False, 1: False, 2: False, 3: False, 4: True}, 'Grade1': {0: True, 1: False, 2: False, 3: False, 4: False}, 'Score2': {0: True, 1: False, 2: False, 3: False, 4: False}, 'Grade3': {0: False, 1: False, 2: True, 3: False, 4: False}, 'Grade4': {0: False, 1: True, 2: False, 3: False, 4: False}}

>Solution :

Modification of my previous answer to handle missing values and arbitrary order:

# get df_input as Series
s = df_input.loc[0]

# find identical values
mask = df.eq(s)

# match ranges
def match_range(val, ref):
    # extract the upper value, for the last replace by np.inf
    s = pd.to_numeric(ref.str.extract('[<-](\d+)', expand=False)).fillna(np.inf)
    # define default output as False
    out = np.zeros(len(ref), dtype=bool)
    # find matching position
    order = np.argsort(s)
    match = np.searchsorted(s[order], val)
    if match < len(s):
        out[order[match]] = True
    return out

# apply match_range on columns for which a direct match failed
m = ~mask.any()
mask.loc[:, m] = df.loc[:, m].apply(lambda x: match_range(s[x.name], x))

# generate output
out = (pd.DataFrame(np.where(mask, s, ''),
                    index=np.arange(len(df))+1,
                    columns=df.columns)
         .rename_axis('Tiers').reset_index()
      )

out

Output:

   Tiers Score Grade1 Score2 Grade3 Grade4
0      1   290     A1                     
1      2                               ID2
2      3                         D3       
3      4                 190              
4      5

numpy

byMR

Published January 04, 2023

Add a comment

Leave a ReplyCancel reply

Read more

Questions

Unity c# transform Function

byMR

January 4, 2023

Questions

Turning the dataframe upside down and performing groupby

byMR

January 4, 2023

Questions

Realated Apache Kafka producer and consumer part

byMR

January 4, 2023

Questions

Future function call in init flutter doesn't update UI

byMR

January 4, 2023

Questions

Do not want to close modal after submit data in modal PHP

byMR

January 4, 2023

Questions

how to transfer string to number in javascript from nested array

byMR

January 4, 2023