pandas read_json dtype=pd.CategoricalDtype does not work but dtype='category' does

Is this a known issue that specifying CategoricalDtype dtype at read_json does not convert the column dtype, or is there a mistake in the code? import pandas as pd df = pd.read_json( "./data/data.json", dtype={ #"facility": pd.CategoricalDtype, # does not work "facility": ‘category’, # does work "supplier": pd.CategoricalDtype, # does not work } ) df.info() —–… Read More pandas read_json dtype=pd.CategoricalDtype does not work but dtype='category' does

Converting currency to numeric value

Since they are in object format, I am trying to create a new variable by converting prices in my df to a numeric value. I tried to remove the ‘,’ and ‘$’ from the numbers in the column and then convert them to a different type with pd.to_numeric df_l[‘price_MXN2’] = df_l[‘price_MXN’].str.replace(‘,’,”) df_l[‘price_MXN2’] = df_l[‘price_MXN’].str.replace(‘$’,”) df_l[‘price_MXN2’]… Read More Converting currency to numeric value

pandas astype doesn't work as expected (fails silently and badly)

I’ve encountered this strange behavior of pandas .astype() (I’m using version 1.5.2). When trying to cast a column as integer, and later requesting dtypes, all seems fine. Until you try to extract the values by row, when you get inconsistent types. Code: import pandas as pd import numpy as np ​ df = pd.DataFrame(np.random.randn(3, 3))… Read More pandas astype doesn't work as expected (fails silently and badly)

How do I create a compound dtype numpy array from existing individual vectors?

I am learning about dtypes in numpy and I have the following doubt. I can define a compound type as follows: myrecord = np.dtype([ (‘col1’, ‘u4’), (‘col2’, ‘f8’) ]) If I have two individual numpy arrays: a=np.array([1,2,3,4]) b=np.array([10.1,20.1,30.1,40.1]) How would I generate a third array c of type my_record? This is what I tried, which… Read More How do I create a compound dtype numpy array from existing individual vectors?

Dataframe conditional replacement with intigers

I have a dataframe column like this: df[‘col_name’].unique() >>>array([-1, ‘Not Passed, On the boundary’, 1, ‘Passed, On the boundary’, ‘Passed, Unclear result’, ‘Passes, Unclear result, On the boudnary’, ‘Rejected, Unclear result’], dtype=object) In this column, if an element contains the word ‘Passed’ as a field or as a substring, then replace the entire field with… Read More Dataframe conditional replacement with intigers

pandas multIndex from product – ignore same row comparison

I have a pandas dataframe like as shown below Company,year T123 Inc Ltd,1990 T124 PVT ltd,1991 ABC Limited,1992 ABCDE Ltd,1994 tf = pd.read_clipboard(sep=’,’) tf[‘Company_copy’] = tf[‘Company’] I would like to compare each value from tf[‘company’] against each value of tf[‘company_copy] but exclude same matching row number or index number, string For ex: I want T123… Read More pandas multIndex from product – ignore same row comparison

How to handle "The given header was not found" when paging records in c# API GET request?

I’m requesting data from an API that requires paging records based on a custom header called "cursor". Only 100 records may be retrieved per call and as such I’ve created a while loop to execute. The loop functions… until it doesn’t. Once all records are paged, the headers get dropped and my program errors out… Read More How to handle "The given header was not found" when paging records in c# API GET request?