How to normalize a list from a json file using python?

I have a sample file like this sample.json { "school":[ { "testId":123, "testName":"test1", "Teachers":[ { "Tid":111, "Tname":"aaa" }, { "Tid":222, "Tname":"bbb" }, { "Tid":333, "Tname":"ccc" }, { "Tid":444, "Tname":"ddd" } ], "location":"India" } ] } i need to normalize the Teachers list from this file, my code: import json import pandas as pd with open(‘sample.json’,… Read More How to normalize a list from a json file using python?

Converting iterrows into itertuples and accessing namedtuples

Trying to reduce the overhead of iterrows by changing it to itertuples. (there are many columns) I’m trying to turn this with iterrows def named_tuple_issue_iterrows(df: pd.DataFrame, column_name: str): for index, series in df.iterrows(): result = series[column_name] # Do something with result Into itertuples. def named_tuple_issue_itertuples(df: pd.DataFrame, column_name: str): for namedtuple in df.itertuples(): result = namedtuple[column_name]… Read More Converting iterrows into itertuples and accessing namedtuples

Get column values not exist in another column pandas

I have two dataframe df1 and df2, df1 & df2 have column "A". I want output df3 has column "A" which has values of df1 not exist on df2. df1 A I-13856942 I-13856914 I-13861633 I-13875002 I-13875673 df2 A I-13856942 I-13856914 I-13861633 output df3 A I-13875002 I-13875673 >Solution : A possible solution: df1.loc[df1.merge(df2, how=’left’, indicator=True)[‘_merge’].eq(‘left_only’),:] Output:… Read More Get column values not exist in another column pandas

How to create a list of dataframes based on two other dataframes

I have two dataframes. The first one has thousands of columns which represent a given city followed by the year, like "London_2001", "London_2002", and some measurements in the rows. The second dataframe has two columns. The first one is a territorial region, and the second one a list of cities. Something like: df1 <- data.frame(London_2021… Read More How to create a list of dataframes based on two other dataframes

Pandas DataFrame: .replace() and .strip() methods returning NaN values

I read a pdf file into a DataFrame using tabula and used .concat() to combine it all into one DataFrame by doing the following: import pandas as pd import tabula df = tabula.read_pdf(‘card_details.pdf’, pages=’all’) df = pd.concat(df, ignore_index=True) I want to clean some of this data as a column which contains card numbers also has… Read More Pandas DataFrame: .replace() and .strip() methods returning NaN values