first value from selective column with groupby and .first() function

Existing Dataframe : Id col_1 col_2 col_3 col_4 A 3 6 6 2 A 3 6 6 5 A 3 6 6 4 B 2 4 4 6 B 2 4 4 6 Expected Dataframe : Id col_1 col_2 col_3 A 3 6 6 B 2 4 4 I am trying to find first Appearance… Read More first value from selective column with groupby and .first() function

pandas dataframe as as json file using index value

I have a pandas dataframes created by groupby as below: airbus_df.groupby([‘mfr’]).apply(lambda row: row.to_json(orient=’records’)).to_frame() output looks like: 0 mfr AIRBUS [{"mfr mdl code":"3940005","icao":"A00C7A","se… AIRBUS CANADA LTD PTNRSP [{"mfr mdl code":"1400010","icao":"A062EC","se… AIRBUS HELICOPTERS DEUTSCHLAND [{"mfr mdl code":"5620040","icao":"A32422","se… AIRBUS HELICOPTERS INC [{"mfr mdl code":"1145005","icao":"A1F846","se… AIRBUS INDUSTRIE [{"mfr mdl code":"3930402","icao":"A009A4","se… AIRBUS SAS [{"mfr mdl code":"3940312","icao":"A3AD89","se I wanted to create… Read More pandas dataframe as as json file using index value

Pandas filter rows by last 12 months in data frame

I need to keep only the rows with other columns for months with past 12 months. The max date here is 2022-08-01, so the resulting dataframe should have data from 2021-09-01 to 2022-08-01 Input data frame: d = {‘MONTH’: [‘2021-01-01’, ‘2021-02-01′,’2021-03-01′,’2021-04-01’, ‘2021-05-01’, ‘2021-06-01′,’2021-07-01′,’2021-08-01’, ‘2021-09-01’, ‘2021-10-01′,’2021-11-01′,’2021-12-01’, ‘2022-01-01’, ‘2022-02-01′,’2022-03-01′,’2022-04-01’, ‘2022-05-01’, ‘2022-06-01′,’2022-07-01′,’2022-08-01’, ‘2022-01-01’, ‘2022-02-01′,’2022-03-01′,’2022-04-01’, ‘2022-05-01’, ‘2022-06-01′,’2022-07-01′,’2022-08-01’], ‘col2’: [3,4,1,2,… Read More Pandas filter rows by last 12 months in data frame

Returning count of 0 if value doesn't exist Pandas DataFrame

I am trying to set 2 variables as counts of a specific value in a column. I have one variable here: missing_gmt = missing_records._merge.value_counts().Missing_in_GMTLib That correctly returns: 44 I also have another variable here: missing_nsl = missing_records._merge.value_counts().Missing_in_NSL Which should return 0 (no records exist) but instead is throwing: AttributeError: ‘Series’ object has no attribute ‘Missing_in_NSL’… Read More Returning count of 0 if value doesn't exist Pandas DataFrame

Convert string column to array of fixed length strings in pandas dataframe

I have a pandas dataframe with a few columns. I want to convert one of the string columns into an array of strings with fixed length. Here is how current table looks like: +—–+——————–+——————–+ |col1 | col2 | col3 | +—–+——————–+——————–+ | 1 |Marco | LITMATPHY | | 2 |Lucy | NaN | | 3… Read More Convert string column to array of fixed length strings in pandas dataframe

Why does my monthly frequency date range use the last day of the month rather than the first?

I am creating a date range as follows: contract_start_date = pd.to_datetime(‘2022-11-01′) contract_length_months = 3 * 12 # three years pd.date_range(start=contract_start_date, periods=contract_length_months, freq=’M’) >> DatetimeIndex([‘2022-11-30’, ‘2022-12-31’, ‘2023-01-31’, ‘2023-02-28’, ‘2023-03-31’, ‘2023-04-30’, ‘2023-05-31’, ‘2023-06-30’, ‘2023-07-31’, ‘2023-08-31’, ‘2023-09-30’, ‘2023-10-31’, ‘2023-11-30’, ‘2023-12-31’, ‘2024-01-31’, ‘2024-02-29’, ‘2024-03-31’, ‘2024-04-30’, ‘2024-05-31’, ‘2024-06-30’, ‘2024-07-31’, ‘2024-08-31’, ‘2024-09-30’, ‘2024-10-31’, ‘2024-11-30’, ‘2024-12-31’, ‘2025-01-31’, ‘2025-02-28’, ‘2025-03-31’, ‘2025-04-30’, ‘2025-05-31’, ‘2025-06-30’,… Read More Why does my monthly frequency date range use the last day of the month rather than the first?

Can't use function in data frame which is converted from Html File

I have one html file where table is stored and I store that html file into pandas Dataframe like this. from bs4 import BeautifulSoup import pandas as pd table = BeautifulSoup(open(‘/home/lenovo/Downloads/F4311.html’,’r’).read()).find(‘table’) # You are passing a <class ‘bs4.element.Tag’> element into pandas read_html. You need to convert it to a string. df = pd.read_html(str(table)) It worked… Read More Can't use function in data frame which is converted from Html File

How to convert a list with nested dictionary into dataframe to save as a csv?

I have the following list: a = [{‘cluster_id’: 0, ‘points’: [{‘id’: 1, ‘name’: ‘Alice’, ‘lat’: 52.523955, ‘lon’: 13.442362}, {‘id’: 2, ‘name’: ‘Bob’, ‘lat’: 52.526659, ‘lon’: 13.448097}]}, {‘cluster_id’: 0, ‘points’: [{‘id’: 1, ‘name’: ‘Alice’, ‘lat’: 52.523955, ‘lon’: 13.442362}, {‘id’: 2, ‘name’: ‘Bob’, ‘lat’: 52.526659, ‘lon’: 13.448097}]}, {‘cluster_id’: 1, ‘points’: [{‘id’: 3, ‘name’: ‘Carol’, ‘lat’: 52.525626, ‘lon’:… Read More How to convert a list with nested dictionary into dataframe to save as a csv?