Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Create a DataFrame from list in lists (Pandas)

I´m having trouble creating a dataframe on my list.

The list contains four columns, but instead it says on presente one column with data:

ValueError: 4 columns passed, passed data had 1 columns.

The list itself is presented in this way:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 559.64, 8.01, 0.5520765512479038]]
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 520.34, 7.44, 0.5393857093988743]]
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 556.72, 7.96, 0.5410827096899603]]
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 688.67, 9.84, 0.5845350761787548]]
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 625.3, 8.94, 0.5612954767824924]]

I know there is something happening due to the double [], but i can´t figure it out. Can´t someone help me?

Here is the code so far:

   for i in range(6):
    excel_file = pd.read_excel(input_file, sheet_name=sheet[i])
    excel_file = excel_file.values.tolist()

    filtered = [x for x in excel_file if 'TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)' in x
                or 'TOTAL DAS DESPESAS DE CUSTEIO (A)' in x
                ]

    sheet_file = sheet[i]
    sheet_variable.append(sheet_file)
    wb_name.append(file_name)
    conab_data.append(filtered)

    print(filtered)

df_conab = pd.DataFrame(conab_data, columns=['Descrição', 'Preço/ha', 'Scs/ha', 'Part. %'])
df_conab['Local/UF/Ano'] = sheet_variable
df_conab['Fonte'] = wb_name

print(df_conab)

>Solution :

you could fix this with a for loop

overly_nested = [[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 559.64, 8.01, 0.5520765512479038]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 520.34, 7.44, 0.5393857093988743]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 556.72, 7.96, 0.5410827096899603]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 688.67, 9.84, 0.5845350761787548]],
[['TOTAL DAS DESPESAS DE CUSTEIO DA LAVOURA (A)', 625.3, 8.94, 0.5612954767824924]]]

for i, sub_list in enumerate(overly_nested):
    overly_nested[i]=sub_list[0]
df = pd.DataFrame(overly_nested)
print(df)

I’m sure theres a way to do this with zip(), let me experiment and I’ll edit if I find it

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading