Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python: Extracting Datasets in Dataset

I got a weird looking dataset, where every row describes another dataset. "data" in this case is a list which I have converted to a dataframe.

result_df = pd.DataFrame(data)

enter image description here

When looking in the first entry of the dataframe above, I see a dataframe with 5 rows. This is the case for every other row. See the dataframe for the first row (row zero) here:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

result_df[0][0]
    _embedded.results|className _embedded.results|classId   _embedded.results|uri   _embedded.results|searchHit _embedded.results|title _embedded.results|preferredLabel    _embedded.results|isTopConceptInScheme  _embedded.results|isInScheme    _embedded.results|hasSkillType  _embedded.results|hasReuseLevel _embedded.results|broaderHierarchyConcept   _embedded.results|_links    _embedded.results|broaderSkill  BC_name
   0    Skill   http://data.europa.eu/esco/model#Skill  http://data.europa.eu/esco/skill/237db40b-4600...   range of project control principles project management principles   {'de': 'Prinzipien des Projektmanagements', 'n...   [http://data.europa.eu/esco/concept-scheme/mem...   [http://data.europa.eu/esco/concept-scheme/ski...   [http://data.europa.eu/esco/skill-type/knowledge]   [http://data.europa.eu/esco/skill-reuse-level/...   [http://data.europa.eu/esco/isced-f/0413]   {'self': {'href': 'https://ec.europa.eu/esco/a...   NaN Project Financials Control
   1    Skill   http://data.europa.eu/esco/model#Skill  http://data.europa.eu/esco/skill/abb9c7f1-6d69...   Operate projection equipment manually or with ...   operate projector   {'de': 'Projektoren bedienen', 'no': 'betjene ...   [http://data.europa.eu/esco/concept-scheme/mem...   [http://data.europa.eu/esco/concept-scheme/ski...   [http://data.europa.eu/esco/skill-type/skill]   [http://data.europa.eu/esco/skill-reuse-level/...   [http://data.europa.eu/esco/skill/S8.6.2]   {'self': {'href': 'https://ec.europa.eu/esco/a...   NaN Project Financials Control
   2    Skill   http://data.europa.eu/esco/model#Skill  http://data.europa.eu/esco/skill/25a713ba-cbc0...   Manage the overall planning, coordination, and...   manage railway construction projects    {'de': 'Bahnbauprojekte leiten', 'no': 'admini...   NaN [http://data.europa.eu/esco/concept-scheme/ski...   [http://data.europa.eu/esco/skill-type/skill]   [http://data.europa.eu/esco/skill-reuse-level/...   [http://data.europa.eu/esco/skill/S4.2.1]   {'self': {'href': 'https://ec.europa.eu/esco/a...   [http://data.europa.eu/esco/skill/fff5bc45-b50...   Project Financials Control
   3    Skill   http://data.europa.eu/esco/model#Skill  http://data.europa.eu/esco/skill/d37bc902-f640...   prepare financial projections   prepare financial projections   {'de': 'Finanzprognosen erstellen', 'no': 'for...   [http://data.europa.eu/esco/concept-scheme/mem...   [http://data.europa.eu/esco/concept-scheme/ski...   [http://data.europa.eu/esco/skill-type/skill]   [http://data.europa.eu/esco/skill-reuse-level/...   [http://data.europa.eu/esco/skill/S2.7.3]   {'self': {'href': 'https://ec.europa.eu/esco/a...   NaN Project Financials Control
   4    Skill   http://data.europa.eu/esco/model#Skill  http://data.europa.eu/esco/skill/7106b5df-e017...   PRojects IN Controlled Environments, version 2  Prince2 project management  {'de': 'Prince2-Projektmanagement', 'no': 'Pri...   NaN [http://data.europa.eu/esco/concept-scheme/ski...   [http://data.europa.eu/esco/skill-type/knowledge]   [http://data.europa.eu/esco/skill-reuse-level/...   [http://data.europa.eu/esco/isced-f/0413]   {'self': {'href': 'https://ec.europa.eu/esco/a...   [http://data.europa.eu/esco/skill/bec4359e-cb9...   Project Financials Control

Here’s a screenshot snipped of the dataframe:
enter image description here

Is it possible to extract these dataset in every row and append it to one big dataframe? So the resulting dataframe at the end should have the size of "1716 x 5 = 8580".

I tried something like this without success:

column_names = ["_embedded.results|className", "_embedded.results|classId", "_embedded.results|uri","_embedded.results|searchHit", "_embedded.results|title ", "_embedded.results|preferredLabel", "_embedded.results|isTopConceptInScheme", "embedded.results|isInScheme","_embedded.results|hasSkillType","_embedded.results|hasReuseLevel","_embedded.results|broaderHierarchyConcept","_embedded.results|_links","_embedded.results|broaderSkill","BC_name"]
my_df = pd.DataFrame(columns = column_names)

for index, i in result_df.iterrows():
  for j in i:
    my_df.append(j)

>Solution :

IIUC use if need convert each value to dataFrame:

result_df = pd.concat([pd.DataFrame(x) for x in data], ignore_index=True)

Or if there is already list of DataFrames:

result_df = pd.concat(data, ignore_index=True)
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading