Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

I am not getting the correct info while extracting some particular key value from nested json. please help in correcting the code

I want to extract the task name and config corresponding to each task into new variable.
The code that I have shared is not giving me the desired output. Although it is extracting some info but it is not able to extract all the required details.

Here is the json:

old = {
        "tasks": [
            {
                "task_group_id": "Task_group_1",
                "branch": [
                    {
                        "task_id": "Task_Name_1",
                        "code_file_path": "tasks/base_creation/final_base_logic.hql",
                        "language": "hive",
                        "config": {
                            "k1": "v1",
                            "Q1":"W1"
                        },
                        "sequence": 1,
                        "condition": "in_start_date in range [2021-10-01 , 2023-11-04]"
                    }
                ],
                "default": {
                    "task_id": "Task_group_1_default",
                    "code_file_path": "tasks/base_creation/default_base_logic.hql",
                    "language": "hive",
                    "config": {}
                }
            },
            {
                "task_group_id": "Task_group_2",
                "branch": [
                    {
                        "task_id": "Task_Name_2",
                        "code_file_path": "tasks/variables_creation/final_cas_logic.py",
                        "language": "pyspark",
                        "config": {
                            "k2": "v2"
                        },
                        "sequence": 1,
                        "condition": "in_start_date in range [2022-02-01 , 2023-11-04]"
                    },
                    {
                        "task_id": "Task_Name_3",
                        "code_file_path": "tasks/variables_creation/final_sor_logic.py",
                        "language": "pyspark",
                        "config": {
                            "k3": "v3"
                        },
                        "sequence": 2,
                        "condition": "in_start_date in range [2021-10-01 , 2022-01-31]"
                    }
                ],
                "default": {
                    "task_id": "Task_group_2_default",
                    "code_file_path": "tasks/variables_creation/default_variables_logic.py",
                    "language": "pyspark",
                    "config": {}
                }
            }
        ],
        "dependencies": " ['task_group_id_01_Name >> task_group_id_02_Name']"
    }

Here is my code for extracting the info:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

o_mod = []
for grp in range(len(old['tasks'])):
    for task_id in range(len(old['tasks'][grp]['branch'])):
        o_mod.append({})
        o_mod[grp]['task_id'] = old['tasks'][grp]['branch'][task_id]['task_id']
        o_mod[grp]['config'] = old['tasks'][grp]['branch'][task_id]['config']
            
print(o_mod)

Here is the output which is wrong:

[{'task_id': 'Task_Name_1', 'config': {'k1': 'v1', 'Q1': 'W1'}},
 {'task_id': 'Task_Name_3', 'config': {'k3': 'v3'}},
 {}]

I want output to look like this (Correct output):

[{'task_id': 'Task_Name_1', 'config': {'k1': 'v1', 'Q1': 'W1'}},
 {'task_id': 'Task_Name_2', 'config': {'k2': 'v2'}},
 {'task_id': 'Task_Name_3', 'config': {'k3': 'v3'}}}]

>Solution :

You could use a nested list comprehension over tasks and branch:

o_mod = [ { 'task_id' : b['task_id'], 'config' : b['config'] } for t in old['tasks'] for b in t['branch'] ]

Output:

[
    {
        "task_id": "Task_Name_1",
        "config": {
            "k1": "v1",
            "Q1": "W1"
        }
    },
    {
        "task_id": "Task_Name_2",
        "config": {
            "k2": "v2"
        }
    },
    {
        "task_id": "Task_Name_3",
        "config": {
            "k3": "v3"
        }
    }
]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading