Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Parse a file which is a lists of objects in Python

I have a json-like file in the below format, I would like to store the BLEU score attribute in a list and the chrF2++ score in another list.

The file format:

[
{
 "name": "BLEU",
 "score": 38.8,
 "signature": "nrefs:1|case:lc|eff:no|tok:13a|smooth:exp|version:2.0.0",
 "verbose_score": "75.0/45.5/30.0/22.2 (BP = 1.000 ratio = 1.000 hyp_len = 12 ref_len = 12)",
 "nrefs": "1",
 "case": "lc",
 "eff": "no",
 "tok": "13a",
 "smooth": "exp",
 "version": "2.0.0"
},
{
 "name": "chrF2++",
 "score": 49.6,
 "signature": "nrefs:1|case:mixed|eff:yes|nc:6|nw:2|space:no|version:2.0.0",
 "nrefs": "1",
 "case": "mixed",
 "eff": "yes",
 "nc": "6",
 "nw": "2",
 "space": "no",
 "version": "2.0.0"
}
]
[
{
 "name": "BLEU",
 "score": 19.2,
 "signature": "nrefs:1|case:lc|eff:no|tok:13a|smooth:exp|version:2.0.0",
 "verbose_score": "61.5/33.3/18.2/5.0 (BP = 0.926 ratio = 0.929 hyp_len = 13 ref_len = 14)",
 "nrefs": "1",
 "case": "lc",
 "eff": "no",
 "tok": "13a",
 "smooth": "exp",
 "version": "2.0.0"
},
{
 "name": "chrF2++",
 "score": 38.8,
 "signature": "nrefs:1|case:mixed|eff:yes|nc:6|nw:2|space:no|version:2.0.0",
 "nrefs": "1",
 "case": "mixed",
 "eff": "yes",
 "nc": "6",
 "nw": "2",
 "space": "no",
 "version": "2.0.0"
}
]
....

I tried:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

with open(sys.argv[1]) as f:
    for jsonObj in f:
        list_of_scores = json.loads(jsonObj)
        print(list_of_scores)
        bleuScores.append(list_of_scores[0])
        chrfScores.append(list_of_scores[1])

but it did not work

>Solution :

Since your file does not seem to be a valid JSON file, therefore I would like to manipulate this file to reformat it into a valid JSON file. After that, you can simply use a for loop to get the desired lists:

import json
with open(sys.argv[1]) as f:
  text = f.read()
  text = text.replace("[", "").replace("]", "").replace("}", "},") \
  .replace("},,", "},").strip().strip(",")
  text = "[" + text + "]"
  myDictionary = json.loads(text)

bleus = []
chrs = []
for value in myDictionary:
  if value["name"] == "BLEU":
    bleus.append(value)
  elif value["name"] == "chrF2++":
    chrs.append(value)
print(bleus)
print(chrs)

Output

[{'name': 'BLEU', 'score': 38.8, 'signature': 'nrefs:1|case:lc|eff:no|tok:13a|smooth:exp|version:2.0.0', 'verbose_score': '75.0/45.5/30.0/22.2 (BP = 1.000 ratio = 1.000 hyp_len = 12 ref_len = 12)', 'nrefs': '1', 'case': 'lc', 'eff': 'no', 'tok': '13a', 'smooth': 'exp', 'version': '2.0.0'}, {'name': 'BLEU', 'score': 19.2, 'signature': 'nrefs:1|case:lc|eff:no|tok:13a|smooth:exp|version:2.0.0', 'verbose_score': '61.5/33.3/18.2/5.0 (BP = 0.926 ratio = 0.929 hyp_len = 13 ref_len = 14)', 'nrefs': '1', 'case': 'lc', 'eff': 'no', 'tok': '13a', 'smooth': 'exp', 'version': '2.0.0'}]
[{'name': 'chrF2++', 'score': 49.6, 'signature': 'nrefs:1|case:mixed|eff:yes|nc:6|nw:2|space:no|version:2.0.0', 'nrefs': '1', 'case': 'mixed', 'eff': 'yes', 'nc': '6', 'nw': '2', 'space': 'no', 'version': '2.0.0'}, {'name': 'chrF2++', 'score': 38.8, 'signature': 'nrefs:1|case:mixed|eff:yes|nc:6|nw:2|space:no|version:2.0.0', 'nrefs': '1', 'case': 'mixed', 'eff': 'yes', 'nc': '6', 'nw': '2', 'space': 'no', 'version': '2.0.0'}]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading