Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Python: how to get the next sequence of a list of lists based on a condition?

I used an NLP chunker that splits incorrectly the term ‘C++’ and ‘C#’ as: C (NN), +(SYM), +(SYM), C (NN), #(SYM).

The resulting list of incorrect chunking looks like this:

l = [['C', 'NN'], ['+', 'SYM'], ['+', 'SYM'], ['C', 'NN'], ['#', 'NN']]

I would like to post-process this list, by identifying the strings in index 0 of each list that are ‘C’ and the next in line ‘+’, ‘+’ or ‘#’. Then I’d like to concatenate these strings, so that ‘C’,’+’,’+’ becomes ‘C++’ by simply adding these together. This has to be generalisable, so it should work with lists that contain multiple different words, but still concatenate the desired strings.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

desired result:

l_desired = [['C++', 'NN'], ['C#', 'NN']]

I can identify the items in the list independently (index 0) but I don’t know how to go about identifying the desired sequence. My idea was to use the next() function, although I do not know where to begin.

>Solution :

You can loop over the list and check if the first element is a letter, in this case append as a new item, else update the last item:

from string import ascii_letters

letters = set(ascii_letters)

out = []
for e in l:
    if e[0][0] in letters:
        out.append(e.copy()) # making a copy not to affect original list
    elif out: # this is to check that out is not empty (edge case)
        out[-1][0] += e[0]

Or using a blacklist of symbols:

symbols = set('+#')

out = []
for e in l:
    if e[0] in symbols and out:
        out[-1][0] += e[0]
    else:
        out.append(e.copy())

output:

[['C++', 'NN'], ['C#', 'NN']]
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading