Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to substitute unstressed vowel?

I have a CSV file with the following data:

bel.lez.za;bellézza
e.la.bo.ra.re;elaboràre
a.li.an.te;alïante
u.mi.do;ùmido

the first value is the word divided in syllables and the second is for the stress.
I’d like to merge the the two info and obtain the following output:

bel.léz.za
e.la.bo.rà.re
a.lï.an.te
ù.mi.do

I computed the position of the stressed vowel and tried to substitute the same unstressed vowel in the first value, but full stops make indexing difficult. Is there a way to tell python to ignore full stops while counting? or is there an easier way to perform it? Thx

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

After splitting the two values for each line I computed the position of the stressed vowels:

    char_list=['ò','à','ù','ì','è','é','ï']
    for character in char_list:
        if character in value[1]:
           position_of_stressed_vowel=value[1].index(character)

>Solution :

I’d suggest merging/aligning the two forms in parallel instead of trying to substitute things via indexing. The idea is to iterate through the plain form and take out one character from the accented form for every character from the plain form, keeping dots as they are.

(Or perhaps, the idea is to add the dots to the accented form instead of adding the accented characters to the syllabified form.)

def merge_accents(plain, accented):
    output = ""
    acc_chars = iter(accented)

    for char in plain:
        if char == ".":
            output += char 
        else: 
            output += next(acc_chars)

    return output

Test:

data = [['bel.lez.za', 'bellézza'],
        ['e.la.bo.ra.re', 'elaboràre'],
        ['a.li.an.te', 'alïante'],
        ['u.mi.do', 'ùmido']]

# Returns
# bel.léz.za
# e.la.bo.rà.re
# a.lï.an.te
# ù.mi.do
for plain, accented in data:
    print(merge_accents(plain, accented))
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading