I have a CSV file with the following data:
bel.lez.za;bellézza
e.la.bo.ra.re;elaboràre
a.li.an.te;alïante
u.mi.do;ùmido
the first value is the word divided in syllables and the second is for the stress.
I’d like to merge the the two info and obtain the following output:
bel.léz.za
e.la.bo.rà.re
a.lï.an.te
ù.mi.do
I computed the position of the stressed vowel and tried to substitute the same unstressed vowel in the first value, but full stops make indexing difficult. Is there a way to tell python to ignore full stops while counting? or is there an easier way to perform it? Thx
After splitting the two values for each line I computed the position of the stressed vowels:
char_list=['ò','à','ù','ì','è','é','ï']
for character in char_list:
if character in value[1]:
position_of_stressed_vowel=value[1].index(character)
>Solution :
I’d suggest merging/aligning the two forms in parallel instead of trying to substitute things via indexing. The idea is to iterate through the plain form and take out one character from the accented form for every character from the plain form, keeping dots as they are.
(Or perhaps, the idea is to add the dots to the accented form instead of adding the accented characters to the syllabified form.)
def merge_accents(plain, accented):
output = ""
acc_chars = iter(accented)
for char in plain:
if char == ".":
output += char
else:
output += next(acc_chars)
return output
Test:
data = [['bel.lez.za', 'bellézza'],
['e.la.bo.ra.re', 'elaboràre'],
['a.li.an.te', 'alïante'],
['u.mi.do', 'ùmido']]
# Returns
# bel.léz.za
# e.la.bo.rà.re
# a.lï.an.te
# ù.mi.do
for plain, accented in data:
print(merge_accents(plain, accented))