Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Replace the keys of a dictionary by their associated value, within an input string identifying the occurrences of those keys in said input string

import re, datetime

input_text = "el 30 de octubre de 2021"   #example 1
input_text = "del dia 1 de diciembre del año 2020"   #example 2
input_text = "de este 12 del mes de septiembre de 2020"   #example 3
input_text = "de este 12 del mes de septiembre"   #example 4
input_text = "del dia 21 del mes de noviembre"   #example 5

es_month_dict = { "enero":"01", "febrero":"02", "marzo":"03", "abril":"04", "mayo":"05", "junio":"06", "julio":"07", "agosto":"08","septiembre":"09", "octubre":"10", "noviembre":"11", "diciembre":"12" }

#replacement order is important
input_text = input_text.replace(" dia ", " ")
input_text = input_text.replace(" del mes ", " ")
input_text = input_text.replace(" mes ", " ")
input_text = input_text.replace(" del año ", " ")
input_text = input_text.replace(" año ", " ")

#replacement, for example:  "enero" --> "01"

#only if the year is not indicated, the current year is placed
if():
    input_text = input_text + str(datetime.datetime.today().strftime('%Y'))

print(repr(input_text)) #output

the correct output you would need to obtain in each of the examples:

"el 30 de 10 de 2021"   #for example 1
"del 1 de 12 del 2020"   #for example 2
"de este 12 de 09 de 2020"   #for example 3
"de este 12 de 09 de 2022"   #for example 4
"del 21 de 11 de 2022"   #for example 5

How should I replace the dictionary es_month_dict keys by their value within the input string?
How should I concatenate the year if it is not indicated? Should I do it before replacing the elements of the dictionary es_month_dict or after it?

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

  • iterate over the pairs of you dict, the key is the part to replace, the value the part to replace with
  • put all the words to delete in the same list, and iterate to remove them
  • use a regex to check if the strings ends with 4 digits, or not
es_month_dict = {"enero": "01", "febrero": "02", "marzo": "03", "abril": "04", "mayo": "05", "junio": "06",
                 "julio": "07", "agosto": "08", "septiembre": "09", "octubre": "10", "noviembre": "11",
                 "diciembre": "12"}

deletion_words = ["dia", "del mes", "mes", "del año", "año"]

def transform(value):
    for deletion_word in deletion_words:
        value = value.replace(f" {deletion_word} ", " ")

    for month_name, month_nb in es_month_dict.items():
        value = value.replace(month_name, month_nb)

    if not re.search(r".*\d{4}$", value):
        value += " de " + datetime.datetime.today().strftime('%Y')

    return value

input_text = "el 30 de octubre de 2021"
print(transform(input_text))
input_text = "del dia 1 de diciembre del año 2020"
print(transform(input_text))
input_text = "de este 12 del mes de septiembre de 2020"
print(transform(input_text))
input_text = "de este 12 del mes de septiembre"
print(transform(input_text))
input_text = "del dia 21 del mes de noviembre"
print(transform(input_text))
el 30 de 10 de 2021
del 1 de 12 2020
de este 12 de 09 de 2020
de este 12 de 09 de 2022
del 21 de 11 de 2022
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading