Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Fastest way of replacing strings with its counterpart

I need to replace acronym slangs within a string to its expanded part. The dataset for the slang I use is this one with over 3k items. This is my current code for the process:

import pandas as pd

slangs = pd.read_csv('slang.csv', index_col=[0])

def expand_slang_acronyms():
    word_list = 'foo brb bar'.split(' ')
    for i in range(len(word_list)):
        for j in range(len(slangs)):
            if word_list[i] == slangs.loc[j, 'acronym']:
                word_list[i] = slangs.loc[j, 'expansion']

    print(' '.join(word_list)) # 'foo be right back bar'

Running it as is is quite fast but I need to replace thousands of strings. Timing the code executing just 100 times:

from timeit import timeit
timeit(expand_slang_acronyms, number=100)

In this instance it output 6.519681000005221 which is really slow considering it’s only 100 times. I need a faster way to do this.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

>Solution :

I think there are many ways to do this. Here is one way to speed up the process.

import pandas as pd

slangs = pd.read_csv('slang.csv')
slang_dict = dict(zip(slangs['acronym'], slangs['expansion']))

def expand_slang_acronyms():
    word_list = 'foo brb bar'.split(' ')
    for i in range(len(word_list)):
        if word_list[i] in slang_dict:
            word_list[i] = slang_dict[word_list[i]]

    print(' '.join(word_list)) # 'foo be right back bar'

timeit(expand_slang_acronyms, number=100)

This should result in a performance boost, as dictionary lookups are O(1) on average, compared to O(n) for iterating through a DF.

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading