i have a problem in my project.
WHAT I WANT TO DO ?
i need to get all word split by underscore in python code:
text = "i.am - M_o_h_a_m_m_e_d - and - _1_5y_o - name - moh_mmed - 2_8_j - i___a_m"
re.findall(r'?????' , text)
i need to get :
['M_o_h_a_m_m_e_d','_1_5y_o','i___a_m']
NOTE : if the word starts by a number like (15yo) will be a underscore before the starting number
WRONG : M_o_h_a_m_m_e_d , M_o_h_a_m_m_e_d
WRONG : 1_5_y_o
>Solution :
Here’s a non-regex approach:
text = "i.am - M_o_h_a_m_m_e_d - and - _1_5y_o - name - moh_mmed - 2_8_j - i___a_m"
result = []
for word in text.split(' - '):
if word[0].isdigit() or word[0] == '_' and not word[1].isdigit():
continue
for char in word.split('_'):
if len(char) > 1 and not (char[0].isdigit() ^ char[1].isdigit()):
break
else:
result.append(word)
print(result)
Output:
['M_o_h_a_m_m_e_d', '_1_5y_o', 'i___a_m']
Assuming the rules are:
- Words are split with underscores, except:
- If two subsequent characters are a number and a letter, and
- Numbers in the first position are prepended by an underscore.