I have a string in python that I want to split into a list, however the format of the string is that it is a numbered list.
Example string: "1. one. 2. two. 3. three." Using something like re.split("\d+.", string) works, except sometimes part of the string there are integers followed by a ".", like amounts of currency. Example string: "1. I have $2.2 million. 2. blah blah." This would result in breaking the "2.2 million" into a separate string. How can I use regular expressions to go around this? Thanks.
>Solution :
Your idea of using a regex is good, and from what I understand from the examples you have given us, you are only missing some precision on your separator.
If the dot you are looking for follows the index of an item (1. something), then it will be followed by a space. If it is a decimal number (2.2 million), then it will not.
Therefore, you can split your string this way:
import re
string = "1. test 2. test2 3. 3.4 million"
splitted_string = re.split("\d+\. ", string)
Of course, this new separator will only work if you have no other occurrences of a dot followed by a space than right after each index.
Notice I also added a backslash before the dot: in regexes, a dot can match almost any character (including a dot itself of course), but here you are specifically looking for a dot inside the string.