I am a RegEx beginner and trying to identify the endings of different statements in sms. See screenshot below.
How can I avoid selecting the next letter following by a full-stop that indicates ending of a statement. Note that some statements have <.> while some have <.>
Regex used: r". ?[\D]"
Sample SMS: – I want to select just the full-stop and space if any.
Txn of USD 00.00 done using TC XX at POS*MERCH on 30-Feb-22. Avl bal:USD 00.00. Call xxxxxx for dispute or SMS BLOCK xxxx to xxxxxxx
Acct XX debited with USD XX.00 on some date.Info: ABC*BDECS-XYZ.Avbl Bal:USD yy,xxx.95.Call xxxxxx for dispute or SMS BLOCK xx to xxxxx
screenshot from RegExr on regular pattern
>Solution :
What you’re looking for is a look-ahead group. Whether you make that a positive look-ahead and use the negated character set \D or a negative look-ahead with the character set \d doesn’t really matter- I’ll outline both below:
regex = r". ?(?=\D)" # asserts that the following character matches \D
regex = r". ?(?!\d)" # asserts the following character does NOT match \d
There’s also look-behind variants (?<!pattern) and (?<=pattern), which assert that the pattern doesn’t/does match just before the current position.
None of these groups capture the matched text- they just "look ahead" or "look behind" without changing state.