Python 3.9.18. If the basic stuff below isn’t a bug in re then how come I’m getting different results with what’s supposed to be equivalent code (NOTE: I am not looking for alternative ways to achieve the expected result, I already have plenty such alternatives):
import re
s = '{"merge":"true","from_cache":"true","html":"true","links":"false"}'
re.sub(r'"(true|false)"', r'\1', s, re.I)
'{"merge":true,"from_cache":true,"html":"true","links":"false"}'
^^^ note how only the 1st and 2nd "true" were replaced, but the 3rd and 4rd are still showing quotes " around them.
Whereas the following, which is supposed to be equivalent ((?i) instead of re.I), works as expected:
import re
s = '{"merge":"true","from_cache":"true","html":"true","links":"false"}'
re.sub(r'(?i)"(true|false)"', r'\1', s)
'{"merge":true,"from_cache":true,"html":true,"links":false}'
^^^ all instances of "true" and "false" were replaced.
>Solution :
The function re.sub() has the following signature:
Signature: re.sub(pattern, repl, string, count=0, flags=0)
If you give re.I as the fourth argument, it will interpret it as the count argument.
When converted to an integer, re.I is equal to 2.
>>> print(int(re.I))
2
So supplying this flag in this way causes it to make only 2 replacements.
Instead, I suggest using a keyword arg.
re.sub(r'"(true|false)"', r'\1', s, flags=re.I)