Home Find a string between two substrings, BUT the end of the first is the start of the next one

Questions

Find a string between two substrings, BUT the end of the first is the start of the next one

March 17, 2024

So I have a string that goes like this:

...<p><noop><fademusic:23,0><26:1><wait:30> <speed:10><30:2><5D:1><color:3>August 3, 9:47 AM<b>District  Court<b>Defendant Lobby No. 2<color:0><p><hidetextbox:1><5D:0> <speed:255><music:8,0><wait:30><26:0><bgcolor:513,1,31><wait:7> <person:0,0,0><bg:2><bgcolor:258,1,31><wait:15><wait:30><hidetextbox:0> <name:512><shake:30,0><color:2>(Boy am I nervous!)<color:0><p> <hidetextbox:1><wait:45><name:1792><hidetextbox:0><bgcolor:769,8,31> Wright!<p>...

What do I need: find everything between s. (Note that the ending one is also a starting one for the next.)

My code:

...
filetext = open(fn).read()
tag = '<p>'
result = re.findall(tag+"(.*?)"+tag,filetext,re.DOTALL)
print(result)
...

Expected output:

['<noop><fademusic:23,0><26:1><wait:30>\n<speed:10><30:2><5D:1><color:3>August 3, 9:47 AM<b>District \nCourt<b>Defendant Lobby No. 2<color:0>', '<hidetextbox:1><5D:0>\n<speed:255><music:8,0><wait:30><26:0><bgcolor:513,1,31><wait:7>\n<person:0,0,0><bg:2><bgcolor:258,1,31><wait:15><wait:30><hidetextbox:0>\n<name:512><shake:30,0><color:2>(Boy am I nervous!)<color:0>', '\n<hidetextbox:1><wait:45><name:1792><hidetextbox:0><bgcolor:769,8,31>\nWright!']

Resulting output:

['<noop><fademusic:23,0><26:1><wait:30>\n<speed:10><30:2><5D:1><color:3>August 3, 9:47 AM<b>District \nCourt<b>Defendant Lobby No. 2<color:0>', '\n<hidetextbox:1><wait:45><name:1792><hidetextbox:0><bgcolor:769,8,31>\nWright!']

>Solution :

No need for re module, just use str.split(''). You may not want the empty strings in the result if  starts or ends a string so here is a solution if so:

s = '<p><noop><fademusic:23,0><26:1><wait:30> <speed:10><30:2><5D:1><color:3>August 3, 9:47 AM<b>District  Court<b>Defendant Lobby No. 2<color:0><p><hidetextbox:1><5D:0> <speed:255><music:8,0><wait:30><26:0><bgcolor:513,1,31><wait:7> <person:0,0,0><bg:2><bgcolor:258,1,31><wait:15><wait:30><hidetextbox:0> <name:512><shake:30,0><color:2>(Boy am I nervous!)<color:0><p> <hidetextbox:1><wait:45><name:1792><hidetextbox:0><bgcolor:769,8,31> Wright!<p>'
result = s.split('<p>')
for n in (0, -1):
    if result and not result[n]:
        del result[n]
print(result)

Output:

['<noop><fademusic:23,0><26:1><wait:30> <speed:10><30:2><5D:1><color:3>August 3, 9:47 AM<b>District  Court<b>Defendant Lobby No. 2<color:0>', '<hidetextbox:1><5D:0> <speed:255><music:8,0><wait:30><26:0><bgcolor:513,1,31><wait:7> <person:0,0,0><bg:2><bgcolor:258,1,31><wait:15><wait:30><hidetextbox:0> <name:512><shake:30,0><color:2>(Boy am I nervous!)<color:0>', ' <hidetextbox:1><wait:45><name:1792><hidetextbox:0><bgcolor:769,8,31> Wright!']

If you don’t want any empty strings, e.g., 'abcdef' would return ['abc', '', 'def'], then use:

result = [n for n in s.split('<p>') if n]

python-re

byMR

Published March 17, 2024

Add a comment

Implementing token bucket algorithm

byMR

March 17, 2024

Questions

How to remove duplicate ranges found within array

byMR

March 17, 2024

Questions

JSDoc create and import type from another file

byMR

March 17, 2024

Questions

EXCEL – Listing top 20 items from a range

byMR

March 17, 2024

Questions

char8_t and char16_t are dynamic in size?

byMR

March 17, 2024

Questions

How do I center the login information in the middle of this login box?

byMR

March 17, 2024

Find a string between two substrings, BUT the end of the first is the start of the next one

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Implementing token bucket algorithm

How to remove duplicate ranges found within array

JSDoc create and import type from another file

EXCEL – Listing top 20 items from a range

char8_t and char16_t are dynamic in size?

How do I center the login information in the middle of this login box?

Keep Up to Date with the Most Important News

Find a string between two substrings, BUT the end of the first is the start of the next one

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Implementing token bucket algorithm

How to remove duplicate ranges found within array

JSDoc create and import type from another file

EXCEL – Listing top 20 items from a range

char8_t and char16_t are dynamic in size?

How do I center the login information in the middle of this login box?

Discover more from Dev solutions