Home Beautiful Soup extract string before tag

Questions

Beautiful Soup extract string before tag

January 25, 2022

I have an xml file that has ref tags nested inside para tags:

<para>here be text<ref> REF 1 </ref>and here be some more text</para>

Is there a way using Beautiful Soup to extract the string between the opening para tag and the opening ref tag, ie:

here be text

I’ve tried various things to no avail, including find_previous:

soup = BeautifulSoup(file, 'xml')

ref = soup.find('ref')
ref_before = ref.find_previous('para')

But (obviously) ref_before returns the entire contents of the para tag, ie:

here be text REF 1 and here be some more text

I think this ought to be really simple but I don’t have much experience and just can’t crack it. Any help much appreciated.

>Solution :

You can use contents and select the first element:

soup.find('para').contents[0]

Output:

'here be text'

beautifulsoup

byMR

Published January 25, 2022

Add a comment

Dropdown menu html + css without JavaScript

byMR

January 25, 2022

Questions

Unexpected behavior when creating a custom append() method for lists

byMR

January 25, 2022

Questions

setTimeout does nothing (discord.js)

byMR

January 25, 2022

Questions

RangeError (index): Invalid value: Not in inclusive range 0..5: -5

byMR

January 25, 2022

Questions

I am getting this undefined index error line 23

byMR

January 25, 2022

Questions

How to implement justify()-method in Java

byMR

January 25, 2022

Beautiful Soup extract string before tag

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Like this:

Leave a ReplyCancel reply

Read more

Dropdown menu html + css without JavaScript

Unexpected behavior when creating a custom append() method for lists

setTimeout does nothing (discord.js)

RangeError (index): Invalid value: Not in inclusive range 0..5: -5

I am getting this undefined index error line 23

How to implement justify()-method in Java

Keep Up to Date with the Most Important News

Beautiful Soup extract string before tag

MEDevel.com: Open-source for Healthcare and Education

>Solution :

Share this:

Like this:

Leave a ReplyCancel reply

Keep Up to Date with the Most Important News

Read more

Dropdown menu html + css without JavaScript

Unexpected behavior when creating a custom append() method for lists

setTimeout does nothing (discord.js)

RangeError (index): Invalid value: Not in inclusive range 0..5: -5

I am getting this undefined index error line 23

How to implement justify()-method in Java

Discover more from Dev solutions