Advertisements
I’d like to parse confuence page ,read table and create list for each row.
My Table looks like
My code
x = confluence.get_page_by_id(p_id,expand="body.storage")
soup = BeautifulSoup(x["body"]["storage"]["value"], 'html.parser')
for tables in soup.select("table tr"):
data = [item.get_text() for item in tables.select("td")]
print(data)
But problem is, second column becuase of the new lines output of the code
['Karnataka','Bangalore','BangaloreMysoreTumkur']
And I want the output ot look like
['Karnataka','Bangalore','Bangalore Mysore Tumkur']
Can you please provide the code to fix this.
Thanks for the help!
>Solution :
BeautifulSoup removes the whitespace in rendered HTML, to use a custom separator use this:
data = [item.get_text(separator=" ") for item in tables.select("td")]