Python read html table from confluence and print each row as list


I’d like to parse confuence page ,read table and create list for each row.

My Table looks like

My code

x = confluence.get_page_by_id(p_id,expand="")

soup = BeautifulSoup(x["body"]["storage"]["value"], 'html.parser')

for tables in"table tr"):
    data = [item.get_text() for item in"td")]

But problem is, second column becuase of the new lines output of the code


And I want the output ot look like

['Karnataka','Bangalore','Bangalore Mysore Tumkur']

Can you please provide the code to fix this.

Thanks for the help!

>Solution :

BeautifulSoup removes the whitespace in rendered HTML, to use a custom separator use this:

data = [item.get_text(separator=" ") for item in"td")]

Leave a Reply Cancel reply