Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

How to Parse XY Coordinate Tuples and Split Them Into Separate X and Y Lists

First time asking for help on stack overflow and I am way in over my head.

I am currently working on a project where I need to take percentage based coordinate tuples from very large, variable length xml files, split them into separate X and Y lists, and then find the average difference between values in the lists.

I am currently stuck on splitting the tuples into X and Y lists.

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

import xml.etree.ElementTree as ET'

lem = []

tree = ET.parse('testdata.xml')

root = tree.getroot()

for GazePointOnDisplayArea in root.findall("./GazeData/Left/GazePointOnDisplayArea"):

        le = GazePointOnDisplayArea.get('Value')

        lem.append(le)

print(lem)

#A test xml file shortened to five elements gives the following output

['(0.48734050, 0.50727710)', '(0.48989120, 0.50335540)', '(0.48709830, 0.50172430)', '(0.48531740, 0.50473010)', '(0.48797150, 0.51031550)']

Ideally I’d like to end up with

x = [0.48734050, 0.48989120, 0.48709830, 0.48531740, 0.48797150]
y = [0.50727710, 0.50335540, 0.50172430, 0.50473010, 0.51031550]

I’ve tried *zip and mapping methods but nothing seems to work with this. I’m unsure if I’ve made a parsing error, or if it is to do with there being a decimal, or whatever else.

I am open to using python, numpy, or pandas.

Please advise.

>Solution :

From the output you’re getting it’s a one liner to the output you desire.
First you extract the numbers using regular expressions and then you use numpy to rearrange them:

import re
import numpy as np

text = ['(0.48734050, 0.50727710)', '(0.48989120, 0.50335540)', '(0.48709830, 0.50172430)', '(0.48531740, 0.50473010)', '(0.48797150, 0.51031550)']
x,y = np.array([[float(x) for x in re.findall(r"(\d+\.\d+)",line) ] for line in text]).T
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading