Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

Group data by a tolerance

I have an ordered list

L = [301.148986835,
301.148986835,
301.148986835,
301.161562835,
301.161562835,
301.16156333500004,
301.167179835,
301.167179835,
301.167179835,
301.167179835,
301.167179835,
301.179755835,
301.179755835,
301.179755835,
301.646611835,
301.659187335,
301.659187335,
301.659187335,
301.659187335,
302.138619335,
302.142316335,
302.151194835,
302.1568118349999,
302.15681183500004,
302.15681183500004,
302.15681183500004,
302.156812335,
302.156812335,
302.156812335,
302.169387835,
302.169387835,
302.169387835,
302.169387835,
302.169387835,
302.169388335,
302.636243335,
302.636243835,
302.648819835,
302.648819835,
303.137565335,
303.140827335,
303.140827335,
303.146443835,
303.146443835,
303.146444335,
303.159019835,
303.159019835,
303.15901983500004,
303.159020335,
303.159020335,
303.15902033500004,
303.63283533500004,
303.638451335,
304.130459335,
304.130459335,
304.14370483499994,
304.14370483499994,
304.14370483499994,
304.148651835,
304.148652335,
304.148652335]

I want to group it with a margin of +-0.5

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

The expected output

 R = [[301.148986835,
  301.148986835,
  301.148986835,
  301.161562835,
  301.161562835,
  301.16156333500004,
  301.167179835,
  301.167179835,
  301.167179835,
  301.167179835,
  301.167179835,
  301.179755835,
  301.179755835,
  301.179755835,
  301.646611835,
  301.659187335,
  301.659187335,
  301.659187335,
  301.659187335,
  302.138619335],[302.142316335,
  302.151194835,
  302.1568118349999,
  302.15681183500004,
  302.15681183500004,
  302.15681183500004,
  302.156812335,
  302.156812335,
  302.156812335,
  302.169387835,
  302.169387835,
  302.169387835,
  302.169387835,
  302.169387835,
  302.169388335,
  302.636243335,
  302.636243835,
  302.648819835,
  302.648819835,
  303.137565335,
  303.140827335,
  303.140827335,
  303.146443835,
  303.146443835,
  303.146444335,
  303.159019835,
  303.159019835,
  303.15901983500004,
  303.159020335,
  303.159020335,
  303.15902033500004],
[303.63283533500004,
  303.638451335,
  304.130459335,
  304.130459335,
  304.14370483499994,
  304.14370483499994,
  304.14370483499994],[304.148651835,
  304.148652335,
  304.148652335]

When I use this code (my question is not duplicate

def grouper(iterable):
    prev = None
    group = []
    for item in iterable:
        if prev is None or item - prev <= 1:
            group.append(item)
        else:
            yield group
            group = [item]
        prev = item
    if group:
        yield group

I get the same list as an output

calculate within a tolerance

>Solution :

You update prev in every iteration. Because of this, every element of your list is within 1 of prev. You want to update it only when you start a new group.

Better yet, get rid of prev altogether and always compare against the first element of the group.

I’d also suggest including a tol argument so that the function is more flexible:

def grouper(iterable, tol=0.5):
    tol = abs(tol*2) # Since we're counting from the start of the group, multiply tol by 2
    group = []
    for item in iterable:
        if not group or item - group[0] <= tol:
            group.append(item)
        else:
            yield group
            group = [item]
    if group:
        yield group

Try it online

Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading