I have an Array of 3-tuples in scala as such: (a:Int, b:Int, val:Double), and I need to return an Array which has for all pairs (a, b) the minimum val. It is clear to me how to do this by going to a Map:
a.map(t => ((t._1,t ._2), t._3)).groupBy(_._1).mapValues(_.map(_._2).min).toArray
but I would like to avoid maps for the purposes of memory optimization. Is there a clean way to do this without Map?
>Solution :
Try groupMapReduce, it does all the same, but in a single pass:
tuples.groupMapReduce(t => (t._1, t._2))(_._3)(_ min _)
Runnable example:
val tuples = List(
(0, 0, 58),
(1, 1, 100),
(0, 0, 1),
(0, 1, 42),
(1, 0, 123),
(1, 0, 3),
(0, 1, 2),
(1, 1, 4)
)
tuples.groupMapReduce(t => (t._1, t._2))(_._3)(_ min _).foreach(println)
Output:
((0,0),1)
((1,1),4)
((1,0),3)
((0,1),2)
It should strictly decrease the load on GC, because it doesn’t generate any intermediate lists for the grouped values, and neither for those mapped grouped values in the _.map(_._2)-step in your original solution. That should reduce amount of garbage quite substantially.