Effective group aggregation in Scala collections

I often need to do something like

coll.groupBy(f(_)).mapValues(_.foldLeft(x)(g(_,_)))

What is the best way to achieve the same effect, but avoid explicitly building intermediate collections with groupBy?

+5
source share
2 answers

You can discard the initial collection on the map by holding intermediate results:

def groupFold[A,B,X](as: Iterable[A], f: A => B, init: X, g: (X,A) => X): Map[B,X] = 
  as.foldLeft(Map[B,X]().withDefaultValue(init)){
    case (m,a) => {
      val key = f(a)
      m.updated(key, g(m(key),a))
    }
  }

You said you gathered, and I wrote Iterable, but you need to think about whether the order has a point in your question.

If you need efficient code, you are likely to use a volatile map, as in Rex's answer.

+4
source

, , , - , ( , "" ):

final case class Var[A](var value: A) { }
def multifold[A,B,C](xs: Traversable[A])(f: A => B)(zero: C)(g: (C,A) => C) = {
  import scala.collection.JavaConverters._
  val m = new java.util.HashMap[B, Var[C]]
  xs.foreach{ x =>
    val v = { 
      val fx = f(x)
      val op = m.get(fx)
      if (op != null) op
      else { val nv = Var(zero); m.put(fx, nv); nv }
    }
    v.value = g(v.value, x)
  }
  m.asScala.mapValues(_.value)
}

( .) :

scala> multifold(List("salmon","herring","haddock"))(_(0))(0)(_ + _.length)
res1: scala.collection.mutable.HashMap[Char,Int] = Map(h -> 14, s -> 6)        

- : Java HashMap. , Java HashMaps 2-3 , Scala. ( Scala HashMap, , .) , 2-3 , . , .

+3

All Articles