multithreading - Scala parallel frequency calculation using aggregate doesn't work -
i'm learning scala working exercises book "scala impatient". please see following question , answer , code. i'd know if answer correct. code doesn't work (all frequencies 1). where's bug?
q10: harry hacker reads file string , wants use parallel collection update letter frequencies concurrently on portions of string. uses following code:
val frequencies = new scala.collection.mutable.hashmap[char, int] (c <- str.par) frequencies(c) = frequencies.getorelse(c, 0) + 1
why terrible idea? how can parallelize computation?
my answer: not idea because if 2 threads concurrently updating same frequency, result undefined.
my code:
def parfrequency(str: string) = { str.par.aggregate(map[char, int]())((m, c) => { m + (c -> (m.getorelse(c, 0) + 1)) }, _ ++ _) }
unit test:
"method parfrequency" should "return frequency of each character in string" in { val freq = parfrequency("harry hacker") freq should have size 8 freq('h') should be(2) // fails freq('a') should be(2) freq('r') should be(3) freq('y') should be(1) freq(' ') should be(1) freq('c') should be(1) freq('k') should be(1) freq('e') should be(1) }
edit: after reading this thread, updated code. test works if ran alone, fails if ran suite.
def parfrequency(str: string) = { val freq = immutablehashmap[char, int]() str.par.aggregate(freq)((_, c) => immutablehashmap(c -> 1), (m1, m2) => m1.merged(m2)({ case ((k, v1), (_, v2)) => (k, v1 + v2) })) }
edit 2: see solution below.
++
not combine values of identical keys. when merge maps, (for shared keys) 1 of values (which in case 1), not sum of values.
this works:
def parfrequency(str: string) = { str.par.aggregate(map[char, int]())((m, c) => { m + (c -> (m.getorelse(c, 0) + 1)) }, (a,b) => b.foldleft(a){case (acc, (k,v))=> acc updated (k, acc.getorelse(k,0) + v) }) } val freq = parfrequency("harry hacker") //> map(e -> 1, y -> 1, -> 2, -> 1, c -> 1, h -> 2, r -> 3, k -> 1)
the foldleft iterates on 1 of maps, updating other map key/values found.
Comments
Post a Comment