multithreading - Scala parallel frequency calculation using aggregate doesn't work -


i'm learning scala working exercises book "scala impatient". please see following question , answer , code. i'd know if answer correct. code doesn't work (all frequencies 1). where's bug?

q10: harry hacker reads file string , wants use parallel collection update letter frequencies concurrently on portions of string. uses following code:

val frequencies = new scala.collection.mutable.hashmap[char, int] (c <- str.par) frequencies(c) = frequencies.getorelse(c, 0) + 1 

why terrible idea? how can parallelize computation?

my answer: not idea because if 2 threads concurrently updating same frequency, result undefined.

my code:

def parfrequency(str: string) = {   str.par.aggregate(map[char, int]())((m, c) => { m + (c -> (m.getorelse(c, 0) + 1)) }, _ ++ _) } 

unit test:

"method parfrequency" should "return frequency of each character in string" in {   val freq = parfrequency("harry hacker")    freq should have size 8    freq('h') should be(2) // fails   freq('a') should be(2)   freq('r') should be(3)   freq('y') should be(1)   freq(' ') should be(1)   freq('c') should be(1)   freq('k') should be(1)   freq('e') should be(1) } 

edit: after reading this thread, updated code. test works if ran alone, fails if ran suite.

def parfrequency(str: string) = {   val freq = immutablehashmap[char, int]()   str.par.aggregate(freq)((_, c) => immutablehashmap(c -> 1), (m1, m2) => m1.merged(m2)({       case ((k, v1), (_, v2)) => (k, v1 + v2)   })) } 

edit 2: see solution below.

++ not combine values of identical keys. when merge maps, (for shared keys) 1 of values (which in case 1), not sum of values.

this works:

def parfrequency(str: string) = {   str.par.aggregate(map[char, int]())((m, c) => { m + (c -> (m.getorelse(c, 0) + 1)) },   (a,b) => b.foldleft(a){case (acc, (k,v))=> acc updated (k, acc.getorelse(k,0) + v) }) }   val freq = parfrequency("harry hacker") //> map(e -> 1, y -> 1, -> 2,   -> 1, c -> 1, h -> 2, r -> 3, k -> 1) 

the foldleft iterates on 1 of maps, updating other map key/values found.


Comments

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

php - .htaccess mod_rewrite for dynamic url which has domain names -

Website Login Issue developed in magento -