scala - Assigned variable not passed to a map function in Spark -


i'm using spark 1.3.1 scala 2.10.4. i've tried basic scenario consists in parallelizing array of 3 strings, , mapping them variable define in driver.

here code :

object basictest extends app {    val conf = new sparkconf().setappname("simple application").setmaster("spark://xxxxx:7077")   val sc = new sparkcontext(conf)    val test = sc.parallelize(array("a", "b", "c"))   val = 5    test.map(row => row + a).saveastextfile("output/basictest/")  } 

this piece of code works in local mode, list :

a5 b5 c5 

but on real cluster, :

a0 b0 c0 

i've tried code :

object basictest extends app {    def test(sc: sparkcontext): unit = {      val test = sc.parallelize(array("a", "b", "c"))     val = 5      test.map(row => row + a).saveastextfile("output/basictest/")    }   val conf = new sparkconf().setappname("simple application").setmaster("spark://xxxxx:7077")   val sc = new sparkcontext(conf)    test(sc)  } 

this works in both cases.

i need understand reasons in each case. in advance advice.

i believe relates use of app. "resolved" spark-4170 in warns against using app

if (classof[scala.app].isassignablefrom(mainclass)) {    printwarning("subclasses of scala.app may not work correctly. use main() method instead.") } 

notes ticket:

this bug seems issue way scala has been defined , differences between happens @ runtime versus compile time respect way app leverages delayedinit function.


Comments

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

php - Bypass Geo Redirect for specific directories -

php - .htaccess mod_rewrite for dynamic url which has domain names -