scala - Assigned variable not passed to a map function in Spark -
i'm using spark 1.3.1 scala 2.10.4. i've tried basic scenario consists in parallelizing array of 3 strings, , mapping them variable define in driver.
here code :
object basictest extends app { val conf = new sparkconf().setappname("simple application").setmaster("spark://xxxxx:7077") val sc = new sparkcontext(conf) val test = sc.parallelize(array("a", "b", "c")) val = 5 test.map(row => row + a).saveastextfile("output/basictest/") }
this piece of code works in local mode, list :
a5 b5 c5
but on real cluster, :
a0 b0 c0
i've tried code :
object basictest extends app { def test(sc: sparkcontext): unit = { val test = sc.parallelize(array("a", "b", "c")) val = 5 test.map(row => row + a).saveastextfile("output/basictest/") } val conf = new sparkconf().setappname("simple application").setmaster("spark://xxxxx:7077") val sc = new sparkcontext(conf) test(sc) }
this works in both cases.
i need understand reasons in each case. in advance advice.
i believe relates use of app
. "resolved" spark-4170 in warns against using app
if (classof[scala.app].isassignablefrom(mainclass)) { printwarning("subclasses of scala.app may not work correctly. use main() method instead.") }
notes ticket:
this bug seems issue way scala has been defined , differences between happens @ runtime versus compile time respect way app leverages delayedinit function.
Comments
Post a Comment