scala - How to access accumulators in object outside the place where they were defined? -
i'm defining helper map function separate def in helper object , not "see" accumulator defined earlier in code. spark docs seams recommending keep "remote" functions inside object, how make work accumulators?
object mainlogic{ val counter = sc.accumulator(0) val data = sc.textfile(...)// load logic here val myrdd = data.mappartitionswithindex(mapfunction) } object helper{ def mapfunction(...)={ counter+=1 // not compiling } }
something need passed in parameter other code:
object mainlogic{ val counter = sc.accumulator(0) val data = sc.textfile(...)// load logic here val myrdd = data.mappartitionswithindex(mapfunction(counter, _, _)) } object helper{ def mapfunction(counter: accumulator[int], ...)={ counter+=1 // not compiling } } make sure remember note docs though:
for accumulator updates performed inside actions only, spark guarantees each task’s update accumulator applied once, i.e. restarted tasks not update value. in transformations, users should aware of each task’s update may applied more once if tasks or job stages re-executed.
Comments
Post a Comment