apache spark - java.lang.ClassCastException: scala.Tuple2 cannot be cast to java.lang.Iterable -
working java in spark, want parse text document called artist_data.txt; first created javardd;
javardd rawartistdata = sc.textfile("src/main/resources/artist_data.txt"); parse document, has tab sperator has bad lines number of lines appear corrupted. don't contain tab, or inadvertently include newline character. need use flatmap method;
now running code below, got error; java.lang.classcastexception: scala.tuple2 cannot cast java.lang.iterable
javardd<tuple2<integer, string>> artistbyid0 = rawartistdata .flatmap(new flatmapfunction<string, tuple2<integer, string>>() { private static final long serialversionuid = 1l; @suppresswarnings("unchecked") public iterable<tuple2<integer, string>> call(string s) { string[] sarray = s.split("\t"); return (iterable<tuple2<integer, string>>) new tuple2<integer, string> (integer.parseint(sarray[0]), sarray[1].trim()); } }); javapairrdd<integer, string> artistbyid = javapairrdd.fromjavardd(artistbyid0); system.out.println(artistbyid.count());
this happening because flatmap
expects list of lists, truncate internal lists 1 list. splitting , parsing in 1 go, need map
function return tuple
directly.
a more typical usecase of flatmap
return array split directly, result in of arrays truncated 1 list have of words instead of bunch of separate lists of words.
per comment, sounds code sample shown not display true usecase. if have possibility of returning nothing due bad data, want following:
javardd<tuple2<integer, string>> artistbyid0 = rawartistdata .flatmap(new flatmapfunction<string, tuple2<integer, string>>() { private static final long serialversionuid = 1l; @suppresswarnings("unchecked") public iterable<tuple2<integer, string>> call(string s) { string[] sarray = s.split("\t"); list<tuple2<integer, string>> returnlist = new arraylist<tuple2<integer, string>>(); if(sarray.length >= 2) returnlist.add(new tuple2<integer, string> (integer.parseint(sarray[0]), sarray[1].trim())); return returnlist; ); } });
notice return list items in if split split 2 or more items.
Comments
Post a Comment