Predictionio evaluation fails with Text Classification template -
i trying predict text field based on other text fields on predictionio. used this guide reference. created new app using
pio app new mytextapp
and followed guide upto evaluation using datasource provided in template. okay upto evaluation. on evaluating data source getting error pasted below.
[info] [coreworkflow$] runevaluation started [warn] [utils] hostname, my-thinkcentre-edge72 resolves loopback address: 127.0.0.1; using 192.168.65.27 instead (on interface eth0) [warn] [utils] set spark_local_ip if need bind address [info] [remoting] starting remoting [info] [remoting] remoting started; listening on addresses :[akka.tcp://sparkdriver@192.168.65.27:59649] [info] [coreworkflow$] starting evaluation instance id: au29p8j3fkwdnkfum_ke [info] [engine$] datasource: org.template.textclassification.datasource@faea4da [info] [engine$] preparator: org.template.textclassification.preparator@69f2cb04 [info] [engine$] algorithmlist: list(org.template.textclassification.nbalgorithm@45292ec1) [info] [engine$] serving: org.template.textclassification.serving@1ad9b8d3 exception in thread "main" java.lang.unsupportedoperationexception: empty.maxby @ scala.collection.traversableonce$class.maxby(traversableonce.scala:223) @ scala.collection.abstracttraversable.maxby(traversable.scala:105) @ org.template.textclassification.prepareddata.<init>(preparator.scala:152) @ org.template.textclassification.preparator.prepare(preparator.scala:38) @ org.template.textclassification.preparator.prepare(preparator.scala:34)
do have edit config files make work? have ran tests on movielens data.
so particular error message occurs when data isn't getting read through datasource
class. if you're using different text data set, make sure correctly reflecting changes eventnames, entitytype, , respective property field names in readeventdata
method.
the maxby
method used pull class highest number of observations. if category label map empty, means there no classes being recorded, tells have no data being fed in.
for example, did spam detector using engine. e-mail data of form:
{"entitytype": "content", "eventtime": "2015-06-04t00:22:39.064+0000", "entityid": 1, "event": "e-mail", "properties": {"label": "spam", "text": "content"}}
to use engine data made following changes in datasource class:
entitytype = some("source"), // specify data entity type eventnames = some(list("documents")) // specify data event name
changes
entitytype = some("content"), // specify data entity type eventnames = some(list("e-mail")) // specify data event name
and
)(sc).map(e => observation( e.properties.get[double]("label"), e.properties.get[string]("text"), e.properties.get[string]("category") )).cache
changes to:
)(sc).map(e => { val label = e.properties.get[string]("label") observation( if (label == "spam") 1.0 else 0.0, e.properties.get[string]("text"), label ) }).cache
after this, i'm able go through building, training, , deployment, evaluation.
Comments
Post a Comment