constructing a graph from streaming data using spark streaming -
i new spark. need construct co-occurrence graph(in tweet -words become nodes , if words same tweet add edge between them) streaming data twitter tweets. can use spark streaming construct live co-occurrence twitter graph. spark streaming meant use case?. not sure whether can done using spark streaming . if not alternatives?
the co-occurrence frequency can seen graph or adjacency matrix, large sparse histogram (frequency count) in product space of word list. wish detect moving window correlation should design sketch data structure track unusual increase or decrease in rate of occurrence in stream. e.g. counting bloom filter or count min sketch applied every word-pair - see http://twitter.github.io/algebird/#com.twitter.algebird.cmscounting
Comments
Post a Comment