Partially indexing Cassandra table with SOLR -

September 15, 2014

one of tables inside our cassandra (dse 4.7) cluster contains south of 15 billion records. number of servers have - impossible index them solr.

so, possible somehow index data partially/sample and/or start indexing , "pause" indexing let's after 500mm records?

i assume other option dump 500mm records , reload them "temp" table , index that...?

the point is, start indexing , have ability search , grow , add more servers - have ability index more , pause again.

is possible?

thanks!

there no way index few rows. agree parallel table (probably ttl) best bet.

here (pretty effective) tactics minimize size of dse search index. can shrink ~50% if you're not using things highlighting (term...) or boosts (omitnorms):

• set termvectors="false"

• set termpositions="false"

• set termoffsets="false"

• set omitnorms="true"

• index fields intend search

Search This Blog

Script

Partially indexing Cassandra table with SOLR -

Comments

Post a Comment

Popular posts from this blog

Magento/PHP - Get phones on all members in a customer group -

javascript - Bootstrap Popover: iOS Safari strange behaviour -

spring cloud - How to configure SpringCloud Eureka instance to point to https on non standard port -