Elasticsearch: find documents with distinct values and then aggregate over them -

April 15, 2010

my index has log-like structure: insert version of document whenever event occurs. example, here documents in index:

{ "key": "a", subkey: 0 } { "key": "a", subkey: 0 } { "key": "a", subkey: 1 } { "key": "a", subkey: 1 } { "key": "b", subkey: 0 } { "key": "b", subkey: 0 } { "key": "b", subkey: 1 } { "key": "b", subkey: 1 }

i'm trying construct query in elasticsearch equivalent following sql query:

select count(*), key, subkey (select distinct key, subkey t)

the answer query be

(1, a, 0) (1, a, 1) (1, b, 0) (1, b, 1)

how replicate query in elasticsearch? came following:

get test_index/test_type/_search?search_type=count {   "aggregations": {     "count_aggr": {       "terms": {         "field": "concatenated_key"       },       "aggs": {         "sample_doc": {           "top_hits": {             "size": 1           }         }       }     }   } }

concatenated_key concatenation of key , subkey. query create bucket each (key, subkey) combination , return sample document each bucket. however, don't know how can aggregate on fields of _source.

would appreciate ideas. thanks!

if don't have possibility re-index documents , add your own concatenated key field, way of doing it:

get /my_index/my_type/_search?search_type=count {   "aggs": {     "key_agg": {       "terms": {         "field": "key",         "size": 10       },       "aggs": {         "sub_key_agg": {           "terms": {             "field": "subkey",             "size": 10           }         }       }     }   } }

it give this:

     "buckets": [         {            "key": "a",            "doc_count": 4,            "sub_key_agg": {               "doc_count_error_upper_bound": 0,               "sum_other_doc_count": 0,               "buckets": [                  {                     "key": 0,                     "doc_count": 2                  },                  {                     "key": 1,                     "doc_count": 2                  }               ]            }         },         {            "key": "b",            "doc_count": 4,            "sub_key_agg": {               "doc_count_error_upper_bound": 0,               "sum_other_doc_count": 0,               "buckets": [                  {                     "key": 0,                     "doc_count": 2                  },                  {                     "key": 1,                     "doc_count": 2                  }               ]            }         }      ]

where have key - "key": "a" - , each combination key , number of docs match key=a , subkey=0 or key=a , subkey=1:

             "buckets": [                  {                     "key": 0,                     "doc_count": 2                  },                  {                     "key": 1,                     "doc_count": 2                  }               ]

same goes other key.

Search This Blog

Script

Elasticsearch: find documents with distinct values and then aggregate over them -

Comments

Post a Comment

Popular posts from this blog

javascript - Bootstrap Popover: iOS Safari strange behaviour -

Magento/PHP - Get phones on all members in a customer group -

spring cloud - How to configure SpringCloud Eureka instance to point to https on non standard port -