Elasticsearch "_index" field optimizations

Advertisements

I tried to do some queries to alias with multiple indices in Kibana with profiler and it seems that when you make high-level filter on _index field – it runs some really fast MatchNoDocQuery queries on all indices except the needed one.

E.g.: let’s say we have two indices: test.book and test.film. We have an alias test with pattern test.*. Also each index has product_type field which may be "book" or "film".

It seems that this query:

GET test/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "terms": {
            "_index": ["test.film"]
          }
        }
      ]
    }
}

Is much faster than this query:

GET test/_search
{
  "query": {
    "bool": {
      "filter": [
        {
          "terms": {
            "product_type": ["film"]
          }
        }
      ]
    }
  }
}

Are there any optimizations when running filter on "_index" field?

>Solution :

The first query is faster because the query knows how to go through all the segments of the test.film index only and not check the segments of the test.book index. Whereas in the second, the query has no idea that there are multiple indexes and it will have to go through all the shards.

Say both indexes have 10 segments, which means 20 segments in total when running the query of the alias. The first query will only go through 10 segments, whereas the second will have to go through the 20 segments in order to figure out the documents that satisfy the condition.

Leave a ReplyCancel reply