Follow

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use
Contact

ElasticSearch: Possible to Query with Regex Field?

I have indexed data into ElasticSearch using the following index settings:

KNN_INDEX = {
    "settings": {
        "index.knn": True,
        "index.knn.space_type": "cosinesimil",
        "index.mapping.total_fields.limit": 10000,
        "analysis": {
          "analyzer": {
            "default": {
              "type": "standard",
              "stopwords": "_english_"
            }
          }
        }
    },
    "mappings": {
        "dynamic_templates": [
            {
                "sentence_vector_template": {
                    "match": "sent_vec*",
                    "mapping": {
                        "type": "knn_vector",
                        "dimension": 384,
                        "store": True
                    }
                }
            },
            {
                "sentence_template": {
                    "match": "sentence*",
                    "mapping": {
                        "type": "text",
                        "store": True
                    }
                }
            }
        ],
        'properties': {
            "metadata": {
                "type": "object"
            }
        }
    }
}

Following are a couple of example documents that I am indexing into ElasticSearch:

{
    # DOC 1
    "sentence_0": "Machine learning for aquatic plastic litter detection, classification and quantification (APLASTIC-Q)Large quantities of mismanaged plastic waste are polluting and threatening the health of the blue planet."
    "sentence_1": "As such, vast amounts of this plastic waste found in the oceans originates from land."
    "sentence_2": "It finds its way to the open ocean through rivers, waterways and estuarine systems."
},
{
    # DOC 2
    "sentence_0": "What predicts persistent early conduct problems?"
    "sentence_1": "Evidence from the Growing Up in Scotland cohortBackground There is a strong case for early identification of factors predicting life-course-persistent conduct disorder."
    "sentence_2": "The authors aimed to identify factors associated with repeated parental reports of preschool conduct problems."
    "sentence_3": "Method Nested caseecontrol study of Scottish children who had behavioural data reported by parents at 3, 4 and 5 years."
    "sentence_4": "Results 79 children had abnormal conduct scores at all three time points ('persistent conduct problems') and 434 at one or two points ('inconsistent conduct problems')."
}

There can be different number of sentences for each indexed document. For querying, I want to search over all sentences over all documents.
I am able to search over a particular "sentence number" in all documents using the below query:

MEDevel.com: Open-source for Healthcare and Education

Collecting and validating open-source software for healthcare, education, enterprise, development, medical imaging, medical records, and digital pathology.

Visit Medevel

query_body = {
        "query": {
            "match": {
                "sentence_0": "persistent"
            }
        }
    }
    result = client.search(index=INDEX_NAME, body=query_body)
    print(result)

But what I am looking for is something like below:

query_body = {
        "query": {
            "match": {
                "sentence_*": "persistent"
            }
        }
    }
result = client.search(index=INDEX_NAME, body=query_body)
print(result)

The above query does not work though. Is is possible perform such a query search ? Thanks.

>Solution :

Use query_string it supports regex in field names

{
  "query": {
   "query_string": {
     "fields": ["sentence*"],
     "query": "persistent"
   }
  }
}
Add a comment

Leave a Reply

Keep Up to Date with the Most Important News

By pressing the Subscribe button, you confirm that you have read and are agreeing to our Privacy Policy and Terms of Use

Discover more from Dev solutions

Subscribe now to keep reading and get access to the full archive.

Continue reading