In Elasticsearch, the 'nested data type' is a special data type used for indexing fields that contain arrays of objects. This data type is particularly suitable for handling cases where each object needs to be indexed and queried independently.
Ordinary JSON object arrays in Elasticsearch do not preserve the boundaries between objects. For example, consider a document field containing personnel information, which includes multiple roles and skills associated with each role.
Without using the nested type, querying for personnel with a specific role and corresponding skills may yield incorrect results because Elasticsearch defaults to treating roles and skills as separate arrays, and their combination is flattened.
With the nested data type, each array element (object) is treated as a separate document, enabling accurate indexing and querying of each object, thus avoiding incorrect associations.
For example, consider the following document structure:
json{ "person_id": 1, "name": "John Doe", "roles": [ { "role": "developer", "skills": ["Java", "Elasticsearch"] }, { "role": "designer", "skills": ["Photoshop", "Illustrator"] } ] }
In this case, if we want to find personnel with the role "developer" and skills including "Elasticsearch", without properly using the nested type, the query might incorrectly return personnel with the role "developer" but without the skill "Elasticsearch", because roles and skills are flattened.
To implement this query in Elasticsearch, we need to define the roles field as a nested type during mapping:
json{ "mappings": { "properties": { "roles": { "type": "nested", "properties": { "role": { "type": "text" }, "skills": { "type": "keyword" } } } } } }
Then, we can use a nested query to precisely search:
json{ "query": { "nested": { "path": "roles", "query": { "bool": { "must": [ { "match": { "roles.role": "developer" } }, { "term": { "roles.skills": "Elasticsearch" } } ] } } } } }
This query ensures that only the correct documents are returned, i.e., personnel with the role "developer" and skills including "Elasticsearch". This is the purpose and importance of the nested data type in Elasticsearch.