1. Elasticsearch Field Types Overview
Elasticsearch, as a distributed search and analytics engine, has field type design that directly impacts indexing performance, query efficiency, and data accuracy. Incorrect field type selection can result in tokenization errors, aggregation failures, or wasted storage. This article systematically analyzes Elasticsearch's core field types and provides practical selection guidelines based on real-world scenarios to help developers build efficient and reliable search applications.
1.1 Common Field Types
Elasticsearch provides rich built-in types, categorized as follows:
- Core text types:
text(for full-text search) andkeyword(for exact matching) are fundamental for handling text data. - Numeric types:
integer,long,float,doublefor numerical calculations. - Boolean type:
booleanfor binary values. - Date-time type:
datefor time-series analysis. - Special types:
ip(IP addresses),object(nested objects),nested(complex nested structures), etc.
Note: Elasticsearch 8.0+ defaults to a combined mode where
textfields implicitly include akeywordsubfield, but explicit declaration optimizes performance.
1.2 Detailed Type Explanations
Text Type
- Purpose: Full-text search, such as for article titles or content.
- Characteristics: Automatically tokenized, supports analysis queries (e.g.,
match), but not exact matching. - Example:
json"title": { "type": "text", "analyzer": "standard" }
- Best practices: Use only for tokenization scenarios. Avoid
termqueries ontextfields, as it causes tokenization errors.
Keyword Type
- Purpose: Exact matching, such as filtering status or aggregating tags.
- Characteristics: Not tokenized, preserves original values, supports
termqueries and aggregations. - Example:
json"status": { "type": "keyword", "ignore_above": 256 }
- Best practices: For fields requiring exact matching (e.g., status codes), use
keyword. For example:
json"user_id": { "type": "keyword" }
Avoid using text for user_id queries.
Numeric Types
- integer/long: Integers, such as
age. - float/double: Floating-point numbers, such as
price. - Example:
json"price": { "type": "float", "format": "currency" }
- Best practices: Specify precision for numeric fields (e.g.,
floatfor currency). Avoid storing numbers intextfields.
Date Type
- Purpose: Date-time values, such as log timestamps.
- Characteristics: Automatically parses date strings, supports time-range queries.
- Example:
json"created_at": { "type": "date", "format": "yyyy-MM-dd'T'HH:mm:ss.SSSZ" }
- Best practices: Specify
formatto avoid parsing errors. For example,created_atshould usedatetype, nottext.
IP Type
- Purpose: IP addresses, such as user access sources.
- Characteristics: Automatically parses IP addresses, supports network range queries.
- Example:
json"ip_address": { "type": "ip" }
- Best practices: Use only for IP address fields. Avoid
textfor IP filtering, which causes performance degradation.
Nested Type
- Purpose: Handle nested objects within arrays, such as product tags.
- Characteristics: Prevents flattening, supports independent queries.
- Example:
json"tags": { "type": "nested", "properties": { "name": { "type": "keyword" } } }
- Best practices: Use for independent array element queries. For example:
json"tags": { "type": "nested", "properties": { "tag_name": { "type": "keyword" } } }
Avoid object type to prevent flattening errors.

1.3 How to Choose the Right Field Types
Select field types based on these principles, considering real-world scenarios:
-
Query requirements first:
- Full-text search: Use
texttype (e.g.,titlefield). - Exact matching: Use
keywordtype (e.g.,statusfield). - Numeric ranges: Use numeric types (e.g.,
pricefield). - Date filtering: Use
datetype (e.g.,created_atfield).
- Full-text search: Use
-
Analysis requirements:
- Aggregation operations: Prioritize
keywordordatetypes. For example, aggregatingstatusrequireskeyword. - Text analysis: Use
textfor tokenization; usekeywordto preserve original values.
- Aggregation operations: Prioritize
-
Storage efficiency:
texttypes consume more storage (post-tokenization) for large text;keywordtypes are smaller for small-value fields.- For high-frequency query fields, prioritize
keywordto reduce indexing overhead.
-
Avoid common pitfalls:
- Incorrect example: Executing
termquery ontextfield:
- Incorrect example: Executing
json"query": { "term": { "title": { "value": "Elasticsearch" } } }
Results in tokenization errors and empty responses.
- Correct approach: Add a
keywordsubfield totextfields or usetextwithmatchqueries.
Code Example: Index Mapping Design
Here is a practical index mapping example showing correct mixed-type field selection:
json{ "mappings": { "properties": { "title": { "type": "text", "analyzer": "standard" }, "status": { "type": "keyword", "ignore_above": 256 }, "price": { "type": "float", "format": "currency" }, "created_at": { "type": "date", "format": "yyyy-MM-dd'T'HH:mm:ss.SSSZ" }, "ip_address": { "type": "ip" }, "tags": { "type": "nested", "properties": { "name": { "type": "keyword" } } } } } }
Best practices summary:
- Always explicitly declare field types to avoid defaults.
- Match type to use case:
textfor search,keywordfor exact matches. - Optimize for performance: Use
keywordsubfields fortextfields when needed. - Validate with real data: Test queries to ensure correct type handling.