How to Specify which fields are indexed in ElasticSearch
In Elasticsearch, specifying which fields to index primarily involves setting up the mapping (Mapping). Mapping is similar to the schema definition in a database; it defines the names, types, and how to parse and index data for fields in the index. The following are specific steps and examples:1. Understanding Default BehaviorFirst, it is important to understand Elasticsearch's default behavior. In Elasticsearch, if no mapping is explicitly specified, it automatically infers field types and creates indexes for them. This means that all fields in a document are default searchable.2. Custom MappingAlthough Elasticsearch can automatically create indexes for all fields, in practical applications, we may not need to index all fields. Unnecessary indexing can consume additional storage space and potentially affect performance.Example: Creating Custom MappingSuppose we have an index containing user data, where certain fields do not need to be searched, such as user descriptions. The following are the steps to create custom mapping:Define Mapping:In the above example, the field is set with "index": false, meaning this field will not be indexed, thus saving resources and not being searched during queries.3. Updating Existing MappingOnce an index is created and data is written to it, modifying the index mapping becomes complex. Elasticsearch does not allow changing the data types of existing fields. If you need to modify the indexing properties of a field (e.g., from "index": true to "index": false), the typical approach is to recreate the index.Example: ReindexingCreate a new index and apply the new mapping settings.Use the API to copy data from the old index to the new index.4. Using TemplatesFor indices that need to be created frequently and are similar, you can use index templates to predefine mappings and other settings. This way, Elasticsearch automatically applies these predefined settings when creating an index.Example: Creating an Index TemplateBy using these methods, you can effectively control which fields are indexed, optimize the performance and storage of indexing. This is particularly important in big data environments, as it can significantly improve search efficiency and reduce costs.