Elasticsearch是一个基于Lucene的免费开放、分布式、RESTful搜索引擎。



ElasticSearch

### 1. 确认环境和安装必要的软件

首先，确保Elasticsearch环境已经搭建好并且运行中。其次，根据需要可能还需要安装如Python等开发语言环境，并且安装相关库，例如`elasticsearch-py`（Python的Elasticsearch客户端）。

### 2. 准备JSON文件

确保你有一个或多个JSON文件准备导入到Elasticsearch。JSON文件应该是有效的格式，并且符合Elasticsearch的文档结构要求。例如：

```json
{
  "id": "1",
  "product_name": "Apple iPhone 12",
  "category": "Electronics",
  "price": 799
}
```

### 3. 编写脚本处理和上传数据

我将使用Python作为例子来展示如何导入数据。首先，你需要安装`elasticsearch`库，可以通过pip安装：

```bash
pip install elasticsearch
```

然后编写一个Python脚本来读取JSON文件，并将其内容索引到Elasticsearch。以下是一个简单的例子：

```python
from elasticsearch import Elasticsearch
import json

# 连接到Elasticsearch实例
es = Elasticsearch("http://localhost:9200")

# 读取JSON文件
with open('data.json', 'r') as file:
    data = json.load(file)

# 索引数据，这里使用data中的id作为文档ID
response = es.index(index="products", id=data["id"], document=data)
print(response['result'])
```

### 4. 校验数据

导入数据后，可以通过Kibana或Elasticsearch的API进行查询，以确保数据已正确索引。

```bash
curl -X GET "localhost:9200/products/_doc/1"
```

这将返回刚才索引的文档，确认数据的准确性。

### 5. 批量导入

如果有大量的JSON文件或非常大的单个JSON文件，你可能需要考虑使用批量API（Bulk API）来提高效率。使用Python可以这样做：

```python
from elasticsearch import Elasticsearch, helpers
import json

es = Elasticsearch("http://localhost:9200")

def bulk_index(file_path):
    with open(file_path, 'r') as file:
        data = json.load(file)
        actions = [
            {
                "_index": "products",
                "_id": doc["id"],
                "_source": doc
            }
            for doc in data
        ]
        helpers.bulk(es, actions)

bulk_index('large_data.json')
```

这个例子假设`large_data.json`包含一个JSON列表，每个元素都是一个要索引的文档。

### 6. 监控和优化

根据数据的大小和索引的复杂性，可能需要监控Elasticsearch集群的性能，并根据情况调整配置或硬件资源。

以上就是将JSON文件导入Elasticsearch的基本步骤和一些高级技巧。希望这对您有帮助！

### 1. 确认环境和安装必要的软件

首先，确保Elasticsearch环境已经搭建好并且运行中。其次，根据需要可能还需要安装如Python等开发语言环境，并且安装相关库，例如 `elasticsearch-py`（Python的Elasticsearch客户端）。

### 2. 准备JSON文件

确保你有一个或多个JSON文件准备导入到Elasticsearch。JSON文件应该是有效的格式，并且符合Elasticsearch的文档结构要求。例如：

```json
{
  "id": "1",
  "product_name": "Apple iPhone 12",
  "category": "Electronics",
  "price": 799
}
```

### 3. 编写脚本处理和上传数据

我将使用Python作为例子来展示如何导入数据。首先，你需要安装 `elasticsearch`库，可以通过pip安装：

```bash
pip install elasticsearch
```

然后编写一个Python脚本来读取JSON文件，并将其内容索引到Elasticsearch。以下是一个简单的例子：

```python
from elasticsearch import Elasticsearch
import json

# 连接到Elasticsearch实例
es = Elasticsearch("http://localhost:9200")

# 读取JSON文件
with open('data.json', 'r') as file:
    data = json.load(file)

# 索引数据，这里使用data中的id作为文档ID
response = es.index(index="products", id=data["id"], document=data)
print(response['result'])
```

### 4. 校验数据

导入数据后，可以通过Kibana或Elasticsearch的API进行查询，以确保数据已正确索引。

```bash
curl -X GET "localhost:9200/products/_doc/1"
```

这将返回刚才索引的文档，确认数据的准确性。

### 5. 批量导入

如果有大量的JSON文件或非常大的单个JSON文件，你可能需要考虑使用批量API（Bulk API）来提高效率。使用Python可以这样做：

```python
from elasticsearch import Elasticsearch, helpers
import json

es = Elasticsearch("http://localhost:9200")

def bulk_index(file_path):
    with open(file_path, 'r') as file:
        data = json.load(file)
        actions = [
            {
                "_index": "products",
                "_id": doc["id"],
                "_source": doc
            }
            for doc in data
        ]
        helpers.bulk(es, actions)

bulk_index('large_data.json')
```

这个例子假设 `large_data.json`包含一个JSON列表，每个元素都是一个要索引的文档。 

### 6. 监控和优化

根据数据的大小和索引的复杂性，可能需要监控Elasticsearch集群的性能，并根据情况调整配置或硬件资源。

以上就是将JSON文件导入Elasticsearch的基本步骤和一些高级技巧。

Hoe to Import /Index a JSON file into Elasticsearch

Elasticsearch 中的 `connect` 与 `createConnection` 并非是 Elasticsearch 官方提供的 API 或函数，这两个词可能是在特定的上下文或库中使用的，比如说某些客户端库可能会提供这样的方法来管理与 Elasticsearch 集群的连接。

假设您提到的是某个特定的 Elasticsearch 客户端库，那么通常：

- `connect` 方法可能用于建立与 Elasticsearch 集群的连接。它可能是一个简便方法，用于连接到集群并确认连接是活跃的。这个方法可能不需要太多参数，或者它可能会使用一些默认的配置。

- `createConnection` 方法可能更加灵活，允许开发者指定更多的配置选项，比如连接的地址、端口、使用的协议、认证信息等。`createConnection` 方法可能会返回一个连接实例，该实例可以用于后续的操作和查询。

举个例子，如果我们使用的是 Node.js 的 Elasticsearch 客户端，我们可能会这样使用这两个方法（以伪代码为例）：

```javascript
// 假设这是一个假想的Elasticsearch客户端库
const esClient = require('elasticsearch-client');

// 使用connect方法简单地连接到Elasticsearch集群
esClient.connect('http://localhost:9200');

// 使用createConnection创建一个带有详细配置的连接
const connection = esClient.createConnection({
  host: 'http://localhost:9200',
  log: 'trace',
  auth: {
    username: 'user',
    password: 'pass'
  }
});
```

在实际的 Elasticsearch 客户端中，例如官方提供的 `elasticsearch.js` 或者新的 `@elastic/elasticsearch`，您通常直接在客户端实例化时传递配置参数，而不会有单独的 `connect` 或 `createConnection` 方法。如下所示：

```javascript
const { Client } = require('@elastic/elasticsearch');

const client = new Client({
  node: 'http://localhost:9200',
  auth: {
    username: 'user',
    password: 'pass'
  }
});
```

在上面的官方客户端代码示例中，您只需创建一个 `Client` 实例，并通过构造函数传递配置参数来连接 Elasticsearch 集群。

因此，为了提供准确的答案，我需要知道具体是哪个客户端库或应用程序中的 `connect` 和 `createConnection`。如果你能提供更多的上下文或详细信息，我将能够给出更加具体的答案。

> 我对官方文档的理解是，通常当只有一个连接时使用 mongoose.connect() ，而如果有多个连接实例则使用 mongoose.createConnection() 。

是的。确切地说，`.connect()`实际上创建了一个保持打开状态的套接字/连接池（在`poolSize`连接设置中定义，默认值为 5），因此实际上有多个连接，但在单个池中。话虽如此，如果您想要多个具有不同属性的连接池，您应该使用`createConnection`

> 另外，如果我的理解是正确的，那么在单连接中使用 mongoose.createConnection() 有什么缺点？为什么不建议我们在每种情况下都使用 mongoose.createConnection() 来标准化连接？

[这是一个很好的问题，文档](https://mongoosejs.com/docs/connections.html#multiple_connections)中也已经有答案：

> > 当您调用 mongoose.connect() 时，Mongoose 会创建一个默认连接。您可以使用 mongoose.connection 访问默认连接。

基本上`.connect`是一组（大部分）最佳实践设置的简写`createConnection`

在大多数简单的项目中，您不必担心指定不同的读取或写入设置、池大小、与不同副本服务器的单独连接等，这就是`.connect`存在的原因。

但是，如果您有更苛刻的要求（例如出于法律或性能原因），您可能必须使用`createConnection`.

几周前，我遇到了一种情况，我的一个（内部）统计包需要零星但大负载的数据库访问。由于我不想将 db/mongoose 对象传递给包以使其尽可能模块化，因此我只是创建了一个新连接。这非常有效，因为我只需要访问我在包中定义的某个模型，而不是我的“父”包中定义的模型。由于新包只需要读取访问，并且可以从不同的从属/副本数据库读取以减少主数据库的负载，因此我切换到两端的 createConnection 以将连接与主数据库分开。

对我来说，在同一模块中创建多个连接的一大缺点是，如果模型是在不同的连接中定义的，则您无法通过 直接访问模型`mongoose.model`。该答案详细说明了该问题：[https://stackoverflow.com/a/22838614/2856218](https://stackoverflow.com/a/22838614/2856218)

What is the diffence between connect and createconnection in elasticsearch?

Elasticsearch 在默认情况下并没有设置用户权限认证机制。但从5.x版本开始，Elastic Stack 引入了 X-Pack 插件，随后在7.x版本中，Elasticsearch 和Kibana 的基本安全特性被默认启用在了基本版中，这一特性包括了密码保护。

当你首次安装 Elasticsearch 时，你需要初始化内置用户的密码。Elasticsearch拥有几个内置用户，例如 `elastic`、`kibana` 和 `logstash_system` 等。其中，`elastic` 用户是超级用户，可以用来登录 Kibana 和操作 Elasticsearch 集群。

在启用了基本安全特性的 Elasticsearch 版本中，没有默认密码。相反，你需要在设置过程中使用 `elasticsearch-setup-passwords` 命令来为内置用户设置密码。例如，通过以下命令可以为所有内置用户设置密码：

```bash
bin/elasticsearch-setup-passwords auto
```

该命令会为每个内置用户生成随机密码，并在命令行中显示。另外，也可以使用交互式命令 `interactive` 来为每个用户设置你想要的密码。

如果是Elasticsearch集群的Docker容器实例，也可以通过设置环境变量 `ELASTIC_PASSWORD` 来指定 `elastic` 用户的密码。

请注意，出于安全原因，应该避免使用默认密码或弱密码，并且在部署的时候应该为所有的内置用户设置强密码。此外，对于生产环境，最好遵循最小权限原则配置用户角色，以降低安全风险。

[默认值是](https://www.elastic.co/guide/en/x-pack/current/security-getting-started.html)：

    user: elastic
    password: changeme
    

所以：

    $ curl -u elastic:changeme localhost:9200
    {
      "name" : "5aEHJ-Y",
      "cluster_name" : "docker-cluster",
      "cluster_uuid" : "3FmaYN7rS56oBTqWOyxmKA",
      "version" : {
        "number" : "5.6.2",
        "build_hash" : "57e20f3",
        "build_date" : "2017-09-23T13:16:45.703Z",
        "build_snapshot" : false,
        "lucene_version" : "6.6.1"
      },
      "tagline" : "You Know, for Search"
    }
    

了解有关[更改默认值的](https://www.elastic.co/guide/en/x-pack/current/security-getting-started.html)更多信息。

设置Elastic Search的用户名和密码：（ES版本：7.5.2）（Ubuntu 18.04）
--------------------------------------------------

步骤1：首先在elasticsearch.yml文件中启用xpackmonitoring

    root@flax:/etc/elasticsearch# vim elasticsearch.yml
    
    Add the following line to the end of file:
        xpack.security.enabled: true
    
    File Contents:
    # ======================== Elasticsearch Configuration =========================
    #
    # NOTE: Elasticsearch comes with reasonable defaults for most settings.
    #       Before you set out to tweak and tune the configuration, make sure you
    #       understand what are you trying to accomplish and the consequences.
    #
    # The primary way of configuring a node is via this file. This template lists
    # the most important settings you may want to configure for a production cluster.
    #
    # Please consult the documentation for further information on configuration options:
    # https://www.elastic.co/guide/en/elasticsearch/reference/index.html
    #
    # ---------------------------------- Cluster -----------------------------------
    #
    # Use a descriptive name for your cluster:
    #
    #cluster.name: my-application
    #
    # ------------------------------------ Node ------------------------------------
    #
    # Use a descriptive name for the node:
    #
    #node.name: node-1
    #
    # Add custom attributes to the node:
    #
    #node.attr.rack: r1
    #
    # ----------------------------------- Paths ------------------------------------
    #
    # Path to directory where to store the data (separate multiple locations by comma):
    #
    path.data: /var/lib/elasticsearch
    #
    # Path to log files:
    #
    path.logs: /var/log/elasticsearch
    #
    # ----------------------------------- Memory -----------------------------------
    #
    # Lock the memory on startup:
    #
    #bootstrap.memory_lock: true
    #
    # Make sure that the heap size is set to about half the memory available
    # on the system and that the owner of the process is allowed to use this
    # limit.
    #
    # Elasticsearch performs poorly when the system is swapping the memory.
    #
    # ---------------------------------- Network -----------------------------------
    #
    # Set the bind address to a specific IP (IPv4 or IPv6):
    #
    #network.host: 192.168.0.1
    network.host: 127.0.0.1
    http.host: 0.0.0.0
    #
    # Set a custom port for HTTP:
    #
    http.port: 9200
    #
    # For more information, consult the network module documentation.
    #
    # --------------------------------- Discovery ----------------------------------
    #
    # Pass an initial list of hosts to perform discovery when this node is started:
    # The default list of hosts is ["127.0.0.1", "[::1]"]
    #
    #discovery.seed_hosts: ["host1", "host2"]
    #
    # Bootstrap the cluster using an initial set of master-eligible nodes:
    #
    #cluster.initial_master_nodes: ["node-1", "node-2"]
    #
    # For more information, consult the discovery and cluster formation module documentation.
    #
    # ---------------------------------- Gateway -----------------------------------
    #
    # Block initial recovery after a full cluster restart until N nodes are started:
    #
    #gateway.recover_after_nodes: 3
    #
    # For more information, consult the gateway module documentation.
    #
    # ---------------------------------- Various -----------------------------------
    #
    # Require explicit names when deleting indices:
    #
    #action.destructive_requires_name: true
    xpack.security.enabled: true
    

步骤 2：转到 /usr/share/elasticsearch 文件夹：

    root@flax:/usr/share/elasticsearch# systemctl start elasticsearch
    
    root@flax:/usr/share/elasticsearch# ./bin/elasticsearch-setup-passwords interactive
    
    Initiating the setup of passwords for reserved users elastic,apm_system,kibana,logstash_system,beats_system,remote_monitoring_user.
    You will be prompted to enter passwords as the process progresses.
    Please confirm that you would like to continue [y/N]y
    
    
    Enter password for [elastic]: 
    Reenter password for [elastic]: 
    Enter password for [apm_system]: 
    Reenter password for [apm_system]: 
    Enter password for [kibana]: 
    Reenter password for [kibana]: 
    Enter password for [logstash_system]: 
    Reenter password for [logstash_system]: 
    Enter password for [beats_system]: 
    Reenter password for [beats_system]: 
    Passwords do not match.
    Try again.
    Enter password for [beats_system]: 
    Reenter password for [beats_system]: 
    Enter password for [remote_monitoring_user]: 
    Reenter password for [remote_monitoring_user]: 
    Changed password for user [apm_system]
    Changed password for user [kibana]
    Changed password for user [logstash_system]
    Changed password for user [beats_system]
    Changed password for user [remote_monitoring_user]
    Changed password for user [elastic]
    
    root@flax:/usr/share/elasticsearch# systemctl restart elasticsearch
    
    root@flax:/usr/share/elasticsearch# systemctl restart elasticsearch.service

请注意ElasticSearch的版本。在**7.2**参数 ELASTIC\_PASSWORD 中起作用。

    docker run -p 9200:9200 \
               -p 9300:9300 \
               -e "discovery.type=single-node" \ 
               -e "ELASTIC_PASSWORD=my_own_password" \
    

但也应该在 elasticsearch.yml 中添加这一行：

    xpack.security.enabled: true
    

默认情况下，它不存在。

[安全设置列表](https://www.elastic.co/guide/en/elasticsearch/reference/current/security-settings.html#general-security-settings)

如果您在elasticsearch 7.7版本中启用了基本的x-pack安全性`xpack.security.enabled: true`（在撰写本答案时），它将不会`changeme`像旧版本的x-pack中那样具有默认密码（）。

正如[安全入门官方文档中提到的](https://www.elastic.co/guide/en/x-pack/6.2/security-getting-started.html)

> X-Pack 安全性提供了一个内置的弹性超级用户，您可以使用它来开始设置。该elastic用户拥有集群的完全访问权限，包括所有索引和数据，**因此elastic用户默认没有设置密码。**

所以您需要更改密码`elastic`，如果您想在安装后更改密码，请按照[交互模式指南中的内置用户设置密码进行操作](https://www.elastic.co/guide/en/elasticsearch/reference/current/built-in-users.html#set-built-in-user-passwords)

这需要您从 elasticsearch bin 文件夹运行以下命令。

    bin/elasticsearch-setup-passwords interactive

**设置用户名和密码**

ssh到系统，停止elasticsearch和kibana服务，然后运行以下命令

    sudo nano /etc/elasticsearch/elasticsearch.yml
    

更新此文件，通过添加以下行来启用安全性

    xpack.security.enabled: true 
    

**更改密码**

执行以下步骤更改密码

步骤1：

     cd /usr/share/elasticsearch/
    

第2步：

    sudo bin/elasticsearch-setup-passwords auto
    

> 自动 - 使用随机生成的密码 交互式 - 使用用户输入的密码

或者

    sudo bin/elasticsearch-setup-passwords interactive
    

> 您可以在“交互”模式下运行该命令，该模式会提示您输入elastic、kibana\_system、logstash\_system、beats\_system、apm\_system和remote\_monitoring\_user用户的新密码：

以上命令可以帮助您设置密码

**启动弹性搜索**

1.  通过运行 systemctl 命令启动 Elasticsearch 服务：
    
    sudo systemctl启动elasticsearch.service
    

系统启动该服务可能需要一些时间。如果成功的话不会有任何输出。

2.  启用 Elasticsearch 在启动时启动：
    
    sudo systemctl 启用elasticsearch.service
    

**启动并启用 Kibana**

1.  启动 Kibana 服务：
    
    sudo systemctl 启动 kibana
    

如果服务启动成功，则没有任何输出。

2.  接下来，将 Kibana 配置为在启动时启动：
    
    sudo systemctl 启用 kibana

What is the default user and password for elasticsearch

在 Elasticsearch 中，插入数据通常是通过 HTTP PUT 或 POST 请求向所选索引提交 JSON 格式的文档来实现的。以下是插入数据的几种常见方法：

### 使用 HTTP PUT 插入单个文档

如果你已经知道你想要插入的文档的ID，可以使用PUT方法直接插入。例如：

```http
PUT /index_name/_doc/document_id
{
  "field1": "value1",
  "field2": "value2",
  ...
}
```

在这个例子中，`index_name` 是你想要插入文档的索引名称，`_doc` 是文档类型（在Elasticsearch 7.x之后已经废弃），`document_id` 是这个文档的唯一标识符，紧随其后的是要插入的JSON格式的文档内容。

### 使用 HTTP POST 插入单个文档

如果你不关心文档的ID，Elasticsearch可以为你自动生成一个。你可以使用POST方法来完成：

```http
POST /index_name/_doc
{
  "field1": "value1",
  "field2": "value2",
  ...
}
```

在这个例子中，Elasticsearch将自动生成文档ID，并且插入提供的数据。

### 批量插入文档

当需要插入多个文档时，可以使用Elasticsearch的批量API（_bulk API）来提高效率。这样可以在一个请求中插入多个文档，示例如下：

```http
POST /_bulk
{ "index" : { "_index" : "index_name", "_id" : "1" } }
{ "field1" : "value1" }
{ "index" : { "_index" : "index_name", "_id" : "2" } }
{ "field1" : "value2" }
...
```

批量API接受一系列操作，每个操作由两行组成：第一行指定操作和元数据（如`_index`和`_id`），第二行包含实际的文档数据。

### 使用客户端库

除了直接使用HTTP请求，许多开发人员更喜欢使用客户端库来与Elasticsearch交互。这种方式提供了更方便的API和错误处理。以JavaScript为例，使用官方的`elasticsearch`客户端库可以这样插入数据：

```javascript
const { Client } = require('@elastic/elasticsearch');
const client = new Client({ node: 'http://localhost:9200' });

client.index({
  index: 'index_name',
  id: 'document_id',
  body: {
    field1: 'value1',
    field2: 'value2',
    ...
  }
}, (err, result) => {
  if (err) console.log(err);
  console.log(result);
});
```

在这个例子中，我们创建了一个Elasticsearch客户端实例，然后使用其`index`方法插入一个文档。可以指定文档ID，也可以让Elasticsearch自动生成。

总结来说，插入数据到Elasticsearch通常涉及发送含有JSON文档的HTTP请求到适当的索引上，无论是单个文档还是批量文档。客户端库可用于简化这个过程，并提供更加方便和健壮的编程接口。

您必须`curl`首先在 PC 上安装二进制文件。[你可以在这里](http://curl.haxx.se/gknw.net/7.29.0/dist-w32/curl-7.29.0-rtmp-ssh2-ssl-sspi-zlib-idn-static-bin-w32.zip)下载。

之后将其解压缩到一个文件夹中。可以说`C:\curl`。在该文件夹中，您会发现`curl.exe`包含多个`.dll`文件的文件。

`cmd`现在，通过从 .txt 文件中键入内容来打开命令提示符`start menu`。然后`cd c:\curl`在那里输入，它会带你到curl文件夹。现在执行`curl`您拥有的命令。

一件事是，Windows 不支持字段周围的单引号。所以你必须使用双引号。例如，我已将您的curl 命令转换为适当的命令。

    curl -H "Content-Type: application/json" -XPOST "http://localhost:9200/indexname/typename/optionalUniqueId" -d "{ \"field\" : \"value\"}"

如果您将 Kibana 与 Elasticsearch 一起使用，那么您可以使用下面的 REST 请求来创建并放入索引。

创建索引
----

    http://localhost:9200/company
    PUT company
    {
      "settings": {
        "index": {
          "number_of_shards": 1,
          "number_of_replicas": 1
        },
        "analysis": {
          "analyzer": {
            "analyzer-name": {
              "type": "custom",
              "tokenizer": "keyword",
              "filter": "lowercase"
            }
          }
        }
      },
      "mappings": {
        "employee": {
          "properties": {
            "age": {
              "type": "long"
            },
            "experience": {
              "type": "long"
            },
            "name": {
              "type": "text",
              "analyzer": "analyzer-name"
            }
          }
        }
      }
    }
    

创建文档
----

    POST http://localhost:9200/company/employee/2/_create
    {
    "name": "Hemani",
    "age" : 23,
    "experienceInYears" : 2
    }

让我清楚地解释一下..如果您熟悉rdbms..索引是数据库..索引类型是表..这意味着索引是索引类型的集合.，就像数据库（DB）中的表集合一样。

在NOSQL中..索引是数据库，索引类型是集合。作为数据库的集合组..

要执行这些查询...您需要安装适用于 Windows 的 CURL。

Curl只不过是一个命令行休息工具..如果你想要一个图形工具..尝试

Chrome 的 Sense 插件...

希望能帮助到你..

要测试和尝试来自 Windows 的curl 请求，您可以使用 Postman 客户端 Chrome 扩展。它使用起来非常简单，而且功能非常强大。

或者按照建议您可以安装 cURL 实用程序。

示例卷曲请求如下。

    curl -X POST -H "Content-Type: application/json" -H "Cache-Control: no-cache" -d '{
    "user" : "Arun Thundyill Saseendran",
    "post_date" : "2009-03-23T12:30:00",
    "message" : "trying out Elasticsearch"
    }' "http://10.103.102.56:9200/sampleindex/sampletype/"
    

我也在广泛地开始和探索 ES。因此，如果您还有任何其他疑问，请告诉我。

编辑：将索引名称和类型名称更新为完全小写以避免错误并遵循约定。

我开始使用`curl`，但后来迁移到使用`kibana`。以下是来自elastic.co（E elastic search、K kibana）的有关 ELK 堆栈的更多信息： https: [//www.elastic.co/elk-stack](https://www.elastic.co/elk-stack)

使用 kibana，您的`POST`请求会更简单一些：

    POST /<INDEX_NAME>/<TYPE_NAME>
    {
        "field": "value",
        "id": 1,
        "account_id": 213,
        "name": "kimchy"
    }

How to insert data into elasticsearch

在Elasticsearch中，索引的名称一旦创建之后是不能直接修改的，但是您可以通过创建索引的别名（alias）或重新索引（reindexing）的方法来间接"重命名"索引。

### 方法一：使用别名（Alias）

虽然不能直接重命名索引，但是您可以给索引创建一个或多个别名，这样可以通过新的别名来访问原有的索引。

创建别名的步骤如下：

1. 使用`POST`或者`PUT`请求为现有索引创建别名：

```json
POST /_aliases
{
  "actions": [
    {
      "add": {
        "index": "原始索引名",
        "alias": "新的别名"
      }
    }
  ]
}
```

2. 确认别名已被创建，并可以通过它访问数据。

3. 可选的，您可以删除旧的索引名，但这样做前要确保所有写入和读取操作都已经切换到了新的别名。

### 方法二：重新索引（Reindexing）

如果您需要更彻底地改名，可以使用重新索引的方法。这涉及到将旧索引中的数据复制到一个新的索引中，然后您可以根据需要删除旧的索引。

重新索引的步骤如下：

1. 创建新的索引，并指定所需的设置和映射。

2. 使用 `_reindex` API 将旧索引的数据复制到新索引：

```json
POST /_reindex
{
  "source": {
    "index": "旧索引名"
  },
  "dest": {
    "index": "新索引名"
  }
}
```

3. 在重新索引完成后，确保新索引已正确地包含了所有的数据。

4. 更新所有应用程序和服务，以使用新的索引名称。

5. 删除旧的索引（如果确定不再需要）：

```json
DELETE /旧索引名
```

**注意：** 重命名索引（特别是重新索引）是一个可能会消耗时间和资源的过程，对于大型索引或生产环境，需要谨慎进行，并考虑到可能的停机时间、数据一致性问题以及对正在进行的查询和索引操作的影响。在生产环境中，可能需要在低流量时段进行此操作，并确保有完整的备份以防万一出错。

您可以使用[REINDEX](https://www.elastic.co/guide/en/elasticsearch/reference/current/docs-reindex.html)来做到这一点。

> 重新索引不会尝试设置目标索引。它不会复制源索引的设置。您应该在运行 \_reindex 操作之前[设置目标索引](https://www.elastic.co/guide/en/elasticsearch/reference/5.2/indices-create-index.html)，包括设置映射、分片计数、副本等。

1.  首先将索引复制到一个新名称

    POST /_reindex
    {
      "source": {
        "index": "twitter"
      },
      "dest": {
        "index": "new_twitter"
      }
    }
    

2.  现在删除索引

    DELETE /twitter

[从 ElasticSearch 7.4 开始，重命名索引的最佳方法是使用新引入的克隆索引 API](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-clone-index.html)复制索引，然后使用[删除索引 API](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-delete-index.html)删除原始索引。

与出于相同目的使用快照 API 或重新索引 API 相比，克隆索引 API 的主要优点是速度，因为克隆索引 API 将段从源索引硬链接到目标索引，而无需重新处理其任何内容（在显然，支持硬链接的文件系统；否则，文件在文件系统级别复制，这仍然比替代方案更有效）。克隆索引还保证目标索引在每个点上都与源索引相同（即不需要手动复制设置和映射，这与重新索引方法相反），并且不需要配置本地快照目录。

**_旁注：_**_尽管此过程比以前的解决方案快得多，但它仍然意味着停机时间。有一些实际用例证明重命名索引是合理的（例如，作为拆分、收缩或备份工作流程中的一个步骤），但重命名索引不应该成为日常操作的一部分。如果您的工作流程需要频繁的索引重命名，那么您应该考虑使用[索引别名](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html)。_

`source_index`以下是将索引重命名为 的完整操作序列的示例`target_index`。它可以使用一些 ElasticSearch 特定的控制台来执行，例如集成[在 Kibana 中的](https://www.elastic.co/guide/en/kibana/current/console-kibana.html)控制台。请参阅[此要点](https://gist.github.com/mjameswh/59c1f59497c03a5cf3697eeb6ca2445d)以获取此示例的替代版本，使用`curl`而不是 Elastic Search 控制台。

    # Make sure the source index is actually open
    POST /source_index/_open
    
    # Put the source index in read-only mode
    PUT /source_index/_settings
    {
      "settings": {
        "index.blocks.write": "true"
      }
    }
    
    # Clone the source index to the target name, and set the target to read-write mode
    POST /source_index/_clone/target_index
    {
      "settings": {
        "index.blocks.write": null 
      }
    }
    
    # Wait until the target index is green;
    # it should usually be fast (assuming your filesystem supports hard links).
    GET /_cluster/health/target_index?wait_for_status=green&timeout=30s
    
    # If it appears to be taking too much time for the cluster to get back to green,
    # the following requests might help you identify eventual outstanding issues (if any)
    GET /_cat/indices/target_index
    GET /_cat/recovery/target_index
    GET /_cluster/allocation/explain
    
    # Delete the source index
    DELETE /source_index

要重命名索引，您可以使用 Elasticsearch 快照模块。

首先，您必须拍摄索引的快照。在恢复索引时，您可以重命名索引。

        POST /_snapshot/my_backup/snapshot_1/_restore
        {
         "indices": "jal",
         "ignore_unavailable": "true",
         "include_global_state": false,
         "rename_pattern": "jal",
         "rename_replacement": "jal1"
         }
    

rename\_replacement ：-您要在其中备份数据的新索引名称。

如果无法 REINDEX，**解决方法**是使用_aliases_。来自[官方](https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-aliases.html)文档：

_当针对特定索引工作时，elasticsearch 中的 API 接受索引名称，并在适用时接受多个索引。索引别名 API 允许使用名称为索引添加别名，所有 API 都会自动将别名转换为实际索引名称。一个别名也可以映射到多个索引，指定别名时，别名会自动扩展到别名索引。别名还可以与搜索和路由值时自动应用的过滤器关联。别名不能与索引同名。_

请注意，如果您使用“更多类似此”功能，则此解决方案不起作用。[https://github.com/elastic/elasticsearch/issues/16560](https://github.com/elastic/elasticsearch/issues/16560)

实现重命名或更改索引映射的另一种不同方法是使用 Logstash 重新索引。以下是 Logstash 2.1 配置示例：

    input {
      elasticsearch {
       hosts => ["es01.example.com", "es02.example.com"]
       index => "old-index-name"
       size => 500
       scroll => "5m"
      }
    }
    filter {
    
     mutate {
      remove_field => [ "@version" ]
     }
    
     date {
       "match" => [ "custom_timestamp", "MM/dd/YYYY HH:mm:ss" ]
       target => "@timestamp"
     }
    
    }
    output {
     elasticsearch {
       hosts => ["es01.example.com", "es02.example.com" ]
       manage_template => false
       index => "new-index-name"
     }
    }

How to rename an index in a cluster in elasticsearch

Lucene 和 Elasticsearch 最主要的区别在于它们分别定位于搜索技术的不同层次。Lucene 是一个开源的全文搜索库，可以用来创建搜索引擎，而 Elasticsearch 是基于 Lucene 构建的，它是一个开源的搜索和分析引擎，提供了分布式多用户能力的全文搜索引擎，附带了 HTTP web 接口和无模式 JSON 文档的处理。

下面是 Lucene 和 Elasticsearch 之间的一些主要区别：

### Lucene：

1. **核心搜索库**：Lucene 是一个Java库，它提供了全文搜索功能的底层API。它不是一个完整的搜索引擎，而是帮助开发者构建搜索引擎的工具。

2. **基础技术**：它处理索引创建、查询解析、搜索等核心功能。

3. **开发复杂性**：使用 Lucene 需要对索引结构和搜索算法有深刻理解，它需要开发者写更多的代码来处理数据的索引、查询和搜索结果的排名等。

4. **分布式能力**：Lucene 本身不支持分布式搜索，如果需要实现分布式搜索，开发者需要自行实现。

5. **接口**：Lucene 主要通过 Java API 提供服务，对于非Java环境需要额外的封装或者桥接技术。

### Elasticsearch：

1. **完整的搜索引擎**：Elasticsearch 是一个实时的分布式搜索和分析引擎，直接可以用于生产环境。

2. **基于 Lucene**：Elasticsearch 底层利用 Lucene 进行索引和搜索，但提供了简单易用的 RESTful API，开发者可以使用 JSON进行数据索引和查询。

3. **简化操作**：Elasticsearch 简化了复杂的搜索引擎构建过程，提供了现成的解决方案，包括集群管理、数据分析、监控等。

4. **分布式架构**：Elasticsearch 天然支持分布式和扩展，可以轻松处理PB级别的数据。

5. **多语言客户端**：Elasticsearch 提供了多种语言的客户端，方便不同开发环境下的接入和使用。

### 实例应用：

假设我们要为网站开发一个文章搜索功能：

- 如果使用 **Lucene**，我们需要自定义数据模型，建立索引，处理搜索查询，实现排名算法，处理高亮显示等，还需要考虑如何将这些搜索功能集成到网站中。这对开发者的要求很高，因为必须对 Lucene 有深入了解并且能够处理底层的细节。

- 如果使用 **Elasticsearch**，我们可以直接通过 HTTP 请求对文章内容进行索引。当用户在搜索框中输入查询时，我们可以发送一个 HTTP 请求到 Elasticsearch，它会处理查询，并返回格式良好的 JSON 结果，其中包括排名高的文档和高亮的搜索词。这大大简化了搜索系统的开发和维护工作。

_Lucene 是一个**Java 库**_。您可以将其包含在您的项目中并使用函数调用引用其函数。

Elasticsearch 是一个基于_JSON的__分布式__Web 服务器_，基于 Lucene 构建。虽然底层的实际工作是由 Lucene 完成的，但 Elasticsearch 为我们提供了一个基于 Lucene 的便捷层。在 Elasticsearch 中创建的每个分片都是一个单独的 Lucene 实例。所以总结一下

1.  Elasticsearch 基于 Lucene 构建，并提供**_基于 JSON 的 REST API_**来引用 Lucene 功能。
2.  Elasticsearch 提供了一个**_基于 Lucene 的分布式系统_**。Lucene 不知道或构建分布式系统。Elasticsearch 提供了这种分布式结构的抽象。
3.  Elasticsearch 提供其他支持功能，如线程池、队列、节点/集群监控 API、数据监控 API、集群管理等。

除了**@Vineeth Mohan 的**话：

**高可用性：** Elasticsearch 是分布式的，因此它可以管理数据复制，这意味着集群中拥有多个数据副本。这可实现高可用性。

**强大的查询 DSL**：Elasticsearch 为我们提供了 JSON 接口，用于在 Lucene 之上读取和写入查询。借助 Elasticsearch，您可以在不了解 Lucene 语法的情况下编写复杂的查询。

**Schemaless (Schema-Free)：**字段（名称，值对）不必`schema`事先定义。当你索引数据时，elasticsearch 可以在运行时自动创建 schema，就像魔术一样。

我将添加另一个角度来进行讨论。

### Elasticsearch 索引 VS Lucene 索引。

Elasticsearch**索引**是一大块文档，就像关系世界中的数据库由表组成一样。  
为了实现扩展，我们将 Elasticsearch 索引分散到多个物理节点/服务器中。

为此，我们将 Elasticsearch 索引分成更小的单元，称为**分片**。

**问：它和Lucene索引有什么关系？**  
如果我们想要搜索特定术语（例如：“ _Cake_ ”或“ _Cookie_ ”），我们将必须检查每个分片并查找它（让我们先不考虑分片在每个节点上的定位和复制方式）。

这个操作会花费很多时间——所以我们需要使用一个**高效的数据结构来进行这个搜索**——这就是**Lucene的索引**发挥作用的地方。

每个**Elasticsearch分片都基于Lucene索引结构**，并存储有关术语的统计信息，以使基于术语的搜索更加高效。

(!) 这非常令人困惑，因为“索引”一词以及 Elasticsearch 分片是 Elasticsearch 索引的一部分但基于 Lucene 索引的数据结构这一事实。

* * *

### Bonus - Lucene 的索引作为倒排索引

从下面的例子可以看出，Lucene的索引存储了原始文档的内容以及附加信息，例如术语词典和术语频率，这提高了搜索效率：

    Term           Document                 Frequency
    Cake           doc_id_1, doc_id_8       4 (2 in doc_id_1, 2 in doc_id_8)
    Cookie         doc_id_1, doc_id_6       3 (2 in doc_id_1, 1 in doc_id_6)
    Spaghetti      doc_id_12                1 (1 in doc_id_12)
    

Lucene 的索引属于称为倒排索引的索引系列。这是因为它可以针对某个术语列出包含该术语的文档。  
这是自然关系的反面，其中文档列出了术语。

* * *

### （提醒）我们是如何从分片变成术语的？

(1) Shard是包含文档的文件目录。  
(2) 文档是字段的序列。  
(3) 字段是术语的命名序列。

我从使用的角度来回答。

Lucene 是一个搜索引擎**库**。您希望使用它来构建自己的搜索引擎：要么是新的 Elasticsearch 或 Solr 竞争对手，要么是适合您的用例的搜索引擎（例如文本分析）。

Elasticsearch 是一个**搜索引擎**。大多数人将其用于日志聚合、产品搜索或这两者的变体（例如社交媒体分析或根据某些搜索条件查找相关人员）。它构建在 Lucene 之上，因此**公开了其大部分（尽管不是全部）功能**。它还增加了很多东西，最重要的是：

*   休息API
*   查询DSL
*   分布式系统（分片、复制、集群管理）
*   方面/[聚合](https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html)
*   常见用途（例如[摄取](https://www.elastic.co/guide/en/elasticsearch/reference/current/ingest.html)处理）和管理（用于监控其[相关指标](https://sematext.com/blog/top-10-elasticsearch-metrics-to-watch/)、备份和恢复等的 API）的附加功能

ElasticSearch相关问题

Hoe to Import /Index a JSON file into Elasticsearch

What is the diffence between connect and createconnection in elasticsearch?

What is the default user and password for elasticsearch

How to insert data into elasticsearch

How to rename an index in a cluster in elasticsearch

What is the difference between lucene and elasticsearch