前两章分别准备了Elasticsearch环境及索引文档的概念。本章将介绍下文档的基本操作CRUD(增查改删)以及批量操作(每次CRUD操作都需要调用API,而批量操作将实现调用一次API进行多次CRUD操作,从而减少资源开销)
文档的CRUD
| 说明 | kibana命令 | |
|---|---|---|
| Index | 文档不存在,则索引新的文档否则删除现有文档,并索引新文档,版本信息+1 | PUT my_index/_doc/1{"user":"mike","commet":"hello world"} |
| Create | 索引文档,支持PUT和POST两种方式PUT:需要指定ID,若ID重复将报错POST:不需要指定ID,将自动生成ID | PUT my_index/_create/1/{"user":"mike","comment":"hello world"} POST my_index/_doc(不指定ID,自动生成){"user":"mike","comment":"hello world"} |
| Read | 通过ID获取文档信息找到文档,返回HTTP 200找不到文档,返回HTTP 404 | GET my_index/_doc/1 |
| Update | Updata不会删除原来的文档,而是实现真正的数据更新 | POST my_index/_update/1{"doc":{"user":"mike","comment":"hello es"}} |
| Delete | 删除指定文档 | DELETE my_index/_doc/1 |
备注:type默认使用_doc.这是为了后续兼容Elasticseach8.0;根据计划6.0之前支持多个type,6.0开始支持一个type,8.0之后将正式废除(统一为_doc).为了过渡,在7.0中增加include_type_name参数(默认为true),让所有的API是type相关的,而8.0之后该参数将默认改为false,也就是不包含type信息了,这个是用于移除type的一个开关
INDEX
文档不存在,则索引新的文档否则删除现有文档,并索引新文档,版本信息+1
若文档不存在,则索引新的文档
PUT my_index/_doc/1
{
"name":"张三"
}
#result返回created表示创建成功;并返回index(索引名称)、type、id、version等信息
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 1,
"failed" : 0
},
"_seq_no" : 0,
"_primary_term" : 1
}
若文档已存在则删除现有文档,并索引新的文档,版本信息+1。
PUT my_index/_doc/1
{
"xb":"男"
}
#此时提示的result是updated;而不是第一次执行时的created,且版本数(_version)也+1
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 1,
"_primary_term" : 1
}
此时通过GET命令查询id为1的记录
GET my_index/_doc/1
#文档中保存的是第二次输入的“xb”:"男",而不是第一次输入的“name”:"张三",这是因为PUT命令虽然提示updated,但不是对原文档的修改,而是直接删掉原文档并新建文档。
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"_seq_no" : 1,
"_primary_term" : 1,
"found" : true,
"_source" : {
"xb" : "男"
}
}
CREATE
索引文档,支持PUT和POST两种方式PUT:需要指定ID,若ID重复将报错POST:不需要指定ID,将自动生成ID
使用PUT命令创建已经存在的文档my_inde/_create/1,将会报错
备注:索引名称后面跟的是方法_create,而非type(_doc),后者属于Index
PUT my_index/_create/1
{
"name":"李四"
}
#由于id为1的文档在已经使用Index语句创建,而使用create指定id为1时将报错
{
"error": {
"root_cause": [
{
"type": "version_conflict_engine_exception",
"reason": "[1]: version conflict, document already exists (current version [2])",
"index_uuid": "huioLo83QmCdtFickQ9S5A",
"shard": "0",
"index": "my_index"
}
],
"type": "version_conflict_engine_exception",
"reason": "[1]: version conflict, document already exists (current version [2])",
"index_uuid": "huioLo83QmCdtFickQ9S5A",
"shard": "0",
"index": "my_index"
},
"status": 409
}
而指定的id不存在,则将成功创建
PUT my_index/_create/2
{
"name":"李四"
}
#id为2目前未创建,返回创建成功的提示
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "2",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 2,
"_primary_term" : 1
}
无论是Index还是Create,只要使用PUT都需要指定文档id。但是部分情况下id不需要指定,这个时候可用使用POST命令来实现
POST my_index/_doc
{
"name":"王五"
}
#提示创建成功,并返回一个随机的id
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "76yFonEBuoXto2dXCuRb",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 3,
"_primary_term" : 1
}
使用POST再次创建文档
POST my_index/_doc
{
"name":"马六"
}
#创建成功,且返回不一样的id
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "8KyFonEBuoXto2dXzeQN",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 4,
"_primary_term" : 1
}
READ
READ是Elasticsearch中使用最频繁的场景,Elasticsearch后续的操作也主要是针对该环节的拓展。本章节主要说下最简单的指定id文档获取及前10条文档获取
指定索引、type、id获取指定文档内容
GET my_index/_doc/1
#查询成功将返回如下信息,其中_source中保存的是输入的所有字段信息
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_version" : 2,
"_seq_no" : 1,
"_primary_term" : 1,
"found" : true,
"_source" : {
"xb" : "男"
}
}
若输入的id不存在对应的文档
GET my_index/_doc/100
#若文档不存在,则返回如下错误信息
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "100",
"found" : false
}
不指定具体文档id,使用_search方法将获取前10条文档
GET my_index_1/_doc/_search
#默认获取10条,这边由于数量限制只显示4条
{
"took" : 0,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 4,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_score" : 1.0,
"_source" : {
"xb" : "男"
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "2",
"_score" : 1.0,
"_source" : {
"name" : "李四"
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "76yFonEBuoXto2dXCuRb",
"_score" : 1.0,
"_source" : {
"name" : "王五"
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "8KyFonEBuoXto2dXzeQN",
"_score" : 1.0,
"_source" : {
"name" : "马六"
}
}
]
}
}
UPDATE
使用update进行更新时,文档必须已经存在;且更新只会对相应字段做增量修改
针对不存在的文档进行更新操作
POST my_index/_update/100
{
"doc":{
"name":"张三"
}
}
#错误信息如下
{
"error" : {
"root_cause" : [
{
"type" : "document_missing_exception",
"reason" : "[_doc][100]: document missing",
"index_uuid" : "huioLo83QmCdtFickQ9S5A",
"shard" : "0",
"index" : "my_index"
}
],
"type" : "document_missing_exception",
"reason" : "[_doc][100]: document missing",
"index_uuid" : "huioLo83QmCdtFickQ9S5A",
"shard" : "0",
"index" : "my_index"
},
"status" : 404
}
针对id为1的文档进行操作,id为1的文档中目标保存数据信息"xb":"男",version值为2
POST my_index/_update/1
{
"doc":{
"sg":"123"
}
}
#执行更新命令之后,重新获取id为1的文档发现,版本号+1,旧字段(更新语句中未包含)存在,新增字段成功
GET my_index/_doc/1
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_version" : 3,
"_seq_no" : 6,
"_primary_term" : 1,
"found" : true,
"_source" : {
"xb" : "男",
"sg" : "123"
}
}
DELETE
用于删除指定id的文档;若id不存在则保存;文档删除之后版本信息不会被清空
删除id为1的文档
DELETE my_index/_doc/1
#删除成功提示如下
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_version" : 4,
"result" : "deleted",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 7,
"_primary_term" : 1
}
再次删除id为1的文档
DELETE my_index/_doc/1
#删除id不存在的文档,报错信息如下
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_version" : 5,
"result" : "not_found",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 8,
"_primary_term" : 1
}
测试新建id为的文档,查看版本号
PUT my_index/_create/1
{
"name":"张三"
}
#文档删除之后若使用create版本号将加1;若使用index则版本号重新计数
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_version" : 6,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 16,
"_primary_term" : 1
}
批量操作
每一次执行CRUD都需要发送网络请求调用API,若是需要查询100条或者插入100条数据,那么需要发送100次网络请求,这个开销是很大的。所以这边引入了批量操作的概念,多次操作只需要发送一个网络请求。主边主要介绍bulk、mget、msearch三种
bulk
支持CUD(增改删)操作
mget
支持R(查)操作,主要用于精确(明确索引名称及文档id)查询
msearch
支持R(查)操作,主要用于模糊查询
bulk
bulk支持index、create、update、delete
POST _bulk
{ "index" : { "_index" : "my_index", "_id" : "1" } }
{ "xm" : "xm_index" }
{ "delete" : { "_index" : "my_index", "_id" : "2" } }
{ "create" : { "_index" : "my_index","_id":4} }
{ "xm" : "xm_create" }
{ "update" : {"_id" : "1", "_index" : "my_index"} }
{ "doc" : {"sfzh" : "sfzh_update"} }
#每一行末尾需要换行符(/n),包括最后一行
#每个操作由两行json组成:一行action(包括index、delete、create、update)+元数据(包括_index、_id),另一行数据;其中delete比较特殊,只需要action+元数据一行
#针对bulk中的语句,给出单独的反馈
#每个操作独立的,单条失败不会影响其他操作
#返回结果如下:
{
"took" : 281,
"errors" : false,
"items" : [
{
"index" : {
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_version" : 9,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 17,
"_primary_term" : 1,
"status" : 200
}
},
{
"delete" : {
"_index" : "my_index",
"_type" : "_doc",
"_id" : "2",
"_version" : 2,
"result" : "deleted",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 18,
"_primary_term" : 1,
"status" : 200
}
},
{
"create" : {
"_index" : "my_index",
"_type" : "_doc",
"_id" : "4",
"_version" : 1,
"result" : "created",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 19,
"_primary_term" : 1,
"status" : 201
}
},
{
"update" : {
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_version" : 10,
"result" : "updated",
"_shards" : {
"total" : 2,
"successful" : 2,
"failed" : 0
},
"_seq_no" : 20,
"_primary_term" : 1,
"status" : 200
}
}
]
}
mget
mget用于批量精确查询,可以减少网络连接所产生的开销,提高性能
GET /_mget
{
"docs" : [
{
"_index" : "my_index",
"_id" : "1"
},
{
"_index" : "my_index",
"_id" : "2"
}
]
}
#需要指定_index和_id
#所有的查询条件需要放在"docs"后面的[]里面
#针对mget中的查询条件,给出单独的反馈
#返回结果如下:
{
"docs" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "1",
"_version" : 10,
"_seq_no" : 20,
"_primary_term" : 1,
"found" : true,
"_source" : {
"xm" : "xm_index",
"sfzh" : "sfzh_update"
}
},
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "2",
"found" : false
}
]
}
msearch
msearch用于批量查询,适用于全文搜索
GET _msearch
{"index" : "my_index"}
{"query" : {"match_all" : {}}, "from" : 0, "size" : 1}
{"index" : "movies"}
{"query" : {"match_all" : {}}, "from" : 0, "size" : 1}
#每一行末尾需要换行符(/n),包括最后一行
#每个操作由两行json组成:一行索引名称,一行查询参数
#针对msearch中的语句,给出单独的反馈
#返回结果如下:
{
"took" : 1,
"responses" : [
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 5,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "my_index",
"_type" : "_doc",
"_id" : "76yFonEBuoXto2dXCuRb",
"_score" : 1.0,
"_source" : {
"name" : "王五"
}
}
]
},
"status" : 200
},
{
"took" : 1,
"timed_out" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"skipped" : 0,
"failed" : 0
},
"hits" : {
"total" : {
"value" : 9743,
"relation" : "eq"
},
"max_score" : 1.0,
"hits" : [
{
"_index" : "movies",
"_type" : "_doc",
"_id" : "1660",
"_score" : 1.0,
"_source" : {
"year" : 1997,
"@version" : "1",
"title" : "Eve's Bayou",
"id" : "1660",
"genre" : [
"Drama"
]
}
}
]
},
"status" : 200
}
]
}
常见错误代码
| 问题 | 原因 |
|---|---|
| 无法连接 | 网络故障或集群挂掉 |
| 连接无法关闭 | 网络故障或节点出错 |
| 429 | 集群过于繁忙 |
| 4XX | 请求体格式错误 |
| 500 | 集群内部错误 |




