Elasticsearch
全文检索
倒排索引
正排索引
倒排索引
Elasticsearch 是面向文档型数据库,一条数据在这里就是一个文档,Elasticsearch 类比 MySQL:
-
index 索引(索引库)
ES中的索引非传统索引的含义,ES中的索引是存放数据的地方,是ES中的一个概念词汇
index类似于我们Mysql里面的一个数据库 create database user; 好比就是一个索引库。一个索引由一个名字来标识(必须全部是小写字母),并且当我们要对这个索引中的文档进行索引、搜索、更新和删除(CRUD)的时候,都要使用到这个名字。在一个集群中,可以定义任意多的索引。 -
type 类型
类型是用来定义数据结构的
在每一个index下面,可以有一个或者多个type,好比数据库里面的一张表。 -
document 文档
文档就是最终的数据了,可以认为一个文档就是一条记录。
是ES里面最小的数据单元,就好比表里面的一条数据 -
Field 字段
好比关系型数据库中列的概念,一个document有一个或者多个field组成。 -
Mapping 映射
mapping 是处理数据的方式和规则方面做一些限制,如:某个字段的数据类型、默认值、分析器、是否被索引等等。这些都是映射里面可以设置的,其它就是处理 ES 里面数据的一些使用规则设置也叫做映射,按着最优规则处理数据对性能提高很大,因此才需要建立映射,并且需要思考如何建立映射才能对性能更好。 -
shard 分片
一台服务器,无法存储大量的数据,ES把一个index里面的数据,分为多个shard,每个分片本身也是一个功能完善并且独立的“索引”,这个“索引”可以被放置到集群中的任何节点上 -
replica 副本
一个分布式的集群,难免会有一台或者多台服务器宕机,如果我们没有副本这个概念。就会造成我们的shard发生故障,无法提供正常服务。
在ES集群中,我们一模一样的数据有多份,能正常提供查询和插入的分片我们叫做 primary shard,其余的我们就管他们叫做 replica shard(备份的分片)
当我们去查询数据的时候,我们数据是有备份的,它会同时发出命令让我们有数据的机器去查询结果,最后谁的查询结果快,我们就要谁的数据(这个不需要我们去控制,它内部就自己控制了)
在默认情况下,我们创建一个库的时候,默认会帮我们创建5个主分片(primary shrad)和5个副分片(replica shard),所以说正常情况下是有10个分片的。
同一个节点上面,副本和主分片是一定不会在一台机器上面的,就是拥有相同数据的分片,是不会在同一个节点上面的。
所以当你有一个节点的时候,这个分片是不会把副本存在这仅有的一个节点上的,当你新加入了一台节点,ES会自动的给你在新机器上创建一个之前分片的副本。 -
Allocation 分配
将分片分配给某个节点的过程,包括分配主分片或者副本。如果是副本,还包含从主分片复制数据的过程。这个过程是由 master 节点完成的。
搜索引擎原理
安装启动
-
解压后,进入 bin 文件目录,点击 elasticsearch.bat 文件启动 ES 服务 。
注意:9300 端口为 Elasticsearch 集群间组件的通信端口, 9200 端口为浏览器访问的 http协议 RESTful 端口
。 -
打开浏览器,输入地址: http://localhost:9200,测试返回结果,返回结果如下:
{
"name" : "DESKTOP-9GAAVES",
"cluster_name" : "elasticsearch",
"cluster_uuid" : "0IYvTtEzQv69VeX2zLYf5g",
"version" : {
"number" : "7.8.0",
"build_flavor" : "default",
"build_type" : "zip",
"build_hash" : "757314695644ea9a1dc2fecd26d1a43856725e65",
"build_date" : "2020-06-14T19:35:50.234439Z",
"build_snapshot" : false,
"lucene_version" : "8.5.1",
"minimum_wire_compatibility_version" : "6.8.0",
"minimum_index_compatibility_version" : "6.0.0-beta1"
},
"tagline" : "You Know, for Search"
}
创建索引
向 ES 服务器发 PUT
请求 : http://elasticsearch宿主机IP:9200/索引名称
例:
**注意:请求类型是 PUT**
创建一个名称为test的索引,类似于在mysql创建一个名为test的数据库
http://127.0.0.1:9200/test
返回结果:
{
"acknowledged": true, //相应结果
"shards_acknowledged": true, // 分片结果
"index": "test" //索引名称
}
查看索引
- 查看全部索引
向 ES 服务器发GET
请求 : http://宿主机IP:9200/_cat/indices?v
例:
**注意:请求类型是 GET**
http://127.0.0.1:9200/_cat/indices?v
返回结果:
这里请求路径中的_cat 表示查看的意思,
indices 表示索引,所以整体含义就是查看当前 ES服务器中的所有索引,
就好像 MySQL 中的 show tables 的感觉
health status index uuid pri rep docs.count docs.deleted store.size pri.store.size
yellow open test MttSb4vtT0CPls-1aE6oHw 1 1 0 0 208b 208b
以上返回结果的含义:
health
:当前服务器健康状态: green(集群完整),yellow(单点正常、集群不完整), red(单点不正常)
status
:索引打开、关闭状态
index
: 索引名
uuid
: 索引统一编号
pri
: 主分片数量
rep
: 副本数量
docs.count
: 可用文档数量
docs.deleted
: 文档删除状态(逻辑删除)
store.size
: 主分片和副分片整体占空间大小
pri.store.size
:主分片占空间大小
- 查看指定索引
向 ES 服务器发GET
请求 : http://宿主机IP:9200/索引名称
例:
**注意:请求类型是 GET**
http://127.0.0.1:9200/test
返回结果:
{
"test": {//索引名
"aliases": {},//别名
"mappings": {},//映射
"settings": {//设置
"index": {//设置 - 索引
"creation_date": "1617861426847",//设置 - 索引 - 创建时间
"number_of_shards": "1",//设置 - 索引 - 主分片数量
"number_of_replicas": "1",//设置 - 索引 - 主分片数量
"uuid": "J0WlEhh4R7aDrfIc3AkwWQ",//设置 - 索引 - 主分片数量
"version": {//设置 - 索引 - 主分片数量
"created": "7080099"
},
"provided_name": "shopping"//设置 - 索引 - 主分片数量
}
}
}
}
删除索引
向 ES 服务器发 DELETE
请求 : http://宿主机IP:9200/索引名称
例:
**注意:请求类型是 DELETE**
http://127.0.0.1:9200/test
返回结果:
{
"acknowledged": true // 请求成功
}
创建文档
向 ES 服务器发 POST
请求 :http://宿主机IP:9200/索引名称/类型/自定义的ID,请求体内容为JSON
或
http://宿主机IP:9200/索引名称/类型,此处的ID如果不自定义的话,那么ES将会自动生成
例:
**注意:请求类型是 POST**
此处相当于是给shopping数据库下的product表添加一条ID为1001的数据
http://127.0.0.1:9200/shopping/product/1001
将此数据放在请求体body中
{
"title":"小米手机",
"category":"小米",
"images":"http://www.gulixueyuan.com/xm.jpg",
"price":3999.00
}
返回结果:
{
"_index": "shopping",//索引
"_type": "product",//类型-文档
"_id": "1001",//唯一标识,可以类比为 MySQL 中的主键,如果不指定,那么就会随机生成
"_version": 1,//版本
"result": "created",//结果,这里的 create 表示创建成功
"_shards": {//
"total": 2,//分片 - 总数
"successful": 1,//分片 - 总数
"failed": 0//分片 - 总数
},
"_seq_no": 0,
"_primary_term": 1
}
查询文档
- 查询指定文档
向 ES 服务器发GET
请求 :http://宿主机IP:9200/索引名称/类型/需要查询的文档ID
例:
**注意:请求类型是 GET**
此处相当于是查询shopping数据库下的product表中ID为1001的数据
http://127.0.0.1:9200/shopping/product/1001
返回结果:
{
"_index": "shopping",
"_type": "product",
"_id": "1001",
"_version": 1,
"_seq_no": 0,
"_primary_term": 1,
"found": true,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
}
}
- 查询索引下所有数据
向 ES 服务器发GET
请求 : http://宿主机IP:9200/索引名称/_search
例:
**注意:请求类型是 GET**
此处相当于是查询shopping数据库下的所有数据
http://127.0.0.1:9200/shopping/_search
返回结果:
{
"took": 64,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1,
"hits": [{
"_index": "shopping",
"_type": "product",
"_id": "1001",
"_score": 1,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
}
}]
}
}
修改文档
- 全量修改
和新增文档一样,输入相同的 URL 地址请求,如果请求体变化,会将原有的数据内容覆盖
向 ES 服务器发 POST
请求 :http://宿主机IP:9200/索引名称/类型/文档ID
例:
**注意:请求类型是 POST**
此处相当于是替换shopping数据库下product表中ID为1001的数据
http://127.0.0.1:9200/shopping/product/1001
请求体JSON内容为:
{
"title":"华为手机",
"category":"华为",
"images":"http://www.gulixueyuan.com/hw.jpg",
"price":1999.00
}
返回结果:
{
"_index": "shopping",
"_type": "product",
"_id": "1001",
"_version": 2,
"result": "updated",//updated 表示数据被更新
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 1,
"_primary_term": 1
}
- 局部修改
向 ES 服务器发POST
请求 : http://宿主机IP:9200/索引名称/_update/文档ID
**注意:请求类型是 POST**
此处相当于是替换shopping数据库下product表中ID为1001的数据
http://127.0.0.1:9200/shopping/_update/1001
请求体JSON内容为:
{
"doc": { //指定修改的为文档中的数据
"title":"拉面手机"
}
}
修改后:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 1,
"hits": [{
"_index": "shopping",
"_type": "product",
"_id": "1001",
"_score": 1,
"_source": {
"title": "拉面手机", //只修改了此处的数据
"category": "华为",
"images": "http://www.gulixueyuan.com/hw.jpg",
"price": 1999
}
}]
}
}
删除文档
向 ES 服务器发DELETE
请求 :http://宿主机IP:9200/索引名称/类型/文档ID
例:
**注意:请求类型是 DELETE **
此处相当于是删除shopping数据库下product表中ID为1001的数据
http://127.0.0.1:9200/shopping/product/1001
返回结果:
{
"_index": "shopping",
"_type": "product",
"_id": "1001",
"_version": 9,
"result": "deleted", //删除成功
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 8,
"_primary_term": 1
}
文档条件查询 , 分页查询 ,字段筛选,查询排序
向ES添加如下数据:
[
{
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
},
{
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
},
{
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
},
{
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
},
{
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
},
{
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
},
{
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
]
- 条件查询
1.URL携带参数(不推荐,这种形式的查询,很容易出现数据盗用,或者参数值出现中文会出现乱码情况)
向 ES 服务器发 GET
请求 : http://宿主机IP:9200/索引名称/_search?q=条件key:条件值
例:
**注意:请求类型是 GET**
此处相当于是查询shopping数据库下price字段等于3999的数据
http://127.0.0.1:9200/shopping/_search?q=price:3999
返回结果:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1,
"hits": [{
"_index": "shopping",
"_type": "product",
"_id": "1001",
"_score": 1,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999 //只查询出了price=3999的数据
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "7B8uiYEBSDaNywCKnwWG",
"_score": 1,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999//只查询出了price=3999的数据
}
}]
}
}
2.请求体带参查询
向 ES 服务器发 GET
请求 :http://宿主机IP:9200/索引名称/_search,附带JSON体如下:
{
"query":{//表示当前操作为查询
"match":{ //表示查询类型为匹配查询,相当于mysql 的模糊查询 '%值%'
"条件key":"条件值"
}
}
}
----------------------------
{
"query":{//表示当前操作为查询
"match_all":{} //表示查询全部内容
}
}
例:
**注意:请求类型是 GET**
此处相当于是查询shopping数据库下price字段包含3999的数据
http://127.0.0.1:9200/shopping/_search
携带的请求体:
{
"query":{
"match":{
"price":3999
}
}
}
返回结果:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 1,
"hits": [{
"_index": "shopping",
"_type": "product",
"_id": "1001",
"_score": 1,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "7B8uiYEBSDaNywCKnwWG",
"_score": 1,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
}
}]
}
}
- 分页查询
向 ES 服务器发GET
请求 : http://宿主机IP:9200/索引名称/_search,附带JSON体如下:
{
"query":{ //表示当前操作为查询
"match_all":{} //表示查询全部内容,相当于没有where条件
},
"from":0,//分页的起始位置,类似于mysql中 limit 的第一个参数
"size":2 //每页展示的条数,类似于mysql中 limit 的第二个参数
}
例:
**注意:请求类型是 GET**
此处相当于是查询shopping数据库下的全部数据,并按照每页展示2条数据的规则进行分页
http://127.0.0.1:9200/shopping/_search
请求体内容:
{
"query":{
"match_all":{}
},
"from":0,
"size":1
}
返回结果:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 7, //数据总条数为7
"relation": "eq"
},
"max_score": 1,//所有数据里面打分最高的分数
"hits": [{
"_index": "shopping",
"_type": "product",
"_id": "1001",
"_score": 1, //分数,这个分数越大越靠前出来
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
}
}]
}
}
- 字段筛选
向 ES 服务器发GET
请求 : http://宿主机IP:9200/索引名称/_search,附带JSON体如下:
{
"query":{
"match_all":{}
},
"_source":["需要保留的字段名称1","需要保留的字段名称2"]
}
例:
**注意:请求类型是 GET**
此处相当于是查询shopping数据库下的全部数据,但只返回数据的title和price字段
http://127.0.0.1:9200/shopping/_search
请求体内容:
{
"query":{
"match_all":{}
},
"_source":["title","price"]
}
返回结果:
{
"took": 3,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 7,
"relation": "eq"
},
"max_score": 1,
"hits": [{
"_index": "shopping",
"_type": "product",
"_id": "1001",
"_score": 1,
"_source": {
"price": 3999, //只返回了指定的字段
"title": "小米手机"
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "7B8uiYEBSDaNywCKnwWG",
"_score": 1,
"_source": {
"price": 3999,
"title": "小米手机"
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "7R8uiYEBSDaNywCK7QXv",
"_score": 1,
"_source": {
"price": 1999,
"title": "小米手机"
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "7h8viYEBSDaNywCKGgW-",
"_score": 1,
"_source": {
"price": 1999,
"title": "小米手机"
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "7x8viYEBSDaNywCKQAVb",
"_score": 1,
"_source": {
"price": 1999,
"title": "华为手机"
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "8B8viYEBSDaNywCKbQUz",
"_score": 1,
"_source": {
"price": 1999,
"title": "华为手机"
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "8R8viYEBSDaNywCKmQVq",
"_score": 1,
"_source": {
"price": 1999,
"title": "华为手机"
}
}]
}
}
- 查询排序
向 ES 服务器发GET
请求 : http://宿主机IP:9200/索引名称/_search,附带JSON体如下:
{
"query":{
"match_all":{}
},
"sort":{//表示当前操作为排序
"price":{//需要排序的字段名称
"order":"desc" //指定排序规则,desc(降序),asc(升序)
}
}
}
例:
**注意:请求类型是 GET**
此处相当于是查询shopping数据库下的全部数据,但只返回数据的title和price字段
http://127.0.0.1:9200/shopping/_search
请求体内容:
{
"query":{
"match_all":{}
},
"sort":{
"price":{ // 按照价格进行降序排序
"order":"desc"
}
}
}
返回结果:
{
"took": 21,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 7,
"relation": "eq"
},
"max_score": null,
"hits": [{
"_index": "shopping",
"_type": "product",
"_id": "1001",
"_score": null,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
},
"sort": [3999]
}, {
"_index": "shopping",
"_type": "product",
"_id": "7B8uiYEBSDaNywCKnwWG",
"_score": null,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
},
"sort": [3999]
}, {
"_index": "shopping",
"_type": "product",
"_id": "7R8uiYEBSDaNywCK7QXv",
"_score": null,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
},
"sort": [1999]
}, {
"_index": "shopping",
"_type": "product",
"_id": "7h8viYEBSDaNywCKGgW-",
"_score": null,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
},
"sort": [1999]
}, {
"_index": "shopping",
"_type": "product",
"_id": "7x8viYEBSDaNywCKQAVb",
"_score": null,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
},
"sort": [1999]
}, {
"_index": "shopping",
"_type": "product",
"_id": "8B8viYEBSDaNywCKbQUz",
"_score": null,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
},
"sort": [1999]
}, {
"_index": "shopping",
"_type": "product",
"_id": "8R8viYEBSDaNywCKmQVq",
"_score": null,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
},
"sort": [1999]
}]
}
}
文档多条件查询,范围查询(大于,小于)
- 多条件查询
向 ES 服务器发GET
请求 : http://宿主机IP:9200/索引名称/_search,附带JSON体如下:
bool
它包含以下操作符:
must
: 多个查询条件的完全匹配,相当于mysql中的 and。must_not
:多个查询条件的相反匹配,相当于mysql中的 not。should
:至少有一个查询条件匹配, 相当于mysql中的 or。
{
"query":{ //表示当前的操作为查询
"bool":{//用来合并多个过滤条件查询结果的布尔逻辑,可以理解为mysql的where
"must":[//多个查询条件的完全匹配,相当于mysql中的 and,还可以使用should,相当于mysql的 or,must_not相当于mysql中的 not
{
"match":{//指定匹配的规则
"key":"value" //需要匹配的key : 需要匹配的值
}
},{
"match":{//指定匹配的规则
"key":value//需要匹配的key : 需要匹配的值
}
}
]
}
}
}
例:
**注意:请求类型是 GET**
http://127.0.0.1:9200/shopping/_search
请求体内容:
1.以下操作,相当于实现了如下SQL:
select * form product where category like '%小米%' and price ='3999.00'
{
"query":{
"bool":{
"must":[{
"match":{
"category":"小米"
}
},{
"match":{
"price":3999.00
}
}]
}
}
}
------------------------------------------------
2.以下操作,相当于实现了如下SQL:
select * form product where category like '%小米%' or category like '%华为%'
{
"query":{
"bool":{
"should":[{
"match":{
"category":"小米"
}
},{
"match":{
"category":"华为"
}
}]
}
}
}
------------------------------------------------
3.以下操作,相当于实现了如下SQL:
select * form product where category not like '%小米%' and category not like '%华为%'
{
"query":{
"bool":{
"must_not":[{
"match":{
"category":"小米"
}
},{
"match":{
"category":"华为"
}
}]
}
}
}
返回结果:
操作1:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 2,
"relation": "eq"
},
"max_score": 2.1507282,
"hits": [{
"_index": "shopping",
"_type": "product",
"_id": "1001",
"_score": 2.1507282,
"_source": {
"title": "小米手机",
"category": "小米", //返回category为“小米”
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999//返回价格为“3999”
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "7B8uiYEBSDaNywCKnwWG",
"_score": 2.1507282,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
}
}]
}
}
------------------------------------------------------------------------------------------------
操作2:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 7,
"relation": "eq"
},
"max_score": 1.8889232,
"hits": [{
"_index": "shopping",
"_type": "product",
"_id": "7x8viYEBSDaNywCKQAVb",
"_score": 1.8889232,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "8B8viYEBSDaNywCKbQUz",
"_score": 1.8889232,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "8R8viYEBSDaNywCKmQVq",
"_score": 1.8889232,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "1001",
"_score": 1.3862942,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "7B8uiYEBSDaNywCKnwWG",
"_score": 1.3862942,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "7R8uiYEBSDaNywCK7QXv",
"_score": 1.3862942,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "7h8viYEBSDaNywCKGgW-",
"_score": 1.3862942,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
}]
}
}
------------------------------------------------------------------------------------------------
操作3:
{
"took": 339,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0,
"hits": [{
"_index": "shopping",
"_type": "product",
"_id": "Bx8BioEBSDaNywCKxAa_",
"_score": 0,
"_source": {
"title": "锤子手机",
"category": "锤子",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1299
}
}]
}
}
- 范围查询
向 ES 服务器发GET
请求 : http://宿主机IP:9200/索引名称/_search,附带JSON体如下:
{
"query":{
"bool":{
"filter":{//表示当前操作为过滤
"range":{//指定范围
"key":{//需要过滤的字段名称
"gt":0 // gt 表示大于,lt表示小于,0为指定的范围值
}
}
}
}
}
}
例:
**注意:请求类型是 GET**
http://127.0.0.1:9200/shopping/_search
请求体内容:
以下操作相当于如下SQL:
select * form product where price <1300
{
"query":{
"bool":{
"filter":{
"range":{
"price":{
"lt":1300
}
}
}
}
}
}
返回结果:
{
"took": 1,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 0,
"hits": [{
"_index": "shopping",
"_type": "product",
"_id": "Bx8BioEBSDaNywCKxAa_",
"_score": 0,
"_source": {
"title": "锤子手机",
"category": "锤子",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1299 //价格低于1300
}
}]
}
}
文档全文检索 , 完全匹配 , 高亮查询
- 全文检索
向 ES 服务器发GET
请求 :http://宿主机IP:9200/索引名称/_search,附带JSON体如下:
{
"query":{
"match":{
"key" : "value" //指定需要全文检索的key : 全文检索的值
}
}
}
例:
**注意:请求类型是 GET**
http://127.0.0.1:9200/shopping/_search
请求体内容:
以下操作相当于如下SQL:
select * form product where category like '%锤%' or category like '%华%'
{
"query":{
"match":{
"category" : "锤华"
}
}
}
返回结果:
{
"took": 2,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 4,
"relation": "eq"
},
"max_score": 1.7917595,
"hits": [{
"_index": "shopping",
"_type": "product",
"_id": "Bx8BioEBSDaNywCKxAa_",
"_score": 1.7917595,
"_source": {
"title": "锤子手机",
"category": "锤子", //返回包含 “锤” 字
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1299
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "7x8viYEBSDaNywCKQAVb",
"_score": 0.9444616,
"_source": {
"title": "华为手机",//返回包含 “华” 字
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "8B8viYEBSDaNywCKbQUz",
"_score": 0.9444616,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "8R8viYEBSDaNywCKmQVq",
"_score": 0.9444616,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
}]
}
}
- 完全匹配
向 ES 服务器发GET
请求 :http://宿主机IP:9200/索引名称/_search,附带JSON体如下:
{
"query":{
"match_phrase":{//指定匹配类型为完全匹配
"key" : "value" //指定需要全文检索的key : 全文检索的值
}
}
}
例:
**注意:请求类型是 GET**
http://127.0.0.1:9200/shopping/_search
请求体内容:
以下操作相当于如下SQL:
此处将不会再把“锤华”二字拆开,而是作为一个整体来匹配
select * form product where category like '%锤华%'
{
"query":{
"match_phrase":{
"category" : "锤华"
}
}
}
返回结果:
{
"took": 12,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 0,
"relation": "eq"
},
"max_score": null,
"hits": [] //返回结果为空,因为我们录入的ES 数据中并不含有 category 字段为 “锤华” 的数据
}
}
- 高亮查询
向 ES 服务器发GET
请求 :http://宿主机IP:9200/索引名称/_search,附带JSON体如下:
{
"query":{
"match_phrase":{
"key" : "value"
}
},
"highlight":{//指定高亮显示
"fields":{//指定需要高亮显示的字段
"key":{}//对指定的key进行高亮显示
}
}
}
例:
**注意:请求类型是 GET**
http://127.0.0.1:9200/shopping/_search
请求体内容:
以下操作表示,完全匹配 “锤子” 这两个字符,并且对 category 字段进行高亮显示
{
"query":{
"match_phrase":{
"category" : "锤子"
}
},
"highlight":{
"fields":{
"category":{}
}
}
}
返回结果:
{
"took": 54,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 1,
"relation": "eq"
},
"max_score": 3.583519,
"hits": [{
"_index": "shopping",
"_type": "product",
"_id": "Bx8BioEBSDaNywCKxAa_",
"_score": 3.583519,
"_source": {
"title": "锤子手机",
"category": "锤子",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1299
},
"highlight": {
"category": ["<em>锤</em><em>子</em>"]
//此处对返回结果进行了高亮显示,
这个标签是默认的标签,是可以自定义的进行替换的,
比如我们可以替换成
<span style="color:red">锤</span>
<span style="color:red">子</span>
把这个输出到网页上,
自然而然就是红色的了。
}
}]
}
}
聚合查询(类似mysql中的 group by,取最大值max、平均值avg等等)
- 分组查询
向 ES 服务器发GET
请求 :http://宿主机IP:9200/索引名称/_search,附带JSON体如下:
{
"aggs":{//表示当前为聚合操作
"price_group":{//聚合的名称,自定义,随意起
"terms":{//表示当前进行的操作为分组,还有avg(平均值),max(最大值),min(最小值)等
"field":"key"//指定需要分组的字段
}
}
}
}
例:
**注意:请求类型是 GET**
http://127.0.0.1:9200/shopping/_search
请求体内容:
以下操作表示,对price这个字段进行分组查询
{
"aggs":{
"price_group":{
"terms":{
"field":"price"
}
}
}
}
返回结果:
{
"took": 21,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 8,
"relation": "eq"
},
"max_score": 1,
"hits": [{
"_index": "shopping",
"_type": "product",
"_id": "1001",
"_score": 1,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "7B8uiYEBSDaNywCKnwWG",
"_score": 1,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 3999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "7R8uiYEBSDaNywCK7QXv",
"_score": 1,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "7h8viYEBSDaNywCKGgW-",
"_score": 1,
"_source": {
"title": "小米手机",
"category": "小米",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "7x8viYEBSDaNywCKQAVb",
"_score": 1,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "8B8viYEBSDaNywCKbQUz",
"_score": 1,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "8R8viYEBSDaNywCKmQVq",
"_score": 1,
"_source": {
"title": "华为手机",
"category": "华为",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1999
}
}, {
"_index": "shopping",
"_type": "product",
"_id": "Bx8BioEBSDaNywCKxAa_",
"_score": 1,
"_source": {
"title": "锤子手机",
"category": "锤子",
"images": "http://www.gulixueyuan.com/xm.jpg",
"price": 1299
}
}]
},
"aggregations": { //分组后的统计数据
"price_group": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [{
"key": 1999,
"doc_count": 5
}, {
"key": 3999,
"doc_count": 2
}, {
"key": 1299,
"doc_count": 1
}]
}
}
}
以上数据中,会携带原始数据,若只想查看分组结果的数据,则,可以添加 "size":0
来指定,比如使用如下请求体:
{
"aggs":{
"price_group":{
"terms":{
"field":"price"
}
}
},
"size":0
}
返回结果:
{
"took": 6,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 8,
"relation": "eq"
},
"max_score": null,
"hits": [] //此处就不会携带原始数据了
},
"aggregations": {
"price_group": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [{
"key": 1999,
"doc_count": 5
}, {
"key": 3999,
"doc_count": 2
}, {
"key": 1299,
"doc_count": 1
}]
}
}
}
- 求平均值
向 ES 服务器发GET
请求 :http://宿主机IP:9200/索引名称/_search,附带JSON体如下:
{
"aggs":{//表示当前为聚合操作
"price_avg":{//名称,随意起名
"avg":{表示当前进行的操作为计算平均值
"field":"key"//指定需要分组的字段
}
}
},
"size":0 //排除原始数据
}
例:
**注意:请求类型是 GET**
http://127.0.0.1:9200/shopping/_search
请求体内容:
以下操作表示,对price这个字段计算平均值,并排除原始数据
{
"aggs":{
"price_avg":{
"avg":{
"field":"price"
}
}
},
"size":0
}
返回结果:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 8,
"relation": "eq"
},
"max_score": null,
"hits": []
},
"aggregations": {
"price_avg": {
"value": 2411.5//计算的平均值
}
}
}
其他最大值,最小值,同理,只需在自定义的名称下,配置关键字即可
映射关系
前提环境准备:
创建一个User索引
请求方式: PUT
http://127.0.0.1:9200/user
- 创建映射
向 ES 服务器发PUT
请求 :http://宿主机IP:9200/索引名称/创建的映射名称,附带JSON体如下:
{
"properties": { //表示对数据的约束配置
"key1":{ //数据的字段名称
"type": "text", //数据类型为text
"index": true //表示当前字段可以根据索引查询
},
"key2":{
"type": "keyword",//表示当前字段不能被进行分词,只能完整匹配
"index": true
},
"key3":{
"type": "keyword",
"index": false //表示当前字段不能根据索引查询
}
...
}
}
例:
**注意:请求类型是 PUT**
http://127.0.0.1:9200/user/_mapping
请求体内容:
{
"properties": {
"name":{
"type": "text",
"index": true
},
"sex":{
"type": "keyword",
"index": true
},
"tel":{
"type": "keyword",
"index": false
}
}
}
返回结果:
{
"acknowledged": true
}
- 查询映射
请求方式:GET
http://127.0.0.1:9200/user/_mapping
返回结果:
{
"user": {
"mappings": {
"properties": {
"name": {
"type": "text"
},
"sex": {
"type": "keyword"
},
"tel": {
"type": "keyword",
"index": false
}
}
}
}
}
- 增加数据
请求方式:PUT
http://127.0.0.1:9200/user/_create/1001
请求体:
{
"name":"小米",
"sex":"男的",
"tel":"1111"
}
返回结果:
{
"_index": "user",
"_type": "_doc",
"_id": "1001",
"_version": 1,
"result": "created",
"_shards": {
"total": 2,
"successful": 1,
"failed": 0
},
"_seq_no": 0,
"_primary_term": 1
}
- 查询数据
1.查找sex含有”男“数据
请求方式:GET
http://127.0.0.1:9200/user/_search
请求体:
{
"query":{
"match":{
"sex":"男"
}
}
}
返回结果:
{
"took": 548,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": {
"value": 0,
"relation": "eq"
},
"max_score": null,
"hits": [] //此处查询为空,因为创建映射时"sex"的类型为"keyword"
所以,"sex"只能完全为”男的“,才能得出原数据
}
}
2.查询电话
请求方式: GET
http://127.0.0.1:9200/user/_search
请求体:
{
"query":{
"match":{
"tel":"11"
}
}
}
返回结果:
出现错误的原因是:创建映射时"tel"的"index"为false,所以不能被查询
{
"error": {
"root_cause": [{
"type": "query_shard_exception",
"reason": "failed to create query: Cannot search on field [tel] since it is not indexed.",
"index_uuid": "DZfd5Q0nT9OUZdqCgeSV7g",
"index": "user"
}],
"type": "search_phase_execution_exception",
"reason": "all shards failed",
"phase": "query",
"grouped": true,
"failed_shards": [{
"shard": 0,
"index": "user",
"node": "_lxFql-IT2SFR5Y2cgVsNw",
"reason": {
"type": "query_shard_exception",
"reason": "failed to create query: Cannot search on field [tel] since it is not indexed.",
"index_uuid": "DZfd5Q0nT9OUZdqCgeSV7g",
"index": "user",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "Cannot search on field [tel] since it is not indexed."
}
}
}]
},
"status": 400
}
集群
Windows集群部署
IK分词器
- 安装
将解压后的后的文件夹放入 ES 根目录下的 plugins 目录下,重启 ES 即可使用。 - 使用
1.使用前
请求方式:GET
请求地址: http://localhost:9200/_analyze
请求体:
{
"text":"测试单词"
}
返回结果:
{
"tokens": [
{
"token": "测",
"start_offset": 0,
"end_offset": 1,
"type": "<IDEOGRAPHIC>",
"position": 0
},
{
"token": "试",
"start_offset": 1,
"end_offset": 2,
"type": "<IDEOGRAPHIC>",
"position": 1
},
{
"token": "单",
"start_offset": 2,
"end_offset": 3,
"type": "<IDEOGRAPHIC>",
"position": 2
},
{
"token": "词",
"start_offset": 3,
"end_offset": 4,
"type": "<IDEOGRAPHIC>",
"position": 3
}
]
}
2.使用后
请求方式:GET
请求地址: http://localhost:9200/_analyze
请求体:
{
"text":"测试单词",
"analyzer":"ik_max_word"
}
ik_max_word
:会将文本做最细粒度的拆分。
ik_smart
:会将文本做最粗粒度的拆分。
返回结果:
{
"tokens": [{
"token": "测试",
"start_offset": 0,
"end_offset": 2,
"type": "CN_WORD",
"position": 0
}, {
"token": "单词",
"start_offset": 2,
"end_offset": 4,
"type": "CN_WORD",
"position": 1
}]
}
- 扩展词汇
请求方式:GET
请求地址: http://localhost:9200/_analyze
请求体:
{
"text":"弗雷尔卓德",
"analyzer":"ik_max_word"
}
返回结果:
{
"tokens": [{
"token": "弗",
"start_offset": 0,
"end_offset": 1,
"type": "CN_CHAR",
"position": 0
}, {
"token": "雷",
"start_offset": 1,
"end_offset": 2,
"type": "CN_CHAR",
"position": 1
}, {
"token": "尔",
"start_offset": 2,
"end_offset": 3,
"type": "CN_CHAR",
"position": 2
}, {
"token": "卓",
"start_offset": 3,
"end_offset": 4,
"type": "CN_CHAR",
"position": 3
}, {
"token": "德",
"start_offset": 4,
"end_offset": 5,
"type": "CN_CHAR",
"position": 4
}]
}
从以上结果可以看出,他将每个字都进行了拆分,但我们希望的是分词器能识别到 ‘弗雷尔卓德’ 也是一个词语。
解决方案:
1.首先进入 ES
根目录中的plugins
文件夹下的ik
文件夹,进入 config
目录,创建 custom.dic
文件,写入“弗雷尔卓德”。
2.同时打开 IKAnalyzer.cfg.xml
文件,将新建的 custom.dic
配置其中。
配置的 IKAnalyzer.cfg.xml
文件内容:
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE properties SYSTEM "http://java.sun.com/dtd/properties.dtd">
<properties>
<comment>IK Analyzer 扩展配置</comment>
<!--用户可以在这里配置自己的扩展字典 -->
<entry key="ext_dict">custom.dic</entry>
<!--用户可以在这里配置自己的扩展停止词字典-->
<entry key="ext_stopwords"></entry>
<!--用户可以在这里配置远程扩展字典 -->
<!-- <entry key="remote_ext_dict">words_location</entry> -->
<!--用户可以在这里配置远程扩展停止词字典-->
<!-- <entry key="remote_ext_stopwords">words_location</entry> -->
</properties>
3.重启 ES 服务器
4.再次测试
请求方式:GET
请求地址: http://localhost:9200/_analyze
请求体:
{
"text":"弗雷尔卓德",
"analyzer":"ik_max_word"
}
返回结果:
{
"tokens": [{
"token": "弗雷尔卓德",
"start_offset": 0,
"end_offset": 5,
"type": "CN_WORD",
"position": 0
}]
}
文档客户端工具展示
目前较为流行的击中客户端工具:
Elasticsearch-Head
:弃用。 Elasticsearch-Head插件在5.x版本之后已不再维护,界面比较老旧。
cerebro
:弃用。原因:据传该插件不支持ES中5.x以上版本。
kinaba
:弃用。功能强大,但操作复杂,以后可以考虑。
Dejavu
:弃用。 也是一个 Elasticsearch的 Web UI 工具,其 UI界面更符合当下主流的前端页面风格,因此使用起来很方便。但是网上可借鉴的文档较少,我也没有细查。
ElasticHD
:推荐。不依赖ES的插件安装,更便捷;导航栏直接填写对应的ES IP和端口就可以操作Es了
- 安装启动
解压之后,cd 到安装目录 执行dos命令:ElasticHD -p 127.0.0.1:9800
,即可