单字符串多字段查询
# 一、Dis Max Query
# 1、基础案例
首先来看一个简单的查询案例:
PUT /blogs/_doc/1
{
"title": "Quick brown rabbits",
"body": "Brown rabbits are commonly seen."
}
PUT /blogs/_doc/2
{
"title": "Keeping pets healthy",
"body": "My quick brown fox eats rabbits on a regular basis."
}
POST /blogs/_search
{
"query": {
"bool": {
"should": [
{ "match": { "title": "Brown fox" }},
{ "match": { "body": "Brown fox" }}
]
}
}
}
由于数据一中title和body字段都含有Brown,而数据二只有body字段含有brown fox,should的查询过程如下:
- 查询should语句中的子查询
- 加和子查询的评分
- 乘以匹配语句的总数
- 除以所有语句的总数
通过这样的算分流程,得到数据一的匹配程度更高,但是如果我们仅仅想要的某一个字段匹配最高,比如在这个例子中,数据二的body字段匹配程度最高。
# 2、Dis Max Query
POST blogs/_search
{
"query": {
"dis_max": {
"queries": [
{ "match": { "title": "Quick pets" }},
{ "match": { "body": "Quick pets" }}
]
}
}
}
- 数据一只有title字段能够匹配Quick,但是Quick在title字段匹配程度高。
- 数据二title字段和body字段能够匹配pets,但是pets在两个字短匹配程度相对较低。
这种情况的竞争就是当字段之间相互竞争,又相互关联的时候,我们只想要查询出最佳字段,也就是我们最匹配的字段。
# 二、Multi Match
# 1、基础语法结构
{
"query": {
"multi_match": {
"query": "barking dogs",
"type": "most_fields",
"fields": [ "title", "title.std" ]
}
}
}
# 2、案例
DELETE /titles
PUT /titles
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "english"
}
}
}
}
POST titles/_bulk
{ "index": { "_id": 1 }}
{ "title": "My dog barks" }
{ "index": { "_id": 2 }}
{ "title": "I see a lot of barking dogs on the road " }
GET titles/_search
{
"query": {
"match": {
"title": "barking dogs"
}
}
}
在title字段的匹配上,id为1的语句整体要简短,同时采用了英文分词器,barking dogs会被分成bark和dogs两个单词,所以id为1的语句匹配程度更高。
DELETE /titles
PUT /titles
{
"mappings": {
"properties": {
"title": {
"type": "text",
"analyzer": "english",
"fields": {"std": {"type": "text","analyzer": "standard"}}
}
}
}
}
POST titles/_bulk
{ "index": { "_id": 1 }}
{ "title": "My dog barks" }
{ "index": { "_id": 2 }}
{ "title": "I see a lot of barking dogs on the road " }
GET /titles/_search
{
"query": {
"multi_match": {
"query": "barking dogs",
"type": "most_fields",
"fields": [ "title", "title.std" ]
}
}
}
GET /titles/_search
{
"query": {
"multi_match": {
"query": "barking dogs",
"type": "most_fields",
"fields": [ "title^10", "title.std" ] // 权重
}
}
}
通过为title字段增加一个子字段std,这个字段采用标准的分词器,来对算分进行综合的影响。
# 参考
上次更新: 2024/06/29, 15:13:44