Full-Body Search
在Search DSL中,需要通过GET方法带Body请求:
GET /_search
{}
GET /index_2014*/type1,type2/_search
{}
GET /_search
{
"from": 30,
"size": 10
}
带Body的GET请求看上去可能很奇怪,某些语言的HTTP类库甚至不支持
GET
请求带Body,如JavaScript。但是这实际是符合RFC 7231标准的,但是标准并未定义GET
带Body应该作何种响应。所以基于这种原因,某些HTTP Server支持,某些不支持,尤其是代理服务器。ES开发者认为GET
相比POST
更合语义,但是由于某些类库不支持GET
带Body,因此某些API也支持POST
请求。 如:
POST /_search { "from": 30, "size": 10 }
Query DSL
要使用Query DSL,给query
字段传递一个查询条件即可:
GET /_search
{
"query": YOUR_QUERY_HERE
}
空query({}
)等价于使用match_all
:
GET /_search
{
"query": {
"match_all": {}
}
}
Query结构
一个Query通常结构如下:
{
QUERY_NAME: {
ARGUMENT: VALUE,
ARGUMENT: VALUE,...
}
}
如果引用了一个特定的字段,还应该有如下的结构:
{
QUERY_NAME: {
FIELD_NAME: {
ARGUMENT: VALUE,
ARGUMENT: VALUE,...
}
}
}
例如,可以使用match
查询tweet
字段是否提到过elasticsearch
:
{
"match": {
"tweet": "elasticsearch"
}
}
完整的查询就像这样:
GET /_search
{
"query": {
"match": {
"tweet": "elasticsearch"
}
}
}
组合多个查询条件
Query clauses可以组合简单查询条件成为复杂查询:
- Leaf clauses (like the
match
clause) that are used to compare a field (or fields) to a query string. - Compound clauses that are used to combine other query clauses. For instance, a
bool
clause allows you to combine other clauses that eithermust
match,must_not
match, orshould
match if possible. They can also include non-scoring, filters for structured search:{ "bool": { "must": { "match": { "tweet": "elasticsearch" }}, "must_not": { "match": { "name": "mary" }}, "should": { "match": { "tweet": "full text" }}, "filter": { "range": { "age" : { "gt" : 30 }} } } }
Query and Filters
ES的Query DSL是一组查询条件的集合。每一组查询都可以分为filtering
context and query
context.
filtering
表示non-scoring
或filtering
查询,如"Does this document match?". The answer is always a simple, binary yes|no.
query
指的是"scoring" query。这将判断文档是否匹配,以及文档如何匹配。
As a general rule, use query clauses for full-text search or for any condition that should affect the relevance score, and use filters for everything else.
几个常用的查询:
{ "match_all": {}}
{ "match": { "age": 26 }}
{ "match": { "date": "2014-09-01" }}
{ "match": { "public": true }}
{ "match": { "tag": "full_text" }}
{
"multi_match": {
"query": "full text search",
"fields": [ "title", "body" ]
}
}
{
"range": {
"age": {
"gte": 20,
"lt": 30
}
}
}
{ "term": { "age": 26 }}
{ "term": { "date": "2014-09-01" }}
{ "term": { "public": true }}
{ "term": { "tag": "full_text" }}
{ "terms": { "tag": [ "search", "full_text", "nosql" ] }}
{
"exists": {
"field": "title"
}
}
The term
query is used to search by exact values, be they numbers, dates, Booleans, or not_analyzed
exact-value string fields.
联合查询
使用bool
查询将多个子查询联合在一起。包含以下几个属性:
must
: Clauses that must match for the document to be included.must_not
: Clauses that must not match for the document to be included.should
: If these clauses match, they increase the_score
; otherwise, they have no effect. They are simply used to refine the relevance score for each document.filter
: Clauses that must match, but are run in non-scoring, filtering mode. These clauses do not contribute to the score, instead they simply include/exclude documents based on their criteria.
每个子查询都会分别为每个文档独立的计算关联性得分,bool
将每个子查询的得分进行合并汇总。
以下这个查询将查找title
匹配字符串how to make millions
,并且mark不是spam
的文档。如果有文档属于starred
,或者是2014
年以后的,排名就比那些不匹配的高。如果同时满足这两种条件排名会更高:
{
"bool": {
"must": { "match": { "title": "how to make millions" }},
"must_not": { "match": { "tag": "spam" }},
"should": [
{ "match": { "tag": "starred" }},
{ "range": { "date": { "gte": "2014-01-01" }}}
]
}
}
TIP: 如果没有
must
条件,那么should
条件至少会匹配一个。但是,如果含有至少一个must
,那么should
条件可以无需匹配到。
如果不想让date
影响到分数,我们可以用filter
子句预限定文档范围:
{
"bool": {
"must": { "match": { "title": "how to make millions" }},
"must_not": { "match": { "tag": "spam" }},
"should": [
{ "match": { "tag": "starred" }}
],
"filter": {
"range": { "date": { "gte": "2014-01-01" }}
}
}
}
任何查询条件都可以以这种形式,简单的移动到filter
子句中,自动转换为non-scoring查询。
如果你需要以多种不同形式查询,bool
本身就可以放入filter
作为non-scoring查询:
{
"bool": {
"must": { "match": { "title": "how to make millions" }},
"must_not": { "match": { "tag": "spam" }},
"should": [
{ "match": { "tag": "starred" }}
],
"filter": {
"bool": {
"must": [
{ "range": { "date": { "gte": "2014-01-01" }}},
{ "range": { "price": { "lte": 29.99 }}}
],
"must_not": [
{ "term": { "category": "ebooks" }}
]
}
}
}
}
检验查询
validate-query
API可以验证一个查询是否合法:
GET /gb/tweet/_validate/query
{
"query": {
"tweet" : {
"match" : "really powerful"
}
}
}
{
"valid" : false,
"_shards" : {
"total" : 1,
"successful" : 1,
"failed" : 0
}
}
想要知道为什么查询不合法,给query string追加explain
参数即可:
GET /gb/tweet/_validate/query?explain
{
"query": {
"tweet" : {
"match" : "really powerful"
}
}
}
{
"valid" : false,
"_shards" : { ... },
"explanations" : [ {
"index" : "gb",
"valid" : false,
"error" : "org.elasticsearch.index.query.QueryParsingException:
[gb] No query registered for [tweet]"
} ]
}
explain
有助于理解ES的查询过程:
GET /_validate/query?explain
{
"query": {
"match" : {
"tweet" : "really powerful"
}
}
}
{
"valid" : true,
"_shards" : { ... },
"explanations" : [ {
"index" : "us",
"valid" : true,
"explanation" : "tweet:really tweet:powerful"
}, {
"index" : "gb",
"valid" : true,
"explanation" : "tweet:realli tweet:power"
} ]
}