중개 플랫폼 서비스 : ElasticSearch_-_REST

REST API 기본 구조
Document API
Search API

Term query
match_all query
match query
match_phrase query
bool query
filtered query
aggregation
facets
APIs

Basic API
Bulk API
Indexing data
User Query DSL
참고 문헌

ElasticSearch의 REST API를 정리 합니다.

REST API 기본 구조

ElasticSearch는 [http://ko.wikipedia.org/wiki/REST REST (Representational State Transfer)] API를 제공하여 다양한 환경에서 사용할 수 있습니다.

'''REST에서 Methods의 주요 용도'''

POST : 등록 (Create)
PUT : 수정 (Replace), 데이터가 없을 경우에는 등록 (Create)
DELETE : 삭제 (Delete)
GET : 조회 (List, Retrieve)

'''URI 기본 형태'''

http://localhost:9200/index/type/id?parameters
http://localhost:9200/ http://localhost:9200/[index/][type/]action?parameters
*index : DBMS에서 데이터베이스에 해당
*type : DBMS에서 테이블에 해당
*id : DBMS에서 레코드에 해당하는 Document의 ID
index, type, id를 여러개 지정할 경우 ","를 사용하여 구분 합니다. ""를 사용하여 모두 지정할 수 있습니다.
*action : 특정 작업을 지시
*공통 parameters
**pretty : 반환 값이 있다면 JSON response를 표시
**v : verbose. 상세 정보 표시
**help : 사용 가능한 컬럼 정보 표시
**h=컬럼1,컬럼2 : headers. 컬럼 표시
**bytes=b : 1kb 대신에 1024와 같이 숫자를 표시

'''Action'''

{| border="1" cellspacing="0" cellpadding="2" style="line-height: 20.7999992370605px; width: 100%;" |- | style="text-align: center; background-color: rgb(204, 204, 204);" | Action | style="text-align: center; background-color: rgb(204, 204, 204);" | 상세 |- | _cluster | 클러스터 관련 작업 |- | _nodes | 노드 관련 작업 |- | _aliases | index alias 관련 작업 |- | _analyze | analyzer 관련 작업 |- | _cache | Cache 관련 작업 |- | _flush | Transaction log 또는 Memory free 작업 |- | _optimize | Segment 파일 병합 작업 |- | _stats | 시스템 또는 색인의 통계 정보 |- | _search | 검색 작업 |- | _msearch | Multi 검색 작업 |- | _mget
| Multi Document petch 작업
|- | _validate
| Query에 대한 유효성 검사 작업
|- | _suggest
| 검색어 자동 완성
|- | _bulk
| Bulk 색인 작업
|- | _count
| 문서 count 작업
|- | _settings | elasticsearch.yml에 설정한 global settings 정보 조회
[http://localhost:9200/index/_settings?pretty=true](http://localhost:9200/index/_settings?pretty=true)

number_of_shards : Shard 개수
number_of_replicas : Replica 개수
index.refresh_interval : Index 변경 후 검색 결과에 반영되는 시간 설정
snalysis :analyzer와 tokenizer 설정
store : 저장 옵션

|-
| _mapping

| 매핑 정보
http://localhost:9200/index/_mapping?pretty=true

'''Core Type Attribute'''

{| border="1" cellspacing="0" cellpadding="2" style="width: 100%;"
|- | style="text-align: center; background-color: rgb(204, 204, 204);" | Attribute | style="text-align: center; background-color: rgb(204, 204, 204);" | Default

| style="text-align: center; background-color: rgb(204, 204, 204);" | 상세
|- | style="text-align: center;" | store | style="text-align: center;" | no

원본 저장 여부
style="text-align: center;"
style="text-align: center;"
색인 방식 지정 no, not_analyzed, analyzed
-
style="text-align: center;"
style="text-align: center;"
색인어에 대한 메타 정보 저장 방식 yes, no, with_offsets, with_positions, with_positions_offsets
-
style="text-align: center;"
style="text-align: center;"
Boost 값
-
style="text-align: center;"
style="text-align: center;"
필드의 값이 null일 경우의 default 값
-
style="text-align: center;"
style="text-align: center;"
Lucene의 norms 사용 여부
-
style="text-align: center;"
style="text-align: center;"
색인시 저장할 메타 정보 설정 positions, docs
-
style="text-align: center;"
style="text-align: center;"
색인 및 검색 시 사용할 Global analyzer
-
style="text-align: center;"
style="text-align: center;"
색인시 사용할 analyzer
-
style="text-align: center;"
style="text-align: center;"
검색시 사용할 analyzer
-
style="text-align: center;"
style="text-align: center;"
_all 필드에 검색 가능한 모든 필드를 포함할지 여부
-
style="text-align: center;"
style="text-align: center;"
문자열 필드에서 정해진 크기를 넘는 문자는 무시하도록 설정
-
style="text-align: center;"
style="text-align: center;"
Phrase 검색에서 전후 텍스트간의 간격 설정
-
style="text-align: center;"
style="text-align: center;"
최대 number_term 설정
-
style="text-align: center;"
style="text-align: center;"
잘못된 number, date 무시
-
style="text-align: center;"
style="text-align: center;"
Date format
}

필드 타입

string, number, boolean, date
ip

필드 종류

_id : Document의 primary key
_source : 색인된 document의 모든 필드 저장
_all :
Search Field : 검색 대상이 되는 일반 필드
Facet Field : 검색 결과에 대한 Group by 연산을 수행
*terms : 지정한 필드 목록으로 group by 연산
*statistical : 지정한 필드 목록에 대한 통계
*terms statistical facet : key 필드에 대한 value 필드의 통계
Sort Field : 정렬 필드, Default로 _score 필드의 내림차순으로 정렬
Boost Field : 부스트용 필드
Highlight Field : 강조 필드

'''등록/수정/삭제/조회 사례'''

{| cellspacing="0" cellpadding="2" border="1" width="100%" bgcolor="#FFFFFF" align="center" style="line-height: 20.7999992370605px;"
|- | width="20%" align="center" valign="middle" style="background-color: rgb(238, 238, 238);" | 등록
(POST / PUT)
| width="80%" |

customer 인덱스 생성

 curl -XPUT 'node201.hadoop.com:9200/customer?pretty'
 
 curl -GET 'node201.hadoop.com:9200/_cat/indices?v'

external 타입으로 문서 추가
- 문서 번호는 자동으로 생성

 curl -XPOST 'node201.hadoop.com:9200/customer/external?pretty' -d '
 {
 "name": "Mountain Lover"
 }'
 
 curl -XGET 'node201.hadoop.com:9200/customer/external/1lz2jL6CQui07FnZGd_R9w?pretty'

external 타입으로 1번 문서 추가

 curl -XPUT 'node201.hadoop.com:9200/customer/external/1?pretty' -d '
 {
 "name": "Mountain Lover"
 }'
 
 curl -XGET 'node201.hadoop.com:9200/customer/external/1?pretty'

|-
| align="center" valign="middle" style="background-color: rgb(238, 238, 238);" | 수정
(PUT / POST)
|

external 타입으로 1번 문서 수정

 curl -XPOST 'node201.hadoop.com:9200/customer/external/1/_update?pretty' -d '
 {
 "doc": { "name": "Mountain Lover!", "age": 20 }
 }'
 
 curl -XGET 'node201.hadoop.com:9200/customer/external/1?pretty'

external 타입으로 1번 문서 수정

 curl -XPUT 'node201.hadoop.com:9200/customer/external/1?pretty' -d '
 {
 "name": "Mountain Lover!"
 }'
 
 curl -XGET 'node201.hadoop.com:9200/customer/external/1?pretty'

|-
| align="center" valign="middle" style="background-color: rgb(238, 238, 238);" | 삭제
(DELETE)
|

문서 삭제

 curl -XDELETE 'node201.hadoop.com:9200/customer/external/1?pretty'
 
 curl -XGET 'node201.hadoop.com:9200/customer/external/1?pretty'
 
 curl -XDELETE 'node201.hadoop.com:9200/customer/external/_query?pretty' -d '
 {
 "query": { "match": { "name": "Mountain Lover!" } }
 }'
 
 curl -XGET 'node201.hadoop.com:9200/customer/external/1?pretty'

customer 인덱스 삭제

 curl -XDELETE 'node201.hadoop.com:9200/customer?pretty'
 
 curl -GET 'node201.hadoop.com:9200/_cat/indices?v'

|-
| align="center" valign="middle" style="background-color: rgb(238, 238, 238);" | 조회
(GET)
|

조회

 curl -GET 'node201.hadoop.com:9200/_cat/indices?v'
 
 curl -XGET 'node201.hadoop.com:9200/customer/external/1?pretty'

조회되는 데이터 구조
- _index
- _type
- _id
- _version : 1, 2, 3, ...
- _source : { name1: value1, name2: value2 }

|-
| align="center" valign="middle" style="background-color: rgb(238, 238, 238);" | 검색
(GET / POST)
|

REST request URI

 curl -XGET 'node201.hadoop.com:9200/customer/_search?q=*&pretty'

REST request body

 curl -XPOST 'node201.hadoop.com:9200/customer/_search?pretty' -d '
 {
 "query": { "match_all": {} }
 }'

Document API

index api : -XPUT : 등록, -XPOST : 등록 (id 자동 생성)

/_create
- ?op_type=create : 이미 데이터가 있으면 오류
?routing=~ : routing에 지정한 값의 해쉬값을 사용하여 작업할 node 지정
?version=n
?parent=~
?timestamp=2014-11-15T14%3A12%3A12
?ttl=34 : time to live (milliseconds)
?consistency=one, quorum, all
?replication=async, sync
?refresh=true
?timeout=5m

get api : -XGET

?fields=,
?routing=~ : routing에 지정한 값의 해쉬값을 사용하여 작업할 node 지정
?version=n
?realtime=false
/_source : _source 필드만 반환 (-XHEAD 사용 가능)
?_source=false : _source 필드를 반환하지 않음
- ?_source_include, _source_exclude
?preference=_primary, _local, ~
?refresh=true

delete api : -XDELETE

?routing=~ : routing에 지정한 값의 해쉬값을 사용하여 작업할 node 지정
?version=n
?parent=~
?consistency=one, quorum, all
?replication=async, sync
?refresh=true
?timeout=5m

update api : -XPUT

/_update
"script" 필드 사용법
- ctx._source.필드명

 "script" : "ctx._source.counter += count",
 "params" : {                           #--- script에 인자 전달
 "count" : 4
 } 
 
 ctx._source.remove(\"text\")           #--- 필드 삭제
 ctx._source.tags.contains(tag) ? (ctx.op = \"delete\") : (ctx.op = \"none\")
 if (ctx._source.tags.contains(tag)) { ctx.op = \"none\" } else { ctx._source.tags += tag }

"upsert" : 필드가 있으면 수정, 없으면 등록
"doc_as_upsert": true : 문서가 있으면 수정, 없으면 등록
?routing=~ : routing에 지정한 값의 해쉬값을 사용하여 작업할 node 지정
?parent=~
?replication=async, sync
?timeout=5m
?consistency=one, quorum, all
?refresh=true
?fields=,
?version=n
?version_type
?timestamp=2014-11-15T14%3A12%3A12
- ctx._timestamp
?ttl=34 : time to live (milliseconds)
- ctx._ttl

multi get api : -XGET, /_mget

 curl -XGET 'node201.hadoop.com:9200/_mget' -d '{
 "docs": [
 {
  "_index": "~",
  "_type": "~",
  "_id": "~"
 }
 }
 }'
 
 curl -XGET 'node201.hadoop.com:9200/customer/external/_mget' -d '{
 "ids": ["~"]("~",)
 }'

"_source"

 "_source": false
 "_source": ["field1", "field2" ]()
 "_source": {
 "include": ["~" ](),
 "exclude": ["~", "~" ]()
 }

"fields": "~~", "~~"
"_routing": "~"

bulk api : /_bulk

requests 파일

 #--- index, create, update, delete
 { "index": { "_index": "~", "_type": "~", "_id": "~" } }
 { "field1": "value1" }
 { "update": { "_index": "~", "_type": "~", "_id": "~" } }
 { "doc": { "field1": "value1" }, "doc_as_upsert": true }
 #--- upsert, doc_as_upsert, script, params, lang 파라메터 지원

bulk api

 curl -s -XPOST 'node201.hadoop.com:9200/_bulk --data-binary @requests

_version, _routing, _parent, _timestamp, _ttl, _consistency
?refresh=true

delete by query api : -XDELETE, /_query

 curl -XDELETE 'node201.hadoop.com:9200/customer/external/_query?q=user:~'
 #--- q : query
 #--- df : default field
 #--- analyzer : query analyzer
 #--- default_operator : OR (default), AND
 
 curl -XDELETE 'node201.hadoop.com:9200/customer/external/_query' -d '{
 "query": {
 "term": { "user": "~" }
 }
 '}

?routing=~ : routing에 지정한 값의 해쉬값을 사용하여 작업할 node 지정
?replication=async, sync
?consistency=one, quorum, all

bulk udp api

설정

 bulk.udp.enabled: true
 bulk.udp.bulk_actions: 1000
 bulk.udp.bulk_size: 5m                 #-- 5MB
 bulk.udp.flush_interval: 5s
 bulk.udp.concurrent_requests: 4
 bulk.udp.host:                         #--- network.host에 지정된 값이 default임
 bulk.udp.port; 9700-9800
 bulk.udp.receive_buffer_size: 10mb

사용법

 cat requests | nc -w 0 -u node201.hadoop.com 9700

term vectors api

 curl -XGET 'node201.hadoop.com:9200/customer/external/1/_termvector?pretty=true&fields=~,~"

multi termvectors api

 curl -XGET 'node201.hadoop.com:9200/_mtermvectors' -d '{
 "docs": [
 {
  "_index": "~",
  "_type": "~",
  "_id": "~",
  "term_statistics": true
 }
 ]
 '}

Search API

[http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/_the_search_api.html](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/_the_search_api.html)
[http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html](http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/query-dsl.html)

REST request uri search

q : Query String Query
analyzer
default_operator : OR (default), AND
_source = false, _source_include, _source_exclude
df : 디폴트 필드 지정
fields : 필드 지정
sort : field:asc, field:desc
explain
track_score = true
timeout
from : 반환할 레코드의 시작 인덱스 (0, 1, 2, ...)
size : 반환할 레코드 수 (디폴트는 10)
search_type : query_then_fetch (default), dfs_query_the_fetch, dfs_query_and_fetch, query_and_fetch, count, scan
lowercase_expanded_terms
analyze_wildcard : false (default), true
scroll=5m
preference : _primary, _primary_first, _local, _only_node:xyz, _prefer_node:xyz, _shards:2,3

 curl -XGET 'node201.hadoop.com:9200/customer/_search?q=*&pretty'
 curl -XGET 'node201.hadoop.com:9200/customer/_search?pretty&q=user:kimchi'

REST request body search

_all : 모든 인덱스를 가르키는 예약어
?routing=~ : routing에 지정한 값의 해쉬값을 사용하여 작업할 node 지정
"from" : 0
"size" : 10
"sort" : { "post_date" : {"order" : "asc"}}, "_score"
"_source": false
"_source": { "include": [ "obj1.", "obj2." ], "exclude": "*.description" }
"fields" : "postDate"
"script_fields" : 계산을 통하여 새로운 필드 생성
"fielddata_fields" : "test2"
"post_filter" : { "term" : { "tag" : "green" } }
"highlight" : 결과에 highlight 추가
"rescore" : _score 계산 규칙 조정
"explain": true
"version": true

 curl -XGET 'node201.hadoop.com:9200/customer/_search?pretty' -d '
 {
 "query": { "match_all": {} },
 "sort": { "balance": { "order": "desc" } },   #--- 정렬
 "from": 10,                                   #--- 10번째까지 skip
 "size": 10,                                   #--- 10개의 데이터 반환
 "_source": ["account_number", "balance" ]()    #--- 반환할 필드 지정
 }'

Term query

curl -XGET 'node201.hadoop.com:9200/customer/_search?pretty' -d '
{
"query": {
"term": { "user": "kimchi" }
}
}'

minimum_shoud_match

match_all query

 curl -XGET 'node201.hadoop.com:9200/customer/_search?pretty' -d '
 {
 "query": { "match_all": {} }
 }'

match query

 curl -XPOST 'node201.hadoop.com:9200/customer/_search?pretty' -d '
 {
 "query": { "match": { "account_number": 20 } }
 }'

match_phrase query

 curl -XPOST 'node201.hadoop.com:9200/customer/_search?pretty' -d '
 {
 "query": { "match_phrase": { "address": "mill lane" } }
 }'

bool query

 curl -XPOST 'node201.hadoop.com:9200/customer/_search?pretty' -d '
 {
 "query": {
 "bool": {
 #--- "must" : AND, "should" : OR, "must_not" : NOT (~ AND ~)
 "must": [                        
 { "match": { "address": "mill" } },
 { "match": { "address": "lane" } }
 ],
 "must_not": [
 { "match": { "state": "ID" } }
 ]
 }
 }
 }'

filtered query

 curl -XPOST 'node201.hadoop.com:9200/customer/_search?pretty' -d '
 {
 "query": {
   "filtered": {
     "query": { "match_all": {} },
     "filter": {
       "range": {                     #--- range filter
         "balance": {
           "gte": 20000,
           "lte": 30000
         }
       }
     }
   }
 }
 }'

aggregation

 curl -XPOST 'node201.hadoop.com:9200/customer/_search?pretty' -d '
 {
 "size": 0,
 "aggs": {
 "group_by_state": {                #--- count(state) 반환
 "terms": {
 "field": "state"
 }
 }
 }
 }'
 
 curl -XPOST 'node201.hadoop.com:9200/customer/_search?pretty' -d '
 {
 "size": 0,
 "aggs": {
 "group_by_state": {                #--- state별 avg(balance) 반환
 "terms": {
 "field": "state",
 "order": {
 "average_balance": "desc"
 }
 },
 "aggs": {
 "average_balance": {
 "avg": {
 "field": "balance"
 }
 }
 }
 }
 }
 }'

search shards api : search 문이 어떤 노드의 shards에서 처리되었는지 정보 반환

 curl -XGET 'node201.hadoop.com:9200/customer/_search_shards'

search template : Template를 사용하여 search문 구성

 curl -XGET 'node201.hadoop.com:9200/customer/_search/template?pretty' -d '{
 "template" : {
 "query": { "match" : { "{{my_field}}" : "{{my_value}}" } },
 "size" : "{{my_size}}"
 },
 "params" : {
 "my_field" : "foo",
 "my_value" : "bar",
 "my_size" : 5
 }
 }'

facets

search에 결과에 대한 aggregation 처리나 통계 처리

terms : 필드에 대한 통계 처리

 curl -XPOST 'node201.hadoop.com:9200/customer/_search?pretty=true' -d '{
 "query" : { ~ },
 "facets": { "terms": { "field": "~" } }
 }'

facets global 설정

main : 현재 search문에만 적용
global : 모든 search문에 적용

 curl -XPOST 'node201.hadoop.com:9200/customer/_search?pretty=true' -d '{
 "facets": {
 "myFacets": {
 "terms": { "field": "~" },
 "global": true
 }
 }
 }'

facet filter

 curl -XPOST 'node201.hadoop.com:9200/customer/_search?pretty=true' -d '{
 "facets": {
 "myFacets": {
 "terms": { "field": "~" }
 },
 "facet_filter" {
 "terms": { "user": "kimchi" }
 }
 }
 }'

terms facet : 빈도수가 높은 10개의 terms을 반환

"all_terms" : true
"exclude" : "term2"
"regex" : "regex expression here", "regex_flags" : "DOTALL"
"script" : "term + 'aaa'"
"script" : "term == 'aaa' ? true : false"
"script_field" : "_source.my_field",

 curl -XPOST 'node201.hadoop.com:9200/customer/_search?pretty=true' -d '{
 "query" : { ~ },
 "facets": { 
 "필드": {
 "terms": { 
 "field": "~", 
 "size": 10,
 "order": "count"       #--- count (default), term, reverse_count, reverse_term
 } 
 }
 }
 }'

APIs

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cat.html
Cluster health check : http://node201.hadoop.com:9200/_cat/health?v
Node information : http://node201.hadoop.com:9200/_cat/nodes?v
Index information : http://node201.hadoop.com:9200/_cat/indices?v
- http://node201.hadoop.com:9200/_cat/indices/인덱스명?v
Master information : http://node201.hadoop.com:9200/_cat/master?v
Shards information : http://node201.hadoop.com:9200/_cat/shards?v
- http://node201.hadoop.com:9200/_cat/shards/샤드명?v
Alias information : http://node201.hadoop.com:9200/_cat/aliases?v
Distk 할당 정보 : http://node201.hadoop.com:9200/_cat/allocation?v
전체 문서 개수 : http://node201.hadoop.com:9200/_cat/count?v
- 인덱스의 문서 개수 : http://node201.hadoop.com:9200/_cat/count/인덱스명?v
Node별 로드된 필드 데이터 정보 : http://node201.hadoop.com:9200/_cat/fielddata?v
http://node201.hadoop.com:9200/_cat/fielddata/필드1,필드2?v
http://node201.hadoop.com:9200/_cat/fielddata?v&fields=필드1,필드2
Pending tasks information : http://node201.hadoop.com:9200/_cat/pending_tasks?v
Plugin information : http://node201.hadoop.com:9200/_cat/plugins?v
Recovery information : http://node201.hadoop.com:9200/_cat/recovery?v
Thread pool information : http://node201.hadoop.com:9200/_cat/thread_pool?v

|-
| align="center" valign="middle" | _nodes API |

http://www.elasticsearch.org/guide/en/elasticsearch/reference/current/cluster.html
Node명 지정 방법
- _nodes/_local : 로컬 node
- _nodes/IP1,IP2
- _nodes/노드명
- _nodes/노드속성

|-
| align="center" valign="middle" | _cluster API |

참고 문헌

Basic API

Node

 curl -X GET [http://node201.hadoop.com:9200/_status](http://node201.hadoop.com:9200/_status)                     #--- 상태 확인

Index 관리 (데이터베이스)

_all : 모든 index 적용

 curl -X POST [http://node201.hadoop.com:9200/index001](http://node201.hadoop.com:9200/index001)                   #--- index 생성
 curl -X DELETE [http://node201.hadoop.com:9200/index001](http://node201.hadoop.com:9200/index001)                 #--- index 삭제
 
 curl -X GET [http://node201.hadoop.com:9200/index001/_mapping](http://node201.hadoop.com:9200/index001/_mapping)           #--- Mapping 조회
 curl -X GET [http://node201.hadoop.com:9200/index001/_status](http://node201.hadoop.com:9200/index001/_status)            #--- 상태 확인
 curl -X GET [http://node201.hadoop.com:9200/index001/_search](http://node201.hadoop.com:9200/index001/_search)            #--- 검색
 curl -X GET [http://node201.hadoop.com:9200/_all/_search](http://node201.hadoop.com:9200/_all/_search)                #--- 검색

Type 관리 (테이블)

 #--- type 생성, _id는 자동으로 생성됨
 curl -X POST [http://node201.hadoop.com:9200/index001/type001](http://node201.hadoop.com:9200/index001/type001) -d '{ title: "Greeting", body: "Hello World!" }'
 curl -X DELETE [http://node201.hadoop.com:9200/index001/type001](http://node201.hadoop.com:9200/index001/type001)         #--- type 삭제
 
 curl -X GET [http://node201.hadoop.com:9200/index001/type001/_mapping](http://node201.hadoop.com:9200/index001/type001/_mapping)   #--- Mapping 조회
 curl -X GET [http://node201.hadoop.com:9200/index001/type001/_status](http://node201.hadoop.com:9200/index001/type001/_status)    #--- 상태 확인
 curl -X GET [http://node201.hadoop.com:9200/index001/type001/_search](http://node201.hadoop.com:9200/index001/type001/_search)    #--- 검색
 [http://node201.hadoop.com:9200/index001/type001/_search?q=title:Gre*ting](http://node201.hadoop.com:9200/index001/type001/_search?q=title:Gre*ting)

Mapping 관리 (테이블 스키마)

 #--- Mapping 생성 
 curl -X PUT [http://node201.hadoop.com:9200/index001/type001/_mapping](http://node201.hadoop.com:9200/index001/type001/_mapping) -d '{
 type001: {
 properties: {
  title: { 
    type: "string", 
    index: "not_analyzed"
  }
 }
 }
 }'
 curl -X GET [http://node201.hadoop.com:9200/index001/type001/_mapping](http://node201.hadoop.com:9200/index001/type001/_mapping)   #--- Mapping 조회

Document 관리 (레코드)

 #--- document 생성
 curl -X POST [http://node201.hadoop.com:9200/index001/type001/data001](http://node201.hadoop.com:9200/index001/type001/data001) -d '{ title: "Greeting", body: "Hello World!" }'
 curl -X POST [http://node201.hadoop.com:9200/index001/type001/data001/_update](http://node201.hadoop.com:9200/index001/type001/data001/_update) -d '{ title: "Greeting", body: "Hello World!" }'
 curl -X DELETE [http://node201.hadoop.com:9200/index001/type001/data001](http://node201.hadoop.com:9200/index001/type001/data001) #--- data001 데이터 삭제
 
 curl -X GET [http://node201.hadoop.com:9200/index001/type001/data001](http://node201.hadoop.com:9200/index001/type001/data001)    #--- data001 데이터 조회
 #--- document 검색
 curl -X GET [http://node201.hadoop.com:9200/index001/type001/_search](http://node201.hadoop.com:9200/index001/type001/_search) -d '{query: {text: {_all: "Hello"}}}'

q : 검색어, fieldName:fieldValue
default_operator=OR : 기본 연산자, AND, OR
fields=_source : 반환할 필드
sort : 정렬, field:asc, field:desc
timeout : 검색 수행 타임아웃, default는 무제한
size=10 : 반환할 데이터의 개수

 [http://node201.hadoop.com:9200/index001/type001/_search?q=title:Gre*ting](http://node201.hadoop.com:9200/index001/type001/_search?q=title:Gre*ting)
 curl -X POST [http://node201.hadoop.com:9200/index001/type001/_search](http://node201.hadoop.com:9200/index001/type001/_search) -d '{ query: {term: {title: "Greeting"}} }'
 curl -X POST [http://node201.hadoop.com:9200/index001/type001/_search](http://node201.hadoop.com:9200/index001/type001/_search) -d '{ query: {bool: {must: {match: {title: "Greeting"}}}} }'

Prefix query

scoring_boolean
constant_score_boolean : score를 계산하지 않음
constant_score_filter : filter를 사용
top_terms_n : scoring_boolean과 유사하나 n개의 결과만 반환
top_terms_boost_n : top_terms_n과 유사하지만 boost에 대해서 score 계산

 curl -X GET '[http://node201.hadoop.com:9200/index001/type001/_search?pretty'](http://node201.hadoop.com:9200/index001/type001/_search?pretty%27) -d '{
 "query": {
 "prefix": {
  "name": "j",                           #--- j로 시작하는 단어 검색
  "rewrite": "constant_score_boolean"
 }
 }
 }'

Rescore

 {
 "fields" : ["available"]("title",),
 
 "query" : {
 "match_all" : {}
 },
 
 "rescore" : {
 "query" : {
  "rescore_query" : {
    "custom_score" : {
      "query" : {
        "match_all" : {}
      },
      "script" : "doc['year']('year'.md).value"
    }
  }
 }
 }
 }

window_size
query_weight
rescore_query_weight
rescore_mode = total , max , min , avg , and multiply
- total : original_query_score * query_weight + rescore_query_score * rescore_query_weight

Bulk API

Bulk로 문서 등록, 수정, 삭제

 curl -XPOST 'node201.hadoop.com:9200/customer/external/_bulk?pretty' -d '
 {"index":{"_id":"1"}}
 {"name": "John Doe" }
 {"index":{"_id":"2"}}
 {"name": "Jane Doe" }
 '
 
 curl -XPOST 'node201.hadoop.com:9200/customer/external/_bulk?pretty' -d '
 {"update":{"_id":"1"}}
 {"doc": { "name": "John Doe becomes Jane Doe" } }
 {"delete":{"_id":"2"}}
 '

documents.json 파일을 사용하여 Bulk indexing

 curl -XPOST [http://node201.hadoop.com:9200/customer/external/_bulk?pretty](http://node201.hadoop.com:9200/customer/external/_bulk?pretty) --data-binary @documents.json

Multi Get

  curl [http://node201.hadoop.com:9200/library/book/_mget?fields=title](http://node201.hadoop.com:9200/library/book/_mget?fields=title) -d '{
 "ids" : [1,3](1,3.md)
 }'

MultiSearch

 curl [http://node201.hadoop.com:9200/library/books/_msearch?pretty](http://node201.hadoop.com:9200/library/books/_msearch?pretty) --data-binary '
 { "type" : "book" }
 { "filter" : { "term" : { "year" : 1936} }}
 { "search_type": "count" }
 { "query" : { "match_all" : {} }}
 { "index" : "library-backup", "type" : "book" }
 { "sort" : ["year"]("year".md) }
 '

Sort

 {
 "query" : {
 "terms" : {
  "title" : ["crime", "front", "punishment" ](),
  "minimum_match" : 1
 }
 },
 "sort" : [
 { "section" : "desc" }
 #-- {"release_dates" : { "order" : "asc", "mode" : "min" }}
 #-- min, max, avg, sum
 ]
 }

Indexing data

보통 shard 하나당 최소 크기는 1 ~ 10GB (최대 50GB 이내)

Node Size =< Shard Size

Replica size : 1 ~ Node Size - 1

{| border="1" cellspacing="0" cellpadding="2" style="width: 972px;" |- | style="text-align: center; background-color: rgb(241, 241, 241);" | API 종류 | style="text-align: center; background-color: rgb(241, 241, 241);" | 상세 |- | style="text-align: center;" | REST API | curl -XPUT [http://localhost:9200/blog/article/1](http://localhost:9200/blog/article/1) -d "{~}" |- | style="text-align: center;" | Bulk API | curl -s -XPUT [http://localhost:9200/market/_bulk](http://localhost:9200/market/_bulk) --data-binary @market.json |- | style="text-align: center;" | UDP Bulk API |
|- | style="text-align: center;" | River API |
|}

User Query DSL

[Lucene Query language](http://www.jopenbusiness.com/mediawiki/Lucene%23Lucene%20Query%20language)

TF/IDF (Term Frequency / Inverse Document Frequency)

Document boost
Field boost
Coord
Inverse document frequency
Length norm
Term frequency
Query norm

http://www.jopenbusiness.com/mediawiki/images/3/36/LuceneScore02.png http://www.jopenbusiness.com/mediawiki/images/5/5a/LuceneScore01.png http://www.jopenbusiness.com/mediawiki/images/3/36/LuceneScore02.png

q : Query
d : Document

Query type

custom_boost_factor
constant_score
custom_score

참고 문헌

ElasticSearch
[[Category:Search|Category:Search]]분류: BigData

최종 수정일: 2024-09-30 12:26:18