Mat Brown
@0utoftime
http://github.com/outoftime
A standalone distributed HTTP-based fulltext search index
Runs as its own instance (or cluster)
Designed to distribute both data and processing over many machines
All interaction is via a RESTful, JSON API
Efficiently retrieve documents whose text content matches user input
$ wget https://github.com/downloads/elasticsearch/elasticsearch/elasticsearch-0.17.6.tar.gz
$ tar -xvf elasticsearch-0.17.6.tar.gz
$ cd elasticsearch-0.17.6
$ ./bin/elasticsearch -f
> post('/myapp/post', { :title => 'My First Post', :blog_id => 1 })
{
"ok" => true,
"_index" => "myapp",
"_type" => "post",
"_id" => "lVRL_5FERsWW2tf_oX1WJw",
"_version" => 1
}
JSON in, JSON out
You can have as many indexes as you like.
> put('/myapp/post/_mapping', {
:post => {
:properties => {
:title => { :type => 'string', :index => 'analyzed' }
}
}
})
Almost everything in ElasticSearch is configurable at runtime, via the JSON API.
> post(/myapp/post/_search', { :query => { :term => { :blog_id => 1 }}})
{
"hits" => {
"total" => 1,
"max_score" => 1.0,
"hits" => [
[0] {
"_index" => "myapp",
"_type" => "post",
"_id" => "sHUhJNq6R7-8atpaXdzcfw",
"_score" => 1.0,
"_source" => {
"title" => "My First Post",
"blog_id" => 1
}
}
]
}
}
JSON search DSL exposes the underlying Lucene search API in tremendous detail.
_sourceElasticSearch retains the original JSON document that you sent it, and returns it with search results (by default).
> get('/myapp/post/sHUhJNq6R7-8atpaXdzcfw')
{
"_index" => "myapp",
"_type" => "post",
"_id" => "sHUhJNq6R7-8atpaXdzcfw",
"_version" => 1,
"exists" => true,
"_source" => {
"title" => "My First Post",
"blog_id" => 1
}
}
A standalone distributed HTTP-based fulltext search index.
An HTTP-based distributed document-oriented data store with a search-based query system.
{
"post": {
"properties": {
"author": {
"properties": {
"name": { "type": "string" },
"department_id": { "type": "integer" }
}
}
}
}
}
{ "author": { "name": "Mat", "department_id": 4 }}
{ "query": { "term": { "author.name": "Mat" }}}
Index the same data in different ways.
We might want to index a title as a scalar string, standard analyzed fulltext, and as substrings.
{
"post": {
"properties": {
"title": {
"type": "multi_field",
"fields": {
"title": { "type": "string", "index": "analyzed" },
"scalar": { "type": "string", "index": "not_analyzed" },
"substrings": { "type": "string", "index": "analyzed", "analyzer": "n_gram" }
}
}
}
}
}
{
"query": {
"filtered": {
"query": {
"query_string": {
"query": "delicious pizza",
"fields": ["title", "body", "tags", "author.name"]
}
},
"filter": {
"or": [
{ "term": { "blog_id": 1 }},
{
"geo_distance": {
"distance": "5mi",
"location": {
"lat": 40,
"lon": -70
}
}
}
]
}
}
}