Elasticsearch is a distributed search and analytics engine that is used to index, search, and analyze large volumes of data quickly and in real-time. Elasticsearch is built on top of Apache Lucene, which is a high-performance indexing and search library. Elasticsearch provides a simple and powerful REST API that allows users to interact with their data through search queries, aggregations, and more.
In this tutorial, we will cover the basics of working with Elasticsearch and provide you with a step-by-step guide on how to set up Elasticsearch on your own machine, index data, and perform basic search and aggregation queries.
Prerequisites
Before getting started, you’ll need the following:
- A basic understanding of REST APIs
- A machine running Ubuntu 18.04 (or any other recent Linux distribution)
- Java 8 or later installed on your computer
- Elasticsearch installed on your machine
For the purpose of this tutorial, we’ll be installing Elasticsearch on Ubuntu 18.04.
Step 1: Install Elasticsearch
The first step to getting started with Elasticsearch is to install it on your machine. Here are the steps for installing Elasticsearch on Ubuntu 18.04:
1.1: Install Java 8 or Later
Elasticsearch is built on top of Java, so you’ll need to install Java 8 or later on your machine. Here’s how to install Java 8:
sudo apt-get update
sudo apt-get install openjdk-8-jdk
Once you’ve installed Java, you can verify the installation by running the following command:
java -version
1.2: Download and Install Elasticsearch
The next step is to download and install the Elasticsearch package that matches the version of Java you’ve installed. Here’s how to download and install the Elasticsearch package:
wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.9.3-linux-x86_64.tar.gz
tar -xzf elasticsearch-7.9.3-linux-x86_64.tar.gz
cd elasticsearch-7.9.3
./bin/elasticsearch
This will start Elasticsearch on your machine.
Step 2: Index Data
Once Elasticsearch is installed, you can start indexing data. In Elasticsearch, data is stored in indices, which are similar to tables in a relational database. Here’s how to create an index and add data to it:
2.1: Create an Index Mapping
Before you can start indexing data, you’ll need to define a mapping for your index. A mapping defines the fields that your index will contain and their data types. Here’s an example mapping for a blog post index:
PUT /blog_post
{
"mappings": {
"properties": {
"title": {
"type": "text"
},
"content": {
"type": "text"
},
"tags": {
"type": "keyword"
},
"date": {
"type": "date"
}
}
}
}
This mapping defines four fields: title
, content
, tags
, and date
. The title
and content
fields are of type text
, which means that they can contain full-text search data. The tags
field is of type keyword
, which means that it can be used for keyword-based search queries. The date
field is of type date
, which means that it can be used for date-based search queries.
2.2: Index Data
Now that you have a mapping for your index, you can start indexing data. Here’s how to index a blog post:
PUT /blog_post/_doc/1
{
"title": "Getting started with Elasticsearch",
"content": "Elasticsearch is a distributed search and analytics engine that is used to index, search, and analyze large volumes of data.",
"tags": ["elasticsearch", "tutorial", "search"],
"date": "2020-11-18"
}
This will create a new document in the blog_post
index with an ID of 1
.
Step 3: Search and Analyze Data
Now that you’ve indexed some data, you can start querying it. In Elasticsearch, search and analytics are performed using the REST API, which allows you to send search queries and aggregations to Elasticsearch.
3.1: Simple Search Query
Here’s an example of a simple search query that searches for blog posts that contain the word elasticsearch
in the title or content:
GET /blog_post/_search
{
"query": {
"match": {
"title": "elasticsearch"
}
}
}
This query uses the match
query to search for the term elasticsearch
in the title
field. Elasticsearch will return any documents that contain the term elasticsearch
in the title
field.
3.2: Aggregations
Elasticsearch also supports aggregations, which allow you to summarize and analyze data. Here’s an example of a simple aggregation that counts the number of blog posts for each tag:
GET /blog_post/_search
{
"aggs": {
"tags": {
"terms": {
"field": "tags"
}
}
}
}
This query uses the terms
aggregation to group the documents by their tags
field. Elasticsearch will return a list of all the unique tags in the tags
field, along with the number of documents that have each tag.
Conclusion
In this tutorial, we’ve covered the basics of working with Elasticsearch, including how to install Elasticsearch, index data, and perform basic search and aggregation queries. Elasticsearch is a powerful tool for search and analytics, and it’s used by some of the world’s largest companies to index and analyze large volumes of data. With Elasticsearch, you can search, filter, and aggregate your data in real-time, making it an essential tool for any modern data-driven organization.