Greetings!
Elasticsearch provides several methods for updating documents. However, not all of them are suitable for every use case. For instance, data migrations or scheduled ETL processes might lead to unexpected behavior and degrade the user experience. In these scenarios, Elasticsearch aliases can be leveraged to achieve seamless zero-downtime updates.


Depending on the need, deleting the index and recreating it would be faster than deleting individual records.
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-template.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-get-template.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-delete-index.html
Elasticsearch provides several methods for updating documents. However, not all of them are suitable for every use case. For instance, data migrations or scheduled ETL processes might lead to unexpected behavior and degrade the user experience. In these scenarios, Elasticsearch aliases can be leveraged to achieve seamless zero-downtime updates.
What is an Alias?
An alias is a logical name that acts as a pointer to one or more physical indexes. Instead of interacting directly with the index itself, applications or users can perform operations like indexing and search using an alias.
Alias Switching
Now, imagine we have index_v1 and want to transition to index_v2. The process involves simply updating the alia; removing index_v1 and adding index_v2. Since the applications interact with the alias rather than the indices directly, they remain unaware of the change, allowing the update to occur seamlessly without any downtime.
Steps for Zero Downtime Update
Before jumping in, let’s summarize the necessary steps:- Create an index template with index patterns.
- Create movies_v1 and movies_v2 indexes.
- Attach movies_v1 as the initial index to the alias movies.
- Verify the alias and index.
- Insert data into movies_v1.
- Verify the data by fetching it from the movies alias.
- Updates: Insert data into movies_v2.
- Switch the alias to point to movies_v2.
- Verify the data by fetching it from the movies alias.
Step 1: Create the index template
The first step is to create a template for our index, specifying the necessary mappings. The key aspect to note in this step is the use of "index_patterns." Any index created that matches the specified pattern will inherit the same configurations, enabling us to reuse the template definition effectively.PUT /_index_template/movies_template{"index_patterns": ["movies_v1","movies_v2"],"template": {"mappings": {"properties": {"id": {"type": "keyword"},"title": {"type": "text"},"genre": {"type": "keyword"}}}}}
Step 2: Create indexes
After creating the index template, we can create an index without specifying any mappings, as the configurations will automatically be inherited from the template.PUT /movies_v1GET /movies_v1PUT /movies_v2GET /movies_v2
Step 3: Attach the alias
Our application will only be aware of the alias when performing searches. Therefore, we need to associate an index with the alias.POST /_aliases/movies{"actions": [{"add": {"index": "movies_v1","alias": "movies"}}]}
Step 4: Verify the alias
After associating the alias, we can perform searches through the alias.GET _aliasGET /_alias/moviesGET /movies/_alias
GET /moviesGET /movies/_search
Step 5: Insert initial data
During the initial run (such as an ETL or other resource-intensive process), we insert the data into the first index.PUT /movies_v1/_doc/1{"id": "1","title": "Rurouni Kenshin","genre": "Action"}
Step 6: Verify initial data
Now, we can verify the data by querying through the alias.GET /movies/_search
Step 7: Updates: Insert data into the second index
Now, imagine we want to update our data using an ETL job. Instead of updating index_v1 directly, we insert the data into index_v2.PUT /movies_v2/_doc/1{"id": "1","title": "Rurouni Kenshin","genre": "Action"}PUT /movies_v2/_doc/2{"id": "2","title": "The Great Battle","genre": "Historical"}
Step 8: Switch the alias
After step 7, our latest data resides in index_v2, but the application is still pointing to index_v1 through the alias. Instead of making any changes to the application, we simply remove index_v1 from the alias and associate index_v2 with it.POST /_aliases{"actions": [{"remove": {"index": "movies_v1","alias": "movies"}},{"add": {"index": "movies_v2","alias": "movies"}}]}
Step 9: Verify data
As usual, verify the data by fetching it through the alias.GET /movies/_search
Step 10: Delete data before indexing
If this is a recurring scheduled task, creating new indexes every day (or weekly, according to the schedule) could become problematic. Instead, we might prefer to maintain two indexes and switch between them using logic.- If the active index is index_v2, insert the latest data into index_v1.
- If the active index is index_v1, insert the latest data into index_v2.
POST /movies_v1/_delete_by_query{"query": {"match_all": {}}}
DELETE /movies_v1PUT /movies_v1
Conclusion
In this article, we explored how to leverage Elasticsearch aliases and index templates to achieve zero-downtime updates. This approach is a clean and efficient method for handling data migrations, scheduled jobs, and a variety of use cases where minimizing downtime is critical. By using aliases, we can seamlessly switch between indices without affecting the application's operation, allowing for smooth updates and ensuring a better user experience during data transitions.References
https://www.elastic.co/guide/en/elasticsearch/reference/current/aliases.htmlhttps://www.elastic.co/guide/en/elasticsearch/reference/current/indices-put-template.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-get-template.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/indices-delete-index.html
Comments
Post a Comment