A Beginner’s Guide to Elasticsearch CRUD Operations and Search with Python

Greetings!

Search functionality is a common feature in most applications today. Elasticsearch, one of the most powerful and widely used search engines, offers efficient solutions for building search applications. In this article, we will create a simple REST application with a search endpoint using Elasticsearch and Python Flask. The code is straightforward and easy to understand.

Setup Elasticsearch

We are using the Elasticsearch Docker image to set up a local cluster; however, you can use any method you prefer to set up an Elasticsearch cluster.

version: "3.9"
services:
  elasticsearch:
    image: elasticsearch:8.6.2
    environment:
    - xpack.security.enabled=false
    - discovery.type=single-node
    - ES_JAVA_OPTS=-Xms1g -Xmx1g
    volumes:
    - es_data:/usr/share/elasticsearch/data
    ports:
    - target: 9200
      published: 9200
    networks:
    - elastic

  kibana:
    image: kibana:8.6.2
    ports:
    - target: 5601
      published: 5601
    depends_on:
    - elasticsearch
    networks:
    - elastic

volumes:
  es_data:
    driver: local

networks:
  elastic:
    name: elastic
    driver: bridge

docker-compose -f es-docker.yml up

docker-compose -f es-docker.yml up

You can verify the setup by accessing localhost:9200.

Install dependencies

First, let’s install the necessary dependencies.

python3 -m venv venv
source venv/bin/activate

pip install Flask elasticsearch

Initialize Flask

Now, we can verify the setup by creating a hello route.

from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route('/hello', methods=['GET'])
def hello():
    return jsonify({ "message": "Hello Elasticsearch" }), 200
    
if __name__ == '__main__':
    app.run(port=5000, debug=True)

python3 app.py

Elasticsearch Mappings and Index Creation

An Elasticsearch index is a logical namespace that stores a collection of related documents, similar to a database in relational systems.
Mappings define the structure and data types of fields within an index, specifying how data should be indexed and stored to enable precise search and analysis.
We will create a "movies" index with the required fields. To ensure the index exists, we can define a separate function to create it if it does not already exist. Additionally, we will provide custom mappings to Elasticsearch instead of relying on auto-detection. This approach offers greater control over the search functionality, particularly for searching by genre in this example.

from json
from elasticsearch import Elasticsearch

es = Elasticsearch("http://localhost:9200")
index_name = "movies"

def create_index():
    mappings = {
        "mappings": {
            "properties": {
                "id": {"type": "keyword"},
                "title": {"type": "text"},
                "director": {"type": "text"},
                "year": {"type": "integer"},
                "language": {"type": "keyword"},
                "genre": {"type": "keyword"},
                "rating": {"type": "float"},
            }
        }
    }
    if not es.indices.exists(index=index_name):
        es.indices.create(index=index_name, body=mappings)
        print(f"Index '{index_name}' created.")
    else:
        print(f"Index '{index_name}' already exists.")

Once the function is defined, we can use it to insert movies from a JSON file for testing.

movies.json

[
    {
        "id": "tt0133093",
        "title": "The Matrix",
        "director": "Lana Wachowski, Lilly Wachowski",
        "year": 1999,
        "language": "English",
        "genre": [
            "Sci-Fi",
            "Action"
        ],
        "rating": 8.7
    },
    {
        "id": "tt0245429",
        "title": "Spirited Away",
        "director": "Hayao Miyazaki",
        "year": 2001,
        "language": "Japanese",
        "genre": [
            "Animation",
            "Fantasy"
        ],
        "rating": 8.6
    }
]

While Elasticsearch provides a bulk endpoint for inserting large amounts of data in scenarios like this, I’m using this approach for learning purposes.

@app.route('/init', methods=['GET'])
def init():
    create_index()
    with open("movies.json") as movies_file:
        movies = json.load(movies_file)
        for movie in movies:
            es.index(document=movie, id=movie["id"], index=index_name)
    return jsonify({ "message": "Movies initiated in movies index"}), 200

Get all movies

This endpoint provides a simple way to view the entire movie collection. We are using match_all, which returns all the movies.

@app.route('/movies', methods=['GET'])
def get_movies():
    query = {
        "query": {
            "match_all": {}
        }
    }
    result = es.search(index=index_name, body=query)
    return jsonify([hit["_source"] for hit in result["hits"]["hits"]]), 200

Get a movie by ID

Fetch details of a specific movie using its unique identifier.

GET /movies/_doc/tt1375666

@app.route('/movies/<id>', methods=['GET'])
def get_movie_by_id(id):
    try:
        result = es.get(index=index_name, id=id)
        return jsonify(result["_source"])
    except Exception as e:
        return jsonify(str(e)), 404

Create a movie

Add a new movie to the Elasticsearch index with details like title, director, year, language, genre, and rating.

@app.route('/movies', methods=['POST'])
def create_movie():
    data = request.json
    res = es.index(index=index_name, document=data)
    return jsonify({"message": "Movie added successfully.", "id": res['_id']}), 201

Update, Delete a movie

Modify the details of an existing movie or remove a movie from the index.

@app.route('/movies/<id>', methods=['PUT'])
def update_movie(id):
    data = request.json
    try:
        res = es.update(index=index_name, id=id, body={"doc": data})
        return jsonify({"message": "Movie updated successfully.", "id": res['_id']}), 200
    except Exception as e:
        return jsonify({"error": str(e)}), 404

@app.route('/movies/<id>', methods=['DELETE'])
def delete_movie(id):
    try:
        es.delete(index=index_name, id=id)
        return jsonify({"message": "Movie deleted successfully."}), 204
    except Exception as e:
        return jsonify({"error": str(e)}), 204

Search a movie

Search is the primary functionality when using Elasticsearch. For this exercise, we are building a search feature to find movies by either the title or the director. We perform the search by querying both the title and director fields simultaneously using the multi_match query. To support prefix searches, we use the phrase_prefix type.

http://127.0.0.1:5000/movies/search?q=king

@app.route('/movies/search', methods=['GET'])
def search():
    search_term = request.args.get("q", "")
    query = {
        "query": {
            "multi_match": {
                "query": search_term,
                "fields": [ "title", "director" ],
                "type": "phrase_prefix"
            }
        }
    }
    result = es.search(index=index_name, body=query)
    return jsonify([hit["_source"] for hit in result["hits"]["hits"]]), 200

Summary

This article demonstrates how to build a simple CRUD and search application using Elasticsearch and Python. It covers setting up Elasticsearch, as well as creating, retrieving, updating, and deleting movie records, and searching by title or director.

Manju