Skip to main content

Implementing Faceted Search and Dynamic Filtering in Elasticsearch

Greetings!

Faceting is a widely used concept in search and data filtering across various technologies and domains. It is commonly applied in e-commerce, healthcare, travel, and many other fields to enhance user experience. Most databases support faceting (filtering) for this purpose. In this article, I will focus on developing a facet search using Elasticsearch with movie data.

Elasticsearch Aggregation

Elasticsearch offers a powerful aggregation feature that enables querying and analyzing data efficiently. With a single query, you can retrieve both filters and search results, allowing for complex analyses and meaningful insights.

Terms Aggregation

This is used to group documents based on unique values of a specified field.
{
"aggs": {
"genres": {
"terms": {
"field": "genre",
"size": 100
}
}
}

}

Range Aggregation

This groups documents into predefined numerical or date ranges. It is useful for filtering data into meaningful segments, such as price ranges or age groups. 
{
"aggs": {
"rating_ranges": {
"range": {
"field": "rating",
"ranges": [
{ "to": 5 },
{ "from": 5, "to": 7 },
{ "from": 7, "to": 9 },
{ "from": 9 }
]
}
}
}

}

Histogram Aggregation

This creates evenly spaced numerical intervals (buckets). It is useful for data distribution analysis, such as grouping products by price or movies by release year.
{
"aggs": {
"movies_by_decade": {
"histogram": {
"field": "year",
"interval": 10
}
}
}

}
Thus, a search query would be as below:
GET /facet-movies/_search
{
"from": 0,
"size": 0,
"query": {
"bool": {
"must": [],
"filter": []
}
},
"aggs": {
"genre_buckets": {
"terms": {
"field": "genre",
"size": 10
}
}
}

}
This query returns search documents as well as the aggregations which we can use to filter results in subsequent queries.
"aggregations": {
"genre_buckets": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "Action",
"doc_count": 5
},
{
"key": "Adventure",
"doc_count": 3
},
{
"key": "Drama",
"doc_count": 7
}
]
}

}

Planning a search

Search functionality varies based on business requirements. However, a typical search generally includes the following key features:
  • Free Text Search – Allows users to search using keywords or phrases.
  • Filters – Enables users to refine results based on specific attributes (e.g., category, price range, release year).
  • Pagination – Splits search results into multiple pages to enhance performance and usability.
  • Sorting – Orders results based on relevance, date, rating, or other criteria.
  • These elements work together to create a seamless and efficient search experience, helping users find relevant information quickly.
GET /facet-movies/_search
{
"from": 0,
"size": 20,
"query": {
"bool": {
"must": [],
"filter": []
}
},
"sort": [],
"aggs": {}
}

Facet with Movies

For this example, let’s use the following movie dataset.
PUT /facet-movies
{
"mappings": {
"properties": {
"imdbid": { "type": "keyword" },
"title": { "type": "text" },
"genre": { "type": "keyword" },
"year": { "type": "integer" },
"rating": { "type": "float" },
"language": { "type": "keyword" }
}
}
}
POST _bulk
{ "index": { "_index": "facet-movies", "_id": "tt0167260" } }
{ "imdbid": "tt0167260", "title": "The Lord of the Rings: The Return of the King", "genre": ["Adventure", "Action", "Drama"], "year": 2003, "rating": 8.9, "language": "English" }
{ "index": { "_index": "facet-movies", "_id": "tt0167261" } }
{ "imdbid": "tt0167261", "title": "Kenshin", "genre": ["Action", "Adventure"], "year": 2021, "rating": 7.8, "language": "Japanese" }
{ "index": { "_index": "facet-movies", "_id": "tt8367810" } }
{ "imdbid": "tt8367810", "title": "The Great Battle", "genre": ["Action", "War"], "year": 2018, "rating": 7.1, "language": "Korean" }
{ "index": { "_index": "facet-movies", "_id": "tt0167263" } }
{ "imdbid": "tt0167263", "title": "Planet of the Apes", "genre": ["Sci-Fi", "Adventure"], "year": 2001, "rating": 8.5, "language": "English" }
{ "index": { "_index": "facet-movies", "_id": "tt0167264" } }
{ "imdbid": "tt0167264", "title": "Baahubali", "genre": ["Action", "Fantasy", "Drama"], "year": 2015, "rating": 8.1, "language": "Telugu" }
{ "index": { "_index": "facet-movies", "_id": "tt0167265" } }
{ "imdbid": "tt0167265", "title": "Seven Samurai", "genre": ["Drama", "Action", "Adventure"], "year": 1954, "rating": 8.6, "language": "Japanese" }
{ "index": { "_index": "facet-movies", "_id": "tt0167266" } }
{ "imdbid": "tt0167266", "title": "Andhadhun", "genre": ["Thriller", "Mystery", "Comedy"], "year": 2018, "rating": 8.2, "language": "Hindi" }
{ "index": { "_index": "facet-movies", "_id": "tt0167267" } }
{ "imdbid": "tt0167267", "title": "The Lord of the Rings: The Two Towers", "genre": ["Adventure", "Action", "Drama"], "year": 2002, "rating": 8.7, "language": "English" }
{ "index": { "_index": "facet-movies", "_id": "tt1979319" } }
{ "imdbid": "tt1979319", "title": "Rurouni Kenshin", "genre": ["Action", "Adventure"], "year": 2012, "rating": 7.4, "language": "Japanese" }
{ "index": { "_index": "facet-movies", "_id": "tt3029556" } }
{ "imdbid": "tt3029556", "title": "Rurouni Kenshin: Kyoto Inferno", "genre": ["Action", "Adventure"], "year": 2014, "rating": 7.5, "language": "Japanese" }
{ "index": { "_index": "facet-movies", "_id": "tt3029630" } }
{ "imdbid": "tt3029630", "title": "Rurouni Kenshin: The Legend Ends", "genre": ["Action", "Adventure"], "year": 2014, "rating": 7.6, "language": "Japanese" }
{ "index": { "_index": "facet-movies", "_id": "tt9314288" } }
{ "imdbid": "tt9314288", "title": "Rurouni Kenshin: The Final", "genre": ["Action", "Drama"], "year": 2021, "rating": 7.2, "language": "Japanese" }
{ "index": { "_index": "facet-movies", "_id": "tt10758050" } }
{ "imdbid": "tt10758050", "title": "Rurouni Kenshin: The Beginning", "genre": ["Action", "Drama"], "year": 2021, "rating": 7.4, "language": "Japanese" }
When a user lands on the initial page, we want to display the first 10 movies along with the available filters. In Elasticsearch, the "from" and "size" properties are used to paginate results. For the initial load, the search term will be empty. However, we will use aggregations to retrieve the available filters.

Building filters

It's important to define our requirements first. While common filters like ratings are widely used across various domains, I’m including additional filters in this example to make it more engaging: Genre, Ratings, Category, Year, and Popularity. Each filter will utilize different aggregation types, such as terms, range, and histogram.
GET /facet-movies/_search
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [],
"filter": []
}
},
"aggs": {
"genre_buckets": {
"terms": {
"field": "genre",
"size": 10
}
},
"language_buckets": {
"terms": {
"field": "language",
"size": 10
}
},
"release_year_buckets": {
"terms": {
"field": "year",
"order": {
"_key": "desc"
}
}
},
"ratings_buckets": {
"histogram": {
"field": "rating",
"interval": 1
}
},
"ratings_ranges_buckets": {
"range": {
"field": "rating",
"ranges": [
{
"key": "Poor",
"to": 3
},
{
"key": "Average",
"from": 3,
"to": 6
},
{
"key": "Good",
"from": 6,
"to": 8
},
{
"key": "Excellent",
"from": 8
}
]
}
}
}
}

Search using filters

Once the facets are defined, it is crucial to apply these filters during the search process. This helps refine the results based on the specified criteria, ensuring a more relevant and personalized user experience.

CODE search with facets
GET /facet-movies/_search
{
"from": 0,
"size": 10,
"query": {
"bool": {
"must": [
{
"match": {
"title": "kenshin"
}
}
],
"filter": [
{
"terms": {
"genre": [
"Action",
"Drama"
]
}
},
{
"term": {
"language": "Japanese"
}
},
{
"range": {
"year": {
"gte": 2010,
"lte": 2020
}
}
},
{
"range": {
"rating": {
"gte": 6,
"lte": 7.5
}
}
}
]
}
},
"aggs": {
"genre_buckets": {
"terms": {
"field": "genre",
"size": 10
}
},
"language_buckets": {
"terms": {
"field": "language",
"size": 10
}
},
"release_year_buckets": {
"terms": {
"field": "year",
"order": {
"_key": "desc"
}
}
},
"ratings_buckets": {
"histogram": {
"field": "rating",
"interval": 1
}
},
"ratings_ranges_buckets": {
"range": {
"field": "rating",
"ranges": [
{
"key": "Poor",
"to": 3
},
{
"key": "Average",
"from": 3,
"to": 6
},
{
"key": "Good",
"from": 6,
"to": 8
},
{
"key": "Excellent",
"from": 8
}
]
}
}
}

}
While it is possible to use two separate queries:
  • The initial search
  • Another query for searches with terms and filters.
This approach increases maintenance complexity. Instead, we can simplify and streamline the process using Elasticsearch search templates.

Enhancing Search with Search Templates

As the name suggests, a search template allows us to create a reusable query structure for our search requirements. This reduces the workload on the client application, as it no longer needs to manage or construct the query manually.
POST /_scripts/facet_movies_search_template
{
"script": {
"lang": "mustache",
"source": {
"from": "{{from}}",
"size": "{{size}}",
"query": {
"bool": {
"must": [
{{#title}}
{
"match": {
"title": "{{title}}"
}
}
{{/title}}
],
"filter": [
{{#genres}}
{
"terms": {
"genre": {{#toJson}}genres{{/toJson}}
}
},
{{/genres}}
{{#language}}
{
"term": {
"language": "{{language}}"
}
},
{{/language}}
{{#release_year}}
{
"range": {
"year": {
"gte": "{{release_year.gte}}",
"lte": "{{release_year.lte}}"
}
}
},
{{/release_year}}
{{#ratings}}
{
"range": {
"rating": {
"gte": "{{ratings.gte}}",
"lte": "{{ratings.lte}}"
}
}
}
{{/ratings}}
]
}
},
"aggs": {
"genre_buckets": {
"terms": {
"field": "genre",
"size": 10
}
},
"language_buckets": {
"terms": {
"field": "language",
"size": 10
}
},
"release_year_buckets": {
"terms": {
"field": "year",
"order": {
"_key": "desc"
}
}
},
"ratings_buckets": {
"histogram": {
"field": "rating",
"interval": 1
}
}
}
}
}

}
Once the template is created, we only need to pass the request parameters.
POST /facet-movies/_search/template
{
"id": "facet_movies_search_template",
"params": {
"from": 0,
"size": 10
}

}
POST /facet-movies/_search/template
{
"id": "facet_movies_search_template",
"params": {
"from": 0,
"size": 10,
"title": "kenshin",
"language": "Japanese"
}

}
POST /facet-movies/_search/template
{
"id": "facet_movies_search_template",
"params": {
"from": 0,
"size": 10,
"title": "kenshin",
"genres": ["Action", "Drama"],
"language": "Japanese",
"category": "Feature Film",
"release_year": {
"gte": 2010,
"lte": 2020
},
"ratings": {
"gte": 6,
"lte": 7.5
}
}

}

Conclusion

In this article, we explored how to implement facet search using Elasticsearch, covering key concepts such as pagination, filtering, sorting, and aggregations. We also discussed how to improve query maintainability and efficiency by leveraging Elasticsearch search templates, allowing for a more scalable and reusable search solution.
With Elasticsearch's powerful capabilities, businesses can optimize search performance, reduce query complexity, and deliver a seamless user experience.

References

https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-template.html
https://www.elastic.co/search-labs/tutorials/search-tutorial/full-text-search/facets

Comments