Skip to content

Querying

Romain Ruaud edited this page Nov 13, 2017 · 3 revisions

Querying content

In this chapter you will learn how Elasticsuite is proceeding to retrieve data from Elasticsearch.

This guide will not cover Elasticsearch basics like "what is a query ?" or "what is a boolean clause ?". It is prerequisite that you already know the main concepts of Elasticsearch before exploring this guide.

This guide will cover the following part :

  • How to build a search Request
  • How to send query to the engine
  • How to exploit the results

Table of contents

Search Requests

Elasticsuite follows Magento2's architecture of defining Search Containers (or Search Requests). However, for Elasticsuite, since we changed a little the logic, the Search Requests are defined on a elasticsuite_search_request.xml file.

Let's see how the module ElasticsuiteCatalog is describing its requests :

<requests xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
          xsi:noNamespaceSchemaLocation="urn:magento:module:Smile_ElasticsuiteCore:etc/elasticsuite_search_request.xsd">

    <request name="quick_search_container" label="Catalog Product Search" index="catalog_product" type="product" fulltext="true" />

    <request name="catalog_product_autocomplete" label="Catalog Product Autocomplete" index="catalog_product" type="product" fulltext="true" />

    <request name="catalog_view_container" label="Category Product View" index="catalog_product" type="product" fulltext="false" />

    <request name="category_search_container" label="Catalog Category Search" index="catalog_category" type="category" fulltext="true" />
</requests>

As you can see, Search Requests have the following properties :

  • A name, to identify them technically
  • A label
  • The index where the search request is querying
  • The entity type the search request is using
  • A boolean indicating if the request supports fulltext search or not.

If you are familiar with Elasticsearch, you'll notice that it looks like the entrypoint of a search query which is basically looking like http://elasticsearch:9200/index/type/_search.

You will also notice that we have several requests querying the catalog_product index and product type. This allow us to write different configuration for these requests and to have customizable query configuration depending from the context.

These search requests are the one listed into Relevance Configuration

Building Search Requests

You are able to build a Search Request by using the \Smile\ElasticsuiteCore\Search\Request\Builder.

It's method create() takes the following parameters :

  • $storeId Search request store id.
  • $containerName Search request name.
  • $from Search request pagination from clause.
  • $size Search request pagination size.
  • $queryText Search request fulltext query.
  • $sortOrders Search request sort orders.
  • $filters Search request filters.
  • $queryFilters Search request filters prebuilt as QueryInterface.
  • $facets Search request facets.

The first 4 parameters are speaking for themselves : Store Id and Search request name are mandatory to know which Elasticsearch index we are querying. No need to talk about what the pagination is used for.

Let's dig a bit more into the other parameters.

Fulltext search query.

This parameters is indicating the query text we are searching for in the index. It can be null since we are able to query only with filters which will be explained later.

If you specified a $queryText, Elasticsuite will automatically build a fulltext search query via the Smile\ElasticsuiteCore\Search\Request\Query\Fulltext\QueryBuilder.

We will not cover here what exactly does the Fulltext Query Builder, since it's part of the Elasticsuite's magic. Basically, it will build a fulltext query according to the current search request configuration defined in the Back-Office.

Sort orders.

At this step, sort orders can be a basic PHP array with the field as key, and the direction to sort as a value.

Eg :

$sortOrders = ['entity_id' => 'asc'];

Filters

The $filters parameter is meant to contain filtering conditions, exprimed with the Magento's collection syntax.

If you are familiar with Magento, you already know these conditions styles : it's basically an array containing the field to test, and a condition composed of one or several operator and a value.

You can build conditions like you would express them when using the Magento's addFieldToFilter method :

    // Simple condition
    $condition = ['entity_id' => 128];

    // Conditions with an operator
    $condition = ['entity_id' => ['eq' => 128]];
    $condition = ['entity_id' => ['in' => [1, 2, 3, 4]];
    $condition = ['name' => ['like' => 'test']];
    $condition = ['price.price' => ['from' => 10, 'to' => 100]];

Note that the field should be the field as it's declared in the mapping. Notice the price.price in the examples. This is due to the fact that for products, the price is a nested field. See the indexing guide to have more informations about field types.

Please also note that for now, not all operators are supported for building conditions. Here is the list of operators actually supported :

  • 'eq'
  • 'seq'
  • 'in'
  • 'from'
  • 'moreq'
  • 'gteq'
  • 'to'
  • 'lteq'
  • 'like'
  • 'in_set'

Query Filters

Okay, so we have seen basic filtering can be done easily with the operators given above. What if you need more complex filtering ?

You are also able to express Query Filters. If you already know Elasticsearch, you'll understand quickly what is available to you here, and will probably love it.

Query Filters are exactly Elasticsearch query parts. This means that you are able to build complex filters just like you would do in JSON when querying Elasticsearch directly.

To build these filters, you will have to instantiate first a Smile\ElasticsuiteCore\Search\Request\Query\QueryFactory and then start combinating conditions.

There is a lot of common Elasticsearch queries coming out of the box with Elasticsuite. Here is a list of available queries :

  • 'boolean'
  • 'common'
  • 'filtered'
  • 'function_score'
  • 'match'
  • 'missing'
  • 'multi_match'
  • 'nested'
  • 'not'
  • 'range'
  • 'term'
  • 'terms'

This is actually enough to build all the filtering needed for fulltext-search and catalog navigation.

But you can also extend the QueryFactory to support more query types. See below the Extending queries and aggregation types part.

To use one of these queries, just call the query factory with the proper params : $termQuery = $queryFactory->create(QueryInterface::TYPE_TERM, ['field' => 'entity_id', 'value' => 128])

This guide will not cover all of these query specificities, since you are supposed to already know most of them. You can find more informations about query types on the Elasticsearch documentation

Eg : you may have notice the 'nin' operator is not handled out of the box by Elasticsuite. You could easily mimic it by doing a 'not' query with a 'terms' query like this :

    $termsQuery = $queryFactory->create(
        \Smile\ElasticsuiteCore\Search\Request\QueryInterface::TYPE_TERMS,
        ['field' => 'entity_id', 'values' => $excludeProductIds]
    );

    $query = $queryFactory->create(
        \Smile\ElasticsuiteCore\Search\Request\QueryInterface::TYPE_NOT,
        ['query' => $termsQuery]
    );

Here is also a small example of building a complex condition to determine if a product is new, based on the news_from_date and news_to_date fields :

$now = (new \DateTime())->format(\Magento\Framework\Stdlib\DateTime::DATETIME_PHP_FORMAT);

$clauses = [];

$newFromDateEarlier = $this->queryFactory->create(
    QueryInterface::TYPE_RANGE,
    ['field' => 'news_from_date', 'bounds' => ['lte' => $now]]
);

$newsToDateLater = $this->queryFactory->create(
    QueryInterface::TYPE_RANGE,
    ['field' => 'news_to_date', 'bounds' => ['gte' => $now]]
);

$missingNewsFromDate = $this->queryFactory->create(QueryInterface::TYPE_MISSING, ['field' => 'news_from_date']);
$missingNewsToDate   = $this->queryFactory->create(QueryInterface::TYPE_MISSING, ['field' => 'news_to_date']);

// Product is new if "news_from_date" is earlier than now and he has no "news_to_date".
$clauses[] = $this->queryFactory->create(
    QueryInterface::TYPE_BOOL,
    ['must' => [$newFromDateEarlier, $missingNewsToDate]]
);

// Product is new if "news_to_date" is later than now and he has no "news_from_date".
$clauses[] = $this->queryFactory->create(
    QueryInterface::TYPE_BOOL,
    ['must' => [$missingNewsFromDate, $newsToDateLater]]
);

// Product is new if now is between "news_from_date" and "news_to_date".
$clauses[] = $this->queryFactory->create(
    QueryInterface::TYPE_BOOL,
    ['must' => [$newFromDateEarlier, $newsToDateLater]]
);

// Product is new if one of previously built queries match.
$queryFilter = $this->queryFactory->create(QueryInterface::TYPE_BOOL, ['should' => $clauses])

Which will produce the following query when sent to the engine :

"bool": {
    "must": [],
    "must_not": [],
    "should": [
      {
        "bool": {
          "must": [
            {
              "range": {
                "news_from_date": {
                  "lte": "2017-10-24 10:29:52",
                  "boost": 1
                }
              }
            },
            {
              "missing": {
                "field": "news_to_date"
              }
            }
          ],
          "must_not": [],
          "should": [],
          "minimum_should_match": 1,
          "boost": 1
        }
      },
      {
        "bool": {
          "must": [
            {
              "missing": {
                "field": "news_from_date"
              }
            },
            {
              "range": {
                "news_to_date": {
                  "gte": "2017-10-24 10:29:52",
                  "boost": 1
                }
              }
            }
          ],
          "must_not": [],
          "should": [],
          "minimum_should_match": 1,
          "boost": 1
        }
      },
      {
        "bool": {
          "must": [
            {
              "range": {
                "news_from_date": {
                  "lte": "2017-10-24 10:29:52",
                  "boost": 1
                }
              }
            },
            {
              "range": {
                "news_to_date": {
                  "gte": "2017-10-24 10:29:52",
                  "boost": 1
                }
              }
            }
          ],
          "must_not": [],
          "should": [],
          "minimum_should_match": 1,
          "boost": 1
        }
      }
    ],
    "minimum_should_match": 1,
    "boost": 1
}

Facets

When preparing a search request, you are also able to ask the engine to provide some faceting on fields. Facets are now known as Aggregations in the Elasticsearch eco-system and are used to provide aggregated data based on a search query.

As usual, you can learn more about aggregations on the Elasticsearch Documentation.

We use it to proper display navigation filters when navigating through Search Result page or Categories.

A facet is a combination of the following parameters :

  • type : the facet type.
  • config : an array containing the facet configuration.

Facet type

For now Elasticsuite supports out of the box the three following aggregation types :

  • Histogram
  • Terms
  • Filters Aggregation (called QueryGroup in Elasticsuite)

This is actually enough to build all the layered navigation needed in Magento2. But you can also extend the Aggregation factory to support more bucket types. See below the Extending queries and aggregation types part.

Facet configuration

When building an aggregation, you can specify it's configuration.

These parameters will depend on the aggregation type as according to the Elasticsearch documentation.

Histogram as described here can handle the following parameters :

  • interval
  • minDocCount

Terms will allow :

  • sortOrder
  • size

and QueryGroup will allow :

Extending the Query and Aggregation Factory

You can define your own query and aggregation factories if you plan to use more than what Elasticsuite provide out the box, you will be able to inject your own objects via the DI.

Let's take a look on how this is done by Elasticsuite :

    <type name="Smile\ElasticsuiteCore\Search\Request\Query\QueryFactory">
        <arguments>
            <argument name="factories" xsi:type="array">
                <item name="boolQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Request\Query\BooleanFactory</item>
                <item name="filteredQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Request\Query\FilteredFactory</item>
                <item name="nestedQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Request\Query\NestedFactory</item>
                <item name="notQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Request\Query\NotFactory</item>
                <item name="missingQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Request\Query\MissingFactory</item>
                <item name="termQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Request\Query\TermFactory</item>
                <item name="termsQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Request\Query\TermsFactory</item>
                <item name="rangeQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Request\Query\RangeFactory</item>
                <item name="matchQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Request\Query\MatchFactory</item>
                <item name="commonQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Request\Query\CommonFactory</item>
                <item name="multiMatchQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Request\Query\MultiMatchFactory</item>
                <item name="functionScore" xsi:type="object">Smile\ElasticsuiteCore\Search\Request\Query\FunctionScoreFactory</item>
            </argument>
        </arguments>
    </type>

    <type name="Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Query\Builder">
        <arguments>
            <argument name="builders" xsi:type="array">
                <item name="boolQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Query\Builder\Boolean\Proxy</item>
                <item name="filteredQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Query\Builder\Filtered\Proxy</item>
                <item name="nestedQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Query\Builder\Nested\Proxy</item>
                <item name="notQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Query\Builder\Not\Proxy</item>
                <item name="missingQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Query\Builder\Missing\Proxy</item>
                <item name="termQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Query\Builder\Term\Proxy</item>
                <item name="termsQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Query\Builder\Terms\Proxy</item>
                <item name="rangeQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Query\Builder\Range\Proxy</item>
                <item name="matchQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Query\Builder\Match\Proxy</item>
                <item name="commonQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Query\Builder\Common\Proxy</item>
                <item name="multiMatchQuery" xsi:type="object">Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Query\Builder\MultiMatch\Proxy</item>
                <item name="functionScore" xsi:type="object">Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Query\Builder\FunctionScore\Proxy</item>
            </argument>
        </arguments>
    </type>

    <type name="Smile\ElasticsuiteCore\Search\Request\Aggregation\AggregationFactory">
        <arguments>
            <argument name="factories" xsi:type="array">
                <item name="termBucket" xsi:type="object">Smile\ElasticsuiteCore\Search\Request\Aggregation\Bucket\TermFactory</item>
                <item name="histogramBucket" xsi:type="object">Smile\ElasticsuiteCore\Search\Request\Aggregation\Bucket\HistogramFactory</item>
                <item name="queryGroupBucket" xsi:type="object">Smile\ElasticsuiteCore\Search\Request\Aggregation\Bucket\QueryGroupFactory</item>
            </argument>
        </arguments>
    </type>

    <type name="Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Aggregation\Builder">
        <arguments>
            <argument name="builders" xsi:type="array">
                <item name="termBucket" xsi:type="object">Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Aggregation\Builder\Term\Proxy</item>
                <item name="histogramBucket" xsi:type="object">Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Aggregation\Builder\Histogram\Proxy</item>
                <item name="queryGroupBucket" xsi:type="object">Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Request\Aggregation\Builder\QueryGroup\Proxy</item>
            </argument>
        </arguments>
    </type>

You can see that for the queries and aggregations, this is done in two parts :

  • The Query Factory : it's a query returning either a Smile\ElasticsuiteCore\Search\Request\QueryInterface or a Smile\ElasticsuiteCore\Search\Request\BucketInterface
  • The Query Builder : it's the object that will convert the previous object into a basic PHP array useable by the Elasticsearch client.

Once you have understand these 2 concepts, you just have to implement a factory and a builder for your new query/aggregation type, and then inject it via the DI. That's all.

Querying the engine

Now that we have built our search request, we will see how it should be run to the engine to retrieve results.

This part is the easiest of this guide. What you have is just to inject the SearchEngine, and pass the previously built Request to the method search().

Like this :

    $searchRequest = $requestBuilder->create(
        $storeId,
        $searchRequestName,
        $from,
        $size,
        $queryText,
        $sortOrders,
        $filters,
        $queryFilters,
        $facets
    );

    /** @var \Magento\Search\Model\SearchEngine $searchEngine */
    $queryResponse = $searchEngine->search($searchRequest);

What you obtain here is a Smile\ElasticsuiteCore\Search\Adapter\Elasticsuite\Response\QueryResponse. This object contains the following methods :

  • getIterator() : retrieve an iterable object on result documents
  • count() : use this to get the number of objects returned by the request
  • getAggregations() : will return the aggregations data, if any

Going Further / Practicals

If you look closely on Smile\ElasticsuiteCatalog\Model\ResourceModel\Product\Fulltext\Collection, you'll be able to understand how we implemented it :

  • we prepare the Search Request in prepareRequest which is called before collection loading.
  • the request is processed, and matching entity_id are extracted
  • we use them as a standard Id filter for the collection
  • layered navigation filters are applying facets and filters if needed

We also already have a module for indexing CMS Pages, which is a quite good tutorial to learn how you can index and query for custom content in an external module.

This module is available here

Clone this wiki locally