ESQL an alternative to DQL here is why

In the last month or two, I’ve hacked around a concept that I call “PHP Extended SQL - an alternative to Doctrine Query Language”. In this post I’ll showcase the ESQL library with a complex use case, illustrating the reasons behind the creation of the ESQL library.

The use case

I have a collection of Products each belonging to a Category. The data is using the vocabulary of https://schema.org/Product which represents the Category property of my Product as: “A category for the item. Greater signs or slashes can be used to informally indicate a category hierarchy. “.

The stack is using API Platform which handles pagination, filters and the JSON-LD format completed by the Hydra specification.

Dive in

For the sake of readability of the code we removed non-essential parts. The code is available on github with instructions on how to launch it yourself.

Let’s create two doctrine entities, first the Category:

<?php

namespace App\Entity;

use ApiPlatform\Core\Annotation\ApiProperty;
use ApiPlatform\Core\Annotation\ApiResource;
use Doctrine\Common\Collections\Collection;
use Doctrine\ORM\Mapping as ORM;

/**
 * @ApiResource(iri="http://schema.org/Category")
 * @ORM\Entity
 */
class Category
{
    /**
     * @ORM\Column(type="string", length=255)
     */
    public string $name;

    /**
     * @ORM\Column(type="string", length=30)
     * @ORM\Id
     * @ApiProperty(iri="http://schema.org/identifier", identifier=true)
     */
    public string $identifier;

    /**
     * @ORM\OneToMany(targetEntity="Category", mappedBy="parent")
     * @ApiProperty(readable=false)
     */
    private Collection $children;

    /**
     * @ORM\ManyToOne(targetEntity="Category", inversedBy="children")
     * @ORM\JoinColumn(name="parent_id", referencedColumnName="identifier")
     */
    public ?Category $parent = null;
}

Nothing special, the category has an identifier, a name and uses a self-referencing OneToMany for the hierarchy. Now for the product I’ll choose to name my relation to Category categoryRelation because I want my categories to be displayed with “slashes (…) to informally indicate a category hierarchy” as the JSON schema proposes. As this Entity is also our (JS)Object(N) representation on the API we will define a field that isn’t mapped to doctrine and a getter that builds the breadcrumb displaying the Category hierarchy:

<?php
namespace App\Entity;

class Product {
    /**
     * @ApiProperty(iri="http://schema.org/category")
     */
    private string $category = '';
    
    public function getCategory(): string
    {
        $category = $this->categoryRelation;
        $str = $category->name;
        while ($category = $category->parent) {
            $str = $category->name.' / '.$str;
        }

        return $str;
    }
}

Here is the full Product Entity:

<?php
namespace App\Entity;

use ApiPlatform\Core\Annotation\ApiProperty;
use ApiPlatform\Core\Annotation\ApiResource;
use Doctrine\ORM\Mapping as ORM;
use Symfony\Bridge\Doctrine\IdGenerator\UlidGenerator;
use Symfony\Component\Uid\Ulid;

/**
 * @ApiResource(iri="http://schema.org/Product")
 * @ORM\Entity
 */
class Product
{
    /**
     * @ORM\Id
     * @ORM\Column(type="ulid", unique=true)
     * @ORM\GeneratedValue(strategy="CUSTOM")
     * @ORM\CustomIdGenerator(class=UlidGenerator::class)
     */
    private Ulid $id;

    /**
     * @ORM\Column(type="string", length=255)
     * @ApiProperty(iri="http://schema.org/name")
     */
    public string $name;

    /**
     * @ORM\Column(type="string", length=255)
     * @ApiProperty(iri="http://schema.org/description")
     */
    public string $description;

    /**
     * @ORM\ManyToOne(targetEntity=Category::class)
     * @ORM\JoinColumn(name="category_id", referencedColumnName="identifier")
     * @ApiProperty(readable=false)
     */
    public Category $categoryRelation;

    /**
     * @ApiProperty(iri="http://schema.org/category")
     */
    private string $category = '';

    /**
     * @ApiProperty(iri="https://schema.org/gtin")
     * @ORM\Column(type="string", length=14)
     */
    public string $gtin;

    public function getCategory(): string
    {
        $category = $this->categoryRelation;
        $str = $category->name;
        while ($category = $category->parent) {
            $str = $category->name.' / '.$str;
        }

        return $str;
    }
}

The GTIN field stands for Global Trade Item Number and often represents the product’s bar code

Having our two entities, we will add some fixtures and try this out. For this I’m using the AliceBundle:

App\Entity\Product:
    product_{1..100}:
        gtin: '<ean13()>'
        name: '<name()>'
        description: '<realText(140)>'
        categoryRelation: '@<randomElement([vegetables_specified,bagged_salads,herbs_aromatic,fresh_fruits,dried_fruits,iceberg])>' 

The full file is availabe on Github. After loading the fixtures (bin/console hautelook:fixtures:load) run the API endpoint of products and enjoy:

{
  "@context": "/api/contexts/Product",
  "@id": "/api/products",
  "@type": "hydra:Collection",
  "hydra:member": [
    {
      "@id": "/api/products/01774d7b-a79e-a138-6ff2-e94806f06837",
      "@type": "http://schema.org/Product",
      "name": "Tianna Ziemann",
      "description": "Knave was standing before them, in chains, with a round face, and was gone across to the croquet-ground. The other guests had taken his.",
      "gtin": "5794390900075",
      "id": "01774d7b-a79e-a138-6ff2-e94806f06837",
      "category": "Vegetables / Bagged salads / Iceberg"
    },
    {
      "@id": "/api/products/01774d7b-a79e-a138-6ff2-e94806f06838",
      "@type": "http://schema.org/Product",
      "name": "Dr. Samir Weimann DDS",
      "description": "Alice knew it was over at last: 'and I do wonder what I eat\" is the capital of Paris, and Paris is the reason and all her wonderful.",
      "gtin": "4173048045459",
      "id": "01774d7b-a79e-a138-6ff2-e94806f06838",
      "category": "Vegetables / Herbs and Aromatic Plants"
    },
    {
      "@id": "/api/products/01774d7b-a79e-a138-6ff2-e94806f06839",
      "@type": "http://schema.org/Product",
      "name": "Anais Lockman IV",
      "description": "Alice did not feel encouraged to ask his neighbour to tell me who YOU are, first.' 'Why?' said the King said to Alice. 'What IS the same.",
      "gtin": "7892793312223",
      "id": "01774d7b-a79e-a138-6ff2-e94806f06839",
      "category": "Fruits / Dried Fruits"
    },
    ...
  ],
  "hydra:totalItems": 100,
  "hydra:view": {
    "@id": "/api/products.jsonld?perPage=10&page=1",
    "@type": "hydra:PartialCollectionView",
    "hydra:first": "/api/products.jsonld?perPage=10&page=1",
    "hydra:last": "/api/products.jsonld?perPage=10&page=10",
    "hydra:next": "/api/products.jsonld?perPage=10&page=2"
  }
}

It’s working just fine isn’t it?

Understanding the magic behind Doctrine and API Platform

When you request /api/products.jsonld a few things happen.

  1. Symfony’s HTTP Kernel process the request
  2. API Platform receives the information that you’re looking for a collection of the Product resource
  3. Doctrine fetches the collection and hydrates the Product class
  4. API Platform serializes the data using the Symfony’s serializer

None of these actors got the information about the Category resource, remember we execute the following code when retrieving a Product’s category:

<?php
public function getCategory(): string
{
    $category = $this->categoryRelation;
    $str = $category->name;
    while ($category = $category->parent) {
    	$str = $category->name.' / '.$str;
    }

    return $str;
}

This code path is called during the serializer phase. It’s the phase where the Product object gets transformed to JSON. At this moment, the only information Doctrine knows is the categoryRelation identifier. When you call $category->name, if Doctrine doesn’t have this object in memory it’ll fetch it from the Database. To illustrate this let’s take a look at the Symfony’s Profiler, on the Performance tab:

Doctrine profiler performance

On this example, I called api/products.jsonld?perPage=100 which fetches all my products. The consequence here is that Doctrine creates 8 additional queries, one for each category. We can observe this on the Profiler’s Doctrine tab:

Doctrine profiler performance

8 queries to fetch categories, the product collection query and a count query used for the pagination.

This behavior is functional and even has acceptable performances. But what if instead of having only 8 categories we talked about complex hierarchies of Employees or Companies having Groups and many connections? Would this still be acceptable?

Improving data retrieval

If you’re familiar with SQL, you probably heard about Common Table Expression (CTE). It’s a temporary named result set that, among other things, can be recursive. Therefore it’s really useful to create hierarchical queries. For example, let’s write such a query to fetch every categories using a CTE named descendants:

WITH RECURSIVE
    descendants(identifier, name, parent_id) AS (
        SELECT c.identifier, c.name, c.parent_id FROM category c WHERE c.parent_id IS NULL
        UNION ALL
        SELECT c.identifier, c.name, c.parent_id FROM descendants, category c WHERE c.parent_id = descendants.identifier
    )
SELECT c.identifier, c.name, c.parent_id FROM descendants c

On Oracle databases, you’d rather use the CONNECT BY expression which computes a very useful Level pseudo-column on top of handling the hierarchical model

This query would help us reduce the previous number of 8 queries to a single one. However, API Platform works with the Doctrine Query Language (DQL) and as far as I know there is no implementation for CTEs. I found a research paper and this cte-builder that may help you add such a CTE on top of the doctrine if you really want to it is definitely possible. We could use native SQL with Doctrine and this would require us to use a ResultSetMapping, the class behind Doctrine’s mapping between fetched data (arrays) and PHP classes. Even so, we would loose API Platform’s filters and pagination as they also work with the Doctrine Query Language.

Another viable tip with Doctrine is to use a Nested Set for example using the Doctrine Tree Extension

ESQL as a viable alternative

The ESQL library offers an API Platform bridge that integrates seamlessly with your current stack. Let’s add the library to the project:

composer require soyuka/esql jane-php/automapper

Load the bundles (we use JanePHP’s AutoMapper for the example, a mapper using symfony’s serializer is also available):

<?php
# config/bundles.php
return [
    ...
    Jane\AutoMapper\Bundle\JaneAutoMapperBundle::class => ['all' => true],
    Soyuka\ESQL\Bridge\Symfony\Bundle\ESQLBundle::class => ['all' => true],
];

Then, let’s enable the bridge using the esql attribute:

<?php
/**
 * @ApiResource(iri="http://schema.org/Product", attributes={"esql"=true})
 * @ORM\Entity
 */
class Product

Reload /api/products.jsonld and you should stumble upon an error:

The property "App\Entity\Category::$name" is not readable because it is typed "string". You should initialize it or declare a default value instead.

This is expected, check the Doctrine tab of the profiler again. We executed two queries but did not fetch the categories:

doctrine queries with esql

We’ll add these ourselves by adding a DataProvider. In API Platform, a Data Provider is the main extension point when it comes to data retrieval. Our ProductDataProvider will:

The service declaration stands as:

App\DataProvider\ProductDataProvider:
	tags: 
		- {name: 'api_platform.collection_data_provider', priority: 20}
	arguments: 
		$decorated: '@esql.api_platform.default.collection_data_provider'

And here’s the ProductDataProvider (read comments):

<?php
namespace App\DataProvider;

use ApiPlatform\Core\DataProvider\CollectionDataProviderInterface;
use ApiPlatform\Core\DataProvider\ContextAwareCollectionDataProviderInterface;
use ApiPlatform\Core\DataProvider\RestrictedDataProviderInterface;
use Doctrine\Persistence\ManagerRegistry;
use Soyuka\ESQL\ESQL;
use Soyuka\ESQL\ESQLInterface;
use Soyuka\ESQL\ESQLMapperInterface;
use App\Entity\Category;
use App\Entity\Product;
use Symfony\Component\HttpFoundation\RequestStack;
use Symfony\Component\HttpKernel\Exception\BadRequestHttpException;

final class ProductDataProvider implements RestrictedDataProviderInterface, CollectionDataProviderInterface, ContextAwareCollectionDataProviderInterface
{
    private RequestStack $requestStack;
    private ManagerRegistry $managerRegistry;
    private ESQLMapperInterface $mapper;
    private ESQLInterface $esql;
    private ContextAwareCollectionDataProviderInterface $decorated;

    public function __construct(RequestStack $requestStack, ManagerRegistry $managerRegistry, ESQLMapperInterface $mapper, ESQLInterface $esql, ContextAwareCollectionDataProviderInterface $decorated)
    {
        $this->requestStack = $requestStack;
        $this->managerRegistry = $managerRegistry;
        $this->mapper = $mapper;
        $this->esql = $esql;
        $this->decorated = $decorated;
    }

    public function supports(string $resourceClass, string $operationName = null, array $context = []): bool
    {
        return Product::class === $resourceClass;
    }

    public function getCollection(string $resourceClass, string $operationName = null, array $context = [])
    {
        // Call the DataProvider to retrieve products
        $data = $this->decorated->getCollection($resourceClass, $operationName, $context);
        // Fetch categories
        $categories = $this->getCategories();

        // Map categories to the product ourselves
        foreach ($data as $product) {
            foreach ($categories as $category) {
                if ($product->categoryRelation->identifier === $category->identifier) {
                    $product->categoryRelation = $category;
                }
            }
        }

        return $data;
    }

    private function getCategories(): array
    {
        $categoryParameter = null === ($request = $this->requestStack->getCurrentRequest()) ? null : $request->query->get('category');
        
        if (\is_array($categoryParameter)) {
            throw new BadRequestHttpException();
        }

        $connection = $this->managerRegistry->getConnection();
        // If a category is present we can load only the tree for this specific category
        $categoryPredicate = $categoryParameter ? 'c.identifier = :category' : 'c.parent_id IS NULL';
        $category = $this->esql->__invoke(Category::class);

        $query = <<<SQL
WITH RECURSIVE
    ancestors(identifier, name, parent_id) AS (
        SELECT c.identifier, c.name, c.parent_id FROM category c WHERE {$categoryPredicate}
        UNION ALL
        SELECT c.identifier, c.name, c.parent_id FROM ancestors, category c WHERE c.identifier = ancestors.parent_id
    ),
    descendants(identifier, name, parent_id) AS (
        SELECT c.identifier, c.name, c.parent_id FROM category c WHERE {$categoryPredicate}
        UNION ALL
        SELECT c.identifier, c.name, c.parent_id FROM descendants, category c WHERE c.parent_id = descendants.identifier
    )
SELECT {$category->columns()} FROM ancestors {$category->alias()}
UNION
SELECT {$category->columns()} FROM descendants {$category->alias()}
SQL;

        // Use doctrine's executeQuery method, it calls prepare and execute
        $stmt = $connection->executeQuery($query, $categoryParameter ? ['category' => $categoryParameter] : []);
        $data = $stmt->fetchAll();
        $categories = $this->mapper->map($data, Category::class);

        // We need to map categories to themselves to obtain a tree
        foreach ($categories as $category) {
            if (null === $category->parent) {
                continue;
            }

            foreach ($categories as $parent) {
                if ($parent->identifier === $category->parent->identifier) {
                    $category->parent = $parent;
                }
            }
        }

        return $categories;
    }
}

And here we go, try out calling /api/products.jsonld?perPage=100, only 3 queries will ever get called hence:

ESQL queries reduced to 3

Filter over categories

Now to go further, we will add a custom filter on this endpoint. The particularity of this filter is that it filters on the whole hierarchy. Meaning that:

The ESQL bridge provides a QueryCollectionExtensionInterface allowing you to modify the original SQL query and adapt it to your needs. This preserves the pagination and the already available filters.

<?php

namespace App\Extension;

use Doctrine\Persistence\ManagerRegistry;
use Soyuka\ESQL\Bridge\ApiPlatform\Extension\QueryCollectionExtensionInterface;
use Soyuka\ESQL\ESQLInterface;
use App\Entity\Category;
use App\Entity\Product;
use Symfony\Component\HttpFoundation\RequestStack;
use Symfony\Component\HttpKernel\Exception\BadRequestHttpException;

final class CategoryFilterExtension implements QueryCollectionExtensionInterface
{
    private RequestStack $requestStack;
    private ESQLInterface $esql;
    private ManagerRegistry $managerRegistry;

    public function __construct(RequestStack $requestStack, ESQLInterface $esql, ManagerRegistry $managerRegistry)
    {
        $this->requestStack = $requestStack;
        $this->esql = $esql;
        $this->managerRegistry = $managerRegistry;
    }

    public function apply(string $query, string $resourceClass, ?string $operationName = null, array $parameters = [], array $context = []): array
    {
        $request = $this->requestStack->getCurrentRequest();

        // Just return early if the category parameter is not present
        if (null === $request || !$request->query->has('category') || null === $categoryParameter = $request->query->get('category')) {
            return [$query, $parameters];
        }

        if (\is_array($categoryParameter)) {
            throw new BadRequestHttpException();
        }

        $product = $this->esql->__invoke($resourceClass);
        $category = $this->esql->__invoke(Category::class);

        // We change the query and use this one instead, it joins descendants over the products
        $query = <<<SQL
WITH RECURSIVE
    descendants(identifier, name, parent_id) AS (
        SELECT c.identifier, c.name, c.parent_id FROM category c WHERE c.identifier = :category
        UNION ALL
        SELECT c.identifier, c.name, c.parent_id FROM descendants, category c WHERE c.parent_id = descendants.identifier
    )
SELECT {$product->columns()} FROM {$product->table()}
JOIN descendants {$category->alias()} ON {$product->join(Category::class)}
SQL;

        $parameters['category'] = $categoryParameter;

        return [$query, $parameters];
    }

    public function supports(string $resourceClass, ?string $operationName = null, array $context = []): bool
    {
        return Product::class === $resourceClass;
    }
}

The service definition:

App\Extension\CategoryFilterExtension:
    tags: ['esql.collection_extension']

And voilĂ , you have a hierarchical filter, with clean queries and still benefit from pagination and filters. For example, querying /api/products.jsonld?category=bagged_salads&description=like.*looking* will add a product.description LIKE %looking% clause!

To know more take a look at ESQL on github! This example with ESQL is also available here with instructions on how to run it.