Playing with Hierarchies: Implementing the Nested Set Model in MySQL

Playing with Hierarchies: Implementing the Nested Set Model in MySQL
Photo by Johann Siemens / Unsplash

Hierarchical data structures, or tree structures, are common in many systems, from representing organizational charts to category structures in e-commerce applications. One way to model such structures in a relational database is by using the Nested Set Model. Unlike the adjacency list model, the nested set model allows efficient querying of both ancestors and descendants with fewer SQL queries. In this article, we'll explore how to implement and manipulate tree data using the nested set model in MySQL.

Why Use the Nested Set Model?

The Nested Set Model is particularly useful when you need to frequently query entire subtrees or get a list of ancestors in hierarchical data. For example:

  • Querying all descendants of a node (subcategories in a category tree).
  • Fetching the entire tree structure or a specific branch efficiently.

Compared to the adjacency list model, which stores only parent-child relationships, the nested set model assigns two numbers (lft and rght) to each node, representing the position of that node in a depth-first traversal of the tree.

Table Structure for the Nested Set Model

Let’s consider a table that represents a simple hierarchy of categories. Here's how the table looks in MySQL:

CREATE TABLE categories (
    id INT AUTO_INCREMENT PRIMARY KEY,
    name VARCHAR(100),
    lft INT NOT NULL,
    rght INT NOT NULL
);
  • id: Unique identifier for each node.
  • name: Descriptive name for each node (e.g., category, department).
  • lft and rght: These columns define the nested set's tree structure.

How the Nested Set Model Works

The lft and rght values indicate a node's position in a tree traversal:

  • For a parent node, all its child nodes will have lft and rght values that are inside the range of the parent's lft and rght.
  • Leaf nodes have consecutive lft and rght values.

For example, consider this tree structure:

1. Electronics
  1.1. Phones
    1.1.1. Smartphones
  1.2. Computers
    1.2.1. Laptops
    1.2.2. Desktops

A corresponding categories table using the nested set model might look like this:

idnamelftrght
1Electronics110
2Phones25
3Smartphones34
4Computers69
5Laptops78

Here is the SQL INSERT INTO if you want to try.

-- Root category "Electronics"
INSERT INTO categories (name, lft, rght) VALUES ('Electronics', 1, 10);

-- Subcategory "Phones" under "Electronics"
INSERT INTO categories (name, lft, rght) VALUES ('Phones', 2, 5);

-- Sub-subcategory "Smartphones" under "Phones"
INSERT INTO categories (name, lft, rght) VALUES ('Smartphones', 3, 4);

-- Subcategory "Computers" under "Electronics"
INSERT INTO categories (name, lft, rght) VALUES ('Computers', 6, 9);

-- Sub-subcategory "Laptops" under "Computers"
INSERT INTO categories (name, lft, rght) VALUES ('Laptops', 7, 8);

Here is the visualization of the lft and rght for each node on the tree.

Electronics (1,12) Phones (2,5) Smartphones (3,4) Computers (6,11) Laptops (7,8) Desktops (9,10) Format: Category Name (left value, right value)

Querying Data Using the Nested Set Model

1. Fetching All Descendants of a Node

To fetch all descendants of a category (e.g., all subcategories under "Electronics"), you can use the following query:

SELECT name
FROM categories
WHERE lft BETWEEN (SELECT lft FROM categories WHERE name = 'Electronics')
AND (SELECT rght FROM categories WHERE name = 'Electronics');

This query efficiently retrieves all categories between the lft and rght values of "Electronics".

We can use WITH RECURSIVE as well for above query, here is the query.

WITH RECURSIVE category_tree AS (
    SELECT id, name, lft, rght
    FROM categories
    WHERE name = 'Electronics'
    
    UNION ALL

    SELECT c.id, c.name, c.lft, c.rght
    FROM categories c
    INNER JOIN category_tree ct ON c.lft > ct.lft AND c.lft < ct.rght
)
SELECT name
FROM category_tree;

The result of above query will include "Electronics" as well, to exclude we can use this query.

SELECT name
FROM categories
WHERE lft > (SELECT lft FROM categories WHERE name = 'Electronics')
AND rght < (SELECT rght FROM categories WHERE name = 'Electronics');

And here is the equal query if using WITH RECURSIVE.

WITH RECURSIVE category_tree AS (
    SELECT id, name, lft, rght
    FROM categories
    WHERE name = 'Electronics'
    
    UNION ALL

    SELECT c.id, c.name, c.lft, c.rght
    FROM categories c
    INNER JOIN category_tree ct ON c.lft > ct.lft AND c.lft < ct.rght
)
SELECT name
FROM category_tree
WHERE name != 'Electronics';
2. Fetching the Entire Tree Structure

To retrieve the entire tree in the correct hierarchical order, you can use:

SELECT name, lft, rght
FROM categories
ORDER BY lft;

This will list all categories based on their lft values, preserving the tree hierarchy.

3. Fetching the Ancestors of a Node

To get all the ancestors of a specific node (e.g., "Smartphones"), use the following query:

SELECT parent.name
FROM categories AS node,
     categories AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rght
AND node.name = 'Smartphones'
ORDER BY parent.lft;

This query returns all parent categories of "Smartphones", starting from the root.

Inserting Nodes into the Nested Set Model

Inserting a node into the tree involves updating the lft and rght values of the existing nodes and then inserting the new node with appropriate lft and rght values. For example, to insert a new category called "Tablets" under "Phones", you need to:

  1. Update the lft and rght values of the existing nodes.
  2. Insert the new node with its corresponding lft and rght values.
-- Update the right values
UPDATE categories
SET rght = rght + 2
WHERE rght >= 5;

-- Update the left values
UPDATE categories
SET lft = lft + 2
WHERE lft > 5;

-- Insert the new node
INSERT INTO categories (name, lft, rght)
VALUES ('Tablets', 5, 6);

This query adds "Tablets" as a child of "Phones".

Deleting Nodes

To delete a node (and all its descendants), you need to:

  1. Remove the node and its children.
  2. Update the lft and rght values of the remaining nodes.

For example, to delete the "Phones" node and all its descendants:

-- Delete the node and its descendants
DELETE FROM categories
WHERE lft BETWEEN 2 AND 5;

-- Update the remaining nodes
UPDATE categories
SET rght = rght - 4
WHERE rght > 5;

UPDATE categories
SET lft = lft - 4
WHERE lft > 5;

What we learned so far?

The Nested Set Model is a powerful technique for managing hierarchical data in MySQL. Although inserting and deleting nodes require more complex operations, the model offers significant advantages for querying entire branches or subtrees. Understanding and applying this model can significantly improve the efficiency of applications that need to work with hierarchical data.

Hope it helps.

Support Us

Subscribe to Buka Corner

Don’t miss out on the latest issues. Sign up now to get access to the library of members-only issues.
[email protected]
Subscribe