Playing with Hierarchies: Implementing the Nested Set Model in MySQL
Hierarchical data structures, or tree structures, are common in many systems, from representing organizational charts to category structures in e-commerce applications. One way to model such structures in a relational database is by using the Nested Set Model. Unlike the adjacency list model, the nested set model allows efficient querying of both ancestors and descendants with fewer SQL queries. In this article, we'll explore how to implement and manipulate tree data using the nested set model in MySQL.
Why Use the Nested Set Model?
The Nested Set Model is particularly useful when you need to frequently query entire subtrees or get a list of ancestors in hierarchical data. For example:
- Querying all descendants of a node (subcategories in a category tree).
- Fetching the entire tree structure or a specific branch efficiently.
Compared to the adjacency list model, which stores only parent-child relationships, the nested set model assigns two numbers (lft
and rght
) to each node, representing the position of that node in a depth-first traversal of the tree.
Table Structure for the Nested Set Model
Let’s consider a table that represents a simple hierarchy of categories. Here's how the table looks in MySQL:
CREATE TABLE categories (
id INT AUTO_INCREMENT PRIMARY KEY,
name VARCHAR(100),
lft INT NOT NULL,
rght INT NOT NULL
);
id
: Unique identifier for each node.name
: Descriptive name for each node (e.g., category, department).lft
andrght
: These columns define the nested set's tree structure.
How the Nested Set Model Works
The lft
and rght
values indicate a node's position in a tree traversal:
- For a parent node, all its child nodes will have
lft
andrght
values that are inside the range of the parent'slft
andrght
. - Leaf nodes have consecutive
lft
andrght
values.
For example, consider this tree structure:
1. Electronics 1.1. Phones 1.1.1. Smartphones 1.2. Computers 1.2.1. Laptops 1.2.2. Desktops
A corresponding categories
table using the nested set model might look like this:
id | name | lft | rght |
---|---|---|---|
1 | Electronics | 1 | 10 |
2 | Phones | 2 | 5 |
3 | Smartphones | 3 | 4 |
4 | Computers | 6 | 9 |
5 | Laptops | 7 | 8 |
Here is the SQL INSERT INTO if you want to try.
-- Root category "Electronics"
INSERT INTO categories (name, lft, rght) VALUES ('Electronics', 1, 10);
-- Subcategory "Phones" under "Electronics"
INSERT INTO categories (name, lft, rght) VALUES ('Phones', 2, 5);
-- Sub-subcategory "Smartphones" under "Phones"
INSERT INTO categories (name, lft, rght) VALUES ('Smartphones', 3, 4);
-- Subcategory "Computers" under "Electronics"
INSERT INTO categories (name, lft, rght) VALUES ('Computers', 6, 9);
-- Sub-subcategory "Laptops" under "Computers"
INSERT INTO categories (name, lft, rght) VALUES ('Laptops', 7, 8);
Here is the visualization of the lft
and rght
for each node on the tree.
Querying Data Using the Nested Set Model
1. Fetching All Descendants of a Node
To fetch all descendants of a category (e.g., all subcategories under "Electronics"), you can use the following query:
SELECT name
FROM categories
WHERE lft BETWEEN (SELECT lft FROM categories WHERE name = 'Electronics')
AND (SELECT rght FROM categories WHERE name = 'Electronics');
This query efficiently retrieves all categories between the lft
and rght
values of "Electronics".
We can use WITH RECURSIVE as well for above query, here is the query.
WITH RECURSIVE category_tree AS (
SELECT id, name, lft, rght
FROM categories
WHERE name = 'Electronics'
UNION ALL
SELECT c.id, c.name, c.lft, c.rght
FROM categories c
INNER JOIN category_tree ct ON c.lft > ct.lft AND c.lft < ct.rght
)
SELECT name
FROM category_tree;
The result of above query will include "Electronics" as well, to exclude we can use this query.
SELECT name
FROM categories
WHERE lft > (SELECT lft FROM categories WHERE name = 'Electronics')
AND rght < (SELECT rght FROM categories WHERE name = 'Electronics');
And here is the equal query if using WITH RECURSIVE.
WITH RECURSIVE category_tree AS (
SELECT id, name, lft, rght
FROM categories
WHERE name = 'Electronics'
UNION ALL
SELECT c.id, c.name, c.lft, c.rght
FROM categories c
INNER JOIN category_tree ct ON c.lft > ct.lft AND c.lft < ct.rght
)
SELECT name
FROM category_tree
WHERE name != 'Electronics';
2. Fetching the Entire Tree Structure
To retrieve the entire tree in the correct hierarchical order, you can use:
SELECT name, lft, rght
FROM categories
ORDER BY lft;
This will list all categories based on their lft
values, preserving the tree hierarchy.
3. Fetching the Ancestors of a Node
To get all the ancestors of a specific node (e.g., "Smartphones"), use the following query:
SELECT parent.name
FROM categories AS node,
categories AS parent
WHERE node.lft BETWEEN parent.lft AND parent.rght
AND node.name = 'Smartphones'
ORDER BY parent.lft;
This query returns all parent categories of "Smartphones", starting from the root.
Inserting Nodes into the Nested Set Model
Inserting a node into the tree involves updating the lft
and rght
values of the existing nodes and then inserting the new node with appropriate lft
and rght
values. For example, to insert a new category called "Tablets" under "Phones", you need to:
- Update the
lft
andrght
values of the existing nodes. - Insert the new node with its corresponding
lft
andrght
values.
-- Update the right values
UPDATE categories
SET rght = rght + 2
WHERE rght >= 5;
-- Update the left values
UPDATE categories
SET lft = lft + 2
WHERE lft > 5;
-- Insert the new node
INSERT INTO categories (name, lft, rght)
VALUES ('Tablets', 5, 6);
This query adds "Tablets" as a child of "Phones".
Deleting Nodes
To delete a node (and all its descendants), you need to:
- Remove the node and its children.
- Update the
lft
andrght
values of the remaining nodes.
For example, to delete the "Phones" node and all its descendants:
-- Delete the node and its descendants
DELETE FROM categories
WHERE lft BETWEEN 2 AND 5;
-- Update the remaining nodes
UPDATE categories
SET rght = rght - 4
WHERE rght > 5;
UPDATE categories
SET lft = lft - 4
WHERE lft > 5;
What we learned so far?
The Nested Set Model is a powerful technique for managing hierarchical data in MySQL. Although inserting and deleting nodes require more complex operations, the model offers significant advantages for querying entire branches or subtrees. Understanding and applying this model can significantly improve the efficiency of applications that need to work with hierarchical data.
Hope it helps.
Comments ()