Site Navigation

SQL Smackdown 02.17.2003

[2003-02-17] < Back

SQL Smack down is back!

It’s been a while since the last issue, but I think this issue should make up for having to wait so long.

The main focus of this article will be trees and hierarchies, and I am going to base most of this article around the example of an employee table, which will show subordination in the structure.

One of the problems with trees when using a database language like SQL, is that trees are hierarchal data, and SQL is designed to store relational data. Below I will go explain the 2 most common ways to model hierarchal data in SQL, and go over how to use one of them.

Adjacency Lists

Most of you have probably seen something similar to this example in a book or on the web, and I am betting that the example told you to build a table with a structure similar to the one below:

Employee
emp boss Other Info
Jamie NULL
Rudy Jamie
Julie Jamie
Mike Julie
Ed Mike
Nate Mike

The above structure represents what is called an adjacency list model, and is fairly easy to use. The problems with this model are that it is not very normalized, and it does not model subordination very well (if you remove Mike, then Ed and Nate will be lost in the hierarchy). Another problem with this model is that in order to query a tree from it, you most likely will be relegated to using a cursor. There must be a better way…

Nested Sets

The answer is nested sets. Let’s consider a table with the following structure:

Employee
emp lft rgt
Jamie 1 12
Rudy 2 3
Julie 4 11
Mike 5 10
Ed 6 7
Nate 8 9

This structure is more normalized, since the emp column is not duplicated when used for the boss column. Also, if we removed Mike from the table, Ed and Nate would not be lost in the hierarchy (you will see why soon). There are also some simple queries we can run to get the tree structure.

In order to understand what is going on in the structure above, we will use a simple nested oval analogy. With the data from the tree above, we can see that Jamie, with the span in between his lft and rgt values encompassing all other lfg and rgt values, will have the largest oval, encompassing all other ovals. Rudy has a lft of 2 and a rgt of 3, which means he will be inside Jamie’s oval. Julie has a lft value of 4 and a rgt value of 11, so she will be inside of Jamie’s oval, but not inside of Rudy’s oval, since Rudy’s lft and rgt values do not fall in between Julie’s range. Mike’s lft and rgt values of 5 and 10 do, however, fall in between Julie’s, so his oval is located inside of Julie’s oval, which is in turn inside of Jamie’s oval. Ed and Nate have lft and rgt values not encompassing each other, but that fall inside of Mike’s range, so they will be inside of Mike’s oval. Below I have included an image to help you visualize this idea.

From the above model I hope you can get a basic understanding of how the lft and rgt fields work to group the employees.

Real world examples

Now that you have a basic understanding of how the Nested Sets models works, you may be thinking that it looks nice, but how is this supposed to make it easier to use the tree? While I will admit it looks like a daunting task to write queries to select trees from this structure, and even more difficult to insert an employee in the middle or even at the end of the tree, with the few examples I give below you should have a good foundation of how to write most, if not all of the queries you will need for the tree.

First, here is the code to create the table, and insert the sample data.

CREATE TABLE Employee (
    emp VARCHAR(10) NOT NULL PRIMARY KEY,
    lft INTEGER NOT NULL UNIQUE CHECK (lft > 0),
    rgt INTEGER NOT NULL UNIQUE CHECK (rgt > 1),
    CONSTRAINT order_okay CHECK (lft < rgt)
)

INSERT Employee (emp, lft, rgt) VALUES ('Jamie', 1, 12)
INSERT Employee (emp, lft, rgt) VALUES ('Rudy', 2, 3)
INSERT Employee (emp, lft, rgt) VALUES ('Julie', 4, 11)
INSERT Employee (emp, lft, rgt) VALUES ('Mike', 5, 10)
INSERT Employee (emp, lft, rgt) VALUES ('Ed', 6, 7)
INSERT Employee (emp, lft, rgt) VALUES ('Nate', 8, 9)

Now we will move on to the examples.

Example 1: Selecting an employee and all of their supervisors.
SELECT b.*
    FROM Employee AS a
    INNER JOIN Employee AS b
        ON (a.lft BETWEEN b.lft AND b.rgt)
    WHERE a.emp = 'Ed'
    ORDER BY b.rgt

This will return the following results, showing Ed, then each one of his superiors in order of their subordination, all the way up to Jamie.

emp lft rgt
Ed 6 7
Mike 5 10
Julie 4 11
Jamie 1 12
Example 2: Selecting a supervisor and all of their subordinates.
SELECT a.*
    FROM Employee AS a
    INNER JOIN Employee AS b
        ON (a.lft BETWEEN b.lft AND b.rgt)
    WHERE b.emp = 'Mike'
    ORDER BY a.rgt DESC

This query will return the following results, showing Mike, then all of his subordinates.

emp lft rgt
Mike 5 10
Nate 8 9
Ed 6 7
Example 3: Adding an employee.

You will need to create a stored procedure to do this one, or execute it as a batch in the query analyzer.

BEGIN
DECLARE @right_most_sibling INTEGER;
SET @right_most_sibling = (SELECT rgt
                            FROM Employee
                            WHERE emp = 'Mike'); -- Supervisor

UPDATE Employee
    SET lft = CASE
                WHEN lft > @right_most_sibling THEN lft + 2
                ELSE lft
            END,
        rgt = CASE
                WHEN rgt >= @right_most_sibling THEN rgt + 2
                ELSE rgt
            END
    WHERE rgt >= @right_most_sibling;

INSERT INTO Employee (emp, lft, rgt)
    VALUES ('Dave', @right_most_sibling, (@right_most_sibling + 1))
END

This will add the new employee, and shift all necessary lft and rgt values accordingly.

Example 4: A list of nodes so you can make an indented listng.
SELECT COUNT(a.emp) AS indentation, a.emp, a.lft
    FROM Employee AS a
    INNER JOIN Employee AS b
        ON (a.lft BETWEEN b.lft AND b.rgt)
    GROUP BY a.emp, a.lft
    ORDER BY a.lft

This will return a straight list that can be used to display the tree in your application, if you want to display the tree in query analyzer, try this:

SELECT CAST(REPLICATE(' ', COUNT(a.emp) - 1) + a.emp AS VARCHAR(15)) AS Tree, a.lft
    FROM Employee AS a
    INNER JOIN Employee AS b
        ON (a.lft BETWEEN b.lft AND b.rgt)
    GROUP BY a.emp, a.lft
    ORDER BY a.lft

This should return a result set similar to the following, depending on which data you have inserted into the table:

emp lft
Jamie 1
Rudy 2
Art 3
Julie 6
Mike 7
Ed 8
Nate 10
Dave 12
Laura 16

If you have any more questions about trees and hierarchies please let me know. This is a confusing subject, and I am sure I have not done a perfect job of explaining them here.

Disclaimer: Much of the information contained in this article was derived from Joe Celko's article at Intelligent Enterprise, various message board posts, and the author's own personal experience.

Well that's it for this edition of SQL Smack down. As always, if you have any questions about these or any other topics, please talk to Barry.

< Back