Follow us on Twitter!
Ideas are far more powerful than guns.
Thursday, April 24, 2014
Navigation
Home
HellBoundHackers Main:
HellBoundHackers Find:
HellBoundHackers Information:
Learn
Communicate
Submit
Shop
Challenges
HellBoundHackers Exploit:
HellBoundHackers Programming:
HellBoundHackers Think:
HellBoundHackers Track:
HellBoundHackers Patch:
HellBoundHackers Other:
HellBoundHackers Need Help?
Other
Members Online
Total Online: 37
Guests Online: 30
Members Online: 7

Registered Members: 82903
Newest Member: Piriformis
Latest Articles

Hierarchical data and SQL

Arrow Image This article explains some basic concepts to store hierarchical data in RDBMSs.



/* Introduction */

Hierarchical data, like in tree structures or other graphs, is hard to use directly in relational database management systems. Some trick has to be thought of in order to easily store and maintain such data. Lots of information can be found on this topic all over the internet, but I thought it would be a good idea to write a short article about this myself, to explain the basics.


/* The Case */

For this article, I thought I\'d make up some sample data to use for examples. Here is the tree (special case of a directed acyclic graph) I\'d like to store:
img407.imageshack.us/img407/9364/sqltree1.png
This tree represents some hierarchical structure, were Adams is Blake\'s manager and Evans is Blake\'s subordinate. The table will be called \'Personnel\' in my examples.


/* Solution 1: Adjacency list */

In this solution, we represent the tree by representing its edges, like this:
Code
CREATE TABLE Personnel (
   name varchar(40) not null primary key,
   mgr varchar(40) references Personnel(name)
);



This means: there is an edge from employees to their managers.

Something you might have to do regularly, is asking if someone is (indirectly) subordinate of someone else. As the number of levels between those persons is unknown, we don\'t know how many times we have to self join Personnel. This means that this task can\'t be solved in traditional SQL. We would have to use a recursive query to solve this, like this (SQL3 format):
Code
WITH
   RECURSIVE Temp AS
      (SELECT name, mgr FROM Personnel)
   UNION
      (SELECT Temp.name, Temp.mgr
      FROM Temp, Personnel
      WHERE Temp.mgr = Personnel.name)
SELECT mgr
FROM Temp
WHERE name = \'James\' AND mgr = \'Adams\';



However, lots of RDBMSs don\'t support this. Updating the table (adding or deleting an employee) can also require multiple queries. Therefore, this method is often not preferred. However, executing SELECT queries can be simplified by making sure that your table is transitively closed.

I\'ll use the transitive relationship \'is greater than\' as an example for this. Imagine A is greater than B and B is greater than C. This also implies that A is greater than C. A transitively closed table would include all such (implied) pairs. This would mean a much bigger database, yet SELECT queries like the one above will presumably execute faster. So whether this method is efficient all depends on the information you want to store and on the way in which you\'ll be using said information.


/* Solution 2: Materialized path */

This solutions stores the complete path to each employee. In my example, James would have a path of 1.2.1.1, which means that James is the first child of 1.2.1 (Garcia), that Garcia is the first child of 1.2 (Clark) and, finally, that Clark is the second child of 1 (Adams).
Code
CREATE TABLE Personnel (
   name varchar(40) not null primary key,
   path varchar(10) not null
);



In this case, showing a given employee and all of his/her supervisors can be done (quite easily) with the LIKE operator:
Code
SELECT P1.name
FROM Personnel P1, Personnel P2
WHERE P2.path LIKE P1.path || \'%\'
AND P2.name = \'James\';



In Oracle, || is the operator for string concatenation. In MySQL, we would use the CONCAT() function.

So querying the table is a lot easier than with the first solution. Updating, however, remains (relatively) hard, depending on the string manipulation functions your RDBMS has built in.


/* Solution 3: Nested set */

This solution uses the preorder tree traversal algorithm. (http://en.wikiped. . ._traversal) For each node, the moment of visit entry and exit is recorded as its left and right values. It is important that you know what this means. I\'m too lazy to make an image to explain this, but you can try Google Images (\"preorder tree traversal\" -> the image with the food tree) for this. Our table is going to look like this:
Code
CREATE TABLE Personnel (
   name varchar(40) not null primary key,
   lft tinyint not null unique check (lft>0),
   rgt tinyint not null unique check (rgt>1),
   check (lft<rgt)
);



Clark\'s record has a lft value of 8 and a rgt value of 15. If we would want to show all of his subordinates, the query would be like this:
Code
SELECT name
FROM Personnel
WHERE lft BETWEEN 8 AND 15
ORDER BY lft ASC;



On the other hand, all of James\' managers can be found with this query:
Code
SELECT name
FROM Personnel
WHERE lft < 10 AND rgt > 11
ORDER BY lft ASC;



This would require a second query to find the lft and rgt values of a given employee though.

Intermezzo: The number of subordinates of a given person can be found with simple math: numDesc = (rgt-lft-1)/2 .

Updating this table is not that hard either. For example, lets hire a new employee called \'Lewis\' who is going to work for Blake. He\'s going to have the following values: lft=5, rgt=6. (That\'s between Evans and Ford.) First, we have to make some space by updating the lft and rgt values of all employees right of Lewis:
Code
UPDATE Personnel SET lft=lft+2 WHERE lft>4;
UPDATE Personnel SET rgt=rgt+2 WHERE rgt>4;



Now Lewis can fill up the new space:
Code
INSERT INTRO Personnel SET name=\'Lewis\', lft=4, rgt=5;




This last solution is probably the hardest to understand fully, yet you can do almost anything with this and it is usually a lot faster.


/* Epilogue */

First of all, I hope you liked this article and found it useful. English is not my native language, but I think I speak it well enough to make myself clear. There could still be some errors in this text though.

This article was meant as a quick overview. For more in-depth information, this link contains a really good article about the same subject:
http://www.dbazin. . .tropashko4

I know my article looks very similar to the one in the link on first sight, but I want to assure you that I have not copied anything directly from this site. I got my information mainly from my college materials, but I found out later on that my lecturers based their slides on that specific article.


Greetings,

GTADarkDude

Comments

ynori7on February 17 2010 - 18:08:34
God I hate working with trees. Nice article though, well organized.
korgon February 17 2010 - 20:41:07
Good information, well written, really liked it. Trees are fun to pee out of.
ShadyTyranton March 07 2010 - 19:55:18
Nice work interesting article.
Post Comment

Sorry.

You must have completed the challenge Basic 1 and have 100 points or more, to be able to post.