CONNECT BY, Part II

In my last post, I used CONNECT BY to query a table holding the bones of the skeleton in the “Dem Dry Bones” song: since the skeleton in the song has a head with no parent bone (making it a root node), and each bone has only one parent bone, it’s hierarchical data.

In this post, I’m going to add an arm and some ribs to the skeleton to make the data more interesting to query.

INSERT INTO skeleton VALUES ('a rib', 'back');
INSERT INTO skeleton VALUES ('b rib', 'back');
INSERT INTO skeleton VALUES ('c rib', 'back');
INSERT INTO skeleton VALUES ('arm', 'shoulder');
INSERT INTO skeleton VALUES ('wrist', 'arm');
INSERT INTO skeleton VALUES ('finger','wrist');
 
commit;

Now let’s see what the skeleton looks like.

SELECT bone, connected_to_the, level, 
lpad(' ',2*level, '-') || bone AS bone_tree  
--,ltrim(sys_connect_by_path(bone,' -> '),' -> ') as path 
FROM skeleton
START WITH connected_to_the IS NULL
CONNECT BY prior bone=connected_to_the
BONE      CONNECTED LEVEL BONE_TREE
--------- --------- ----- ---------------------------------------------
head                    1 - head
neck      head          2 --- neck
shoulder  neck          3 ----- shoulder
arm       shoulder      4 ------- arm
wrist     arm           5 --------- wrist
finger    wrist         6 ----------- finger
back      shoulder      4 ------- back
a rib     back          5 --------- a rib
b rib     back          5 --------- b rib
c rib     back          5 --------- c rib
hip       back          5 --------- hip
thigh     hip           6 ----------- thigh
knee      thigh         7 ------------- knee
leg       knee          8 --------------- leg
ankle     leg           9 ----------------- ankle
heel      ankle        10 ------------------- heel
foot      heel         11 --------------------- foot
toe       foot         12 ----------------------- toe
 
18 rows selected.

Now we’ve got some branches!

But this has potential to get a little ugly, because the various bones hanging off the backbone could come out in any order. It could show the hips above the ribs, or the ribs out of order, or the back above the arm. There’s an ORDER BY clause we can use to guarantee they come out in the order we want: ORDER SIBLINGS. Siblings are child nodes on the same level. For example, ‘arm’ and ‘back’ are both children of the ‘shoulder’ node on the same level. Likewise, all the ribs and the hip are siblings. As it happens, ordering alphabetically by bone will guarantee the order we want.

SELECT bone, connected_to_the, level, lpad(' ',2*level, '-') || bone AS bone_tree 
FROM skeleton
START WITH connected_to_the IS NULL
CONNECT BY prior bone=connected_to_the 
ORDER siblings BY bone ;

There are a couple more pseudocolumns that are fun. Going back to the tree metaphor, we have the root node, the node with no parent. And then we have leaf nodes, which are nodes with no children. In this case the root is the head and the leaves are toes and fingers and ribs. Oracle SQL has a pseudocolumn for root and leaf.

The connect_by_root psuedocolumn gives the root node:

SELECT bone,   
ltrim(sys_connect_by_path(bone,' -> '),' -> ') AS path,
connect_by_root bone AS root
FROM skeleton
START WITH connected_to_the IS NULL
CONNECT BY prior bone=connected_to_the 
ORDER siblings BY 1
BONE      PATH                                                                                                 ROOT
--------- ---------------------------------------------------------------------------------------------------- ---------
head      head                                                                                                 head
neck      head -> neck                                                                                         head
shoulder  head -> neck -> shoulder                                                                             head
arm       head -> neck -> shoulder -> arm                                                                      head
wrist     head -> neck -> shoulder -> arm -> wrist                                                             head
finger    head -> neck -> shoulder -> arm -> wrist -> finger                                                   head
back      head -> neck -> shoulder -> back                                                                     head
a rib     head -> neck -> shoulder -> back -> a rib                                                            head
b rib     head -> neck -> shoulder -> back -> b rib                                                            head
c rib     head -> neck -> shoulder -> back -> c rib                                                            head
hip       head -> neck -> shoulder -> back -> hip                                                              head
thigh     head -> neck -> shoulder -> back -> hip -> thigh                                                     head
knee      head -> neck -> shoulder -> back -> hip -> thigh -> knee                                             head
leg       head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg                                      head
ankle     head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle                             head
heel      head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle -> heel                     head
foot      head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle -> heel -> foot             head
toe       head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle -> heel -> foot -> toe      head

This is boring because we have specified a START WITH condition which occurs once in the table, so the root will always be the same. Let’s omit the start with and try again,

SELECT bone, 
ltrim(sys_connect_by_path(bone,' -> '),' -> ') AS path,
connect_by_root bone AS root
FROM skeleton
--start with connected_to_the is null
CONNECT BY prior bone=connected_to_the 
ORDER siblings BY 1
BONE      PATH                                                                                                 ROOT
--------- ---------------------------------------------------------------------------------------------------- ---------
heel      heel                                                                                                 heel
foot      heel -> foot                                                                                         heel
toe       heel -> foot -> toe                                                                                  heel
wrist     wrist                                                                                                wrist
finger    wrist -> finger                                                                                      wrist
a rib     a rib                                                                                                a rib
b rib     b rib                                                                                                b rib
c rib     c rib                                                                                                c rib
hip       hip                                                                                                  hip
thigh     hip -> thigh                                                                                         hip
knee      hip -> thigh -> knee                                                                                 hip
leg       hip -> thigh -> knee -> leg                                                                          hip
ankle     hip -> thigh -> knee -> leg -> ankle                                                                 hip
heel      hip -> thigh -> knee -> leg -> ankle -> heel                                                         hip
foot      hip -> thigh -> knee -> leg -> ankle -> heel -> foot                                                 hip
toe       hip -> thigh -> knee -> leg -> ankle -> heel -> foot -> toe                                          hip
toe       toe                                                                                                  toe
...

Now we can really see what this query is doing. Without the “start with” telling us which row is the root, any row could be the root. So we have all the paths we would get if the hip was the root, all the paths we would get if shoulder was root, and so on. Even the leaf nodes are represented in one-node paths.

We can cut out the leaf nodes by using another pseudocolumn, connect_by_isleaf. This takes a value of 1 if the node is a leaf node, 0 if it’s not. So to show only leaf nodes that have children:

SELECT bone, level, 
ltrim(sys_connect_by_path(bone,' -> '),' -> ') AS path,
connect_by_root bone AS root
FROM skeleton
WHERE connect_by_isleaf=1
START WITH connected_to_the IS NULL
CONNECT BY prior bone=connected_to_the 
ORDER siblings BY 1
BONE      LEVEL PATH                                                                                            ROOT
--------- ----- ----------------------------------------------------------------------------------------------- ---------
finger        6 head -> neck -> shoulder -> arm -> wrist -> finger                                              head
a rib         5 head -> neck -> shoulder -> back -> a rib                                                       head
b rib         5 head -> neck -> shoulder -> back -> b rib                                                       head
c rib         5 head -> neck -> shoulder -> back -> c rib                                                       head
toe          12 head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle -> heel -> foot -> toe head

Now! I want to make our skeleton do some yoga. Let’s connect the head bone to the toe bone.

UPDATE skeleton SET connected_to_the='toe' WHERE bone='head';

This curls the skeleton up in a circle, and the path from head to toe and back to head again becomes a loop. We call this loop a “cycle”. And now if we run a hierarchical query we get an error about the loop:

SELECT bone, connected_to_the, level, 
ltrim(sys_connect_by_path(bone,' -> '),' -> ') AS path
FROM skeleton
START WITH bone='head'
CONNECT BY prior bone=connected_to_the 
ORDER siblings BY 1
BONE      CONNECTED LEVEL PATH
--------- --------- ----- -----------------------------------------------------------------------------------------------
head      toe           1 head
neck      head          2 head -> neck
shoulder  neck          3 head -> neck -> shoulder
arm       shoulder      4 head -> neck -> shoulder -> arm
...
toe       foot         12 head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle -> heel -> foot -> toe
head      toe          13 head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle -> heel -> foot -> toe
                           -> head
neck      head         14 head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle -> heel -> foot -> toe
                           -> head -> neck
...
thigh     hip          18 head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle -> heel -> foot -> toe
                           -> head -> neck -> shoulder -> back -> hip -> thigh
 
ERROR:
ORA-01436: CONNECT BY loop in user data

If we know there’s a loop in the data and we still want to get results, we can use the NOCYCLE keyword. The syntax is “connect by nocycle”

SELECT bone, connected_to_the, level, ltrim(sys_connect_by_path(bone,' -> '),' -> ') AS path
FROM skeleton
START WITH bone='head'
CONNECT BY NOCYCLE prior bone=connected_to_the 
ORDER siblings BY 1
BONE      CONNECTED LEVEL PATH
--------- --------- ----- -----------------------------------------------------------------------------------------------
head      toe           1 head
neck      head          2 head -> neck
shoulder  neck          3 head -> neck -> shoulder
arm       shoulder      4 head -> neck -> shoulder -> arm
wrist     arm           5 head -> neck -> shoulder -> arm -> wrist
finger    wrist         6 head -> neck -> shoulder -> arm -> wrist -> finger
back      shoulder      4 head -> neck -> shoulder -> back
a rib     back          5 head -> neck -> shoulder -> back -> a rib
b rib     back          5 head -> neck -> shoulder -> back -> b rib
c rib     back          5 head -> neck -> shoulder -> back -> c rib
hip       back          5 head -> neck -> shoulder -> back -> hip
thigh     hip           6 head -> neck -> shoulder -> back -> hip -> thigh
knee      thigh         7 head -> neck -> shoulder -> back -> hip -> thigh -> knee
leg       knee          8 head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg
ankle     leg           9 head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle
heel      ankle        10 head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle -> heel
foot      heel         11 head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle -> heel -> foot
toe       foot         12 head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle -> heel -> foot -> toe

Oracle also provides a pseudocolumn to tell us when we’re in a cycle: connect_by_iscycle. 0 when not a cycle, 1 when it is a cycle.

SELECT bone, connected_to_the,
ltrim(sys_connect_by_path(bone,' -> '),' -> ') AS path, 
connect_by_iscycle AS iscycle
FROM skeleton
START WITH bone='head'
CONNECT BY NOCYCLE prior bone=connected_to_the 
ORDER siblings BY 1
BONE      CONNECTED PATH                                                                                               ISCYCLE
--------- --------- ----------------------------------------------------------------------------------------------- ----------
head      toe       head                                                                                                     0
neck      head      head -> neck                                                                                             0
shoulder  neck      head -> neck -> shoulder                                                                                 0
arm       shoulder  head -> neck -> shoulder -> arm                                                                          0
wrist     arm       head -> neck -> shoulder -> arm -> wrist                                                                 0
finger    wrist     head -> neck -> shoulder -> arm -> wrist -> finger                                                       0
back      shoulder  head -> neck -> shoulder -> back                                                                         0
a rib     back      head -> neck -> shoulder -> back -> a rib                                                                0
b rib     back      head -> neck -> shoulder -> back -> b rib                                                                0
c rib     back      head -> neck -> shoulder -> back -> c rib                                                                0
hip       back      head -> neck -> shoulder -> back -> hip                                                                  0
thigh     hip       head -> neck -> shoulder -> back -> hip -> thigh                                                         0
knee      thigh     head -> neck -> shoulder -> back -> hip -> thigh -> knee                                                 0
leg       knee      head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg                                          0
ankle     leg       head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle                                 0
heel      ankle     head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle -> heel                         0
foot      heel      head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle -> heel -> foot                 0
toe       foot      head -> neck -> shoulder -> back -> hip -> thigh -> knee -> leg -> ankle -> heel -> foot -> toe          1

The value of connect_by_iscycle is only 1 when we’re about to complete the cycle. The head node is connected to the toe node, but the head node is also an ancestor of the toe. This is useful for figuring out where the cycles are in the data.

In the last two posts, we’ve looked at querying hierarchical data using CONNECT BY. In the next post, we’ll look at newer syntax introduced in 11g: recursive WITH.

Illustration by Barbie de Plastico, used under cc license.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.