Recursive Relationships - Discussion

tehNellie

Registered User.
Local time
Today, 15:00
Joined
Apr 3, 2007
Messages
751
Sooo, I finally have the chance to tear apart some of the horrible structures that live in one of my databases.

Currently it Tracks systems access levels across the company, and to do this we create Roles that are loosely based on the company structure and comprise of 4 constituent parts in the name and a bunch of stuff that govern the access level of accounts in that role on the system in question.

To handle this I have 4, interrelated, Tables called role 1, role 2 and so on which contain simply the descriptor of the role part that they contain, so that [Role 1] might contain "Finance", [role 2] might contain "payroll", [role 3] "contrator payments", [role 4] "payments administrator".

Role 1 is related to role2,3,4 and so on up the chain and each individual role table is related to the "master" Role definition which contains the access level information for the system in question.

I'm hoping at this point that everyone currently looks similar to :eek:

If not, let me add that A role can currently contain either [role 1],[role 2][role 3] and a placeholder "#no level 4#" or can contain a "proper" descriptor in [Role 4].

Because of the design, we currently have 3000+ "no level 4#"s held in [Role 4] (wheres the slap head smiley when you need it?)

Now I've been looking at a number of ways of trying to Normalise and improve this part of the DB, the obvious solution, because role 1-4 tables are purely descriptors is to just combine all of those into one "role" table, stick a junction table between it and the Role Definition table and be done with it. However this still leaves several problems, we're still, sort of, hardcoded to 4 levels within the database itself (ok so we can just add another column if we need more) and a few other obvious failings.

Still with me?

So I've started to look into the possiblity of using a recursive relationship on what is still, in effect, the Junction table between the descriptors and the Role Definition.

At a basic level I now have 3 tables:
Code:
Role
----
RoleID - PK
Description - varchar

Roleconfig
----------
ConfigID - PK
ParentconfigID - int
RoleID - int

Roledefinition
-------------
ID
RoleconfigID

ParentconfigID relates to ConfigID within RoleConfig

At the moment, this structure, again at a basic level, now appears to work. However the variable elements within a role looked like a potential problem. Finding element one is simple, the [partentconfigID] is NULL. Finding the Top element when you've got 4 is simple, [configID] doesn't appear in [parentconfigID].

Where the fun starts is trying to control the recursion where you've got role1,role2, role3 being a valid role description and a role4 added to it also being a valid role description. Now as far as I can see there are two options to handle this.

1) Create in Roleconfig an entry (ok, entries) for role1,2,3 and use that as your 3 element role description. Create new entries containing the same information for your 1,2,3,4 role element. Less than ideal for, I hope, obvious reasons, we're still basically duplicating information and it is also difficult to build your role description in a query because you don't know how many elements will comprise that description.

2) Add a "valid" boolean column to roleconfig so that you can reuse your existing 1,2,3 and simply tag role 3 as 'valid', then add a role4 element and also tag that as 'valid'. The main downside to this is similar to the last one above, you know that valid means it is a top level description, but you still don't know how many elements there are and outputting a list containing

Finance-Payroll-ContractorPayments-PaymentAdmin
AND
Finance-Payroll-ContractorPayments

As valid roles still looks like it requires some jiggery-Pokery

I still have some concerns about controlling the recursion and ensuring that roledefinition can only relate back to a valid top level role which looks like it will require some careful planning. It's necessary to create a validation rule so that parentconfigID cannot be the configID for example, and I'll need to ensure that Roledefinition cannot relate to a roleconfig that isn't the last element in the chain.

We already "shoehorn" what are effectively 5+ element role descriptions into this structure, using recursion like this, I believe, eliminates the need for future Database changes if the front end code is amended to handle it. Which I guess is where the "discussion" part of the thread title comes in.

Sorry for the length of the thread, but this is melting my brain at the moment and it's not something that seems to come up very often so thought it might be interesting.
 
Rip that dog apart. Stop looking at the stuff you have. Refer to the REAL authority. What is the actual requirement? Because others have used this thing for a while, you are going to have to fight HARD HARD HARD to divorce "what was" from "what needs to be" - perhaps because folks have been immersed in this hellish design.

I suspect you should look at a table of roles and a junction table that lists roles associated with a particular other entity. (Not sure what that other entity might be from the description. You are right, it is very convoluted.)

Recursive relationships are allowed, but rarely actually needed, so your description tells me you are looking at the solution to describe the problem. In other words, the tail is wagging the dog.
 
Recursive relationships are allowed, but rarely actually needed, so your description tells me you are looking at the solution to describe the problem. In other words, the tail is wagging the dog.

Funnily enough that occurred to me a few minutes after writing the novel above. What I've proposed above "fixes" the problems we have, though possibly in a manner more complicated than actually required, but as it usually turns out when you actually ask the people who use it whether it actually meets their requirement it turns out that actually, it doesnt. "fixing" the current solution doesn't actually provide the full functionality that they want.

Now it may well be that we have to settle with improving the crap that currently exists in the DB, and however we do it it will be a nightmare to sort out what is there currently, but if there's a possibility that we can kill two bids with one stone and define a level of functionality that gives the users what they need then it's worth persuing.

I'm still interested in the notion of the recursive relationship, it does appear to offer some interesting flexibility that a standard junction table solution can't, but it also appears to be a bitch to control properly.
 
For discussion only:

The ultimate in recursive relationships is a family tree if it is properly normalized. Think about it.

Everybody in the tree is related to everybody else in some way. You can do this design in two main tables, the effect of which is recursive. (There are other tables for supplemental information)

Table 1 is the person table. Each person is listed, has an ID, birth date, death date, a few other things. Oh, and of course a full name written out in some convenient list of surname, given name, middle name, etc.

Table 2 is the relationships table. It contains three elements minimum. (You can add other elements if you wish, mostly of the optional type.) The elements are
Person1 - ID from person table
Person2 - different ID from person table
NatureOfRelation - a code that can be looked up in a table, perhaps. This third table, however, is essentially static.

Now, you "read" the entry as Person1 RelatedTo Person2

The third table contains codes that allow you to fill in the relation. The table contains "code number" "forward relation" and "backward relation" - so you might see a relation table that says <person1, relation, person2>

1, 1, 2
3, 2, 1
3, 2, 2
4, 2, 1
4, 2, 2

And when you look up the fields you would find that 1 is Adam, 2 is Eve, 3 is Cain, 4 is Able, and for the middle column, 1 is spouse and 2 is child. Reading forwards, this is "Cain" "is child of" "Adam" - and the backwards read would be "Adam" "is parent of" "Cain"

Why is this recursive? Because you can trace generations by following links from person table to relationship table one link-pair at a time.

Search the relationships for all records with X as Person1 and "Child" as relationship - and you have the first generation descendents of person X. Check each of those descendents in turn for having "Child" records (which you have to do by recursive tree analysis techniques) and you have found the second generation.

See, a practical case where recursion makes sense.

But back to your original problem... I'll lay odds that what you REALLY REALLY wanted was just a parent/child table with one//many relationships to those roles. Or at most a parent table with two dependent child tables for different attributes to be one//many.
 
Last edited:
Hmmm, actually a family tree is a very good analogy of what we have at the moment each business division is the top of the tree "parent" for example "Technical" or "Finance". A child of Technical is "Network Security", one its children is "Router Support" and so on.

That said I'm still not convinced that a recursive structure is actually the best method to implement (the perils of listening to a "keen" developer). What I do like about it is that it removes the need to touch the DB structure if they suddenly decide they want to add a 5th level to the Role Description what I'm still not convinced about is the ability to easily control and query the structure in an efficient manner nor, as is the wont of companies, to easily control any re-arrangement of the structure of the "tree".
 
The PRACTICAL problem of using a family tree structure is that reports are a bloody nightmare. Access could handle this structure in a single VBA routine but queries are not inherently recursive.
 
Fortunately I have this in SQL server which gives a bit more flexibility but at the moment it looks like the overhead for controlling the structure, especially as you wont know how many levels you are going to travel before you complete the query, far outweighs the benefits of apparent flexibility. There's a lot of unioning and WHILE loops to perform which means that where I query a "standard" junction table and relationships once I have to potentially query the recursive setup multiple times to return a table view or the information that I want. Adding a "level" and "valid" column to the table helped improve the efficiency and readability somewhat, but from a real world point of view, it is something that I think I'd caution trying to implement.
 
The PRACTICAL problem of using a family tree structure is that reports are a bloody nightmare

Darn. For a second there, Doc_Man, you'd given me a glimmer of how I might actually be able to store taxonomic information for species (Kingdom, Phylum, Order etc) since there are tons of hierarchical taxonomic levels that are not always present for one branch of the family tree (so to speak) compared to others. But if you say getting the info out is a nightmare then I'll wait a few years before delving into that.

Still, thanks for the interesting discussion. :)
 
Sad to say, Craig, the technology is there today to store it, but limitations on the design of SQL make report generation a nightmare.

Why? 'cause set theory don't know nothin' 'bout generations.

You COULD - if you bloody well had to - write a VBA program that tried to populate this stuff in a TREE VIEW (control) but I don't know that you can print those. If you can, that is the path to take. I'll leave it to you to decide whether that makes any sense at all.
 
It occurs to me that this is a problem in other areas too. For example, land ownership and parcel numbers. I designed a database to track changes in land ownership for a lawsuit database. Was told parcel numbers were the thing I could hang my hat on. Then I discovered that parcel numbers are not fixed...parcels get subdivided, merged etc all the dang time. The county electronic records do not seem to have a way to deal with this. Parcel numbers simply vanish and new ones appear in their database.

I solved the problem by getting halfway to your structure (I only store direct child/parent relationships for parcel numbers). But I never thought how to move from that to a tree view control (never actually used one of these).

Fiddly stuff. Sounds like there could be some very good applications for a relational database that can handle 'generations' by design.
 
The problem with tree-view controls isn't putting one on a form - it is in populating it.

But the REAL nightmare for doing a family tree, even in tree view, is to do one for the royal families in Renaissance Europe, where more dotted lines and bar sinisters cross the tree than you could imagine. Bastards (geneaological sense of that word) abound, divorces here and there, intermarriage of cousins here and there, ... ugh. You can still DESCRIBE it with the structure I had - if you allow multiple relationships and time-dependend relationships - but even a tree view has its limits.

From the sound of it, though, land records with original layout, subdividing, re-subdividing, coalescing, and other oddball things you can do with land titles... what a mess. But you want some REAL fun? Come to Louisiana, where all titles are doubled or tripled because of split rights between surface and subsurface. It is possible to own property but not its mineral rights (i.e. oil rights, mostly). And of course, vice-versa. Then watch the fun and games in THAT environment.
 
I put together a recursive design once for a power utility network outage tracker, where a device supplied none or more (1:M) devices, which in turn supplied none or more devices, etc, with no theoretical end.

Each device had exactly one source ('Parent'), so I didn't need more than one table (other than lookups). A zero value for parent simply meant that this device was at the 'top level'.

But I had to use code to do 'path tracing' towards and away from each device, as there could be anywhere from zero to ~20 steps to get to a source, and to create the 'tree' of devices supplied from a given device.
 
I once designed a real-time database in which we had a hierarchical structure that was inherently similar to the network tree we used to retrieve data. It was virtual, demand-paged, and retired buffers with least-recently-used algorithms, with sharing and non-sharing locks, wait queues, and deferred write-back. Now the fun part... it was in assembly language on a PDP-11, the code fit in 8 Kb, and if you could give it only enough physical memory to fit 10% of your virtual db size, it still ran within our time constraints and qualified as real time. (Mostly because the stuff we moved in real time took 2 MINUTES to open or close the valving.) In practice, our hierarchy stopped at level 4 for the hardware and level 5 for a few software structures. But in theory we could have reached 20 levels with no sweat.

It is amazing what you can do if you really put your mind to it.
 
I did come across this:
http://www.sqlteam.com/item.asp?ItemID=8866
While arguing with the developer who desperately wants me to use this structure. Effectively it caches the depth of the record within the hierarch directly in the table. The obvious downsides being that you need to keep that cache up to date and in this particular case, my junction table to control the recursion is now 6 times bigger than just taking my 4 levels, keeping them in their existing tables and normalising them properly.

It's not as "flexible" overall as what you can do with a single table, but it doesn't reduce you to a crumpled, rocking wreck trying to control the structure or get decent information back out if.
 
As is clearly enumerated in that thread (which somehow my site firewall let me get to), your issue is actually quite simple.

SQL, being based on homogenous sets, doesn't understand hierarchies too well. It is up to you to find a viable method for fiddling with them.

I recall once playing the game of just storing the hierarchy depth as a number and using VBA to print indented lists based on the depth field - but it was an adaptation that had other problems and wasn't intended as a long-term solution anyway.
 

Users who are viewing this thread

Back
Top Bottom