I guess I didn't look as hard at your "questions" as I should have, but this still bothers me, just no longer at a fully conscious level. (But then, at this time of night, "fully conscious" might be asking a bit much.) Back to the issue - the nagging feeling I have probably means there is an unresolved issue in my mind that could merely be a prejudice or it could be a bigger issue. I just don't know.
I don't blame you at all. This IS an unusual structure and in all databases I've seen, none has this kind of structure. In that sense, I'm tempted to say the structure may be wrong. But on the other hand, just because you haven't seen it before, doesn't make it wrong. Unusual situation requires unusual solutions. This is why I'm asking so many questions here; I have my doubts that the solution I proposed is the best one. And I fully intend to try this solution; burn it in furnace until it comes out golden or discard it if it comes out as a lump of ash.

Perhaps the problem is that every role potentially has different "questions" to be answered - even if they represent attributes of the role. This makes role unification nigh unto impossible when these "questions" are part of the role tables. I am no expert on the higher levels of normalization, but I believe that by folding the attributes into the role tables and allowing the roles to be so different as to be incompatible, you have exposed a denormalization of a very subtle type.
Hmm. Interesting., especially the "denormalization of a very subtle type." The way I look at this, it was most logic to split out incompatible attributes into 1-1 tables, to avoid creating repeating columns had I created freestanding tables for each roles (e.g. names, personal data and other attributes that are true of all employees, whatever their roles may be.)
A tangent: Based on what I've seen, some time people treat two different kind of people differently. For example, an template for student registration stored entries about students' personal data but teachers were in their own table and no personal data was no asked about the teacher. That's perfectly fine, if you decide that it's HR's job to keep track of teachers' personal data, and the database exists to keep records about what classes students may register for. Possibly because of the false dichotomy between students' data and teachers' data, no one thought about what if we needed personal data that are common to both students and teachers but also needed incompatible attributes about students and teachers.
Now, one thing I should make clear; if we were talking about one or two extra columns for two incompatible roles, then big whoops. Just leave it in the one table; I've wasted a few bytes in exchange of much more simpler table maintenance. Things starts to get hairy ONLY if we're talking about more than two roles, adding column that's more than just Yes/No (1 byte) *AND* would be empty (e.g. not applicable) to large percentage of the employee. With two roles (let's assume that all employees must have a role, and roles are mutually exclusive), you have 50% chance that a column will be empty. Not too bad. With 3 roles, it's now 33%. 4, 25%, and I'm not even factoring the numbers of column associated with each roles. Here, it becomes a performance and possibly data integrity issues to maintain a one big table with lot of empty fields.
With 1-1 tables, I now can assure that all relevant tables are relatively "full", the child tables just has the records where those extra bytes will be put to good use.
It is trivial to unify roles when they are part of a list associated with a role. You just about can't do any unification if not. And I'm in favor of coalescing those three parallel tables to one table (of unified roles) and one child table (of role attributes that you call "questions.") Am I making sense that way?
Last night, I thought very carefully about the possibility of creating a list of questions about roles' attributes and storing them as record rather than fields with the junction tables you proposed earlier. I *think* it could be doable; I'll need to run a small prototype database to test the idea. But it hit me- there is a data integrity problem. Suppose I wanted the questions to be required, something I can do with fields. The advantage of having 1-1 tables is that I could make all fields in each tables required, but do not have to fill in the fields in child record until I associate an employee record to one of child record. Thus I can maintain data integrity; how would I do this if the fields were now a set of record in a "Question table" with answers in a junction table? I'd imagine I'd have to take extra steps to look up whether a employee record is missing answers to certain roles and force creation of record. However, that feels wrong to me because as I said earlier, nobody should be making records "just in case" or deleting records just to get everything else working correctly.
Or maybe you know something that I don't, and can explain how we can maintain data integrity between a pseudo-field record and a answer record?
You were forced to split your employees table into parts because you could not unify the roles. But the question I have is whether increasing the virtual depth of the tables (to have lists of attributes underneath) would allow you to narrow the three-table layer to a single table, thus unifying everything and yet retaining generality.
This, I have to agree; the three-way split can be quite a challenge to maintain, and perhaps even unnecessary. I say so because I know of no other solution. However, my ignorance doesn't mean that no other better solution exist for this problem.
We are talking past each other. You make NO RECORD AT ALL if there is no attendance. Then you don't JOIN anything. You just count records matching the person and class from the attendance table. This is a JUNCTION table, perhaps - but you don't have to exercise a JOIN to get there. Just a DCount. If the DCount of the person/class records is zero, you know what your attendance was like. End of question. Not a null - because DCount is one of the rare Domain Aggregates that does empty sets correctly.
Okay, so we're agreed on that point then. Therefore, having nulls for where there were no attendances is not poor design in itself. It is more about how you join tables (not necessary for this scenario). However, you say that DCount is an exception as it handle empty sets correctly; if we had another scenario using say, DSum or DAverage, does the possibility of having nulls (because the intersection wasn't needed, just like if teacher took roll call and didn't enter a record for an absent) in the junction table put those out of whack?
Last edited: