Is it right that if you have no outer joins between tables A, B and C say (in a line with A being top of the hierarchy in your database) then if you display information from a report that is in tables A and B and sum up a column in C using a measure to display it at the level of the B table then it will return no rows at all if there are no rows to sum up in table C?
Yes.
Anyone know a way round this if this is right?
Not with a relational database. A relational database doesn’t really understand the concept of a “hierarchy”. A relational database isn’t a hierarchical database. You can MIMIC a hierarchy in a relational database, but it isn’t really a hierarchy.
This is causing confusion with some of our users because they can’t understand why they should need to have rows in table C to get back a B hierarchy level report.
Because you have defined the join between B and C to be MANDATORY, not optional. If you are querying tables B and C, for every row in B there MUST be a row in C; and for every row in C there MUST be a row in B.
What ideas do people use on when to use outer joins in the universe. Obviously on the ends where reference data may not be recorded but what about between main database tables?
If you have properly and fully defined the cardinalities in the joins between your tables, then deciding whether to make a join outer or not should not be a problem. The key is looking at the minimum cardinality, which is not usually what we look at when we consider our cardinalities. We typically say things like, “These two tables have a one-to-many relationship.” But that’s an expression of the maximum cardinalities between the two tables, not the minimum. The true, full expression would be something like, “Table A has a (zero-or-one)-to-(one-or-many) relationship with table B.”
You have two tables, B and C, with a maximum cardinality of one-to-many, i.e., one B record may have many corresponding C records. First consider: Must every record in B have at least one corresponding record in C as a rule? If so, then the cardinality becomes one-to-(one-or-many). If not, then the cardinality becomes one-to-(zero-or-many). Assume it is the latter, one-to-(zero-or-many). Now consider: Must every record in C have a corresponding record in B as a rule? If so, then the cardinality becomes (one-and-only-one)-to-(zero-or-many). If not, then the cardinality becomes (zero-or-one)-to-(zero-or-many).
There’s your 2-minute course in defining cardinalities in relational databases!
It should be obvious now, that once your cardinalities are fully defined, deciding whether to make a join inner or outer is trivial. In our case, assume the full cardinality is (one-and-only-one)-to-(zero-or-many). In other words, every B record has zero or more corresponding C records; while every C record must have one and only one corresponding B records. If the two tables are joined on the field F1, the SQL would look like this (in Oracle):
B.F1 = C.F1 (+)
Keep in mind, though, that if your cardinality becomes something like (zero-or-one)-to-(zero-or-many), you have a problem. No relational database that I’m aware of will let you define a join like:
B.F1 (+) = C.F1 (+)
If you really have two tables with that cardinality, introduce a third table to act as an intermediary:
B.F1 (+) = Z.F1 AND
C.F1 (+) = Z.F1
Where Z.F1 contains all possible values of B.F1 and C.F1 (SELECT B.F1 UNION SELECT C.F1).
In a C level report we would not really want to return rows from A and B where no rows in C exist so we don’t use outer joins between A, B and C.
Well, then you’re stuck. Keep in mind that a “C-level” report includes a “B-level” report, since C is a lower level of the hierarchy, and you said before that you DID want to make C optional for your B-level reports.
I shouldn’t say you’re stuck. You can always create pre-defined conditions such as “C.F1 is not null” to effectively force the outer join back to an inner join.
If outer joins are used on all joins, is there any cost to report optimisation or any database issues I need to be aware of?
Well, it may be different for other RDBMS’s, but for Oracle there really isn’t much of a cost to outer joins. The key is to have proper indexing.
Erich Hurst
Compaq Computer Corporation
Listserv Archives (BOB member since 2002-06-25)