Universe Design Issue

I am working on combining 2 star schemas into one universe and I am running into a problem. To try to explain the problem in simple terms, let me use an example. Let’s say you have 2 fact tables which share 2 dimensions (basically,
in the shape of a baseball diamond, with 1st and 3rd base being the fact tables and 2nd base and home plate being the dimensions).

When you define the joins between the fact tables and the dimensions, you get a loop. You can resolve this loop by using contexts, but if you do this then if you want to run a query which combines one object from each fact table (with one object from one of the shared dimensions), Business Objects will split the SQL into 2 statements and then join the results. I believe this final join would be done on the PC. The problem is that there may be a lot of data to join on the PC and this could greatly affect performance.
Ideally, you would want this final join to be done on the server. If you were
to code this in SQL yourself, you could do this with one SQL statement.

The other alternative to using contexts is to use aliases to resolve the loop, but if you choose this method, then you would need to have 2 different dimension objects in the universe (for what is the same data base field);
one associated with each fact table. Unfortunately, then your query would need to
contain both objects in order to have both joins included. This is not something we would want our users to have to do.

The problem is really magnified for us as we often divide total cost (from one fact table) by total members (from another fact table) to come up with a Per Member Per Month (PMPM) amount. This appears to be impossible to implement as an object in the universe (if you use contexts to split the SQL). It seems like we will have to use variables on the reporting side to get these PMPM amounts.

Am I missing something? Is there a simple solution to this scenario?

Bill


Listserv Archives (BOB member since 2002-06-25)

Bill Wolff wrote:

I am working on combining 2 star schemas into one universe and I am
running
into a problem. …

When you define the joins between the fact tables and the dimensions, you get a loop. You can resolve this loop by using contexts, but if you do this then if you want to run a query which combines one object from each fact table (with one object from one of the shared dimensions), Business Objects will split the SQL into 2 statements and then join the results. I believe this final join would be done on the PC. The problem is that
there
may be a lot of data to join on the PC and this could greatly affect performance. Ideally, you would want this final join to be done on the
server.
If you were to code this in SQL yourself, you could do this with one SQL
statement.

Bill,

you seem to be struggling more with understanding Star Schema methodology then with Business Objects. The resolution Business Objects makes (splitting a query involving two fact tables) into 2 SQL statements is referred to generically as Multipass SQL. It IS the only way to effectively link multiple star schemas in an ad-hoc fashion and is used by a number of different ROLAP vendors.

I’m assuming that the two fact tables you are discussing are of a different granularity (data warehouse lingo for level of detail). If so, they can NOT be joined in one SQL statement as you suggest. If the tables are of the exact same granularity, you should look at the potential of consolidating them. Even if they aren’t of the same granularity, if multipass SQL is not performing up to par, you can create a third table at a common level of granularity from your base tables. From within Business Objects, you would treat this table as a summary table and setup @aggregate_aware clauses for your objects using it. This would offset the “merge” process you see in Business Objects to some procedure you develop (in C, PL/SQL, …) and run on a scheduled basis. Of course, this table will not be used if you ask for a dimension that is not in the fact table, but that can not be avoided.

FYI,

Bob Molby
GE Lighting


Listserv Archives (BOB member since 2002-06-25)

I am working on combining 2 star schemas into one universe and I am running into a problem.

Am I missing something? Is there a simple solution to this scenario?

What you suggested has been the only way I could deal with a similar situation and it is driven by the need to resolve loops. Basically, the assumption of BO is that you will most often only need to use one fact table in your queries, therefore, it needs the context to know which path.

If all your queries involve both tables, you could try leaving the loop in, it may work. Other than that, you’ll have to let BO do the re-join. It may not be good for you, but that was one of the reasons we chose it!

David Jelinek


Listserv Archives (BOB member since 2002-06-25)

I find this dilemma interesting. I can neither confirm nor deny that something is missing in this two fact table design senario. However, I do have a suggestion for you to try.

It is my experience that when adding a relationship that would create a loop (as in your example) that if I changed the join type (where appropriate mind you - see below comment) to a short-cut join, that the loops are not detected and therefore contexts/aliases are not needed. You may wish to experiment with this to see how BO behaves in your case.

I have had lengthly discussions with Tech Support concerning short-cut joins (my delimma was that I wanted to use multiple shortcut joins simultaneously and that is not supported - TAKE NOTE) and what one needs to understand concerning thier use is the impact that utilizing a shortcut join has. Once a shortcut join is chosen, the nornal path for which the shortcut join is simplifying (in its standard usage) is no longer available for resolving table relationships; therefore, if additional WHERE clause conditions rely on the “normal path” relationships, then BO is left with throwing out the shortcut join usage and taking the long way or normal path as it must. So when using shortcut joins you must remember: (1) multiple shortcut joins are out and (2) depending on the overall relationships that exist within a particual SQL statement, shortcut joins may not be used.

My curiosity is that in your senario, if only shortcut joins exist, or better yet, some minimum number of them that would alleviate contexts and aliases, how would Business Objects generate the SQL for measures from both fact tables along multiple dimensions (say one fact table utilizing normal joins to the dimension tables and the other fact table using shortcut joins only). Let me know if you choose to try it.

Thanks in advance.

Donald May
MIS Engine Services
E-mail: maydp@pweh.com


Listserv Archives (BOB member since 2002-06-25)

Bill:
We are using a star schema with 5 stars!!! Let us say that in a query the user needs data from Star1 and Star2 ; the dimensions are from three tables Tab1, Tab2 , Tab3. Star1 also has order number, lot number and Star2 also has order number, lot number. The UNIVERSE however has one object built for order number, one lot number both only from Star2. Star1 will have only one row returned for each order number/lot number BUT Star2 will have many rows returned for each order number/lot number!!!
The rows returned do not join-up!!!
Either we must make more objects Star1 order number, lot number and make the users include Star1 and Star2 order number, lot number objects in their query…
or
Join the Star1 table to Star2 table…

From: Bill Wolff[SMTP:William.Wolff@HEALTHNET.COM]
Sent: Thursday, October 29, 1998 11:40 AM

I am working on combining 2 star schemas into one universe and I am running
into a problem.


Listserv Archives (BOB member since 2002-06-25)

On the Shortcut Join Possiblity:

I tried this in the past. It was my initial concept for dealing with the loop years ago. Basically it didn’t work or hosed things completely when the SQL was generated. I think it would be a real problem in the case listed because he is wanting to go down both paths simultaneously. Right off, that knocks the SC Join into trouble. What’s worse, since only one SC could be used, the SQL might not even get all the objects wanted…

Visualize it this way. Tables Fact_A, Fact_B, Dim_A, Dim_B. Normal use in the Universe would have an A_context of Dim_A, Fact_A, and Dim_B. B_Context has Dim_A, Fact_B, and Dim_B. Instead, replace all joins with shortcut joins.

Query has objects from all four tables. SQL generator has to go from Dim_A to both Fact_A and Fact_B. Which SC Join does it use? Once one is picked, BO ignores all other SC Joins as if they don’t exist. So how do you get the other objects? You don’t - Cartesian Product Territory!

So that probably won’t work for him. I’ve found only one good case where a shortcut join fits the bill and it had nothing to do with loops. Resolve loops with Contexts or Aliases. In this case, the context is the best method and deal with the Multipass SQL.

David Jelinek


Listserv Archives (BOB member since 2002-06-25)

Susan Storm wrote:

Bill:
We are using a star schema with 5 stars!!! Let us say that in a query the user needs data from Star1 and Star2 ; the dimensions are from three tables Tab1, Tab2 , Tab3. Star1 also has order number, lot number and Star2 also has order number, lot number. The UNIVERSE however has one object built for order number, one lot number both only from Star2. Star1 will have only one row returned for each order number/lot number BUT Star2 will have many rows returned for each order number/lot number!!!
The rows returned do not join-up!!! …
Susan - unfortunately, while you think you may have 5 stars, you are treating them as 5 black holes. If you have an object that is common to both star schemas, it should have its own associated dimension table. In Business Objects, this dimension table would then be joined to each of the fact tables and the object would be defined once from the dimension table.

In your case, this would mean having a order number, and lot number table and defining joins to star1 and star2 from them. You would then define object order from the “order number” dimension table and lot number from the “lot number” dimension table. You must then create or have Business Objects discover the contexts for each star schema (to discover, you must set up cardinality on your joins properly). At this point, the user would select order number, lot number from the dimension tables and get two resultant SQL statements: one for table 1 and one for table 2. These would then be merged in Business Objects based on the common dimension values.

FYI,

Bob Molby
GE Lighting


Listserv Archives (BOB member since 2002-06-25)

Should I also contribute to the discussion?

I didn’t follow the whole discussion. But there are some issues to be considered in universe design with multi-star schemes and aggregate awareness. Multi-star schemes are no problem at all for BusinessObjects. The only issue you have to consider is that common dimensions should be in separate ‘dimension’ tables, as some of you already mentioned. In this case you must use context detection to correctly let BusinessObjects identify the different sub-stars in the scheme, and, btw, let BusinessObjects generate SQL in a way, that give “correct” results.
The unique dimension objects in the universe are then taken from these lookup tables instead of from the fact tables.
This procedure works for designs, where each of the fact tables contain a different measure. If two (or more) of them contain the same measure on different aggregation levels, you needn’t have common lookup tables between them, but can use @aggregate_aware() for the common “dimensions” as well, in which case they are taken either from the one or the other fact table. This only works if there is no other fact (measure) which uses the same common dimension.

In both cases above there is no need (and, it should be avoided) to join the fact tables. But you NEVER will need joins between facts which are part of an “aggregate aware hierarchy”.

Shortcuts are another thing: they are used to shorten the “lookup” from a fact table to a dimension table.

Hope this helps…
Walter

Morris Phillip I (HBM) schrieb:

Hi folks,

Thought I’d throw my two-pence worth in here.

We use a very similar schema that includes 4 ‘stars’, ‘baseball bases’ or ‘black holes’ whichever way you refer to them. They do represent descending orders of granularity, which we originally had 3 of, and we recently added a 4th ‘summary’ level to improve performance.

This is where I came into the ‘multiple shortcut join’ problem of course, but it has not proved to be a great problem by any account. The point is that if these fact tables share common ‘summary’ level dimensions, then you can get away with joining the lower levels into the upper levels to retrieve those dimensions. Therefore, you need to analyse which dimensions are common to which levels, and then join in the lookup tables at the appropriate level. Use of Aggregate awareness can ensure that only single objects are presented to the user.
If you use a common (or surrogate) key to enable very fast joins between fact tables, then the DBMS will quickly whip up the indexes to join the appropriate rows.

None of this would be required of course if shortcut joins worked how we would like them to work!

Hope this helps someone… It took me a good old while to get it this good!


DI Walter Muellner
Delphi Software GmbH, Vivenotgasse 48, A-1120 Vienna / Austria Tel: +43-1-8151456-12, Fax: +43-1-8151456-21 e-mail: w.muellner@delphi.at, WEB: http://www.delphi.at


Listserv Archives (BOB member since 2002-06-25)

Thanx to all!!
It seems that the design in place is missing a table!! The new table will contain std_prd_info_id (relating to all product info)
fiscal_mo
fiscal_yr
order_number
lot_number
With this table all of the stars will have common columns to relate the returned answer set!!!

Walter wrote
Should I also contribute to the discussion?

I didn’t follow the whole discussion. But there are some issues to be considered in universe design with multi-star schemes and aggregate awareness. Multi-star schemes are no problem at all for BusinessObjects. The only issue youhave to consider is that common dimensions should be in separate ‘dimension’


Listserv Archives (BOB member since 2002-06-25)

But what if query returns too much “order number”, “lot number” ? More than, for example, 2 miljons, but universe have to be set to max 30 000 rows ? Because of “order number”, “lot number” are used only for joining cubes in report, they affect performance drammaticaly or just trigger “partial results”.
This is the problem with no answer for me :frowning:


miganos :latvia: (BOB member since 2004-02-06)

Check a few things:

  1. Context
    Make sure you have context established if there is a loop or traps. (There are a lot of topics available that describe these problems.)

  2. Measures
    If your universe is delivering such number of rows there might be either too many dimensions or the measures cannot be summed. Your measures should have a projected aggregation of sum and the field itself be wrapped in sum() on the first screen. This might or might not reduce the number of rows.

Try this first. If you are still receiving that many number of rows try to reduce the number of dimensions in the report. If your user(s) insist remind them that BusObj is foremost a tool to aggregate information, not a data export.


KMB :uk: (BOB member since 2004-02-11)

Form the discussion above, I’ve gathered shortcut joins don’t work the we expect them to. Can some one tell me how shortcut joins work or point me to some good documentation.

I have a situation where my shortcut join is not part of my SQL, when I’m expecting it to be part of the SQL.

Below are my regular joins:
registeration —> courses---->course_term—>course_faculty—>faculty
registeration —> term

I also used a shortcut join from course_term---->term
(if this is not a shortcut join, It will generate a loop here.)

When I select objects from Course_term and term tables the shortcut join works well.

When a add an object to the above query from courses table. It quits using the shortcut join and goes thru the regular join method (which includes the registeration table). This returns too many rows and is not what I want.

Will Bo not use one shorcut join in conjunction with a regular join?

Is my only solution contexts or Aliases?


umpa (BOB member since 2005-08-15)

Do you actual need this join?
registeration —> term

Could you just have these joins?

registeration ---> courses---->course_term--->course_faculty--->faculty 
                                                          |---->term

Michele Pinti (BOB member since 2002-06-17)

As you might have guessed, the above set of tables belong to a student registeration system.

the term table needs to be joint to the registeration table as well as the course_term table.

Its not very benifical to join the term table to the course_faculty table.

If I take the route of
registeration—>course—>course_term------>course_faculty
course_term ------>term

The queries will work, though it may not be very efficent while quering just registeration information to go thru an additional tables (course_term table) to get to the term table.

Eg. of query:
give be registeration & course information for students registered in Fall 2006 term.
Thus you objects will be selected from the registeration and course classes and the condition will come from the term table

If using the following set of joins for the above query. The SQL will use the following tables: registration, Course, course_term and term table.

It is not necessary to go thru the Course_Term table to get to the Term table in the above mentioned query.
Thus, in addition to the above mentioned regular joins, I tried created a shortcut join from:
registeration table …>term table

But, this short cut join does not get used in the above query eg.
:hb: :hb: I need to understand why??? :roll_eyes:


umpa (BOB member since 2005-08-15)

Sorry this is what I meant my formatting was off.

This would be the setup I would use and the short cut join will be used when your query is just using these 2 tables:
registeration table and term table

In this senerio I don’t think that contexts will help because I guessing that you will need one context that includes all tables to satisfy a query that needs some objects from every table.


Michele Pinti (BOB member since 2002-06-17)

I apoligise I did not give you the complete picture. There is one more twist to this.

The regular joins are:
STUDNET —>registeration—>course—>course_term---->course_faculty
course_term ------>term

shortcut join:
registeration…>term

You are right when querying only the registeration and term table the shortcut join gets used as expected.

When querying student, registerationa nd term table. The shortcut join does not get used. It goes the long way to the term table.

Just for the heck of it I deleted the course ---->course_term table join.
Now the query that selected objects from studnet, registeration and term table, did a cartision join between the registeration and the term table.
I cannot understand why it is ignoring the the shortcut join. :hb:
Will it not use the shortcut join in conjunction with regular joins???


umpa (BOB member since 2005-08-15)

I have short cut joins and they are used in conjunction with regular joins.
Have you reviewed the SQL to make sure there isn’t anything being added that would ‘void’ using the shortcut?


Michele Pinti (BOB member since 2002-06-17)

Yes I have reviewed the SQL:

SELECT
AWADM2.STUDENT.FIRST_NAME,
AWADM2.ST_REGISTRATION.SEC,
AWADM2.ST_REGISTRATION.REG_DATE
FROM
AWADM2.STUDENT,
AWADM2.ST_REGISTRATION,
AWADM2.TERM
WHERE
( AWADM2.STUDENT.STUDENT_KEY=AWADM2.ST_REGISTRATION.STUDENT_KEY )
AND (
AWADM2.TERM.TERM_DESC = ‘2004 SPRING’
)

As you can see there is a cartision join between st_registeration and term table. Bo is choosing to ignore the shortcut join that is in place between the st_registeration and term table.
Why I don’t know.


umpa (BOB member since 2005-08-15)

And you currently do not have any contexts set up for these tables?


Michele Pinti (BOB member since 2002-06-17)

Well, I do have contexts, but all these tables belong to the same context.

Even when I remove all teh contexts it does the same thing.


umpa (BOB member since 2005-08-15)