Star, Snowflake, or 3rd Norm?

system · April 21, 2005, 6:42pm

Hello,

I am creating a small (<100,000 transactions) data warehouse for trending that requires many lookup tables for select criteria in queries.

I’ve identified about 5 dimensions for the fact table, but my problem is that each transaction may require up to 20 different criteria (with lookup descriptions). The transaction dimension and the fact table both represent a specific transaction. My question is, how do I represent the 20 lookup descriptions per transaction? Should I keep this info in the dimension (repeated across all transactions), snowflake the lookups from the dimension, or just normalize the entire database?

Will a BO universe support all three schemas?

mvbokker (BOB member since 2002-09-03)

system · April 22, 2005, 1:22am

Yes, Business Objects will “support” all three solutions. Normalizing the actual/physical data model would be the worst choice for any reporting solution including a Business Objects implementation. Normalizing eliminates redundant data thereby avoiding the problem of synchronizing insert/update/delete operations across multiple tables, but makes it hard to query the data for reports.

Andreas (BOB member since 2002-06-20)

system · April 22, 2005, 3:48am

Stars and snowflakes are always the best for reporting. I know you’ll get other opinions, but you really can’t go wrong with a good snowflake.

Steve Krandel (BOB member since 2002-06-25)

system · April 22, 2005, 8:40am

Reporting on transactional 3nf databases is slow.
Snow flakes are OK as long as the tables that join to the dimensions (that are joined to your fact) are low cardinality as this does not affect performance too much.
In my opinion you will get the best performance from a well designed star schema though…

Mak 1 (BOB member since 2005-01-06)

system · April 22, 2005, 12:32pm

With less than 100,000 fact records you could probably do pretty well with a very simple star schema. But you may want to ask two other questions…

1] Could it ever grow or expand in scope and by how much.

2] How to deal with any changing dimension values. (The Kimball slowly changing dimension types are the most commonly used).

b_demann (BOB member since 2003-03-20)

system · April 22, 2005, 3:47pm

Thanks for the response.

I believe 100,000 is probably the max this reporting system will go in the next several years.

We do not need to change history in this data, but open status transactions will be updated in place. Maybe this is a reason enough to just carry the 20+ codes/descriptions in the transaction dimension? This is my first experience with de-normalizing data to this extreme and I’m struggling with it.

mvbokker (BOB member since 2002-09-03)

system · April 22, 2005, 3:59pm

With 100,000 records, a single table is the way to go.

Steve Krandel (BOB member since 2002-06-25)

system · April 22, 2005, 4:39pm

Thanks,

Will do.

mvbokker (BOB member since 2002-09-03)