Universe for operational reports

My client wants to build reports not just for analysis revenue. They want to know the data in each transactions. For example, they want to show a list of transactions, checking the status and revenue for each transactions. Therefore, I need to include transaction tables in my universe

However, I think it is not a good approach for a data warehouse :hb: . There are a lot of joins and everything will be in a mess. For example, prompts, filters do not work because there are same attribute in different tables!

I hope I can get some advise about handling this case. Thank you


summoner :hong_kong: (BOB member since 2008-12-18)

Depends upon how you model your tables. One approach you could consider would be to keep the transaction level data separate from your modelled tables and have hyperlinks through to exception reports.

This way it won’t impact on the performance of your existing model or reports. It depends upon how performant you want the different reports to be.

Without detailed analysis its difficult to advise more.


dessa :madagascar: (BOB member since 2004-01-29)

To me, there are two types of reporting - strategic and tactical. An example of strategic reporting would be to examine sales for the company for the last 36 months. Strategic reporting is about being able to slice and dice a subject area over multiple dimensions. Strategic reporting is about identifying trends, star performers and dogs. Strategic reporting is best done with a star schema and summary tables.

Tactical report is about being able to find out everything about one record. Were did this GL expense for $1200 come from? Well, it came from Invoice 12345 from Vendor ABC. What items were on the invoice? What was the purchase order? Who received the goods and when did it get received? Was a client billed for this expense? As this type of reporting jumps over multiple subject areas for a very small number of records, a star schema is a poor choice and the source OLTP tables work better for reporting.

It sounds like your client wants to do both tactical and strategic reporting. I would make a universe for each type of reporting (like dessa suggested). I wouldn’t bring the source tables into the data warehouse - just a build a universe on top of the Production OLTP instance. Building a universe for OLTP tables takes a different approach than a star schema.


Dennis W. Disney :us: (BOB member since 2003-09-17)

Thank you all for reply

May I ask the reason that star schema is a poor design for tactical reports? Also what are the benefits on building the universe on top of production OLTP server? Why not we import the data into another database, with the help of ETL tools? How about if the source comes from multiple OLTP database?

If we build the universe on top of production OLTP tables, isn’t that we have to handle a lot of extra conditions? For example, there may be some transactions that are voided by users. To list all the transactions that are not voided in a report, I need to add a condition “Status <> voided”

If the client request the following:
“I want to know which transactions involve sales of iPod, and find the total revenue from those transactions”

This involves a lot of transactions. Isn’t it a strategy report? In that case, I think the transaction detail is the fact, while the transaction with transaction id is the dimension.


summoner :hong_kong: (BOB member since 2008-12-18)

Summoner,

Please re-read Dennis’ reply in that he does a good job in explaining strategic and tactical reporting concepts. In a nutshell, tactical reporting provides up-to-the-minute reporting on transactions that are in the mix right now. For example, a Video Rental store would have a transactional database to record movies for rent (a somewhat static table), customers (a somewhat static table), and movies that are rented to customers (a very dynamic table). To determine how many movies are out and how many movies are due back in the store today and tomorrow requires running a report against the OLTP. After the movie is returned triggers could be employed to “stub out” the transaction to another table to build up a history of what took place – which is then used for strategic purposes. The OLTP model does not lend itself to the star schema (two “static” tables and one dynamic). The “stubbed out” table could lend itself to a star schema because the portions of data written to that table could be coded and associated reference tables could assist with the decoding at report time (hence the star schema design).

Based on the above scenario, only a non-transactional type database is served well by a star schema model.

There is no prohibition to doing it this way, it just becomes a matter of timing. How often do you want to gen up the ETL process? Once an hour, once a day, once a week? This can only be answered based on the tactical reasons for knowing the data and its currency.

This all comes under the “Business Rule usage” of the data. This is why you are very valuable to the company, because you know the business and know the data. The old adage to DBAs to “know the data” is very much more critical in a Business Intelligence environment because mistakes that occur here (not knowing the rule to exclude voids, etc, etc) become an exponential problem when generating reports that are used to support decision making. I think this is one reason that BI has gotten a black eye at times because “consultants” were brought in to construct a reporting system (a DSS no less) and got it all wrong because they didn’t know the data, and the DBA didn’t take an interest in the “another thing to do” BI system, so bang, the new system is “bad” and blows up and gets canceled and the company continues struggling with relying on tactical data to make strategic decisions (okay, I’ll get off my soap box now).

Best wishes with your project and I suggest keep on asking questions and doing your homework, you’re on the right track.

Thanks,
John


jsanzone :us: (BOB member since 2006-09-12)

Summoner, you need to go back to your client and understand their requirements better. What may be going on with your client is something similar to one of my clients, who provides parts and repairs for massive diesel engines. The repair team will send in the work that was done for the company to bill, but the repair teams will occasionally leave things out or put something wrong. Someone in the office always reviews the invoices for completeness and accuracy before it is sent to the customer.

I will follow John on to the soapbox and say that if you want a reporting solution for an OLTP system, you should always build a tactical solution on the Production tables first, then a ETL-based strategic solution second. There are a number of reasons for this:

There is always a need for up-to-the-minute reporting
In John’s example of the video rental store, if a customer calls asking if you have “Batman Returns” in stock, you have to know if you have in stock RIGHT NOW. Whether it was in stock last night or not is irrelevant.

OLTP universes are much more flexible
Let’s say that one day, someone calls the company that owns the video rental store and says that a 16 year old clerk is selling/renting adult videos to his high school buddies. The company wants to review the sells/rentals of adult videos at the store by age of sales clerk. If the age of the date of birth of the sales clerk is somewhere in the OLTP system, then it is a matter of adding the join(s), creating a new object, running a test report and then migrating the change to Production. If the date of birth isn’t in a data warehouse, then the data modeler has to add it to the data model, then the DBA has to add it to the table, then the ETL developer has to modify the ETL script, the modified ETL script has to go through testing, a special ETL script has to been written to load date of birth information for all previously loaded sales clerks, the special script has to be tested, the universe has to be modified, a report has to be written, etc. A much longer development cycle.

The tactical reporting experience can provide requirements for the data warehouse
The reality is that it is really hard to get good requirements out of users. Most of the time, what they give you is listing of their current reports. If they users have a OLTP universe for tactical reporting for a while before discussing the requirements for a data warehouse, they will have a good idea of the reporting that they currently do and want to be able to do better. If you create the data warehouse without having a OLTP universe first, then the requirements are going to be much more random, based upon what they are doing in Excel, the canned reports that come with the OLTP system or what they did in the data warehouse for the prior OLTP system.

Data warehouses are much more expensive than a OLTP solution, so they should be saved for where they provide value over an OTLP universe
With a universe on top of OLTP tables, the cost of the reporting project is the cost of building the universe. With a data warehouse, the cost is the cost of building the data model, the data warehouse instance, the ETL scripts and the universe. Troubleshooting data warehouse reports are much more difficult than OLTP reports because the problem could be with the report, the universe, the data model, the way the data was loaded, or that particular day’s load (this is from someone who got calls that “his report is wrong” when it always was an ETL issue). So save the data warehouse for things like sales reporting where it provides an advantage and put everything else into a OLTP universe.


Dennis W. Disney :us: (BOB member since 2003-09-17)

Thank you jsanzone and Dennis. It really helps. :+1:

However, I think I need to consider some issues for building universe on top of prodution OLTP database before I propose it to my client.

The first issue is the performance problem. I think the reports may degrade the performance of normal operations. Let’s say some users may try to select all the transactions without any filters. It will involve all transaction in several years. We cannot force them not to do that. But I think I may set some rules so that it limits the query results. For example, If the field is “Transaction ID”, then I create 3 objects like “Today’s Transactions”, “Yesterday’s Transactions” and “Transactions in current Year”. Setting execution time limitation may not be a good choice for me because it will show incorrect aggregation result (if the execution is halted by the limitation)

Another issue is there will be 2 universe. One universe for OLTP and the another one for OLAP. In users’ point of view, they want a single universe. They want to directly check the transactions using the same universe once they find problems in strategy reports. I think add linking universe is possible approach, but I am afraid the user may mix everything up, i.e., selecting objects in both universe. The linkage between the universe may be a little bit complex because every field in single dimension may corresponds to different tables in OLTP database.


summoner :hong_kong: (BOB member since 2008-12-18)

You can put a stubby join on your table with a prompt. Might be something to think about.


dessa :madagascar: (BOB member since 2004-01-29)

How about joinings, and the condition setting? Are there any tips?
Since I do not use ETL tools now, the universe becomes much complex (more loops, more traps, more joins) and I have to add a lot of joins in the universe. For example, The transaction has 3 level of detail, i.e., 3 tables joining together.

I need to check whether the transaction is voided. On the highest level, I need to add a condition “Status <> voided” on all selected objects. But how about objects in lower level? Should I…

  1. add the condition in joinings?
  2. add the coniditon in all objects?
  3. create a view to join the tables in universe
    For (1), it may be a cleaner approach. But it might be awful if there are some joining between middle level transaction table, and other non-transaction tables (because the objects does not have the condition!)

For example,
Transaction chain
C->B->A

Other joinings
B->D
C->E
where D and E are not transaction tables

If the user only select objects in B and D, all data including (data in voided transactions) will appear

For (2), any small changes make it fail easily
For (3), performance problem


summoner :hong_kong: (BOB member since 2008-12-18)

I add conditions as I build reports

My Oracle EBS universe is still quite small - just GL and some of AP. It currently has 1030 joins.

Use lots and lots of aliases. My EBS universe currently has 637 aliases.

And some design advice off my blog:
[quote:12b12b44e3=“Dennis W. Disney”]To me, OLTP universes should have two types of classes - classes for documents and classes for set ups. Each type of document - PO, Invoice, Receipt, Journal Entry - should have its class that is completely independent of any other class. Before I start a new document class, I create set up classes for all the look ups. For example, before I started the AP invoice class, I create classes for Suppliers, Payment Terms and Distribution Sets. Then I created the class for AP invoice. Each document class and each set up class is a separate context.
[/quote]

Create a condition in the universe in the class for the transaction. Included it when appropriate. At some point, the users are going to want to see reporting on voided transactions as a high number of voided transactions are a warning sign for poor training or fraud.

You worry too much about what the users can do. Your users with little IT skills will be running canned reports. Users with enough IT knowledge will quickly figure out to not poke a sharp stick into their eye, i.e. not running queries without specifying a date range.


Dennis W. Disney :us: (BOB member since 2003-09-17)