BusinessObjects Board

incorrect results with a COUNT DISTINCT measure

Hi,

I am building a universe based on a classic star schema.

To help you understand my point, I will make an analogy between my star schema and the classical “ordering” example.

So I have got a fact table at the order line level. This fact table is linked to several dimensions (including the Date, Customer and Product dimensions) and contains a measure called Number of orders.

The Number of orders measure is defined by the following formula : COUNT (DISTINCT FACT_TABLE.ORDER_KEY).

The FACT_TABLE.ORDER_KEY field contains duplicates since the grain of my fact table is not at the “order” level but at the “order line” level.

As long as the user does not combine the Number of orders measure with objects coming from the Product dimension, there is no problem since an order can reference neither multiple dates nor multiple customers.
But what if the user combine the Number of orders measure with objects coming from the Product dimension ? This combination will cause erroneous results since an order can reference multiple products.

Is there a way to prevent the user from combining the measure with objects coming from the Product dimension ?
Or, can I authorize the user to build a report with such a combination (after all, it can make sense to display the number of orders containing a given product) but prevent him from suming the measure in a report break or grand total ?

The good news is that I am building the universe and the model in the same time so I can modify the model if necessary …

Any help would be greatly appreciated.

Regards,
Emmanuel


eferragu :fr: (BOB member since 2010-11-20)

Exactly, this is a valid question :).
I have worked with an example, almost identical to this and have dealt with it in just the way you describe…
You just need to educate the users on the data and the questions they are answering…


Mak 1 :uk: (BOB member since 2005-01-06)

Alias your fact, and create your order count from the alias. Create another count from the existing fact table and put it in the products class, and call it Product Order Count or something. Make sure that second object references the product table (use the Tables button on the object definition) in order to make sure that it will always pull that table into the SQL.


Dave Rathbun :us: (BOB member since 2002-06-06)

I knew I missed something, earlier, when looking at and answering this 8) .

I went and looked at my Orders, Order Lines universe.
When we did this we had an Orders fact and an Order lines fact.
I made the order count distinct aggregate aware, our product dimension was fully conformed, products were classified at a package / service / group level, hence the counts were correct - as they always hit the Orders table.

I know your described situation is different, but just wondered if you were planning building an Order level fact?

Cheers,

Mark.


Mak 1 :uk: (BOB member since 2005-01-06)

Hello all.

@Dave : I am not sure to have understood what you suggest. Actually I don’t really need two measures. The Product Order Count measure is unnecessary and I think it might confuse our users (getting the number of orders containing a given product is not really a requirement, I was just considering that it could make sense to do this). And how the aliasing will prevent the users from associating the Order Count measure with objects coming from the Product table ?

@Mark : I have considered building two fact tables, one at the Order Line level and another one at the Order level. Then, I could derive my Order Count measure from the Oder Level fact table and prevent the users from combining thi measure with Product objects since they would belong to different contexts. The problem is that I have got several degenerate dimensions in the fact table and building two fact tables would force me to duplicate the degenerate dimensions (having two objects for each degenerate dimension) and would confuse the users. Or maybe I could use aggregate awareness on the degenerate dimensions in order not to duplicate the objects but I doubt it will work …


eferragu :fr: (BOB member since 2010-11-20)

This approach worked very well for me :). Also, it is considered fairly standard.

It works very nicely :yesnod: .
I had degenerate dimensions, across both facts, for example:- OrderNumber, I dealt with this using Aggregate Aware:-

@Aggregate_Aware(dbo.Orders.OrderNumber,dbo.OrderLine.OrderNumber)

The universe created was, pretty much, idiot proof.

The only problem I faced is Business Objects cannot handle shrunken dimension tables, gracefully.
I had to have two physical dimension tables that were virtually identical; one shrunken, belonging to Orders and one with an extra field belonging to OrderLines.


Mak 1 :uk: (BOB member since 2005-01-06)

You are right, Mark. Thank you !

I have just tried it and aggregate awareness works great in this case :smiley:

The drawback is that the universe becomes a lot more complex to maintain with all the @Aggregate_Aware functions and the incompatible objects. Besides that, I have to define and populate a new fact table …

I will propose both solutions to my users and expose to them the advantages and drawbacks of each one.


eferragu :fr: (BOB member since 2010-11-20)

No problem, It was a real life example I gave you ;).
As I said, it is well worth creating the Orders Fact, just for improved query performance, alone, you users will thank you for that.
Also you will be able to handle the product level dimension in a better way.

Ah yes, but I believe it will be a better universe 8).


Mak 1 :uk: (BOB member since 2005-01-06)

Ease of maintenance is never the first priority of universe development, getting the correct answer is. :slight_smile:


Dave Rathbun :us: (BOB member since 2002-06-06)

Hi All,

I am facing similar issue as well.
Scenario-
Tables - Orders_Fct, Lines_Fct, Exceptions_Fct, Others_Fct, Upfront_Fct(Order granualirity)
Cordinality:
a) Orders_Fct(Order_Key) —1:1,N— Lines_Fct(Lines_Key)
b) Orders_Fct(Order_Key) —1:0,N— Exceptions_Fct(Exceptions_Key)
c) Orders_Fct(Order_Key) —1:0,N— Others_Fct(Others_Key)
c) Orders_Fct(Order_Key) —1:1,1— Upfront_Fct(Upfront_Order_Key)

Measures:Count(Order_Key), Count(Lines_Key), Count(Exceptions_Key, Count(Others_Fct), Sum(Upfront_Payment)

Issue:
Order Creation date: 01-Jan-2012
a) Original Order count: Count(Order_Key) = 200
b) Order & Exception Orbjects pulled together: Count(Order_Key) = 83, Sum(Upfront_Payment) = 52K
c) Order & Upfront Orbjects pulled together: Count(Order_Key) = 83, Sum(Upfront_Payment) = 52K
d) Order & Others Orbjects pulled together: Count(Order_Key) = 83, Sum(Upfront_Payment) = 52K
d) Order & Line Objects pulled together: Count(Order_Key) = 240(wired number), Sum(Upfront_Payment) = 152K(wired number)
—>I must see consistent Order count 83 & 52K of payment even when I get Lines also, as number of created orders on particular day/date will not change w.r.t. Lines.

@Aggregate_Aware… can this help me here? or there is any other wayout? please help.

Thanks in advance,
BauV.


BauV (BOB member since 2007-10-09)

Your fact tables “Order” and “Order Lines” pose a fan trap, which must be resolved using alias tables and universe contexts.


Andreas :de: (BOB member since 2002-06-20)

Yes I did, ‘Orders’ Vs ‘Order Lines’ data is fine. But when ‘Upfront Amount’ is pulled it’s failing.

Can anyone help me with better model of joinining ‘Orders’ Vs ‘Order Lines’ to solve this problem please?

Thanks in advance,
BauV.


BauV (BOB member since 2007-10-09)

Is there always an upfront amount? It may be that an outer join or a different join path is required

Upfront is not always used.
Orders —1:1,N— is working fine (as expected).

Upfront Amt should split per ‘Order’ Vs ‘Order Line’ split. For example -

Date:01-Feb-2012
Total Orders Created: 3 (O1,O2,O3)
Total Upfront Amt paid : $400(say 2 Orders only has Upfront each $200)
Order Vs Lines combination: O1 - L1, O1 - L2, O1 - L4, O2 - L2, O2 - L3, O3 - L1, O3 - L2, O3 - L3, O4 - L4
Total Upfront Amt shown when ‘Order Line ID’ or ‘Order Line Action’ is pulled: $1800

Expecation after pulling ‘Order Line ID’ or ‘Order Line Action’:
Number of Orders: 3
Total Upfront Amount: $400

I tried all the possible cobinations of Alias & Contexts as well, still no luck.
Hope atleast 1 of you champs have already experienced this & solved as well. Plzz share the solution you implementd. Or suggest me the better model of joning these 3 tables to solve this problem.

Thanks in advance,
BauV.


BauV (BOB member since 2007-10-09)

If you’ve got an inner join between order and upfront and upfront isn’t always there, then you’ll be short on values. Your SQL will say
order.key = upfront.key but there will be no upfront.key for the order.key to match to.

It’s like when designers don’t alias the calendar table and use the same table for order date and ship date. Unless orders are shipped the same day, they wouldn’t be reported on.

Edit, I just re-read this :slight_smile: .

As well as the point Mark has made, your Order Line and Upfront tables are at different levels of granularity, hence the exageration of the upfront figure when combined with fields from the line table.


Mak 1 :uk: (BOB member since 2005-01-06)

Yes Mark, you are right. I did solve it using Outer join b/w Orders & Upfront already.

My problem is with when ‘Order’ & Upfront Amount’ data pulled along with ‘Orders Lines’ only - as each ‘Order’ can have 1 or N Order Lines.
Orders(Order_Key) —1:1,N— Order Lines(Lines_Key)is the join.

Below is my data model in Univ, can someone correct me here if anything wrong & leading to this problem plzzzz.
a) Orders_Fct(Order_Key) —1:1,N— Lines_Fct(Order_Key)
b) Orders_Fct(Order_Key) —1:0,N— Exceptions_Fct(Order_Key)
c) Orders_Fct(Order_Key) —1:0,N— Others_Fct(Order_Key)
c) Orders_Fct(Order_Key) —1:1,1— Upfront_Fct(Order_Key)

Thanks in advance,
BauV.


BauV (BOB member since 2007-10-09)

Adding 1 more line to my reply as below …

*** ‘Order Count’ & ‘Tpotal Upfront Amount’ are inflating rapidly when Order Line data is pulled along with Orders & Upfront.

Thanks in advance,
BauV


BauV (BOB member since 2007-10-09)

So there is an up front amount for an Order but one Order can have many Order Lines? In that case you’ve got a fan trap - as Mak1 said, there’s an imbalanced granularity in your query.

You need to review what contexts you have.

If you have measures in Orders_Fct, Lines_Fct, Exceptions_Fct, Others_Fct and Upfront_Fct then you would need a context for each. Given that you can pull measures from Orders_Fct and Lines_Fct then you are creating a fan trap if you put them in the same context. Create an alias of Orders_Fct and join it to Orders_Fct. Although it’s a 1:1, create it as a 1:N with N at the alias end. Any measures that you had build against Orders_Fct should now be altered to point to Orders_Fct_Alias

Unfortunately the problem is still unsolved.

Below is what I tried -
a) Orders_Fct(Order_Key) —1:N— Orders_Fct_Alias(Order_Key)
b) Orders_Fct_Alias(Order_Key) —1:1,N— Lines_Fct(Order_Key)
c) Orders_Fct_Alias(Order_Key) —1:1,1— Upfront_Fct(Order_Key)
& All required contexts.

Any further advice frinds?

Thanks in advance,
BauV.


BauV (BOB member since 2007-10-09)