BusinessObjects Board

COUNT (DISTINCT ID) + database delegation + #UNAVAILABLE

Hi,

I need some advice / guidance about warehouse / universe design. Please take the following scenario

DEPARTMENT | STAFF | INVOICE No.
Accounts | Joe | 1
Accounts | Joe | 2
Accounts | Joe | 3
Accounts | Pete | 1
Accounts | Pete | 4

How many invoices has each staff worked on:
Joe = 3
Pete = 2

How many invoices processed by department:
Accounts = 4 (as 1 worked on twice)

If I set the default aggregate to SUM, based on a count (DISTINCT invoice No) then the user can get discrepancies with their data. If they do a query with just DEPARTMENT and INVOICE it is correct, e.g “4” invoices, but if they also add STAFF to their query (even if they don’t show it in the report) then the value is “5”.

Because of this I changed the aggregation to database default. The major issue is that it is now much harder for the user when creating reports as the product doesn’t automatically provide totals. More importantly it removes the ability to do dimensions on formula fields, and start getting lots of #UNAVAILABLE or #MULTIVALUE in existing reports.

This must be a common problem. How do people get over this? (e.g. accurate data, and ability to use totals, and custom dimensions)

My first thought is to have two fact tables, one for department facts, the other for individual facts. However this would cause a massive increase in the size of the warehouse as well as the complexity to the user on what objects to use.

I am pulling my hair out why this isn’t a bigger problem or do people just take the hit of slightly inaccurate data and just set the default to SUM()?

Any advice on my design would be massively appreciated?

Even if the answer is…“that’s life get on with it :)”… at least it will stop me looking for a solution.

Very kind regards

Matt.


injenuity (BOB member since 2010-04-23)

BUMP!
Sorry to be BUMP, but I still need some advice on this matter. I think it is a key design question regarding count (distinct yz).

To guarantee accuracy it should be set to database delegated measure. However as we all know BOXI31 is not that flexible with smart measures. And my users are now constantly complaining of #unavailable etc.

So my question is regarding a COUNT (DISTINCT):
a) Do you avoid and redesign the warehouse so not required
b) Do you set to database delegated and loose a lot of the functionality with formula’s etc
c) Do you set to SUM() and live with the fact that users could get inaccurate data.

I would be really keen on peoples opinions. I cannot be the first person to run into difficulties with COUNT (DISTCINT)

Kind Regards

Matt


injenuity (BOB member since 2010-04-23)

This link may help:-

Projection of Count Distinct should be none. https://bobj-board.org/t/107995


Mak 1 :uk: (BOB member since 2005-01-06)