I’m looking for general guidance on whether to classify database amount field objects as Dimensions or Measures.
The BO User Guide states that
“Dimension objects retrieve the data that will provide the basis for analysis in a report. Dimension objects typically retrieve character-type data… or dates…”
It also states that:
Measure objects retrieve numeric data that is the result of calculations on data in the database.
I am inclinded to set my amount objects as Measures, except that, according to the User Guide, Measures should be the result of a calculation; my amount objects are pulled directly from the database without any manipultaion. However, I don’t feel that I should classify them as Dimensions because I consider amounts to be numeric-type data, not character-type data.
I’ve worked with Universes before where all amount objects were Measures, and didn’t have any problems. Recently, however, a collegue set his amount fields to Dimensions (which I believe may have been the cause of a problem we had aggregating the amount in a formula on a report) and claimed that that was the proper way to classify amounts coming directly from the database.
Our universe is almost exclusively dimensions, and it causes all sorts of headaches. Remember that when you want to link queries together, you must have the same dimensions in both queries. That can cause quite a headache… Does for me anyway.
I vote measures! I vote more details, less dimensions!
Hi trickykid! You are on the right lines, don’t get too hung up with exactly what the user guide says. You want your amounts to be measures, because even if you are pulling them from the database without doing any calculations, when you get them into your report you will, like as not, be summing them, averaging them etc.
Not all numeric data whether calculated or coming directly should be marked as measure
For eg you ahve employee_id defined as integer in the database but it is a dimension not a measure
Measures are basically those numeric values which are continously updated and valued whether computed or coming directly from the table.
I guess the fields coming directly from teh table will be marked as emasures which they come in above category.
Here is the logic I usually use for determining dimensions and measures. I try to think as if I am writing SQL, if the object is something I would want to “group by” when aggregating, then I make it a dimension, if it is something I would aggregate i.e. sum, max, avg, then I made it a measure.
Another way to look at it is this, if you make something a dimension, then when you display data on your report, a dimension will define how much detail is actually shown. For Example, you have Date, Year of date, month of date and amount. You would make all but the amount field dimensions. If you pulled all these fields on a report, you would have all the detail down to the day. So lets say you pull off Date, you are left with year and month, your report should aggregate the amount to the month level. If you pull off month, it will show down to the year level. If you ahve multple years then you can do breaks and such on the year field since it is a dimension.
You can however make an amount a measure and set aggregation to none, this will cause the report to not aggreage and always show full detail unless you add the sum function in the report itself.
Most numeric fields should become measures. However, there are some things to watch for. Numeric identifiers should be dimensions. Standard values (as opposed to actual values) should be dimensions;eg Standard Price should be a dimension but Actual Price should be a measure.
Remember, it is possible to do math with dimensions, it is just easier with measures.
I also put sum() around most measure fields. This makes the database do the summing and returns fewer rows. That reduces network traffic and report computation time.
My philosophy is: If it is a number, make it a measure and put Sum() around it unless there is a reason not to.
No, didn’t confuse me at all. In fact, you helped answer what was going to be my next question (i.e., what to select as the aggregation for the amount objects).
Thanks to everyone for your input. It’s all been very helpful.
Rob
(decided to drop the “trickykid” username and just use my real name)
This has been a great thread, but the ones that have made me scratch my head in the past are the “in betweeners” like tax rates or currency exchance rates. I can’t think of a scenario to ever “group by” the rate, so not likely a dimension. I can’t think of a scenario where I would ever aggregate them (sum / average, though min / max MIGHT happen), so they don’t fit cleanly as a measure either. We usually choose measure though, with no aggregate sql or projection. My logic is that they would most likely be used in a formula with a measure.
I “actively discourage” (to use a polite term) the application of AVG(), STD_DEV(), VAR() functions and the like to database columns as part of their SQL definition – they are most definately not safe when users are manipulating reports to remove objects that were included as dimensions or details in the query.
On more thing I wanted to throw out there… We have some objects that appear twice in our universe; once as a dimension and once as a measure. Namely, appointment dates. Sometimes we want to group by them and count the patients, other times we want to know the length of time between appointments.
This is an interesting thread. Hope the discussion continues…
Measures are always thought of as numeric. But now there are some noises of non-numeric measures/facts which include dates, text etc.
How do you classify a column like COMMENTS (remarks,explaination) as? Dimension? Measure?
The following reasons makes this column NOT a dimension:
COMMENTS are not view points. No reports are based on COMMENTS. There are no queries like:
a) Show the no. of quantity consume which has COMMENTS like “abc”
b) Show top ten products consumed which have COMMENTS like “abc”
The value entered in the COMMENTS field cannot be predetermined and may not be repeated. This value is a description for transaction. This value uniquely pertains to the transaction record (set of dimensions and measures) hence making it a part of the fact table record(as a non-dimension).
COMMENTS are fields which can contain NULL values. Capturing value for COMMENTS cannot be enforced. But dimensional values cannot be null as measures need to pertain to a definite set of dimensions. Hence COMMENTS cannot be made as a dimension.
I understand that COMMENTS can be termed as a detail object. But,considering its properties which were described in the previous message in same thread, where do you store them?? dimension table? fact table?
In the dimension table. If you want to count the number of products you sold then you just look at the fact table. If you subsequently want to count the number of products you sold where the comment on the invoice was “abc” you would then use the join between your dimension table and fact table.
Going ahead with your example, the COMMENTS pertain to the Quantity_Sold(measure). They do not pertain to the dimension(product). If the COMMENTS are stored in dimension table, then it implies that there exists a relation btw dimensional value and the COMMENT. Hence these cannot be placed in the existing dimension table (products).
In case the suggestion is to create a new dimension table for COMMENTS, then the follwoing points go against it:
COMMENTS are not view points. No reports are based on COMMENTS. There are no queries like:
a) Show the no. of quantity consume which has COMMENTS like “abc”
b) Show top ten products consumed which have COMMENTS like “abc”
The value entered in the COMMENTS field cannot be predetermined and may not be repeated. This value is a description for transaction. This value uniquely pertains to the transaction record (set of dimensions and measures) hence making it a part of the fact table record(as a non-dimension).
COMMENTS are fields which can contain NULL values. Capturing value for COMMENTS cannot be enforced. But dimensional values cannot be null as measures need to pertain to a definite set of dimensions. Hence COMMENTS cannot be made as a dimension.
Detail objects could be stored in either the dimension or the fact table – if the comments pertain to transactions in the fact table then store them there. You can soon wrap the column in something like …