Understanding a relational db Universe vs multi-dimensional

system · September 23, 2018, 9:19pm

I’ve never worked directly with a multi-dimensional d.b., and being on the periphery they always confused me. I couldn’t understand what’s the deal with them. It turns out that intuitively everyone thinks in terms of what is described as a multi-dimensional database, and Web Intelligence (and Excel) easily presents multi-dimensional analysis via cross tabs / pivot tables.

This video (https://www.youtube.com/watch?v=q8p_2j5yTHI) did a great job of explaining it, by emphasizing that multi-dimensional is more a way of THINKING about the data than it is structuring the database. Here is another good video that explains multi-dimensional reporting (https://www.youtube.com/watch?v=IhFkNmVmwn4)

As I understand it, tables in a multi-dimensional d.b. (data warehouse) are relational in nature, i.e., the data is not STORED in a multi-dimensional manner, but it’s set up so as to make REPORTING of the data easier in a multi-dimensional way.

Also, one of the main reasons for creating a multi-dimensional data warehouse (or relational) is for performance reasons, i.e., millions of rows are summarized.

So - my main question then is why in IDT are there two types of Universes - relational and OLAP? My understanding is that data in an OLAP database is still (essentially) a relational model, then why are there two different types of Universes?

Secondary question is how is this actually stored? Is it different from a relational database?

Tertiary question is how is it reported on?

NOTE: We use a relational model, and Web Intelligence can certainly do “multi-dimensional analysis” using a relational data model, just use cross-tabs, no?

Maybe some of my assumptions aren’t valid, so feel free to correct!

gadsden_consulting (BOB member since 2015-06-18)

MarkP · September 24, 2018, 8:31am

My understanding is that OLAP is used to build over MS SSAS cubes and other things stored that way.

The old Designer (now UDT) is perfectly adequate for building universes over a data warehouse.

The other big advantage of a warehouse is to allow you to store data over time - historical snapshots and trends if you like. This means that as your warehouse matures, you can move from analysis into predictive analytics without any real difficulty by performing extrapolations of data trends.

system · September 24, 2018, 1:09pm

The distinction between OLAP and relational is more fluid and fuzzy than it used to be.

Traditionally, relational databases store data in tables of rows and columns, optionally using optimization methods such as indexes, partitions, materialized views, etc. Internal to the table, there is no distinction between a “dimension” and a “measure”; they are simply values in a row. The structure of the database is firm – the columns are pre-defined and only change when the table is altered. Generally relational databases are queried with SQL.

OLAP databases store their data differently. Rather than rows & columns, they are aligned by dimensions and measures. There are no “tables” with a predefined structure, as new dimensional values are expected to appear when data is loaded. A key aspect of OLAP is the pre-aggregation of measures for each intersection of its associated dimensions. This speeds up query time but increases load time.

Since SQL was designed for the table/column paradigm, it doesn’t work well with multidimensional data sources. MDX (which is SQL-ish) is a common syntax for querying OLAP data sources.

Since traditional BO universes were built on the relational/SQL design, it needs to take a different approach when querying OLAP cubes for a couple of reasons, not the least of which is that it is not SQL.

I didn’t watch the videos you linked to, but it is not correct to say that multi-dimensional databases are really relational under the hood. That is true for ROLAP databases, but true OLAP/MOLAP databases do have an entirely different data structure.

SSAS Tabular has made things much more squishy. It uses a relational model (hence the name, I assume), but is queried with MDX or DAX as though it was multidimensional (note the small “m”; “Multidimensional” is Tabular’s sibling database in SSAS, and is a true OLAP database).

Joe

joepeters (BOB member since 2002-08-29)

system · September 24, 2018, 5:33pm

Mark P,

per joepeters reply below, your comments make good sense to me. I pulled up the old Universe designer today and it doesn’t have the OLAP option, like IDT.

good point on storing data over time, for analysis and then also predictive analytics.

=================
joepeters,

very good, thank you!

Finally, after watching the above videos, plus this one, “How to think in MDX” (https://www.youtube.com/watch?v=iXrmfxntkfQ), I “see” what is multi-dimensional data, and the associated cube. MDX is building reporting databases that match how people think.

But I still can’t “see” how an OLAP d.b. stores their data, and how it is queried. That can be a topic for another day, but at least I know it’s different from relational. If your have a link or two at the ready on this topic, please advise.

gadsden_consulting (BOB member since 2015-06-18)

system · September 24, 2018, 5:46pm

Each database will be different. But here’s a start: https://en.wikipedia.org/wiki/Online_analytical_processing

joepeters (BOB member since 2002-08-29)

system · September 24, 2018, 6:15pm

joepeters,

great link, thx. I will look around so as to understand MDX database building and querying.

Professor Hasso Plattner advocates in-memory computing (zero response time), which eliminates the need for pre-aggregating data. All of this MDX / data warehouse stuff is done for one primary reason - performance, right?

Professor Hasso Plattner Keynote, from 2015. https://www.youtube.com/watch?time_continue=4&v=80cZQhG2Hhc

per the video -

“OLAP / Analytics is coming back to OLTP”
“the justification for additional systems, for performance reasons, is over”
“a completely new dynamic of software development”

all with Hana of course … but Oracle has in-memory computing

gadsden_consulting (BOB member since 2015-06-18)

system · September 25, 2018, 12:52pm

Improved performance isn’t the only benefit of multidimensional databases. Another is that they are particularly well suited to hierarchical data. As a simple example, consider a company’s organizational structure in a relational database. This would generally be represented with each level of the hierarchy (manager, director, vp, etc.) as a column in a table. If a new level is introduced (ex. svp), then a new column must be added to the table. Any queries that use the table must be modified accordingly. In a multidimensional database, however, the new level can be automatically inserted as part of the data load process.

joepeters (BOB member since 2002-08-29)

system · September 25, 2018, 3:08pm

joepeters,

ok, good point on hierarchies.

But with a standard org structure, I don’t see a column for the role, rather each employee record has Manager_Id, which refers back to the Employee table.

Each employee also has a title, so adding SVP is only a matter of adding to the “role” lookup table, but all the reporting remains the same.

gadsden_consulting (BOB member since 2015-06-18)

system · September 25, 2018, 3:39pm

Sure, there are multiple ways of representing hierarchies in relational databases, but none are truly “pure” like a multidimensional db is.

joepeters (BOB member since 2002-08-29)

system · September 25, 2018, 4:58pm

Ok, thx, and I need to bone up more on MDX.

gadsden_consulting (BOB member since 2015-06-18)