We have large volume of data (in millions) so if we design a dimensional model will the performance effect for adhoc reporting or can we go with basic standalone tables. so which one would be the healthier approach.our main goal is to give users adhoc reporting ability along with good performance so Please help me !!!
Hi Krishna,
First of all we need to prepare the startegy of how to increase the performance of universe,so we will get the good reporting.
In most of the cases we dont design the universe with single table.we will have many tables joined together for the reporting.
End user will not worry about how the data is getting fetched ,he will see whether he has the business metadata needed for his adhoc reporting.
Make sure that you incorporate some best practices of universe designing
My choice is go for the dimension modelling and design in such a way you dont get any flaws in near future.Prepare some performance tuning strategy in all levels
1.Database
2.Network
3.Universe
4.Reports
A) Go with dimensional modelling assuming you are working with a data mart/ data warehouse (see R. Kimball for more info).
B) Make sure adequate hardware has been purchased for BOE as well as the DBMS (Database Management System, such as Teradata, Oracle, IBM DB2, etc.)
C) Make sure network bandwidth and network latency is adequate between:
end-user --> BOE cluster --> DBMS
D) Explore database partitioning for large fact table
E) Explore aggregate/summary tables for large fact tables.
F) Make sure the DBMS has been tuned for optimal query performance (not for ETL!), for example: current statistics, database indexes, SGA (Oracle), database block size, etc.
Finally, unless this knowledge exists inhouse already get external experts to help you with these tasks.