BusinessObjects Board

Does anyone have a comparison of BODI with other ETL tools?

I’m currently using Informatica 7.1, but may have the oportunity to move to DI in a new job. I used Acta 4.3 a while ago, but it looks as though there have been a number of significant enhancements since then (OEM’d mainframe CDC looks particularly good). What I’d like to know is:

What is DI like, performance wise, compared to other tools?

Are there any new transforms to perform specific functions (such as the Normalizer and Sequence Generator transformations in Informatica)?

Is it possible to build custom, reusable transforms?

Is BO likely to have an Enterprise metadata solution (e.g. a CWM-based metadata repository)?

How easy is it to schedule DI jobs using an enterprise-wide scheduler?

Am I going to be frustrated moving from Informatica back to DI?

Are BO pushing DI as a stand-alone ETL tool, or mainly as part of the total BO suite?

Many thanks,

CJA


CJA (BOB member since 2005-03-15)

Are there any new transforms to perform specific functions (such as the Normalizer and Sequence Generator transformations in Informatica)?

Yes, there are built-in transforms to preserve history, do effective dating, generate artificial keys, and so on.

Is it possible to build custom, reusable transforms?

if by Transforms you mean mappings of sources to targets with transformations in the middle, yes, these can be reused. If you mean custom functions, you can write your own, but I don’t know how reusable they are across jobs etc. Not very, as far as I know.

How easy is it to schedule DI jobs using an enterprise-wide scheduler?

I think scheduling is a DI weakness, at least on the Windows platform?

Are BO pushing DI as a stand-alone ETL tool, or mainly as part of the total BO suite?

I think BO would like to have DI compete on its own – and it probably has the functionality and performance to do so. The reality is of course that it doesn’t have the market presence that Informatica or Ascential (IBM!) does, and so it probably gets sold more as part of total BI solutions vs. standalone ETL. It’s certainly less expensive than those other two tools…


dnewton :us: (BOB member since 2004-01-30)

Thanks for the information.

We also have a lot of ‘fun’ scheduing Informatica jobs, as typically one needs to span different platforms to get data from, say, a Cobol file to an end user report, and the Informatica scheduler only deals with its own objects.

As for the price of the tools, I think you’re right. I was reading the information relating to the new DI release and see it has mainframe CDC, data profiling and data lineage all ‘out of the box’ - these are all additional add-ons with Informatica. And all are expensive.

Has anyone used the DI data profiling? Is the data lineage and profiling available to end users?


CJA (BOB member since 2005-03-15)

I would throw in Embedded Dataflows, to reuse part of a flow somewhere else, e.g. The Embedded Dataflow for the extraction is used at InitialLoad DataFlow and DeltaLoad DataFlow, but one loaded into the target per truncate, once per Table Comparison.

We do not have a scheduler, we generate a batch file (shell script) and use the OS scheduler by default. The disadvantage is that we are as limited as the OS is, on the pro side, you have all the flexibility to use a scheduler of your choice.


Werner Daehn :de: (BOB member since 2004-12-17)

Lineage, Impact analysis etc is all web based so you can setup users to view this data.

Profiling in terms of “look at the data and find out patterns” is done in the Designer only and limited to basic operations. Data distribution "the column GENDER has 60% “f”, 30% “m”, 10% “x”, or min/max/count/count nulls. Changes here are on the current plan.


Werner Daehn :de: (BOB member since 2004-12-17)

Correction: you can make custom functions (using the scripting language) and re-use those. For example, a function which converts a Julian date to a “normal” calendar date.


dnewton :us: (BOB member since 2004-01-30)

Is the data profiling information available within the DI development environment? For instance, I’m creating a mapping to extract data on individuals, can see that 10% of these have ‘x’ in Gender, and can therefore perform an appropiate action? Is there information available anywhere for what’s on the DI roadmap?

I also noticed that DI can import matadata in XMI format - is this easy to do?


CJA (BOB member since 2005-03-15)

Yes, in the Designer.
Please contact your local sales rep. He might be able to give you some materials under NDA (although right now is a bad time as we are deciding on the features for the end-of-the-year release).
You create a new XML object and import the XML schema - that’s it. Please be aware, DI handles hierarchical data natively not like others as set of relational elements.


Werner Daehn :de: (BOB member since 2004-12-17)

I think the question was does DI import metadata (as might be output from other repositories/products), versus, can DI import/export XML data.

We’ve used the XML data import/export a fair amount, and it works well, once you wrap your arms around the hierarchical-to-relational mappings.

But as far as importing metadata, I don’t know if it does this. There are some metadata bridges for other BO products, I think, like the BO and Crystal reporting tools.


dnewton :us: (BOB member since 2004-01-30)

Can DI jobs be called from SQL Server’s agent?

Just wondering that if the Data Warehouse is a SQL Server Data Warehouse then it may be feasible to check that the warehouse is ready and has been backed up as Step One and then call the DI load package as script two.

Just curious because we’re a Microsoft house with DTS and I’m trying to get the boss to sign up for Data Integrator but with little luck at the moment.

SQL Agent can call batch jobs out to the OS, I believe, yes? So then yes you could do this – but, the DI server would have to be on the same machine as SQL Server for it to work.

But there are other means to accomplish your goal – the backup script could set a flag in a table to indicate a successful backup date, and DI checks that before running the dataflows.


dnewton :us: (BOB member since 2004-01-30)