BusinessObjects Board

DI/DS Improvements & Suggestions

  1. I never use the lookup_ext in a mapping but as a new-function-call for exactly that reason.

https://wiki.sdn.sap.com:443/wiki/display/BOBJ/lookup_ext

  1. Use two lookups for the OR. So if the condition should be col1=1 or col1=2 you can have two lookups with each condition and in the next step you decide which of the two you actually take via a ifthenelse(). I know, not perfect but with ORs you would need to support brackets as well and then the GUI gets really complex.

  2. Wait one or two months and then DS 12.2 supports that!

  3. True.

  4. True. Can you list the most important ones for us?


Werner Daehn :de: (BOB member since 2004-12-17)

A new transform, something different than History Preserve, more like “Existing Column Preserve”, that does the following:

  • If the data in the target table has a value, leave it alone
  • If the data in the target table is null, then fill in the incoming value

In this transform, you could click on each column and the tab/dialog area below would have options for:

  • Normal (pass-through)
  • Preserve if not not null
  • User-specified condition (and you could put in your own WHERE-like criteria)

This is useful for some columns where you always want to preserve a value like “initial load date”, even if the row is later Updated (rather than inserted). But you want to have the same dataflow, and not have to resort to bringing in the target table as a source.

In fact, now that I think of it, this could be added into the Table Compare transform. Since you already have to read the entire target table’s row for that transform to work, you could have another tab in the table compare that lets you specify this column-by-column behavior.


dnewton :us: (BOB member since 2004-01-30)

How about the ability to have multiple instances of the Designer open at the same time, with each one pointing to a different repository?

For example, I’m in the development environment and want to go look at something in the production repository. Currently I have to exit Designer and restart it, point it to the prod system, look at what I need to (assuming that CRS hasn’t kicked in and I can remember what I was after), then lather-rinse-repeat to go back to my development environment.


nitrobrother :us: (BOB member since 2004-09-08)

How about “View Where Used” ability for variables!!! right now is there a way to find it?


urk :us: (BOB member since 2005-11-29)

That would be fantastic. We recently tried to clean up our warehouse load because there were tons of variables defined, some of them old, but it was hard to find out where they were used.

Can you enter a new Message with SAP support and suggest this?


dnewton :us: (BOB member since 2004-01-30)

I had a requirement to compute an aggregated fact table, and only show the Top customers or Top products in the result. This was a minor pain to do, as it required several Query steps in succession:

  • Aggregate the measures
  • Sort by the customer/product and then by the Revenue, in DESC order
  • Add a “gen_row_num_by_group” column
  • Then filter out just the top 10.

If there was a “Top” transform, it could do all of the steps above. Just specify which columns are the dimensions, which are the measures, and whether you want the highest or lowest values, and how big the TOP should be. It could do all of these in one step.


dnewton :us: (BOB member since 2004-01-30)

KattaUR wrote: 
How about "View Where Used" ability for variables!!! right now is there a way to find it? 
That would be fantastic. We recently tried to clean up our warehouse load because there were tons of variables defined, some of them old, but it was hard to find out where they were used. 

Can you enter a new Message with SAP support and suggest this?

Made a suggestion to SAP support about this, hope they will make this feature available in their future release.


urk :us: (BOB member since 2005-11-29)

Here is the reply from SAP support

We are aware of this limitation, customers have asked to for a "where used" functionality for variables in the past. Unfortunately, this is not a small feature to implement, so it is not possible to put it into an intermediate release, it needs to be part of a major release (it requires changes in the way we store variables in the repository, unlike what we do with jobs, dataflows and workflows, for variables we don't store the relationship between the variable and the places where it is used. So a "where used" today is not possible - or would require a full scan of the textual representation of a job).
 
As a workaround you could query our repository and scan the AL_LANG and AL_LANGTEXT tables for the variable name. But I realize this is not a user friendly solution at all ;-(

urk :us: (BOB member since 2005-11-29)

The facility to lock a datastore so that it won’t be overwiten by an ATL import - maybe an “Are you sure?” prompt.


gpn01 :uk: (BOB member since 2004-11-30)

My apologies if this was already mentioned before!

  • Job chaining:

The ability to call jobs on the same and different jobservers from other jobs - with success, fail, waiting, etc status and associated error alert/reporting options.

This allows you to have master/child jobs as for some of the EDW’s I’ve build it just isn’t possible to stuff everything in a single monser job.

You can already do this, using commandline execution and all, but it’s quite some work to develop a framework around this and it would be very useful if Data Services would offer this “natively”.

  • Job, Workflow, Dataflow Process Execution metadata

There is a alot of metadata stored in the DI/DS repository when jobs are executed. But I would like the ability to capture and store that metadata elsewhere - for instance, in an ETL process schema in your data warehouse environment.

The execution data is available through the Web Admin interface but I store this data also in my data warehouses for reporting on data loading performance etc.

You can already store this information by setting this up manually in your workflow designs but it would be nice if this was already build-in.

It woudl also be great if the error description can be exposed from within Data Services. Right now, I can capture that an error happened (as part of Try / Catch) but it’s tricky to get the error message/description itself - as you have to dive into the log file and retrieve this, which is a pretty messy affair.

Yes, I know that’s all in the Admin interface, but it’s part of the overall process logging schema. I always want to know what records were inserted/updated/deleted under which process execution and if errors happened, I also want my reports to show the error descriptions without having to provide DS Admin monitoring access to the consumers of the daily loading reports (data stewards and BA’s at division level).


ErikR :new_zealand: (BOB member since 2007-01-10)

Since it’s been mentioned a couple of times: The ability to schedule Profiling jobs in an easy way.


dnewton :us: (BOB member since 2004-01-30)

I think the graphic ‘readability’ of Data Flows would be enhanced if the lines from the Inner Source(s) to a Query containing an Outer Join could be displayed as dashes ‘-------’ or dots ‘…’, while the lines from the Outer Source remained (as now) solid ‘_______’.

This would avoid having to click into the Query’s Outer Join tab to discover the nature of the join. Ideally, this facility would be configurable in Tools > Options.


GidonAtWork (BOB member since 2008-09-12)

'Nuther one: We recently moved our job server inside the firewall and I FINALLY am able to use the debugger (port couldn’t/wouldn’t be opened through the 'wall) and I would like to have the ability to set a breakpoint inside a script rather than having to do all those COBOL-like print() statements to show what’s going on (or, more accurately, what’s going wrong).


nitrobrother :us: (BOB member since 2004-09-08)

Definitely! We’ve had our live environment bitten by that before.


clynham :uk: (BOB member since 2007-08-17)

I agree this would be useful. In the meantime, the workaround I use is to rename the variable in a way that’s easy to reverse (eg: $G_Name becomes $G_NameXXX), then perform a full validation. Various references to it are then reported as errors. I then restore the variable to its original name. I’ve not tested this enough to prove it reveals absolutely all occurrences (eg: buried in text strings or GUI dialogues), but it’s a start.


GidonAtWork (BOB member since 2008-09-12)

One of my larger beefs with windows based software is unnecessary mouse movement and excessive clicking, just to navigate through what ends up being fluff panes.

The DataQuality user defined transforms python editor is a prime example of this ergo-problem.

It takes 6 clicks, to get there, and in between each one, you have to drag your mouse clear across the screen to the next button. If you right click to go to the user defined editor, it takes you to the page, that takes you to the page which is the python editor and like I said… the buttons to get to the next page are on opposite sides of the screen. Its the click drag click syndrome I never seem to get rid of.

My suggestion for this transform, would be to integrate everything into one window, and ditch the pretense that you’re using some fancy proprietary python ‘smart’ editor. I may also ask for the smart editor to highlight reserved words, taking case into consideration and also provide me with some line numbers. I would also like the ability to redirect my stdout to a logfile for debugging purposes. This is really easy to do in the script, but it would be a nice feature to have it as a simple on/off button.

There is also no longer an option to execute per dataflow. We have some fairly complex processes in DQ that are being migrated into DS and several contain user defined transforms that run per dataflow.

Ive got buckets of python code that Im going to have to migrate from DQ 11.7 and the clunkiness of this transform is not going to aid in that one bit. :nonod:


jlynn73 :us: (BOB member since 2009-10-27)

Please, please, pretty please we need this “where used for variables” feature!


dlaplant :us: (BOB member since 2006-03-27)

I would really like to be able to use Active Directory as a data source/target!

I have several ETLs that need to extract data from Active Directory, modify it, then reload it into AD.

As it currently stands, I had to write a VB script to read AD into a CSV, then read that. In order to load AD, I had to write another DF to create an LDIFDE upload input file then execute LDIFDE within the WF.

If I could just create a datastore connected to AD, it would be awesome. Or am I missing something?


sreimert (BOB member since 2009-03-11)

Do you have access to a SQL Server instance? You should be able to create a linked server in SQL to your AD server, and then maybe DI could access it through SQL instead.


dnewton :us: (BOB member since 2004-01-30)

This thread has gotten pretty long, so I re-consolidated the postings into a new one:

This thread will be locked.


dnewton :us: (BOB member since 2004-01-30)