BODS using AWS Redshift as a target

system · March 11, 2016, 9:23pm

We are trying to send data into AWS Redshift using DS 4.2.5.

We set up the JDBC from AWS as a JDBC adapter and found it only would allow a source connection…what?

We tried the AWS ODBC that has no bulk loading and performance was very slow. We tried the other ODBC drivers available (SIMBA, Direct Data (Progress), postgresql) - the only one that had decent performance was Direct Data. Performance was different by a factor of 10 (2 minutes vs. 20 minutes to load), but the pricing was also up’d (by a factor of 100 compared to the others).

is direct data our only option, or is there a setting we missed that would allow the jdbc adapter to act as a target?

Padraig.Martin (BOB member since 2016-03-11)

system · March 21, 2016, 4:44am

we ended up going in a different direction. AWS recommended loading the data into their S3 bucket and then run a redshift copy command to load the data. Sounded complicated and disconnected, main reason we went down the path of jdbc, then odbc.

Turns out it was more simple than it sounded and much better performance.

Loading data into S3 consisted of installing third party software to mount S3 as a window’s drive (similar process in linux).

In Data Services, we created file formats and wrote directly to S3.

To Load into Redshift, we used the AWS odbc driver to create a data store. Then ran a sql script for the copy command.

sql(‘Redshift_Target’,'copy <redshift_schema.table> from ‘s3://<S3_Bucket.file’ CREDENTIALS ‘aws_access_key_id=<aws_access_key>;aws_secret_access_key=<aws_secret_key>’ csv dateformat as ‘yyyy.mm.dd’ ');

8M records create a 4Gb file and took 4 minutes to load into Redshift.

Padraig.Martin (BOB member since 2016-03-11)

system · September 29, 2016, 2:42pm

Hi Martin,
Why this third party software is required to mount S3 as windows drive?
If it’s mandatory, then what are the third party softwares available?

iammanohar (BOB member since 2008-07-22)