We are loading data into temporary tables. To improve the performance we have used dbms_stats.gather_table_stats. It has improved the performance a lot. Are there any other ways to improve performance?? How does dbms_stats.gather_table_stats improves performance?
Having up-to-date statistics for the tables does help when querying data, e.g. joining two tables. For loading it does not help at all.
Just imagine you join table A with B.
B might have zero rows and A millions. If that’s the case, the best join method is to use B as driving table and if it is really empty the answer is simple: join returns no rows.
A & B might be large but still fit into memory - calls for a sort-merge-join.
A & B are large and given the histogram of the joined columns one can expect that only a few rows match - hash join is perfect.
For loading turning on the API bulkloader is quite nice. But make sure you know the limitations of the Oracle direct path load upfront and they do not bother you in the dataflow in question (e.g. perfect for appending data to a table never updated).