While doing ETL performance is of utmost importance. Keeping this is mind how do we load data to Oracle server using SSIS in the most optimum way? We can follow the conventional method of using OLEDB Destination in a Data Flow Task. This is what is used most often. I too used this to perform ETL. But was not very happy with the kind of performance I got. I looked for other options available and came to know about the SQL*Loader. now how do we use this with SSIS?
Step 1. If the source is a text file very good!! If not don't worry, use SSIS Data Flow Task to dump the data to a flat file(C:\Data\File.csv). We have the data in the following format after step 1. Columns are Name, Age & Sex (but we do not have them in the output - not mandatory would need to modify step in that case)
Sam,20,M
Daniel,59,M
John,45,M
Step 2. Create a Control file and name it Loader.ctl and paste the below code in there:
LOAD DATA
INFILE 'C:\Data\File.csv'
REPLACE
INTO TABLE <Schema>.<TableName>
FIELDS Terminated by "," TRAILING NULLCOLS
(Name, Age, Sex)
Step 3. Create a Batch file in the same folder where you create the above control file. In the batch file put the below code:
sqlldr username@server/password control=loader.ctl
Step 4. Create the package, After the 1st data flow task where you dump the data to a flat file. Have an Execute Process Task to point to the above batch file. Execute the package and your data will be loaded to Oracle table in much shorter duration.
Note- You can use SQL loader control file for a number of other scenarios not necessarily the one I showcased here. For more on SQL*Loader check this link.