Getting Started with Adobe After Effects - Part 6: Motion Blur


Upload Image Close it
Select File

Browse by Tags · View All
#SQLServer 46
SQL Server 44
SSRS EXPERT 41
BI 36
#BI 33
brh 28
SSIS BEST PRACTICES 23
TSQL 19
MS BI STACKS 18
SSRS 17

Archive · View All
March 2011 24
July 2011 17
February 2011 15
April 2010 13
June 2011 11
May 2011 11
April 2011 11
July 2010 11
March 2010 8
December 2011 7

Incremental Data Loading – 3 day rule

Feb 24 2011 2:21AM by Sherry Li   

Incremental loading is not a easy topic. I am just concentrating on a very narrow focus here.

Here is the context of this blog.

  1. Data Source: relational database in DB2
  2. Data Destination: a table in SQL database
  3. Reporting: a SSRS report with direct data pull from the above table
  4. ETL: Implemented in a SSIS package with some business rules built-in
  5. Daily Process: newly added data from the data source needs to be loaded into the destination table

Challenge of the ETL Process:

  1. Need to be re-startable: without any manual setting
  2. Need to be fast: the source can contain large number of records
  3. Need to be self-correctable: if source data was corrected and back dated, the destination data should be self-correctable

What it really means is to choose among the following options regarding the incremental daily loading:

  1. truncate and re-populate the destination table daily
  2. start from the last date from the destination table: I will need to pass the last date from my SQL table. I can either read it in as a user variable in my SSIS package, and pass it to my query, or I can just “ETL” the last date to the source DB2 database.
  3. just pull the previous day’s data from the source
  4. pull the previous 3 days’ data from the source

The first choice is simple enough, but will suffer from poor performance.

The second choice can be a little messy, but it sounds like a good choice.

The third choice is simple enough, but not a good one when considering that our SQL job can potentially fail every day, in which case, data will be missing due to job failure.

The last choice is simple, and it will pull data for the last 3 days even when your job failed in the last 2 days. And it also should be fast. Fortunately SQL jobs are fixed Monday to Friday in our environment.

For the sake of simplicity, I picked the last option for some of my ETL processes. So far, the 3 day rule has worked pretty smoothly.

Tags: ETL BEST PRACTICES,


Sherry Li
14 · 12% · 3697
0
Liked
 
0
Lifesaver
 
0
Refreshed
 
0
Learned
 
0
Incorrect



Submit

Your Comment


Sign Up or Login to post a comment.

    Copyright © Rivera Informatic Private Ltd Contact us      Privacy Policy      Terms of use      Report Abuse      Advertising      [ZULU1097]