Pages

Saturday 9 February 2013

Redirect Duplicate Rows Using SSIS Step By Step


Redirect Duplicate Rows Using SSIS

There are many ways to redirect the duplicate into another direction. You can use Script Component, Lookup Transformation etc. I am showing you the simplest way to redirect the duplicate. For this, here I am using Fuzzy Group Transformation to redirect the duplicate into another direction.
Here, I am taking Repeat_Data table from my Employee Database. In this table I have created some duplicate records for better understand.


Now Go to BIDS, Start New projects, Name the Project.
Take Data flow Task and double click on it.
Take OLEDB Data Source, double click on it and configure it.


Choose Table Repeat_Data.


Go to Columns and take all columns.


Take Fuzzy Group Transformation and double Click on it. A Fuzzy Group Transformation editor will open. Leave the first tab that is Connection Manager . Click the second tab i.e. Columns and configure it as directed below.


 Make the Minimum Similarity of name column to 0.25.
Go to Advanced tab and set the Threshold to 0.50 and press OK button.


Take Conditional Split Transformation and double Click on it and write Condition.


Condition: _key_in == _key_out
Make the Default output name as Duplicate Rows.

Your all configuration has completed now you can take any destination where you want to keep your unique records and where you want to keep your duplicate records. Here I am taking Multicast Transformation for just showing you the output of this configuration with the help of Data Viewer.


It is the Preview of your Package.



(Note: It is very easy way to redirect the duplicate into another direction but it has a drawback. It is a blocking transformation so if you use it then may be  you have some performance issue.)