Redirect
Duplicate Rows Using SSIS
There
are many ways to redirect the duplicate into another direction. You can use
Script Component, Lookup Transformation etc. I am showing you the simplest way
to redirect the duplicate. For this, here I am using Fuzzy Group Transformation to
redirect the duplicate into another direction.
Here, I
am taking Repeat_Data table from my Employee Database. In this table I have
created some duplicate records for better understand.
Now Go
to BIDS, Start New projects, Name the Project.
Take Data flow Task
and double click on it.
Take OLEDB Data Source,
double click on it and configure it.
Choose
Table Repeat_Data.
Go to Columns and take all columns.
Take Fuzzy Group
Transformation and double Click on it. A Fuzzy Group Transformation editor
will open. Leave the first tab that is Connection Manager . Click the second
tab i.e. Columns
and configure it as directed below.
Make the Minimum Similarity of
name column to 0.25.
Go to Advanced tab
and set the Threshold
to
0.50 and press OK button.
Take Conditional Split Transformation and
double Click on it and write Condition.
Condition: _key_in == _key_out
Make the
Default output
name as Duplicate Rows.
Your all
configuration has completed now you can take any destination where you want to
keep your unique records and where you want to keep your duplicate records.
Here I am taking Multicast Transformation for just showing you the output of
this configuration with the help of Data Viewer.
It is the Preview of your Package.
(Note: It is very easy way to redirect the duplicate into
another direction but it has a drawback. It is a blocking transformation so if
you use it then may be you have some performance
issue.)