Remove Duplicate Records Using SSIS Step by Step
It is very easy task to remove the duplicate rows from the
table using SSIS package. To remove the duplicate records in SSIS there are multiple ways and I am using SORT
transformation. Below I am showing you a simple example for removing the
duplicates.
Go to the BIDS >> choose new project >> choose
Integration Service Project >> give your package name and press OK button.
Now inside the Toolbox from Control flow item select the Data Flow Task
and drag and drop into design area.
Rename the Data Flow Task.
Double click on Data Flow Task OR Right Click on Data Flow Task
and choose edit.
Inside the Data Flow Task choose OLEDB Data Source and Drag and
drop it into design area.
Rename it.
Double click on OLEDB Data Source.
OLEDB Data Source Editor
will open. Now click on NEW button.
Again a new Editor
will open here we again click on NEW button to configure the connection from
the Database.
Here we create connection to Database. For this I am using AdventureWorks
Database So I use our local Server and Database AdventureWorks. After choosing
the Database just check that everything is ok or not by clicking Test Connection
button. If everything is ok then press OK button.
Click on Name of the Table or View and choose the table then
click on column to select the column.
Now choose the columns which you want to display and click
on OK.
Drag the SORT Transformation and double click on it for
configure.
Here enable the check box i.e. at the bottom of the
SORT Transformation to remove the
duplicate row and press OK.
Now you can choose any destination to load the data. I am
using derived column to show you the result.
Here Data Viewer Shows the output rows.