Pages

Wednesday, 3 October 2012

Remove Duplicate Records Using SSIS


 Remove Duplicate Records Using SSIS Step by Step



It is very easy task to remove the duplicate rows from the table using SSIS package. To remove the duplicate records in SSIS there are multiple ways and I am using SORT transformation. Below I am showing you a simple example for removing the duplicates.

Go to the BIDS >> choose new project >> choose Integration Service Project >> give your package name and press OK button.
Now inside the Toolbox from Control flow item select the Data Flow Task and drag and drop into design area.
Rename the Data Flow Task.


Double click on Data Flow Task OR Right Click on Data Flow Task and choose edit.


Inside the Data Flow Task choose OLEDB Data Source and Drag and drop it into design area.
Rename it.
Double click on OLEDB Data Source.



 OLEDB Data Source Editor will open. Now click on NEW button.

 Again a new Editor will open here we again click on NEW button to configure the connection from the Database.



Here we create  connection to Database. For this I am using AdventureWorks Database So I use our local Server and Database AdventureWorks. After choosing the Database just check that everything is ok or not by clicking Test Connection button. If everything is ok then press OK button.


Click on Name of the Table or View and choose the table then click on column to select the column.


Now choose the columns which you want to display and click on OK.



Drag the SORT Transformation and double click on it for configure.


Here enable the check box i.e. at the bottom of the SORT  Transformation to remove the duplicate row and press OK.
Now you can choose any destination to load the data. I am using derived column to show you the result.
Here Data Viewer Shows the output rows.








6 comments:

  1. yes it is useful for removing duplicates but i want to send all duplicate records into another destination and uniq record into one destination

    ReplyDelete
  2. yes it is very help full but i have to sort all the fields and had some null values init and need to delete the duplicate record..how is it posssible???
    pls let me know...

    ReplyDelete
  3. This post is probably where I got the most useful information for my research. Thanks for posting, maybe we can see more on this.
    Are you aware of any other websites on this
    msbi


    ReplyDelete
  4. Awesome post
    This is such a great resource that you are providing and you give it away for free. I love seeing websites that understand the value of providing a quality resource for free. It?s the old what goes around comes around routine. Did you acquired lots of links and I see lots of trackbacks??
    Hadoop training

    ReplyDelete
  5. This post saved me a lot of trouble. Thank you for sharing this valuable information.

    ReplyDelete
  6. Nice explanation with snapshots, keep it up for more Dotnet online training

    ReplyDelete