MSBI Question And Answer By Hussain: 2013

Thursday, 21 March 2013

Difference between MDX and T-SQL

MDX Vs T-SQL

i) The ability of MDX to reference multiple dimensions but SQL refers to only two dimensions, columns and rows when processing queries.

ii) In SQL, the SELECT clause is used to define the column layout for a query, while the WHERE clause is used to define the row layout. However, in MDX the SELECT clause can be used to define several axis dimensions, while the WHERE clause is used to restrict multidimensional data to a specific dimension or member.

iii) The process of creating an SQL query is also different than that of creating an MDX query.

iv) The visualization of an SQL result set is intuitive; the set is a two-dimensional grid of columns and rows. The visualization of an MDX result set is not as intuitive, however. Because a multidimensional result set can have more than three dimensions.

v) To refer to such two-dimensional data in SQL, the name of a column and the unique identification of a row, in whatever method is appropriate for the data, are used to refer to a single cell of data, called a field. However, MDX uses a very specific and uniform syntax to refer to cells of data, whether the data forms a single cell or a group of cells.

Saturday, 9 March 2013

Difference Between Script Task and Script Component

Script Task VS Script Component :

While both the objects provide scripting environments using the VSTA engine, yet there are inherent differences in their purpose and functions. This is due to their design architecture and the placement in development environment.

Ø The main difference of Script Task and Script Component is the way the script is executed by both the objects. The Script Task executes the script only once for each execution instance of the task, while the Script Component generally executes the script for each row, as it applies the custom transformation on each of the data row reads.

Ø The Script Task is available in the Control Flow task tab, while the Script Component works in the pipeline or the Data Flow Task.

Ø You can choose Script Component as a Source, a transformation, or a destination. The choice of function and the metadata and properties you configure in the editor results in creation of different members of the base classes in the auto-generated code. The Script Task does not provide any such facility, as the code generated is always the same.

Ø The Dts property works with only the Script Task and it is not available in the Script Component.

Ø The Script Task supports breakpoints while Script Component does not support breakpoints.

Monday, 4 March 2013

Nonblocking, Partial Blocking and Blocking Transformations In SSIS

Classifying Data Flow Transformations on performance considerations:

The three classifications of Data Flow transformations that are based mainly on

performance considerations.

Nonblocking Synchronous Row-Based Transformations:

These transformations work on a row-by-row basis and modify the data by changing

the data in columns, adding new columns, or removing columns from the data rows,

but these components do not add new rows to the data flow. The rows that arrive at the

input are those that leave at the output of these transformations. These transformations

have synchronous outputs and make data rows available to the downstream components

straightaway. In fact, these transformations do not cause data to move from one buffer

to another; instead, they traverse over the data buffer and make the changes to the data

columns. So these transformations are efficient and lightweight from the process point

of view. The transformations that readily pass the data to the downstream components are classified as Row Transformations.

The following Row Transformations are examples of Nonblocking synchronous row-based transformations:

Ø Audit transformation

Ø Character Map transformation

Ø Conditional Split transformation

Ø Copy Column transformation

Ø Data Conversion transformation

Ø Derived Column transformation

Ø Export Column transformation

Ø Import Column transformation

Ø Lookup transformation

Ø Multicast transformation

Ø OLE DB Command transformation

Ø Percentage Sampling transformation

Ø Row Count transformation

Ø Row Sampling transformation

Ø Script Component transformation configured as Nonblocking Synchronous

Row–based transformation

Ø Slowly Changing Dimension transformation

Partially Blocking Asynchronous Row-Set-Based Transformations:

Integration Services provides some transformations that essentially add new rows in the

data flow and hold on to the data buffers before they can perform the transformation.

The nature of such transformations is defined as Partially Blocking because these

transformations hold on to data for a while before they start releasing buffers. As

these transformations add new rows in the data flow, they are asynchronous in

nature. For example, a Merge transformation combines two sorted data sets that may

be the same data, but it essentially adds new rows in the data flow. The time taken

by this transformation before starting to release the buffers depends on when these

components receive matching data from both inputs.

Following is a list of Partially Blocking asynchronous row set-based transformations:

Ø Data Mining Query transformation

Ø Merge Join transformation

Ø Merge transformation

Ø Pivot transformation

Ø Term Lookup transformation

Ø UnPivot transformation

Ø Union All transformation

Ø Script Component transformation configured as Partially Blocking Asynchronous

Row-set-based transformation

Blocking Asynchronous Full Row-Set-Based Transformations:

These transformations require all the rows to be assembled before they can perform

their operation. These transformations can also change the number of rows in the

data flow and have asynchronous outputs. For example, the Aggregate transformation

needs to see each and every row to perform aggregations such as summation or

finding an average. Similarly, the Sort transformation needs to see all the rows to

sort them in proper order. This requirement of collecting all rows before performing

a transformation operation puts a heavy load on both processor and memory of the

server. You can understand from the nature of the function they perform that these

transformations block data until they have seen all the rows and do not pass the data to

the downstream component until they have finished their operation.

The following are Blocking asynchronous full row-set-based transformations:

Ø Aggregate transformation

Ø Fuzzy Grouping transformation

Ø Fuzzy Lookup transformation

Ø Sort transformation

Ø Term Extraction transformation

Resources:

For more detailed information regarding this study, please see

Ø Microsoft SQL Server 2008 Integration Services by Ashwani Nanda

Ø Microsoft Books Online

Ø http://blogs.msdn.com

Ø http://forums.microsoft.com/MSDN

Saturday, 2 March 2013

STORED PROCEDURE AND RULES FOR CREATING STORED PROCEDURE

STORED PROCEDURE AND RULES FOR CREATING STORED PROCEDURE IN T-SQL

Stored Procedure:

To create Stored procedure, the CREATE PROCEDURE Statement is used. Stored procedures are created in the current database unless a temporary stored procedure is created in tempdb. Because stored procedure are precompiled, they usually provide the best performance of any type of query.

Rules for creating Stored procedure:

i) The name must follow the rules for identifiers.

ii) Referenced objects must exist when your stored procedure runs.

iii) Object with the same name in a single stored procedure can’t be created and then drop or re-create.

iv) Temporary tables can be referenced within the stored procedure. Local temporary tables disappear when the procedure ends.

v) Upto 1024 parameters can be allowed.

vi) Procedure can be nested within procedure upto 32 levels deep.

vii) Stored Procedure can’t have the following T-SQL create statement in them:

Ø CREATE DEFAULT

Ø CREATE RULE

Ø CREATE PROCEDURE

Ø CREATE TRIGGER

Ø CREATE VIEW

TRIGGERS, TRIGGERS PERFORMANCE AND RULES FOR CREATING TRIGGERS

TRIGGERS, TRIGGERS PERFORMANCE AND RULES FOR CREATING TRIGGERS IN T-SQL

Triggers:

A Trigger is a special type of stored procedure that’s automatically invoked when the user try to modify data that it’s designed to protect. Triggers help to secure data’s integrity by preventing unauthorized or inconsistent changes from being made.

Triggers Performance:

From the performance point-of-view, triggers have a relatively low amount of overhead. Most of the time involved in running a trigger is used up by referencing other tables. The referencing can be fast if the other tables are in memory or a bit slower if they read from disk.

Rules for creating Triggers:

i) Triggers can’t be created on temporary tables. They can, however, reference views and temporary tables.

ii) Triggers can’t return result sets. Using the IF EXISTS clause as a part of a SELECT Statement in trigger code is a common practice.

iii) Triggers should be used to maintain data integrity, maintain referential integrity, and encapsulate business rules.

iv) Triggers can be encrypted in the syscomments table if you specify the WITH ENCRYPTION option.

v) WRITETEXT statement don’t activate triggers. They are used to modify text or image data, and it’s a nonlogged transaction.

vi) The following SQL Statement can’t be used in a trigger:

Ø All CREATE Statements

Ø All DROP Statements

Ø ALTER TABLE and ALTER DATABASE

Ø TRUNCATE TABLE

Ø RECONFIGURE

Ø LOAD DATABASE or TRANSACTION

Ø UPDATE STATISTICS

Ø SELECT INTO

Ø DISK STATEMENT

Saturday, 9 February 2013

Redirect Duplicate Rows Using SSIS Step By Step

Redirect Duplicate Rows Using SSIS

There are many ways to redirect the duplicate into another direction. You can use Script Component, Lookup Transformation etc. I am showing you the simplest way to redirect the duplicate. For this, here I am using Fuzzy Group Transformation to redirect the duplicate into another direction.

Here, I am taking Repeat_Data table from my Employee Database. In this table I have created some duplicate records for better understand.

Now Go to BIDS, Start New projects, Name the Project.

Take Data flow Task and double click on it.

Take OLEDB Data Source, double click on it and configure it.

Choose Table Repeat_Data.

Go to Columns and take all columns.

Take Fuzzy Group Transformation and double Click on it. A Fuzzy Group Transformation editor will open. Leave the first tab that is Connection Manager . Click the second tab i.e. Columns and configure it as directed below.

Make the Minimum Similarity of name column to 0.25.

Go to Advanced tab and set the Threshold to 0.50 and press OK button.

Take Conditional Split Transformation and double Click on it and write Condition.

Condition: _key_in == _key_out

Make the Default output name as Duplicate Rows.

Your all configuration has completed now you can take any destination where you want to keep your unique records and where you want to keep your duplicate records. Here I am taking Multicast Transformation for just showing you the output of this configuration with the help of Data Viewer.

It is the Preview of your Package.

(Note: It is very easy way to redirect the duplicate into another direction but it has a drawback. It is a blocking transformation so if you use it then may be you have some performance issue.)

Saturday, 2 February 2013

Dynamic Column Report In SSRS Step By Step

Dynamic Column Report In SSRS

I am showing you how to create Dynamic Column Report. It means when a user select two column the report will display only two columns and if they select five columns then it will display five columns. So we can say that the report will display the field on the basis of user selection.

For this we have to create two datasets.

Here is an example from Northwind Database.

Create first dataset named Hussain_Dataset.

SQL Scripts:

SELECT CustomerID, CompanyName, ContactName, ContactTitle, Address, City, PostalCode, Country, Phone FROM Customers

Insert a Table in design area and create a Table Like that:

Now go to the dataset and create another dataset Hussain_Dataset1

SQL Scripts:

SELECT 1 AS ID, 'Customer Id' AS CustomerID

UNION

SELECT 2 AS ID, 'Company Name' AS CompanyName

UNION

SELECT 3 AS ID, 'Contact Name' AS ContactName

UNION

SELECT 4 AS ID, 'Contact Title' AS ContactTitle

UNION

SELECT 5 AS ID, 'Address' AS Address

UNION

SELECT 6 AS ID, 'Postal Code' AS PostalCode

UNION

SELECT 7 AS ID, 'Country' AS Country

UNION

SELECT 8 AS ID, 'City' AS City

UNION

SELECT 9 AS ID, 'Phone' AS Phone

FROM Customers

Now create a parameter DCParamaeter and click the Allow Multiple value.

Go to the Available Values and choose Get values from query and set dataset to Hussain_Dataset1 , Value Field CustomerID, Label Field CustomerID.

And Now the time to set the Column Visibility. For this we go to first column of the Tablix and right click after that we click on Column Visibility option.