MicroSoft Technologies: 2012

Tuesday, 4 December 2012

New Features In SSIS 2012

1. GUI Improvements -
-Sort packages by name
-Package visualization
-Zoom
-Data flow source/destination wizard
-Grouping in dataflow

2. CDC (Change Data Capture) Task and Components -
-CDC is nothing but Incremental load loads all rows that have changed since the last load
-CDC needs to keep track of which changes have already been processed.
-CDC task does this by storing LSNs in a tracking table
-CDC source component reads from the CDC table function, based on the LSN it for from the CDC task.
-CDC transformation splits records into new rows, updated rows and deleted rows.

3. Flat File Connection Manager Changes -
-The Flat File connection manager now supports parsing files with embedded qualifiers. The connection manager also by default always checks for row delimiters to enable the correct parsing of files with rows that are missing column fields. The Flat File Source now supports a varying number of columns, and embedded qualifiers.

4. Offline Connection Managers -
-Integration Services now validates all connections before validating all of the data flow components when a package is opened and sets any connections that are slow or unavailable to work offline. This helps to reduce the delay in validating the package data flow. After a package is opened, you can also turn off a connection by right-clicking the connection manager in the Connection Managers area and then clicking Work Offline. This can speed up operations in the SSIS Designer.

5. New Functions/Expressions in SSIS 2012-
LEFT:
- You now can easily return the leftmost portion of a string rather than use the SUBSTRING function. Left syntax is the same as we know in T-SQL:
LEFT(character_expression,number)

REPLACENULL:
- You can use this function to replace NULL values in the first argument with the expression specified in the second argument. This is equivalent to ISNULL in T-SQL:
REPLACENULL(expression, expression)

TOKEN:
- This function allows you to return a substring by using delimiters to separate a string into tokens and then specifying which occurrence to return:
TOKEN(character_expression, delimiter_string, occurrence)

TOKENCOUNT:
- This function uses delimiters to separate a string into tokens and then returns the count of tokens found within the string:
TOKENCOUNT(character_expression, delimiter_string)

6. Easy Column Remapping in Data Flow (Mapping Data Flow Columns)
-When modifying a data flow, column remapping is sometimes needed
-SSIS 2012 maps columns on name instead of id
-It also has an improved remapping dialog

7. Shared Connection Managers -
-To create connection managers at the project level that can shared by multiple packages in the project. The connection manager you create at the project level is automatically visible in the Connection Managers tab of the SSIS Designer window for all packages.
-When converting shared connection managers back to regular (package) connection managers, they disappear in all other packages.

8. Scripting Enhancements-
-Now Script task and Script Component support for 4.0.
- Breakpoints are supported in Script Component

9. ODBC Source and Destination -
-ODBC was not natively supported in 2008
-SSIS 2012 has ODBC source & destination
-SSIS 2008 could access ODBC via ADO.NET

10. Reduced Memory Usage by the Merge and Merge Join Transformations –
The old SSIS Merge and Merge Join transformations, although helpful, used a lot of system resources and could be a memory hog. In 2012 these tasks are much more robust and reliable. Most importantly, they will not consume excessive memory when the multiple inputs produce data at uneven rates.

11. Undo/Redo - One thing that annoys users in SSIS before 2012 is lack of support of Undo and Redo. Once you performed an operation, you can’t undo that. Now in SSIS 2012, we can see the support of undo/redo.

Thursday, 4 October 2012

Reporting Services Components - Databases

Reporting Services creates and uses two SQL Server 2008 databases named Report Server and ReportServerTempDB by default. The ReportServer database has tables to store reports, data sources, snapshots, subscriptions, etc. It is the repository for the reporting application data. The ReportServerTempDB database is used for temporary storage needs, as the default name suggests. When you allow the SQL Server 2008 installer to configure Reporting Services, these two databases are created automatically for you. If you want to create them manually, then you need to use the Reporting Services Configuration Manager. In this tutorial we assume that the SQL Server installer is used to configure Reporting Services.

Reporting Services Components

SQL Server Reporting Services 2008 (SSRS) is a full-featured application that provides report design, development, testing, and deployment. In this section we will introduce the main components of the product:

Databases
Windows Service
Report Designer

SQL Server Reporting Services

SQL Server Reporting Services 2008 (SSRS) is a feature included in the SQL Server 2008 product. We use SSRS to design, develop, test, and deploy reports. SSRS was originally slated to be released with SQL Server 2005 but it wound up being released a little bit ahead of SQL Server 2005. SSRS leverages the Business Intelligence Development Studio (BIDS) developer tool for all aspects of authoring and deploying reports. BIDS is included with SQL Server 2008.

In this tutorial we will step through a number of topics that you need to understand to successfully build a report. Our high level outline is as follows:

Reporting Services Components
Install Reporting Services
Business Intelligence Development Studio (BIDS)
Install Sample Database
Create a Simple Report with the Wizard
Create a Simple Report with the Report Designer
Deploy Reports
Configure Report Manager Security

To step through the tutorial topics, use the outline to the left, the arrow buttons next to the outline topic heading, or the arrow buttons at the bottom of each page

Wednesday, 26 September 2012

SQL Server GROUPING (Transact-SQL)

Hello Friends,
You may have heard about this function or maybe not. It’s a very interesting function, which you should also know that. Well don’t worry at all I am going to give a brief about GROUPING function, so watch it carefully:

The GROUPING function was introduced earlier in the SQL Server 2000, but later on there has always been the modification in it, adds more perfection to it.
It Indicates whether a specified column expression in a GROUP BY list is aggregated or not.
It returns 1 for aggregated or 0 for not aggregated in the result set.
Grouping can also be defined as an aggregate function that causes an additional column to be output with a value of 1 when the row is added by either the CUBE or ROLLUP operator, or 0 when the row is not the result of CUBE or ROLLUP.
It can be used only in the SELECT list, HAVING, and ORDER BY clauses when GROUP BY is specified.
The syntax can be seen as:

GROUPING ( <column_expression> )
Here, the “column_expression” tells about column or an expression that contains a column in a GROUP BY clause.

The return type for the GROUPING function is tinyint.
GROUPING is used to distinguish the null values that are returned by ROLLUP, CUBE or GROUPING SETS from standard null values.
The NULL returned as the result of a ROLLUP, CUBE or GROUPING SETS operation is a special use of NULL. This acts as a column placeholder in the result set and means all.
This will become more clear to you after you see this example:

I have created the table “rms.payment” in the database “TEST”. So this can be seen as:

Now, if we want to see that how much “Amount” is being summed up for each “bank_name” and so the status of the “Grouping”, till when it’s 0 or 1, then see this query carefully:

SELECT GROUPING(bank_name) AS 'Grouping',bank_name,SUM(amount) AS Amount
FROM [test].rms.payment
GROUP BY rollup (bank_name)

The result can be seen as:

As you can see that as soon as the status of the Grouping becomes 1, a column placeholder comes into the act (special NULL) and which means all. If you don’t want a NULL into your result set, so it can also be replaced by any string with the use of WHEN clause or any other clause. The “Amount” column also sums up all the groups, and then displays the result.
Note: It might also possible that you might get an error while using the SELECT statement.

Monday, 24 September 2012

Backing up an Analysis Services Database automatically

Backing up an Analysis Services database is one of the easiest things you can do while managing your OLAP databases. It is indeed very straight forward. Log on to SSMS -> right click your OLAP db, select Backup, fill in the details and press OK. Job done. But there are many more things that I want to talk about.
First, apart from GIU, you can use XMLA script to backup your database. You really don’t need to be an XMLA expert to do that. When you are on the Backup dialog box, use the script command to generate the XMLA script. See below:

So, what did I do? I simply fired up the backup window from SSMS. I left the default options (you can change them if you want). And click on the down arrow on the Script button, and then clicked on “Script Action to a New Query Window”. This will generate the XMLA command in a new window as follows:

1<Backup xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">

2  <Object>

3    <DatabaseID>Adventure Works DW 2008</DatabaseID>

4  </Object>

5  <File>Adventure Works DW 2008.abf</File>

6</Backup>

You can execute the above command in SSMS. In case you do so, verify the backup file in the default Backup folder in C:\Program Files\Microsoft SQL Server\MSAS10.MSSQLSERVER\OLAP \Backup
I ran the above script and got the following output in SSMS results window:

1<return xmlns="urn:schemas-microsoft-com:xml-analysis">

2  <root xmlns="urn:schemas-microsoft-com:xml-analysis:empty" />

3</return>

And the Message window output was as follows:

Executing the query ...

<Backup xmlns="http://schemas.microsoft.com/analysisservices/2003/engine">

  <Object>

    <DatabaseID>Adventure Works DW 2008</DatabaseID>

  </Object>

  <File>Adventure Works DW 2008.abf</File>

</Backup>

Backup started.

Execution complete

Verify the backup file in the default folder:

So the above was the first thing I wanted to talk about.
Second, you can schedule this script to run automatically using SSIS or using a SQL Agent job. Backups should be automated, obviously. Typically a best practice is that your backup script runs automatically after an incremental load and cube processing is done. As mentioned, you can do this using SSIS or a SQL Agent job.
Let us see how you can schedule this script using SQL Agent job.
1. Log on to SSMS, DB engine.
2. Make sure SQL Agent is running. If not, start it.
3. Expand SQL Agent. Expand Job.
4. Right click Job, New Job.
5. Specify the name of the Job
6. Click on Step on the left side
7. Click on New at the bottom of the dialog box to add a new step.
8. Specify a step name
9. Change the type to SQL Server Analysis Services command
10. Specify the server name on which the command is going to run.
10. Copy/Paste the XMLA command that you generated in the previous steps.
11. The output should look like this:

12. Click on OK.
13. You can add a schedule if you want so. Or the job can be fired by your SSIS package.
14. For testing purposes, let us click OK and execute the job manually.
15. So click OK to save the job.
16. Right click the job, click on Start Job at step…

17. Once done, the output should be as follows:

18. Verify the backup file in the default folder:

So, in this post I demonstrated a few things about backing up your OLAP db using scripts and SQL Agent job.

Time saving feature in SSMS: Script Objects directly from SQL Editor = "Go To Definition" for SQL Objects in SQL Server Management Studio

SQL Server Management Studio currently lacks a feature present in other Visual Studio Editors - "go to definition".
SSMSBoost plug-in solves this problem by adding a tool what you need: Script object located at the cursor position to a new window.

You can invoke this feature by

pressing Keyboard Shortcut [F2]
right-clicking the identifier and selecting "Script object" from the context menue

SSMSBoost performs search for valid identifiers at cursor location and allows you to select, which one you want to script. If you select part of the identifier - SSMSBoost will perform no search and will use selected text as object identifier.
If you would like to change the default Keyboard Shortcut for the "Script Object" feature you can do that using Shortcuts Editor feature of SSMSBoost (required only in SSMS 2008 as far as SSMS 2012 has built-in shortcuts editor).

By default objects will be screpted as CREATE. If you want to generate ALTER scripts you can use "Script as ALTER" command also provided by our add-in. Use Extras->Settings->"Shortcuts & Macros" to redefine F2 shortcut to this command.

Script Object feature in SSMS with SSMSBoost add-in

Reference URl:-http://www.ssmsboost.com

Tuesday, 18 September 2012

List all SQL Tables and its record count

This code sample shows how to list all Tables and its data Rows count

SELECT

o.name as TableName ,

i.rows as TblRowCount

FROM

sysobjects o INNER JOIN sysindexes i ON (o.id = i.id)

WHERE

o.xtype = 'u' AND i.indid < 2

Out Put :-

EXISTS Function From SQL

This following Query Explains, How to use Exists in SQL Query.

Have a look at the sample given below

USE database name

SELECT column1,column2.... FROM table name

WHERE exists

(SELECT * FROM table name WHERE (condition))

When a subquery is introduced with the keyword EXISTS, it functions as an existence test. The WHERE clause of the outer query tests for the existence of rows returned by the subquery. The subquery does not actually produce any data; it returns a value of TRUE or FALSE.

Read SQL error log

In SQL Server Database, There is plenty of external storedprocedure is available.

To Read Error Log, we can use xp_readerrorlog, which is a external SP.

see the following code, it let us know, how to use this SP

This extended stored procedure will return a errorlog file.

EXEC master..xp_readerrorlog

Configuring the ActiveX Script Task

You can set properties programmatically or through SSIS Designer.

To Set the Properties by SSIS Designer:

Place "Activex Script Task" on Control Flow Designer From "Contol Flow Items" Tool Box
double Click on this task
Activex Script Task Editor will display which contains 3 tabs namely General, Script and Expressions
In General Tab, We can update the name and Description of the Task, As shown in the Following Figure

In Script Tab,

we can update following details about the script

language used in script, this will helps to the compiler for compile our code by using to proper language compiler
script, which will be executed by Activex script Task at run time
Entry Method, here we need provide a method name which is available in the script (script available in step 2), it is a entry point, from where the script will start executing. it is same as Main() function in C, C++, C#

Next is the Expression Tab, here we can assign the value of the property dynamically by executing some expression, calculation, etc.,

Example Script for Exercise:

Language: VB Script Language

Script:

Function Msg()
Msgbox("Hi")
End Function

Entry Method: Msg

Assign these values in script Tab of this task, then execute

you can the Message box with "Hi" Message.

ActiveX Script Task

Activex Script Task allows to write our own custom code by using activex Script languages. at present, this task supports following 4 scripting languages

VB Script Language
JScript Language
SignedJavaScript Class
SignedVBScript Class

ActiveX script has typically been used for the following purposes:

Including business logic in packages. For example, ActiveX script can use conditional logic to manage package workflow.
Writing functions that use conditional logic to set variable values. For example, an ActiveX script can access values in a database table by using ActiveX Data Objects (ADO) and populate variables with the table values.
Performing complex computations. For example, an ActiveX script can compute new dates by applying built-in functions such DateAdd and DateDiff to data values.
Accessing data using other technologies that are not supported by built-in connection types. For example, an ActiveX script can use Active Directory Service Interfaces (ADSI) to access and extract the user names from Active Directory.

If you have existing ActiveX scripts that you would like to continue to use until they can be upgraded, you can configure the ActiveX Script task in the following ways:

Provide the existing script that the task runs.
Specify the scripting language that was used in the script. The ActiveX Script task supports the VBScript and JScript languages and other scripting languages installed on the local computer.
Specify the name of the function that is the entry point for the script.

Monday, 17 September 2012

MERGE, MERGE JOIN and UNION ALL

Well its looks I really getting fond of writing blogs. I think everyone should at least give a try writing blogs once :)

Today I am going to study some of the Dataflow transformations, MERGE, MERGE JOIN, and UNION ALL.

Yes the first question any SQL SERVER developer can ask is, these operations can very well be done in T-SQL query, Why using it in SSIS?, well the answer is, T-SQL can be used only when you are working with data which resides in SQL SEVER Tables, but what if the data is coming from different database itself like ORACLE or MYSQL or Flat File or XML or EXCEL.

Let’s take it one by one each transformation, the post also covers differences, which are actually common questions, in any SSIS interviews.

Rather than reinventing the wheel, I am taking the text from MSDN for providing the description of each transformation which I found covering everything J

MERGE:

The Merge transformation combines two sorted datasets into a single dataset. The rows from each dataset are inserted into the output based on values in their key columns.

By including the Merge transformation in a data flow, you can perform the following tasks:

Merge data from two data sources, such as tables and files.
Create complex datasets by nesting Merge transformations.
Remerge rows after correcting errors in the data.

The Merge transformation is similar to the Union All transformations. Use the Union All transformation instead of the Merge transformation in the following situations:

The transformation inputs are not sorted.
The combined output does not need to be sorted.
The transformation has more than two inputs.

Well here is the example..here we are taking input from two different sql server databases and merging them into a flatfile.

And here is what we need to set in Merge Transformation properties

This transformation is pretty simple and straight forward.

MERGE JOIN: The Merge Join transformation provides an output that is generated by joining two sorted datasets using a FULL, LEFT, or INNER join. For example, you can use a LEFT join to join a table that includes product information with a table that lists the country/region in which a product was manufactured. The result is a table that lists all products and their country/region of origin. For more information, see Using Joins.

You can configure the Merge Join transformation in the following ways:

Specify the join is a FULL, LEFT, or INNER join.
Specify the columns the join uses.
Specify whether the transformation handles null values as equal to other nulls.

Note

If null values are not treated as equal values, the transformation handles null values like the SQL Server Database Engine does.

This transformation has two inputs and one output. It does not support an error output.

Here is the same example with merge join..

And the properties window looks like this, here we need to set which type of Join we need to perform the dataset(Left outer, right outer or inner) it’s the same what we use in any RDBMS.

UNION ALL:

The Union All transformation combines multiple inputs into one output. For example, the outputs from five different Flat File sources can be inputs to the Union All transformation and combined into one output.

The transformation inputs are added to the transformation output one after the other; no reordering of rows occurs. If the package requires a sorted output, you should use the Merge transformation instead of the Union All transformation.

The first input that you connect to the Union All transformation is the input from which the transformation creates the transformation output. The columns in the inputs you subsequently connect to the transformation are mapped to the columns in the transformation output.

To merge inputs, you map columns in the inputs to columns in the output. A column from at least one input must be mapped to each output column. The mapping between two columns requires that the metadata of the columns match. For example, the mapped columns must have the same data type.

If the mapped columns contain string data and the output column is shorter in length than the input column, the output column is automatically increased in length to contain the input column. Input columns that are not mapped to output columns are set to null values in the output columns.

This transformation has multiple inputs and one output. It does not support an error output.

Even though I have taken all sources as OLEBD, you can take any type of source and output data to any type of destination.

Well guys that’s it for today, hope you guys have learned something.

Expressions in SSIS - Part 2

In my previous post i covered Event handler, in this i will be covering Expressions, and where we can actually set the expressions to extend SSIS package functionality.

Expressions

Expressions are used to set or evaluate property values at run time in SSIS package. It provides a great extending feature, based on runtime values or workflow status a property of Control flow or container or connection manager can be set.

Expressions are made up of identifiers, literals, functions, and operators. These elements are evaluated by SSIS and return a single data value which is set as property value.

SSIS Expressions are made up of following functions or elements...

Expressions are evaluated before the validation, and its a good feature evaluate properties of components in SSIS package at run time. A lot can be done usign expression and its out of scope of this blog post to cover all about expressions.

Mathematical Functions

The Mathematical Functions category provides support for complex arithmetic including logarithms, exponents, and absolute values.

String Functions

String Functions provide support for the manipulation of character data.

Date/Time Functions

Date/Time Functions allow manipulation and processing of date and time data.

NULL Functions

NULL Functions allow manipulation of NULL values.

Type Casts

Type Casts are functions that allow data to be changed from one type to another.

Operators

Operators denote various mathematical or logical operations.

Where actually Expressions are used?

In SSIS Expressions can be used with below list of elements...

1. Derived Column Transformation

2. Conditional Split Transformation

3. For Loop Container

4. Precedence Constraints

5. Properties

I hope this piece of information is usefull..

Thanks...

Event handlers, variables, expressions in SSIS - Part 1

SSIS is not just about control flows and dataflows, there are other options available in SSIS which extends functionality of SSIS package, and provides a great flexibility in desiging a successfull ETL solution. Lets take a deep insight of what are those options and how we can take benefit of them.

There are four options which extends functionality of SSIS packages.

1. Event Handlers
2. Variables
3. Expressions
4. SQL Queries

In this post i will be covering Event handlers, and in later series i will cover Variables, Expressions and SQL queries

Event Handlers:

Like any programming languages (C# or JAVA) SSIS also provides Event handlers, which provides functionality of performing some task on a specific event during runtime. At runtime of any SSIS task, series of event takes place, and SSIS Desginer has provided with one tab (Event Handler) to program any action that can be performed on occurance of that event.

Event handlers are created in the same manner as control flow but in different tab as shown.

How we create events?

Its as simple as creating control flow of package.

1. Select the executable container to which the handler will be assigned.
2. Select the event to which you wish the event handler to react to.
3. Drag control flow container and task and connect them together with precedence constraints.

How Events are actually gets handled in SSIS?

Events can be handled at task or container or package level, following is the order it will be handled.

1. Task
2. Container
3. Package

For example: If an event is triggered at the task level and no event handler is defined, the event is passed to container level, if no event handler is defined at container level then it will be passed to package level.

How many events are available?

There are 12 different events that can be handled in SSIS package. Below is the list..

OnError The OnError event is caused by the occurrence of an error.

OnExecStatusChanged The OnExecStatusChanged event occurs when an executable changes its status.

OnInformation When an executable reports information during validation or execution, the OnInformation event occurs.

OnPostExecute OnPostExecute is an event that occurs immediately after an executable has completed running.

OnPostValidate
The OnPostValidate event occurs immediately after an executable has finished its validation.

OnPreExecute The OnPreExecute event occurs immediately before an executable runs.

On PreValidate The On PreValidate event occurs immediately before an executable begins its validation.

On Progress
On Progress is an event that occurs when an executable makes measurable progress.

On QueryCancel The OnQueryCancel event occurs automatically at regular intervals to determine if the package should continue running.

On TaskFailed When a task has failed execution, the OnTaskFailed event occurs.

On VariableValueChanged
The On VariableValueChanged event occurs when a variable's value is changed.

On Warning The OnWarning event is raised when a warning occurs.

Event handlers have a number of properties that allow you to

• assign a name and description to the event handler
• enable or disable the event handler
• determine whether the package fails if the event handler fails
• determine the number of errors that can occur before the event handler fails
• override the execution result that would normally be returned at runtime
• determine the transaction isolation level for operations performed by the event handler, and
• determine the logging mode used by the event handler

Thats all.. I hope this peice of information was usefull.

Thanks..

Dataflow Transformation Categorization

Before anyone starts learning/designing dataflow in SSIS, one should know what is been offered by Microsoft in SSIS toolbox. Most people dont know how many transformations are actually available in dataflow and they choose some alternative workaround to performs operations which can be done easily using available transformatoins. In this article i am going to provide just a brief of all the transformations available in dataflow and categorise them. After reading this article one should be aware of what is been offered in SSIS designer, and it will be helpfull for all those bignners to become aware of available transformations.

Also Categorization helps us to remember transformations, as there are lot.. its difficult to remember :)

Ok so lets start...

Transformations are divided into four categories

1. Row level transformations
2. Rowset transformations
3. Multiple Input-output transformations
4. Advance Transformations.

Row Level Transformations

Row level transformations performs operations on data row by row. These transformatoins are most common in use and can be configured easily. Following are the transformations which comes under this category.

Audit
Cache Transform
Character Map
Copy Column
Data Conversion
Derived Column
Export Column
Import Column
Row Count

Rowset Transformation

Rowset Transformation performs operations on multiple rows, these kind of transformations are memory intensive, but can be functionally very usefull. Below are the list of transformations which comes under this category.

Aggregate
Percent Sampling
Pivot
Row Sampling
Sort
Un pivot

Multi input/output transformations
As the category name implies, these transformations requires multiple input and it generates multiple outputs.
These transformations provides functionality to combine or branch data and give the data flow the overall ability to process data from one or more sources to one or more destinations.

Below is the list of transformations which fall under this category..

Conditional Split
Lookup
Merge
Merge Join
Multicast
UnionAll

Advance Transformatoins
The transformations fall under this category provides ability to perform advance operations on data in pipeline of SSIS package.

OLE DB Command
Slowly Changing Dimension
Data Mining Query
Fuzzy Grouping
Fuzzy Lookup
Script Component
Term Extraction
Term Lookup

Thats all ... This all the transformations, and we have categorize them as well.. I hope this peice of information was helpfull...

Thanks..

Contol flow task categorization

Like how we categorized data flow transformations in my previous post "Dataflow-transformation-categorization" here i have categorized various control flow tasks we have in SSIS.

The task have been categroized based on the functionality they provide in SSIS Control Flow.

Data Flow Task

The Data Flow Task category contains a single task that is used to extract data, apply column-level transformations, and load data in your SSIS package.

Data Preparation Tasks

Data Preparation Tasks allow you to copy files and directories, execute web methods, work with XML documents, and cleanse data in your SSIS packages.
Data Preparation Tasks include File System, FTP, Web Service, XML, and Data Profiling.

Workflow Tasks

Workflow Tasks are used in your SSIS packages to run other packages, programs or batch files, and send or receive messages between packages. These tasks can also send e-mail messages, read WMI data, and watch for WMI events.
Workflow Tasks include Execute Package, Execute Process, Message Queue, Send Mail, WMI Data Reader, and WMI Event Watcher.

SQL Server Tasks

SQL Server Tasks allow you to work with SQL Server objects and data in your SSIS packages.
SQL Server Tasks include Bulk Insert, Execute SQL, Transfer Database, Transfer Error Messages, Transfer Jobs, Transfer Logins, Transfer Master Stored Procedures, and Transfer SQL Server Objects.

Script Task

The Script Task category contains a single task that is used to execute Microsoft Visual Studio Tools for Applications, abbreviated to VSTA, scripts in your SSIS package.

Analysis Services Tasks

Analysis Services Tasks allow you to work with Analysis Services objects in your SSIS packages.
Analysis Services Tasks include Analysis Services Execute DDL, Analysis Services Processing, and Data Mining Query.

Maintenance Tasks, and

Maintenance Tasks allow you to perform administrative functions on the SQL Server in your SSIS packages.
Maintenance Tasks include Back Up Database, Check Database Integrity, Execute SQL Server Agent Job, Execute T-SQL Statement, History Cleanup, Maintenance Cleanup, Notify Operator, Rebuild Index, Reorganize Index, Shrink Database, and Update Statistics.

Backward Compatibility Tasks

Backward Compatibility Tasks are provided for backward compatibility with SQL Server Data Transformation Services or DTS (SQL SERVER 2000).
The two Backward Compatibility Tasks are ActiveX Script and Execute DTS 2000 Package.

In addition to the data flow tasks included with SQL Server 2008, you can also create custom tasks using any programming language that supports the Component Object Model or COM programming model.

Control flow containers provide structure to your package. They are used to group tasks together or allow one or more tasks to be repeated.

There are four types of control flow containers:

Foreach Loop

A Foreach Loop container executes its tasks once for each member of an enumerated collection. For example, the tasks in a Foreach Loop container could be executed once for every row in a table, or once for each file in a folder on the hard drive.

For Loop

A For Loop container executes its tasks repeatedly until a specified expression evaluates to false.

Sequence, and

A Sequence container simply creates a logical grouping of tasks that can be managed as a unit.

Task Host

The Task Host container is created for you automatically when you add a task to the Control Flow design surface in BIDS. This container contains only the task and allows container properties, such as checkpoints and transaction, to be configured on individual tasks.

Hope this consolidation helps ...

Thanks..

SCD - Slowly Changing Dimensions transformation

Slowly changing dimension transformation was one of the mystery transformation for me till now, as i never had chance to actually implement it. So today i block my time to implement this transformation by taking a simple example.

The concept of slowly changing dimension is explained very well and in detail on wikipedia, i would request readers to have a look at below link if you are not aware of SCD concept.

http://en.wikipedia.org/wiki/Slowly_changing_dimension

Before starting on our example, let me give you some glance of the SCD

transformation provided in SSIS 2008. This transformation is used to implement Type 1 and Type 2 SCD, for other types we need to add some custom logic to our ETL. (Type 1: Identify the changes and update the record, no history will be recorded, Type 2: Any change identified we expire the old record and create new record with new values, here we save history information in old record)

OK.. Lets take simple Employee dimension example... in this example i am getting EmployeeFeed.xls file as input for my DimEmployee table (which is my dimension table) and i am using SCD transformation to identify any changes and implement DimEmployee as Type 2.

DimEmployee :

Create table DimEmployee
(

EmpKey int identity(1,1),

EmpId int,

Name varchar(100),

Designation varchar(100),

City varchar(100),

Phone varchar(10),

StartDate datetime,

EndDate datetime

)

So before we start implementing any SCD, we need to first identify attribute in the dimension table for which we need to track changes. In this example i want to track changes for Designation, City and Phone attributes. I am expecting no change in Name attribute or column.

You might have noticed that there are two columns EmpId and EmpKey why these columns are needed in dimension table??

Ans:
EmpId : This is a Business Key, which uniquely identifies a employee in entire data warehouse system.

EmpKey : Is a Surrogate key, which uniquely identifies record in dimension table, and also its a key to identify historical records.

We also have two more columns StartDate and EndDate, these two columns are used to track time of changes, if EndDate is null it means the record is most recent record in dimension table.

Steps to Implement SCD in a data flow.

1. After we add Source (which is excel in our case EmployeeFeed.xls), we need to add Data conversion transformation, to correct if there are any Data type conflicts.
2. Then we add SCD transformation to Data flow, and this will open SCD wizard, Click next on welcome screen.

3. On Select a Dimensoin Table And Keys page, select your dimension table in this case its DimEmployee, Map all the columns from source excel to destination DimEmployee table. One important thing here we do is identify Business Key, which in our case is EmpID. Then click Next

4. On Slowly changing dimension columns page, we need to select appropriate change type of the Dimension Columns and here we have three types : Fixed Attribute--> No change expected Changing Attribute --> Changes are expected, but no need to record history, same record will be updated.

Historical Attribute--> If this attribute is changed, old record will be expired (by setting End Date as current date) and new record will be inserted with new attribute value

In our example, we don't expect any change for Name Attribute hence we selected this as Fixed Attribute, and rest all (Phone,Designation and City ) will be selected as Historical Attribute. Once we are done Click Next

5. On Historical Attribute Options Screen, we have two option, we can use any flag column to show which record is expired and which is most recent and other option is to use Start Date and End Date, in this example we are using second option, and also selected Start Date and End Date column appropriately.

6. For all other screens in this wizard just select next, and on last screen select Finish.

That's it.. we implemented SCD transformation...your data flow should look like as shown below.

If you have noticed, we have two outputs from SCD transformation, New Output and Historical Attribute Output. So if there are any new records which are not present in dimension table those records will be redirected to New Output, and all existing records with some changing attributes will be redirected to Historical Attribute output.

Running the data flow..

I have 9 records in my sample Employee Feed.xls file..

So when i run my data flow for first time, all these 9 records will be redirected to New Output and will be inserted to DimEmployee Table.

Next, I did some changes in EmployeeFeed.xls, Changes are marked in yellow... so there are 4 records which are changed and 2 new records added.

If you can see the data flow, two records are redirected through New Output pipeline and 4 moved through Historical Attribute output, so what happens to those 4 records is we update the EndDate to latest date, then again insert them with new changed attrribute keeping EndDate as null. as shown below.

So that's it... hope you have learned something out of this ...

Thanks

venkatesh