17 May 2014

Creating a basic SSIS package

 Start BIDS and in the basic tools click New Project or Ctrl+Shift+Nas shown in the image below. You could also go to Menu-> File-> New-> Project

      Microsoft Visual Studio – New Project 

On doing the above the following window (see image below) will pop up, you would need to do the following:

Select Project types to Business Intelligence Projects.
  • In Templates select Integration Services Project.
  •  Set the Name of the Project
  •  Browse the location where you would like to save the project created.
  •  Click OK.
This is the basic settings you need to do to start the package creation.

      Microsoft Visual Studio – New Project 

Once you complete the above process, you see the screen as in the next figure. This is BIDS. If you do not see the Tools panel on the left, click the Tools buttons I have highlighted on the top. This will help you see the various other panels that can be visible to you. On the right, you can see the Solution Explorer. It has the solution with the same name you provided in the previous step. 

Under SSIS Packages you would see a default package by the name Package.dtsx being created. It is a blank package as you see in the middle design pane. It is blank. If you wish, you could rename this package.

      


Once you are familiar with the various aspects of BIDS that you see in the image above, we could start developing the package.

Just below the Package.dtsx[Design]* you would notice 4 tabs and by default you are on the Control Flow tab. From the left hand Toolbox drag drop a Data Flow Task onto the Control Flow designer pane that you see. Once you drop the Data Flow Task onto the designer pane you would see the screen similar to the screen below.

      BIDS – Data Flow Task in Control Flow
Double click the Data Flow Task you dropped onto the Control Flow. You will notice that the tab has changed from Control Flow to Data Flow (see the image below).

If you have a look at the Toolbox now, the tools have changed. These tools are specific to Data Flow Task.

As of now all you need to do is drop 2 transformations (Flat File Source and OLEDB Destination) from the Toolbox onto the Data Flow pane. You will notice that from the Flat File Source, there is a green arrow projecting downwards, click this and drag it to the OLEDB Destination (the other transformation) 

NOTE: The tools on the Control Flow are called Tasks, while in Data Flow tasks the tools are called Transforms/Transformation

Flat File Source: This transform as the name suggests is used to define the source from where the data needs to be pulled. We need to define the metadata of the data we wish to pull. We will see how to do this in the next few steps. 


NOTE: In both the transforms that you pull onto the data flow task pane, you will notice a red cross mark on the right side of the transforms. This is because the transforms are not yet setup correctly and if we execute the package now, we will get an error. 

       Data Flow Task with Flat File Source and OLEDB Destination

We should now start with setting up the Flat File Source. Double click the Flat File Source; you will notice the Flat File Source Editor window (see image below) pop-up. 

In the source, the first thing we need is to create a connection to read the file containing the information to be consumed. Since this is a new package and we do not have existing connections, we need to click the New button to create a new connection
 
       Flat File Source Editor


Flat File Connection Manager Editor (image below):
This is the editor used to create a flat file connection. You may provide a name to the connection for easy tracking and a description else, the names will be defaulted. 

Click the Browse button to select the flat file we need to use.

NOTE: The file needs to exist while development, even though with just one record in it.
If the file is in a simple csv format, the other settings will be set automatically, namely the code page, text qualifier, Header row delimiter, Header rows to skip (set to zero by default)

You need to check the “Column names in the first data row” as the file we are using has header record. If we do not check this box the header row would be treated as data.
 
        Flat File Connection Manager Editor - General


You do not need to make any changes in the Columns and Advanced tab as we are treating all the records as string for simplicity and we leave the default settings to be used. Since we have mentioned that the first row has header information the column names would be set as per the data in the first column.

Go to the Preview tab and see if the data is the format you expect and you do not get any error message. In my case, I can see 3 columns with 3 rows in the file. The headers are set correctly.

Click OK to come out of Flat File Connection manager.

Preview the Flat file source and click OK.

NOTE: Now if you see the Flat file source in the Data Flow Task, you will notice that the red error mark is gone from the Flat file source and yet visible on the OLEDB Destination

Flat File Connection Manager Editor - Preview

Next we need to set up the OLEDB Destination. Double click the OLEDB Destination. Similar to the Flat file source, first we need to set up a connection to the SQL Server. Since this is the first package and we do not have any existing OLEDB Connection we need to set up one. For doing this click on the New button highlighted below in the Figure.

     

OLEDB Destination Editor-- Connection Manager


The next image shows the existing connection on the left pane but we have not yet created one, hence we would need to do that. Click the New button on this pop up.

     Configure OLEDB Connection Manager 

Now provide all the information required for the connection to be setup namely: Server Name, Logon type, and the database.

Note: If you have the SQL server on the local machine with no instance you can use period ‘.’ instead of localhost or the machine name.

Click the Test Connection button to verify the validity of the credentials provided and click OK. If there is no error we are good, else check the credentials again

     Set up OLEDB Connection 

Select the Connection we just created. As I had mentioned earlier the table is not created when the development starts. 

SSIS provides a nice and simple way to create the table during development based on the Meta data we provide for the source and the transformations applied (in our case not transforms).

To do the above, click the New button highlighted below in the image below.

   OLEDB Destination Editor – Create new table 

On clicking the New button, you will see a small text box pop up with the query to create the table in the SQL server. You can make changes to the name and the column types etc as per your wish.This is just a guide and not binding on you to follow the table definition provided. 

I changed the name of the definition for clarity.

     OLDEB Destination Editor – Create Table script


Click Ok after you are satisfied with the table definition. Click Ok on the OLED Destination pop-up as well and come to the Data Flow pane. You will see something similar to the next screenshot. Note that the red mark has gone from both the source and destination.

This means that the Source and destination are setup correctly. If there is no environmental error, the package will execute successfully.

To execute the package (i.e. perform the intended task) click function key F5 or click the green Play button marked in red as seen in the screen shot below.

   

BIDS -- Execute Package


You will notice that the Source and Destination turn yellow then green. It will happen too fast for you to notice the yellow. On turning green, you will see that there is a number 3 alongside the arrow connecting the source and destination. This depicts how many records have been sent from the source to destination.

The Source and Destination turning Green signifies that the 2 have performed their task successfully.

     Data Flow Task – Package execution completed successfully


If you remember, we were doing all this in the Data Flow Task. To see if the data flow is successful overall, we move back to the Control Flow tab and see that the Data Flow task we had turned green. Since our package was a simple package and no additional tasks were there in the Control flow, we do not need to worry about other tasks failing or passing. If the Data Flow is green our package is successful.

      Control Flow – Package execution completed successfully


Go to the Progress tab the last tab to the right of the Control Flow Tab. Check the screenshot below for reference. Here you can see how much time was consumed in the execution what the steps were taken etc
  

    
 



 


No comments:

Post a Comment