Published: May 22, 2020
A SQL Server Integration Services (SSIS) package includes the necessary components, such as the connection manager, tasks, control flow, data flow, parameters, event handlers, and variables, to execute a specific ETL task. For an introduction to key SSIS functionality, check out the SSIS Basics guide.
To demonstrate how to get started with SQL Server Integration Services (SSIS), we’re going to walk through the steps to create and test a simple SSIS package. In this example, we will create a package that truncates a staging table and loads that same table with the data found in a flat file.
Looking to learn the basics before you create our first SSIS Package? Check out our SSIS Basics guide.
First, we need to create the SSIS project in which the package will reside. In SSDT, select File, New, Project to open the New Project window shown in Figure 1. Choose the appropriate SSIS project. Set the name of the project to something meaningful (in this case, Load Staging Tables).
After clicking OK, you’ll see the newly created project in the Solution Explorer window and will probably notice in the SSIS Packages folder that a default package (with the name Package.dtsx) was created automatically. Right-click and rename this package to something meaningful, such as Load Orders Staging.dtsx. (Don’t forget to leave the .dtsx extension in place when renaming.)
Figure 1: Create a New SSIS Project
Since we are loading a volatile staging table in this package, we first need to truncate that table. For this operation, we’ll select Execute SQL Task from the SSIS Toolbox and drag it to the control flow surface. Right-click and select Rename to give this task a descriptive name: SQL – Truncate Orders Staging.
Figure 2: Execute SQL Task Added to Control Flow
Figure 3: The Execute SQL Task Editor
To create a new connection manager for this task, use the drop-down list next to the Connection item and choose New Connection. This will open another window to allow you to configure a new OLE DB connection manager. Click New, shown in Figure 4, to create this new connection.
Doing so opens another window, shown in Figure 5, with the new connection manager properties. Set the server name and database name for this connection, and, if necessary, change the authentication type (Windows Authentication is always preferred, if available).
Figure 4: Create New Connection Manager
Figure 5: Configure Connection Manager
Click OK on each of the open Connection Manager windows, and you’ll be back to the Execute SQL Task Editor. The only thing left to configure for this task is to add the SQL statement to execute. In the SQLStatement box, enter the truncate table statement TRUNCATE TABLE [stg].[Orders], as shown in Figure 6.
Figure 6: Fully Configured Execute SQL Task Editor
Click OK, and you’re all done with this task.
Next, find the Data Flow task in the SSIS Toolbox and drag it onto the control flow surface. Right-click and rename the task to DFT – Load Orders Staging.
To ensure that this data flow is run after the completion of the truncation step, we need to use a precedence constraint to establish the order of operations. To do so, click the previously defined SQL task and note the green arrow coming out of it. With your mouse, grab the end of that arrow and drag it over to the Data Flow task, as shown in Figure 7.
Now, double-click the new Data Flow task to jump to the Data Flow tab. Using the SSIS Toolbox, navigate to the Flat File Source (found under the Other Sources folder) and drag it to the empty canvas of the data flow. Rename this source to FFSRC – Orders File, as shown in Figure 8.
Figure 7: The Data Flow and Precedence Constraint
Figure 8: Flat File Source
Again, we see an error indicator, highlighting that we have an undefined property on this source. Double-click this source to create the connection manager using the Flat File Source Editor shown in Figure 9.
Figure 9: Create New Connection Manager
Clicking New in the Flat File Source Editor opens the Flat File Connection Manager Editor, shown in Figure 10. The two essential pieces of information needed here are the name of the connection manager and the path to the file from which we are getting the data. Optionally, you can also set the file structure type (i.e., delimited, fixed width, ragged right), specify a text qualifier (our sample file uses the backtick as a qualifier), and indicate whether the first row in the file contains column names.
Figure 10: Flat File Connection Manager Editor
With these options set, click OK to close the Flat File Connection Manager Editor, then click OK in the Flat File Source Editor. The configuration of the flat file source is done!
Next, we’ll add a cleanup step to remove extra spaces from one of the columns in the source file. While still on the Data Flow tab, find the Derived Column transformation in the Common folder, and drag it to the data flow. Change the name of this new component to DER – Trim Values.
Next, click the flat file source we configured in the previous step, and you’ll notice two arrows—one blue and one red—emerging from this source. The blue arrow represents valid rows of data coming from that source, and you’ll want to connect the end of that blue arrow to the derived column we just added, as shown in Figure 11.
Figure 11: Flat File Source Connected to New Derived Column Transform
To configure this cleanup operation, double-click the derived column transformation. The Derived Column Transformation Editor that appears looks a bit like the Expression Editor we reviewed earlier, but this one has a grid where you can add one or more transformations.
In this list of transformations, you can opt to add a new column or modify the value of an existing column. For our purposes, we’re going to trim out the whitespace on the existing CustomerID column. To do this, use the dropdown menu under Derived Column and select Replace ‘CustomerID’. In the Expression box, enter TRIM(CustomerID) to remove leading and trailing whitespace from that value, as shown in Figure 12.
Figure 12: Derived Column Transformation Editor
Once the derived column settings are configured as shown in Figure 12, click OK. That completes the source and transformation configuration, with the destination as the only item remaining to be configured for this data flow.
For this last step, use the SSIS Toolbox to find the OLE DB Destination under the Other Destinations folder, and drag it to the data flow canvas. Rename it to OLEDST – Orders Staging, and then connect the blue output arrow from the derived column transform to this new destination, as shown in Figure 13.
Figure 13: Added the OLE DB Destination
As you can probably guess based on the prior steps, the error indicator on this destination means we have some properties that still need to be defined. Open the OLE DB Destination Editor by double-clicking the destination.
In the OLE DB Destination Editor window, shown in Figure 14, you’ll see a list of available OLE DB connections, as well as settings for the table to write to in this data flow. Because there is only one OLE DB connection in this package (the same one we configured when we set up the Execute SQL Task to truncate the staging table), that connection will automatically be selected. The only other required item on this tab is to select the output table from the dropdown list.
Figure 14: OLE DB Destination Editor
Choose the [stg].[Orders] table from this list. After doing so, click the Mappings tab on the left to confirm that each of the source columns is mapped to its corresponding destination columns, as shown in Figure 15. Since the column names and data types in the source and destination are the same in our example, this mapping will be done automatically. If your source columns are to be written to destination columns bearing a different name, you would need to do this mapping manually.
Figure 15: Source to Destination Column Mappings
Click OK and the Data Flow task (shown in Figure 16) will now be completed. This package is ready for testing!
Figure 16: Data Flow
You can easily test this package directly in SSDT. In the Solution Explorer window, right-click the name of the package and choose Execute Package. The selected package will begin its execution immediately, starting with the SQL task for truncation and then moving to the Data Flow task. After each of these completes successfully, you’ll see a green check mark indicating a successful execution, as shown in Figure 17.
Figure 17: Successful Control Flow Execution
Clicking the Data Flow tab shows each of the components of the data flow and will even display the row counts for each segment of the load, as shown in Figure 18.
Figure 18: Successful Data Flow Execution
If you’re interested in digging into the detailed logging of this test, or if the test fails and you need to find out why, you can click the Progress tab of the package to see the logging information, as shown in Figure 19.
Figure 19: Logging Information Shown in the Progress Tab
Hopefully you’re ready to take what you have learned from this guide and begin working with simple SSIS packages. The following are a few ways you can continue to expand your SSIS knowledge:
Download the in-depth SSIS Basics guide, which covers not only how to create your first SSIS package but also key SSIS functionality.
Tim Mitchell is a Business Intelligence architect, author, and trainer. He has been building data solutions for over 15 years, specializing in data warehousing, ETL/SSIS, and reporting. He holds a Bachelor’s Degree in Computer Science from Texas A&M at Commerce, and is recognized as a Microsoft Data Platform MVP. Tim is the founder and principal data architect at Tyleris Data Solutions, a consulting firm focused on delivering on-premises and cloud-based solutions on the Microsoft Data Platform.
As an active member of the community, Tim has spoken at international, regional, and local venues, including the PASS Summit, SQLBits, SQL Connections, SQLSaturday events, and various user groups and webcasts. Tim is coauthor of the book “SSIS Design Patterns,” and is a contributing author on the charity book project “MVP Deep Dives 2.” He is a member of the Boulder BI Brain Trust, a Melissa Data MVP, and is active in the North Texas SQL Server Users Group in the Dallas area.
You can visit his website and blog at TimMitchell.net or follow him on Twitter.