This guide walks you through the essential moving parts of SSIS. Check out the next guide in this series that focuses on how to build your first SSIS package.
Before we explore the machinery of SSIS, it’s important to understand some of the terms you’ll often hear used by SSIS professionals. Here are some of the key terms and phrases you’ll need to know to be successful with SSIS.
SSIS is licensed as part of SQL Server, and the internals of running an SSIS package can be installed as part of the normal SQL Server install. As an SSIS developer, however, you will need a couple of additional tools installed on your workstation:
When setting up your SSIS development environment, it’s important to remember that you must install the SSDT version of Visual Studio to be able to work with SSIS projects. If you already have Visual Studio installed, you’ll need to download and install the SSDT components.
If you’ve ever worked with Visual Studio, the SSIS development environment will look familiar. SSDT uses the same multiple-document interface (MDI) that enables you to open or close each SSIS package independently, open several packages at once, and conveniently dock them as needed.
Inside this development environment, you’ll be frequently using the following assets:
Figure 1 shows the SSDT designer with each of these assets called out.
Figure 1: The SSIS Development Environment
Within each SSIS package, there are several fundamental moving parts.
A connection manager defines the data structures that can be read from or written to during the execution of the package. A connection manager can refer to a relational database, a flat file, a database file (such as Microsoft Access), a web service, or a cloud structure (such as blob storage).
Once defined inside an SSIS package, a connection manager can be used by any of the tasks or components within that package. Think of the connection manager as a data gateway; it can be used to retrieve data (source), to write data (destination), and to manipulate or validate the data as it passes through the package logic (transformation). In fact, the same connection manager could be used as source, destination, and transformation within the same package.
Connection managers are unique in that they can be defined at the package level or at the project level. Package and project connections have the exact same functionality; the only difference is the scope of visibility. Package connections are usable within the package in which they are created, while project connections are accessible to any package in that project. The latter is very handy when creating a project that uses the same connection in multiple packages.
As shown in Figure 3, both the package and project connections show up in the Connection Managers tray, and any project connections are also shown in the Solution Explorer under the Connection Managers folder. In the Connection Managers tray, any project-scoped connection will display a (project) prefix in the name.
Figure 3: Package-Level and Project-Level Connection Managers
The Control Flow and Data Flow tabs represent the workspace in which most of the package logic resides. We’ll go into greater detail on each of these tabs in the next section.
Like any good programming environment (and yes, SSIS is a programming environment), SSIS allows user-defined variables that can be statically set or dynamically manipulated during package runtime.
Variables are defined at the package level and, by default, are accessible by any task or component within the package. Each variable will have the following attributes:
The Variables window, shown in Figure 4, is used to create or edit variable values.
Figure 4: The SSIS Variables Window
Variables in SSIS provide a great deal of flexibility when designing SSIS packages. Variables can be used to store operational values (such as row counts), configuration values (such as a particular file name when looping through multiple files), and can even be used to process serialized object data.
Parameters were introduced in SSIS with the release of SQL Server 2012 to easily pass in runtime values to a package. Using SSIS parameters makes a project much more flexible by eliminating the reliance on hard-coded values.
Parameters behave a lot like SSIS variables. They both require a data type definition, can both be set with a default value, and can reduce administrative work by eliminating hard-coded static values within a package. There are some important differences, including:
Parameters are incredibly powerful in that they allow you to completely decouple values that might change over time, including database connection strings, file paths, and login credentials, from the source code. By using parameters to pass in these values at runtime, you can eliminate a lot of unnecessary code changes. Parameterizing values such as these also makes testing easier, by allowing a separate set of test values to be passed in at runtime with no code modification.
Figure 5: The Parameters Tab
To create or edit package parameters, open the package and click the Parameters tab. Project parameters can be found by opening the Project.params file in the SSIS project.
SSIS has its own expression language for creating dynamic behaviors. Almost every task and component in SSIS can use expressions for replacing otherwise static values with a short bit of code that is interpreted at runtime.
Some of the practical uses for expressions in SSIS include:
Expressions can be used throughout SSIS. Many properties of tasks, components, and containers can be modified to substitute expression values for static text. Expressions can be used in the value expression for a variable, and an expression can reference other variables, parameters, or environment information (such as the date and time the current package began executing).
To view the expressions available for any given task or component in SSIS, you can select the item (indicated by the arrow) and find the Expressions item in the Properties window (shown inside the rectangle), as shown in Figure 6.
Figure 6: Expressions
After selecting the item and the property in which the expression will apply, you can use the Expression Builder window, shown in Figure 7, to design the expression. The list of variables and parameters (shown in the upper-left pane) and functions (shown in the upper-right pane) can provide a lot of help when building your expression. For the daring, you can code your expression entirely by hand in the Expression box.
Expressions are incredibly powerful in SSIS, but they do take a bit of getting used to. Don’t feel alone if you initially struggle with the syntax of the SSIS expression language; every SSIS developer who has ever worked with it has had moments of frustration when adapting to this new lingo. Use the Expression Builder to get acquainted with the nuance of creating SSIS expressions.
Figure 7: Expression Builder
The control flow in SSIS is the starting point at which package development begins. The control flow is the design surface onto which tasks are added. The SSIS control flow is responsible for directing which tasks will execute and in what order.
The control flow is the default view when creating or opening a package. The Control Flow tab, at the top left of an opened package, reveals the control flow surface. On a new package, this will be blank; Figure 8 shows a package with two tasks added to the control flow.
When working with the SSIS control flow, there are three different types of objects you’ll use: tasks, containers, and precedence constraints.
Figure 8: The SSIS Control Flow
Figure 12 shows an example of using multiple precedence constraint for both success and failure operations.
The setup shown in Figure 12 results in the following workflow:
Using containers and precedence constraints, the workflow of the package can be as simple or as complex as needed.
Figure 12: Precedence Constraint Example
Event handlers are a special type of SSIS control flow. An event handler is designed to execute when a specific event occurs. Figure 13 shows an event handler set to execute when the executable (in this case, the package itself) encounters an error (defined as OnError, which is the default setting).
Note that, in this example, the event handler logic has not yet been set up; to create an event handler for this executable and this event, you can click the hyperlinked message in the middle of the design surface. The SSIS Toolbox for this tab shows all the tasks commonly available in the control flow.
Event handlers are useful tools but be careful that you don’t overuse them. Because the event handler logic can be set up individually for every task and container (as well as for the package itself), the overuse of event handlers creates a labyrinth of logic that is very difficult to maintain and debug.
Figure 13: Event Handler
As mentioned earlier, the Data Flow task is one of many tasks in the SSIS Toolbox. However, this task differs significantly from the other tasks in that it has its own child elements, referred to as components, that can be connected to create end-to-end data flow operations.
The data flow can be accessed through the Data Flow tab in the package editor. If a package contains multiple data flows, you can use the Data Flow Task drop-down menu, shown in Figure 14, to select the data flow to edit.
Figure 14: Data Flow Layout
Tim Mitchell is a Business Intelligence architect, author, and trainer. He has been building data solutions for over 15 years, specializing in data warehousing, ETL/SSIS, and reporting. He holds a Bachelor’s Degree in Computer Science from Texas A&M at Commerce, and is recognized as a Microsoft Data Platform MVP. Tim is the founder and principal data architect at Tyleris Data Solutions, a consulting firm focused on delivering on-premises and cloud-based solutions on the Microsoft Data Platform.
As an active member of the community, Tim has spoken at international, regional, and local venues, including the PASS Summit, SQLBits, SQL Connections, SQLSaturday events, and various user groups and webcasts. Tim is coauthor of the book “SSIS Design Patterns,” and is a contributing author on the charity book project “MVP Deep Dives 2.” He is a member of the Boulder BI Brain Trust, a Melissa Data MVP, and is active in the North Texas SQL Server Users Group in the Dallas area.