3 Common SSIS Struggles and How to Solve Them with Task Factory
SQL Server Integration Services (SSIS) is Microsoft's long-standing ETL product that, with a little skill and effort, can be highly configurable and provide for a broad set of options when it comes to data warehousing and data integration. With some coding knowledge, time, and effort, you can integrate data and systems well beyond the Microsoft data domain.
That said, while you have a great set of features out of the box to get started, many BI developers find themselves researching the proper means to connect and extract data from other sources, then spend hours of manual coding to implement those solutions. Other times, the components available out of the box simply may not perform nearly as well as you suspect they could, again, if you took the time to research and manually code a solution.
In today's ever-increasing world of regulatory data compliance, security is a focus with every step your data takes along its journey. Again, all of this is possible within SSIS development, so long as you have the time to research and manually code your solution. Are you detecting a pattern?
Build vs. Buy
When talking to SentryOne BI Developer customers, the most common stressor they express is time spent not producing and implementing ETL solutions because of the hours spent in research and manual coding. Rarely are these SSIS packages of the "set it and forget it" variety. The data estate is always changing, and with it, the ETL processes. So, anything that can save time in both developing, and ultimately the execution of these processes provides real value. It's the old build vs. buy argument.
SentryOne teammate, Richard Douglas, has a great article on “Build vs. Buy” when it comes to Database Performance Monitoring (DPM) that I highly recommend. I consider this a variation on that theme when it comes to ETL development within SSIS.
The three struggles I will cover here are:
- Connectivity to Outside Sources
- Performance of Loading and Managing Data
- Data Security and Compliance
#1: Connectivity to Outside Sources
Task Factory offers over 70 high-performance SSIS components that are built to accelerate ETL processes. When it comes to external data sources, there are two popular Task Factory components where SentryOne customers have consistently found value. The first is our Salesforce Source and Destination connectors.
SentryOne Task Factory
If you need to import data from Salesforce for in-house processing, the Salesforce.com Source allows you to retrieve any data supported by the Salesforce.com Soap API like leads, accounts, contacts, and opportunities right into your SSIS data flow. Within five minutes, you'll be pulling Salesforce data into your local environment.
If you need to insert, update, delete, or upsert data into any Salesforce object from within SSIS, then the Salesforce.com Destination makes it super simple with minimal configuration. It supports both regular and bulk mode transfer.
The REST Source in Task Factory allows you to connect to almost any web service or cloud application that utilizes a REST API and bring that data right into your data flow.
The component can handle both XML and JSON data being returned from the REST endpoints, which means it supports both legacy (XML based) and newly created (with JSON) REST endpoints.
With its support for Basic REST, OAuth1, and OAuth2 authentication you can connect to and extract data from platforms such as ZenDesk, GoToWebinar, Survey Monkey, and MailChimp among many others using REST Source.
There are many other useful connectors in Task Factory like Microsoft Dynamics, Azure ML Storage, and more, but those are the two most popular. These two options alone have saved BI Developers and ETL engineers countless hours of manual effort.
#2: Performance of Loading and Managing Data
One of my favorite examples of performance gains with Task Factory is the Upsert Destination component. First, it saves time in merely setting up the task to perform the Upsert to either SQL Server or Oracle with a single component. But where it really pays for itself is in the performance of the Upsert process in production. Ultimately SentryOne customers see up to 700x faster performance with that component compared to using the traditional means in SSIS. That is not a typo. It can be up to seven-hundred times faster.
SentryOne customers see up to 700x faster performance with the Upsert Destination component compared to using the traditional means in SSIS.
Users get the same kind of performance boost out of Dimension Merge SCD Transform. The secret comes from the fact that our component does not need the OLE DB Command transform, which is a row by row operation. Instead, our components load the data in memory and then bulk load them to the source, resulting in the massive performance gains to your ETL process.
#3: Data Security and Compliance
One of the most popular security-related Task Factory components is the Secure FTP Task. Many companies transmit files using FTP as a transmission method for highly sensitive files. The Task Factory Secure FTP Task allows you to transmit files over most common secure channels to your partners safely in an easy to use user interface.
Are you working with email in your ETL process? The SSIS Email Source Adapter reads email from a mailbox into a data flow with the ability to filter messages based on sender, message to, subject, date received, body, and priority.
Finally, the Data Validation Transform acts as a gatekeeper for your data. Many companies receive questionable data from various sources, causing a myriad of production problems and inconsistent reporting. The Data Validation Transform is an SSIS component that verifies that your data is clean before insertion. Suspicious rows are flagged for later removal or additional business rules.
The vast majority of these components are fully supported in Azure Data Factory's SSIS Integration Runtime. Wherever you're orchestrating your ETL processes— be that on-prem, in the cloud, or a hybrid deployment— Task Factory likely has a component that will help you save hours of manual coding effort. The time savings aggregated over multiple ETL projects will increase your productivity at work and your overall quality of life.
I've only touched on a few of the more than 70 different components available in SentryOne Task Factory. Still, I hope you've gotten a better idea of how Task Factory helps resolve many common struggles you experienced during ETL pipeline development.
You can check out the full list of components and download a free trial here. A Task Factory license is not required to develop and run packages in Visual Studio or SSDT!
I believe you'll find, like so many SentryOne customers already have, that the hours saved in development and execution time are well worth licensing Task Factory for your SSIS servers. Give it a try and let us know!
Steve (@SQL_Steve) is a fourteen-year veteran of SentryOne, and has held roles in Solutions Engineering, Quality Assurance, Product Management, and Advanced Analytics prior to assuming his current role. He has almost twenty years' experience working with SQL Server, and holds numerous professional certifications including the Microsoft Professional Program, Data Science Certification. His current responsibilities are focused on ensuring everyone throughout the SentryOne family is educated on our customers, their needs, and the solutions we provide that allow us to improve the lives of Microsoft Data Professionals and their customers everywhere.