T-SQL Tuesday: Shipping Database Changes

T-SQL Tuesday

In a previous life, I was responsible for a production system with somewhere around 1,500 roughly identical databases, one per client. I called it a multi-tenant system for years, but that causes strife, because many people only think of multi-tenant as a database-level concept rather than an instance-level concept. This was, essentially, software as a service (SaaS) before that became a hipster buzzword.

So, we had 1,500 customer databases spread across three production SQL Server 2005 FCIs. Since I always have to defend the choice to completely isolate each customer's data, I'll tell you that this was done for a combination of reasons, but the three most important were:

  • A database per customer allowed us to very quickly migrate a noisy/busy customer or group to a different instance, simplified both onboarding and offloading (CREATE DATABASE or DROP DATABASE), and made it easy to create very specific, targeted indexes for individual customers.
  • Independent databases also allowed for unique SLAs, specifically around RPO/RTO - not all customers had to have point-in-time recovery, for example, and some customers wanted to have full or differential backups performed at specific times and/or in their own time zones.
  • We served a wide variety of verticals, and some customers had legal or contractual requirements for their data to not reside with anyone else's.

There were multiple other factors that led us to this architectural decision, but the point of today's post is to tell you about one of the downsides:

Manageability

I'm not going to sugar-coat it; managing 1,500 databases nearly-identical databases was challenging. (Though managing 15 or 150 is really no less challenging.)

Some of the most interesting problems I solved back then had nothing to do with the database or application logic, but rather with how to deal with making non-intrusive updates to that many databases, and additionally on a team that hadn't adopted source control for database objects in any way.

Onboarding

When a new customer would come aboard, we would have to allocate a database to them, and it would have to support all of the schema - tables, procedures, etc. The naming was easy - we had a central "control" database where we stored system-wide metadata. It had a dbo.Customers table, and the identity column there drove the name of the database: Customer0000<CustomerID>. This avoided complex naming rules for databases, for example when two companies have the same name, but introduced lookup complexity (just like referring to a patient by their ID instead of their name).

The schema and modules were simply inherited from the model database, which we kept up to date just like all of the existing customer databases. So a new database would inherit the latest version of the schema automatically.

Stored Procedures

Changes to stored procedures were pretty easy - we would make the changes in a dev environment, and when we were ready to deploy, we would just run an ALTER PROCEDURE command on all the databases.

And no, that doesn't mean manually typing USE Customer000000000001; or pulling each database from the SSMS dropdown and executing the ALTER over and over again, but I did go through a minor evolution for this.

Initially - when we had ~100 customers, the schema was simple, and changes were infrequent - I just built a dynamic SQL string to run on each database:

-- the ALTER, which needed single-quotes to be escaped
-- which got fun when the procedure generated dynamic SQL too
DECLARE @cmd nvarchar(max) = N'ALTER PROCEDURE dbo.whatever ...';
 
-- then, build a dynamic SQL command to accept the ALTER
-- and run it against each database
DECLARE @sql nvarchar(max) = N'';
 
SELECT @sql = @sql + N'EXEC ' + QUOTENAME(name) + N'.sys.sp_executesql @cmd;'
  FROM sys.databases
  WHERE name = N'model' OR name LIKE N'Customer00%';
 
-- finally, execute them all
EXEC sys.sp_executesql @sql, N'@cmd nvarchar(max)', @cmd;

I went through other iterations of this, such as loops where I printed out each databases' progress, adding a WAITFOR so I could stop the script if I noticed something off.

As the schema became more complex, and both customers and development ramped up, it became clear that this would not scale. I went looking for a tool that would help deploy database objects to a number of server/database combinations with one click.

What I found at the time was SQL Farm Combine, still available for download on some freeware sites, but I don't know how mature it is now or how well it has been maintained. At the time, it was a life saver. All you had to do was paste in the script you wanted to deploy, point it at a server, and it would present a list of all the databases - you just checked which ones you wanted to target. In my case it was easier to select all, then uncheck master, tempdb, and other databases I didn't want to affect. This eliminated the dynamic SQL and, in many cases, reduced a lot of botched deployments due to how hard it was to keep track of as many as 14 consecutive single quotes.

Tables

Changes to tables were a little harder to make non-intrusive - adding a column with a default, for example, was certainly not a trivial thing back then. I would do my best to do things in such a way that they would not lock the entire table or kill the log. For some changes, I could use the same technique as above; others had to be planned out more carefully.

Another thing that became complex is that sometimes tables in the dev environment would go through a whole series of changes before arriving at the final version. With stored procedures, this is pretty easy - you ALTER a bunch of times, and the last ALTER is the one you can apply to production, without any knowledge of the interim states. As you make changes to a table - say, add a column as one data type, then change the data type, then change the name, then drop the column and re-add it with a default, then add a check constraint, and as you go through that iterative process, there isn't a magical "sum up all my changes and issue a single ALTER" button.

Enter Redgate SQL Compare which - even in its early days - was by far the easiest way to generate a script of deltas between two objects. So I would point it at model in dev and model in production, it would do its thing, and I would end up with the "sum up all of my changes" script I had been looking for. I could pick which changes to keep (since sometimes we deployed while multiple separate changes were being developed), generate the script, and plug it into SQL Farm Combine. Three clicks later, I was able to deploy to all 1,500 databases across three instances - without opening Management Studio, and a long time before Powershell caught on.

Today

If I were back in that position today, those would not be my chosen methods. For one, I would be deploying from source code, not using diffs - and potentially using SSDT and database projects. I'd invest more heavily in Redgate's tools, as they are designed specifically for development and deployment (and in addition to continuing with a very mature Compare product, they also have Multi Script, which would replace SQL Farm Combine in my strategy above).

Keeping it brief: This isn't an ad for Redgate, I just think they make the best tools on the market for this kind of task.

Generally

The biggest concern I have about database deployment is not about how you deploy, or even whether or not you use source control. It's that your database changes are backward compatible - meaning they won't break the current application, in the event the application can't be changed at the same time (and with distributed software, that's impossible). The largest number of bugs I had a hand in creating were caused because I assumed the application or middle tiers would be deployed at the same time.

I quickly became very motivated to make sure my changes would work before and after the application tier changes were deployed. Add a parameter to a stored procedure? Make sure it's at the end, and has a default. Add a column to a table? Make sure views and stored procedures don't expose it (yet) or require it. And a hundred other examples. This inspired a series of tips a few years ago:

I will concede that it's not always possible to deploy one tier without the other, but there are clever ways to avoid some of the risk, or at least minimize it. I hope my memories have inspired you to think about your processes. What is your favorite trick for seamless database deployments?

Thwack - Symbolize TM, R, and C