Bad Habits Revival
A few years ago, I started a relatively thorough series of T-SQL "Bad Habits" - things I see people do, often, that can cause problems in various ways (not always related to performance). Over the past 6+ years I have posted a number of other posts, too, that promote best practices and try to squash bad habits. Skip to the index.
I've given dozens of presentations centered around these bad habits, and plenty more have cropped up in the meantime. So I thought I would jot a few more down, and provide a handy index to all of them. First, though, I want to explain my motivation for writing these posts, because I don't think it always comes across the right way.
I'm not trying to correct you, or make you feel like you are "doing it wrong."
I'm trying to help you be consistent, avoid issues, and set better examples.
In many cases, the people committing these quasi-sins are well aware of the situations where they can break, and know that they are currently safe. For example, many people will declare
VARCHAR without length in this scenario:
SELECT CONVERT(VARCHAR, some_int_column) FROM ...
They can get away with not declaring the length for the
VARCHAR here, because the default SQL Server applies in this case is 30, and no integer value could ever be longer than 11 digits. The problem is that leaving off the length can break in *other* scenarios, where the default is 1 (Microsoft basically laughed at me when I suggested they make this consistent). And "break" doesn't always mean "generate an error message," making it easy to discover during development, and fix long before it makes it to production. In this case, you can irrecoverably lose data, and not even know about it:
DECLARE @x VARCHAR = '123'; SELECT @x; -- result: 1
(That works the same way for input parameters to a procedure, by the way, but I can't demo that in one line.)
So my issue in many cases is not that people take advantage of the situations where they know it can't break - they know the risks in other scenarios. My issue is when those people put code in the codebase like that, or - worse - blog or answer questions on Database Administrators or Stack Overflow that implicitly advocate the bad habit. The people who see that code (junior developers/DBAs, future maintainers, kids fresh out of college) don't get all of the author's knowledge as a by-product, so they may not be aware that - while this tactic is okay in the specific case they're looking at (or maybe it's not!) - it can break severely in others.
Another aspect of this is consistency. It seems wrong to me to use the length above when you need to, but not bother when you don't. So why not just specify the length, always? An even better example is the lazy shorthand people use for functions like
DATEDIFF. You might be surprised by this result (I won't spoil the fun; go ahead and try it):
SELECT DATEPART(Y, GETDATE());
If you can't use
Y in this case, why use
D in other cases? Why not just type out
DAY, always? Not only do you avoid this inconsistency where sometimes you need to spell it out and sometimes you don't, your code also becomes much more self-documenting. Since you type the code once, and others may read it many, many times, I don't buy that the extra cost of spelling out the word (measured in microseconds) is a productivity killer. I have never heard of anyone getting to leave for the long weekend early because they shaved all that time off from not having to type those characters, but I have seen long weekends start late because people were troubleshooting some mess that was harder to debug because of shorthand.
I feel similarly about the
INFORMATION_SCHEMA views. Since they are incomplete, aren't being updated for most new features, and Microsoft themselves recommend against them, I find it hard to justify using them in cases where you can (like getting just a list of columns from a table), but then having to use the catalog views when you need information the
INFORMATION_SCHEMA views simply don't have (like partitions, or filtered indexes, or
INCLUDE columns). Unless you need to write code that will work on multiple RDBMS platforms (in which case you can't use any of the SQL Server-specific features anyway), why not just use the catalog views always?
So, to summarize my long-windedness about T-SQL bad habits: be consistent, avoid issues, and set better examples. :-)
Here is an ongoing list of articles that I consider to be along these lines - either promoting best practices or eradicating bad habits; not all are explicitly framed as a "bad habit," but they do all represent in some way things I wish I observed less often. Some of my opinions are controversial, and many have evoked very passionate comment threads - so I recommend scrolling down for those, too.
Aaron (@AaronBertrand) is a Data Platform MVP with industry experience dating back to Classic ASP and SQL Server 6.5. He is editor-in-chief of the performance-related blog, SQLPerformance.com. Aaron's blog focuses on T-SQL bad habits and best practices, as well as coverage of updates and new features in Plan Explorer, SentryOne, and SQL Server.