This is not a good way to make friends in the international market, and it can be a royal pain to change later. So think hard about these design decisions up front. If I were to weigh the impact of disk space versus the cost of refactoring, I would always lean toward using Unicode for any column that could ever potentially require Unicode, and just dealing with the extra space requirements. So, if in doubt, choose
nvarchar. And above all, be consistent - when you start mixing
nvarchar, you end up with implicit conversions, which can often have drastic effects on your workload.
Making your variable length columns too long - one thing I see frequently is the reluctance to pick an upper bound for a variable length column - e-mail and URL, for example, are often set to 4000 or 8000 characters, or even
max types. Even though an e-mail can't be longer than 320 characters (64 for localpart, 1 for @ symbol, 255 for domain), and a URL can't be longer than 1,024 or 2,048 characters, depending on which browser is your lowest common denominator. For
max types there are inherent inefficiencies and differences in behavior that are well documented. For picking an arbitrary length in the 4000/8000 range, a lesser-known problem is that if you have e-mail address defined as
varchar(8000), but the largest value is 120 characters, and the average is 36 characters, the memory grant for any query is going to be based on the assumption that the average e-mail address is 4,000 characters long. On your local system and with a small data set, that's not really a problem; at scale, though, it's going to be painful.
Storing comma-separated lists - this is a popular one that has been getting further muddled by support for other non-relational data like XML and JSON. Basically, my opinion is this: any single fact that you care about independently, should be stored independently. If you're stuffing pet names into a comma-separated list and you're going to want to search for everyone with a pet named Snowball or Santa's Little Helper, that comma-separated list is going to make your job very hard. Try to think about *all* of the future uses of the data, not just focusing on what is the easiest way for the application to pass a "list" to SQL Server and store it (and in fact, think about it as a set, not a list). Depending on the application, you might actually want a better way to do this end-to-end: table-valued parameters, where the app doesn't have to take a set and build a single string, and the database never has to turn around and split that string back into its original set.
Now, I am not saying there are zero use cases for storing XML or JSON inside of SQL Server - there certainly are valid scenarios. But ideally, either the application(s) should not rely on extracting parts or searching these blobs inside the database, or users should understand that those features are unlikely to be very fast.
I really could go on and on about these things, but I'll save you some agony for now. Please do keep in mind that many of these things I discourage are subjective, and they often do have valid pros, I just don't prioritize the pros as high as the cons.