Native SQL Server Backup Types and How To Guide
Published On: October 1, 2021
When a disaster occurs, the ability to successfully restore SQL Server databases is fundamental for avoiding data loss. It’s not enough to rely on hardware and software technologies to replicate data because these technologies can go wrong or there can be multiple failures during a disaster, meaning you have to rely on backups to recover. SQL Server allows multiple types of backups, and a multitude of options to use with those backups. This article will give you primer on the most important SQL Server native backup types and options, and a future article will cover restoring backups.
Full backups are the most common type of native backup in SQL Server and can be performed at the database, filegroup, or file level, although it’s most common to perform a full database backup.
A full database backup is exactly as the name suggests; a complete backup of all the data in the database. A full database backup also contains enough of the transaction log to be able to run recovery as part of the restore, allowing the database to be restored to a transactionally-consistent point. Note however, the backup only contains the portions of the data file that are actually being used. For example, a database that is 250GB in size might only have 50GB of data in it, so the data portion of the backup will only be 50GB.
A full database backup can be considered the base backup for other types of native SQL Server backups such as differential and transaction log backups. For this reason, a full backup is required before you can perform a differential backup or transaction log backup and is the first backup that must be restored in any restore sequence.
You can use the GUI in SQL Server Management Studio to perform a full database backup, or just use T-SQL. I prefer using T-SQL for better consistency with issuing backup commands.
The command for creating a full database backup is BACKUP DATABASE. To use this command, you must specify which database to backup and the backup device (essentially the file that will store the backup). The code below is an example of a full database backup of a database named MyBigDatabase with the backup device being MyBigDatabase.BAK located in the E:\SQLBACKUPS folder.
It’s a good practice to include a date and time in the backup filename, and what kind of backup it is, and if you’re writing a script to perform backups as part of a SQL Agent job, you can construct the filename programmatically. Note, the three letter file type can be anything you want, but it’s common to use .bak or .bck.
Depending on the size of the database and speed of your storage system, it may be necessary to split the backup across multiple files, and this is common for scenarios like migrating a large database to Azure Blob storage. To split the database backup across multiple files, all you have to do is specify multiple backup devices. For example:
What if you need to write the complete full backup file to two or more locations? An example of this would be if you want to write the backup files locally and to your HA/DR server at the same time. You could do it in a two-step process by performing the backup locally and then using another process to copy the backup to the HA/DR server, but SQL Server includes the option to mirror backups to up to three copies. A limitation of using mirror is each location must be the same media type, meaning you can’t mirror to disk and tape or to disk and a URL. An example of a mirrored backup is:
Similar to full backups, native differential backups in SQL Server can be performed at the database, filegroup, or file level, and it’s most common to perform a differential database backup.
Differential database backups contain all changes since the most recent full database backup, so a differential backup isn’t an incremental backup, as some other database systems provide. SQL Server tracks changes to data file pages, and if a page is changed, the extent it’s part of (simply a group of eight pages) will be marked as changed in a “differential bitmap”. A differential backup examines the various differential bitmaps in the database (one per 4GB per data file) and will back up the extents marked as changed. Note, if a page is updated multiple times, the differential backup will only contain the page image at the time of the backup, not the history of changes. The differential bitmaps are reset whenever a full backup is performed.
Differential backups are often used by organizations to help reduce the overall backup storage requirement by reducing the need for nightly full backups and this works well for databases with moderate daily data change. Consider a database that is 250GB in size and results in a 100GB full backup. A nightly full database backup would take up 700GB of storage to maintain seven days of backup retention of just the full backups. Now consider approximately 5GB of data changes in that database daily. If you made a full backup on Sunday and differential backups Monday – Saturday, your backup sizes should measure 100GB on Sunday, 5GB Monday, 10GB Tuesday, 15GB Wednesday, 20GB Thursday, 25GB Friday, 30GB on Saturday. For the entire week, you’d have about 205GB of backup files compared to 700GB with nightly full backups.
This means if your backup strategy is to perform a weekly full backup and a daily differential backup, successive differential backups will get larger through the week as more data changes. It’s possible for the differential backups to become extremely large (tending towards the size of the full backup) and decrease your recovery time objective because of the time it would take to restore the most recent full backup and most recent, large differential backup. If you start using native differential backups, you need to consider the overall benefit of storage reduction, with recovery time. For many organizations, the cost savings for storage outweighs the extra minutes of recovery time.
To perform a differential backup, you must add the WITH DIFFERENTIAL clause to the BACKUP command. Using our previous backup statement for the full backup, I add the differential clause and change the backup type in the name. For example:
Transaction Log Backups
For business-critical databases with constant data change, native transaction log backups should be being performed regularly because if you need to be able to restore your database to any point in time during the day, you’ll need transaction log backups to accomplish that recovery. To back up the transaction log, you’ll need to be using the FULL recovery model. It’s also possible to perform transaction log backups using the BULK_LOGGED recovery model, but it’s rare to see a database that isn’t using FULL.
A transaction log backup contains all log records created since the previous transaction log backup. In order to restore a database to a specific point in time, you’ll first have to restore the most recent full database backup, then the most recent differential backup (if you’re using them), and then each transaction log backup since the most recent data backup up to the point you want to restore to.
The syntax to create a transaction log backup is similar to that of a full backup, however since we’re backing up the log and not the database, you need to specify log as the backup type. For example:
Notice the extension for the backup file changed from .bak to .trn. Just like with data backups, it doesn’t matter what extension you use, but .trn is common. It’s especially important you include the time for transaction log backups since you’ll typically be creating multiple per hour. A common interval for transaction log backups is every 15 minutes, so 96 log backup files per database per day.
A tail-log backup (sometimes called tail-of-the-log backup) simply backs up any log records that have not yet been backed up by regularly scheduled log backups. If you’re going to restore a production database from a native backup, it’s important you capture any log records that haven’t been backed up by creating a tail-log backup, and that will be the final backup in your restore sequence. So, a tail-log backup is a transaction log backup and one could say every transaction log backup was a tail-log backup at the time it was taken. You can get more information here.
COPY_ONLY is a backup type that should always be used when creating a one-off full database backup if you are performing differential backups. The COPY_ONLY clause for a full database backup prevents the differential bitmaps from being reset and that full backup becoming the base for future differential backups. You can get more information here.
An example of a full database backup with COPY_ONLY is as follows:
It’s also possible to use the COPY_ONLY clause for transaction log backups, but they’re typically not necessary, so I won’t cover them here.
The WITH clause is used with the BACKUP command for cases where there are additional backup requirements, as you saw above when I described DIFFERENTIAL and COPY_ONLY backups. Here are some other important options:
- COMPRESSION: This option enables backup compression which usually results in backups being smaller and saving storage space. Compression isn’t on by default when you install an instance of SQL Server but can be enabled by default at the server level, and then NO_COMPRESSION can be used to turn off backup compression for those databases where it doesn’t reduce the backup size. More information can be found here.
- ENCRYPTION: An encryption algorithm can be specified with BACKUP to encrypt the backup files, and this is useful if a database does not have Transparent Data Encryption enabled. More information can be found here.
- STATS: This option is helpful to see the status of the backup operation as it progresses, and is usually used when performing a manual backup. You specify the interval in percentage of completion from 1 to 100, and it’s common to specify 10.
- CHECKSUM: This option specifies the backup operation verifies the page checksum protection on each page as it’s being backed up and fails the backup if it finds a bad checksum. This option also creates a checksum of the backup file and writes it into the backup file header. This allows the integrity of the backup to be verified easily.
- INIT or NOINIT: These options control whether the backup operation appends to an existing file (NOINIT) or overwrites the existing file (INIT). The default is to append to the most recent backup set on the media, but this can cause much confusion as the RESTORE statement has to then specify which of the backups within the file to restore, so it’s common to use INIT. The practice of putting the date and time in each backup filename also remove the possibility of accidentally appending to or overwriting an existing backup file.
Taking these into account, a full backup statement might become something like this:
Backups are crucial for recovering an organization’s data, especially when you consider data is one of the most important assets an organization possesses. Understanding the native SQL Server backup types and ensuring you have the proper backups to meet your recovery objective can mean all the difference in whether an organization survives a massive failure.
A common backup strategy is to have weekly full backups, daily differential backups, and log backups at 15 or 30 minute intervals. Your mileage may vary, as the saying goes, depending on the downtime and data loss requirements of your environment, but now know you the various backup types and options you’ll need to start putting together a backup strategy. If you want some deeper information about how backups work, check out Paul Randal’s Pluralsight course SQL Server: Understanding and Performing Backups.
Tim is a Principal Consultant with SQLskills, and has worked in a variety of roles, including DBA, Lead DBA, and multi-department manager. He is a SQL Server MVP, has a whole collection of Microsoft and other industry certifications, and is very active and passionate in the SQL Community.