Design a backup strategy
Gather requirements. What level of loss is acceptable? How quickly do we need to recover? Which databases are more important than others?
A backup strategy consists of a set of backup operations and a schedule or frequency of when the backup operations will be performed.
Stakeholders need to define the three main requirements:
- Recovery Time Objective (RTO) – maximum downtime after an incident
- Recovery Point Objective (RPO) – maximum acceptable data loss
- Recovery Level Objective (RLO) – the granularity of the data that needs to be restored. (Server / Database / tables, etc.)
These numbers will make up part of your SLA. You need to test restores to know how long they take and if you can really meet your RTO. You may need to use different backup types to get the restores you need.
Things to consider
- Size of the database – bigger takes longer
- DB file structure – multiple files gives flexibility
- Speed & Throughput of network / storage /processor subsystems
- Volume and size of data modifications in the database – need to keeps tranaction log growth under control
- Type of data modifications – Updates v Inserts – Updates don’t grow the logs the same as inserts?
- Data Compressibility – compression uses CPU. If data is not compressible, why waste CPU on compression
- Is point-in-time recovery required? If so, you have to do log backups.
- Recovery Model – simple recovery model doesn’t allow for log backups
- Is the data important? – Maybe don’t backup test data?
Back up databases
- Full – The entire database and any changes made during the backup
- Differential – Everything between the last full and when the differential was started.
- Log – Everything since the lat log backup
- File – a file or file group that maes up the database
- Partial – Like a Full, except it excludes read-only filegroups. It always includes the primary filegroup
- Tail-Log – all transaction log records not backed up since the last log backup. This is usually the first step in a DR process
- Backup device – disk or tape
- Media Family – single non-mirrored device, or a set of mirrored devices in a media set
- Backup set – a successful backup’s content
- Media Set – a set of backup media that contains one or more backup sets
- Log sequence number (LSN) – numbering sequence inside the transaction log
- Sequence Number – order of the pyysical media in a media family. Media families are also numbered within a media set
There is a header on each backup device listing media name, number of media families, sequence numbers, etc.
Full backups are important. If your full gets corrupted, all the differentials in the world won’t save you. A full backup consists of:
- Checkpoint – flushes buffers
- Record backup start LSN – records the LSN where the backup started
- Backup Data – backs up all the extents (8 8k pages) from all the data files in the db
- Record backup end LSN – records the LSN at the end of the backup
- Calculate minimum LSN – determines the minimum LSN required for the log backup
- Backup log –
Differential backups – get bigger the longer you go without a full backup. Restore times increase as well. Does not support point-in-time recovery.
Log Backups (incremental) – frees up space in the transaction log and support point-in-time recovery
File Backups – used when a multi-file db is really large and you want to backup different files at different times.
Partial Backups – by default only backs up read-write file groups and the default file group
- CHECKSUM | NO_CHECKSUM – makes sure each page does not have a checksum or torn page error. also generates a checksum for the entire backup
- CONTINUE_AFTER_ERROR | STOP_ON_ERROR – tells the DB engine what to do if there is a checksmum error
- COPY_ONLY – does not change the backup sequence number
- COMPRESSION | NO_COMPRESSION – compress the backup
- DESCRIPTION – describe the backup
- ENCRYPTION | NO_ENCRYPTION – encrypt the backup? Default is no encryption
- EXPIREDATE | RETAIN_DAYS – when can this backup be overwritten. RETAINDAYS takes priority
- FORMAT | NO FORMAT – creates a new media set. destructive
- INIT | NOINIT – INIT means all backup sets should be overwritten
- SKIP | NOSKIP – ignore the safety check that checks the backup set’s expiration before overwriting it
- STATS – controls the percentage intervals at which you get feedback on the progress of the backup. Default is 10
- MIRROR TO – mirror the backup to another device
Perform database snapshots
Snapshots are read-only point in time files separate from the database file. Work at the page level. Whenever a page is modified, the non modified version is written to the snapshot. They depend on the underlying database. If you loose the parent DB, they are worthless.
Backup databases to Azure
SQL Server Managed Backup to Microsoft Azure – Uses Azure storage blobs. Automatically backs things up through stored procedures. Does fulls every week or every 1GB of log growth. does log backups every tweo hours or every 5MB of log growth.
Backup a VLDB (Very Large Database)
A VLDB is so big that you can’t meet SLA’s even if you throw hardware at it.
Tricks to shrink backup times on non-VLDBs
- Compress tables to make them smaller
- columnstore indexes can decrease the size of tables
- move old data out to an archive database
- use backup compression to reduce the size of the backup
- backup to multiple devices – can run 64 concurrent threads, backup to 64 devices
Tricks to shrink backup times on VLDBs
- Use Filegroups, multiple data files
- Backup different files on different nights
- Put files on separate disks
- Don’t put anything inside the primary data file for the DB
- partial availability – DB can be up even when some files are not
Manage transaction log backups
Full – every transaction fully logged. Log backups have no dependency on DB files
- can recover up to a point in time
- can recover to a marked transaction
- can restore individual pages
- needed for availability groups, database mirroring, and log shipping
Bulk-Logged – records only the extents modified. Need files. Not for production use
- Minimally logs bulk operations
- No point-in-time recovery
- Otherwise like full
- Use when you want to do a big operation, but don’t have log space for it
Simple – no transaction log backups are possible. Not for production use
- No log management – you can loose everything since the last full or differential
You can switch recovery models whenever you want, but it breaks the log backup chain.
You should backup the transaction logs. Here are options you should never use, but will be on the test.
- NO_TRUNCATE – don’t truncate the log at the end of the backup
- NORECOVERY – backs up the tail log and leaves the DB in the RESTORING state
- STANDBY – backs up the tail of the log and leaves the database in a read-only and STANDBY state
You can use NVDIMM (storage class memory) since 2016 SP1 to hold the transaction log cache
Configure backup automation
- You can schedule stuff from the “SQL Server Agent” node in Management Studio
- You can also do a maintenance plan
- Check Database Integrity
- Clean up History
- Execute SQL Server Agent Job
- Execute T-SQL Statement Task
- Maintenance Cleanup Task
- Notify Operator Task
- Rebuild Index
- Shrink Database – not for prod. fragments tables and logs
- Update Statistics