Log chain broken? How to deal with backups from the future.

Today (2015-10-05) we had to deal with an interesting phenomenon. Backups from the future breaking the log chain!

Problem

Our third party software solution that is responsible for backup and restore tried to take a transaction log backup of a database but always converted the transaction log backup into a full backup as it detected the backup chain being broken.

You might also notice this informational message somewhere in your backup software:

Broken chain detected for database [<PlaceYourDatabaseNameHere>]. Automatically converting the backup to Full. This could happen due to a) Performing backups outside our software. b) Running log backups without running full after database restore of previous cycle. c) May have failed to record previous backup information of the database in <PlaceYourBackupSoftwareNameHere>.

The Backup chain is very important when your database is in recovery model FULL, because you would have lost the ability to restore to a certain point-in-time. The software detects this and automatically tries to fix this by taking a full backup. (A differential would have done the job, says Paul Randal: A SQL Server DBA myth a day: (20/30) restarting a log backup chain requires a full database backup)

So what exactly happend to our SQL Server Instance?

Analysis

Let’s have a look at the backup history for the database:

;WITH backup_cte
 AS ( SELECT database_name ,
 backup_type = CASE type
 WHEN 'D' THEN 'database'
 WHEN 'L' THEN 'log'
 WHEN 'I' THEN 'differential'
 ELSE 'other'
 END ,
 backup_finish_date ,
 rownum = ROW_NUMBER() OVER ( PARTITION BY database_name,
 type ORDER BY backup_finish_date DESC )
 FROM msdb.dbo.backupset
 )
SELECT database_name ,
 backup_type ,
 backup_finish_date
FROM backup_cte
WHERE database_name = DB_NAME(DB_ID())
 AND rownum = 1
ORDER BY database_name;

This results in:

backupdatefromthefuture

Wow – a transaction log backup from the future!

That of course could be the problem for the backup software.

But how would that effect the software to think that the backup chain is broken? The backup chain is intact when the last_lsn of the previous transaction log backup always is the first_lsn of the following transaction log backup. This is how the transaction logs are “chained”.

So let’s take a look at the LSNs:

-- ##################
SELECT bs.database_name
 , bs.backup_start_date
 , bs.backup_finish_date
 , CASE bs.type
 WHEN 'D' THEN 'DATABASE'
 WHEN 'I' THEN 'DIFFERENTIAL'
 WHEN 'L' THEN 'LOG'
 END AS backup_type
 , first_lsn
 , last_lsn
FROM msdb.dbo.backupset bs
WHERE database_name = DB_NAME(DB_ID())
ORDER BY bs.backup_start_date DESC

This results to:

lsns

where you can see that the first_lsn from the transaction log backup having finished at 2016-10-01 03:48:49.000 has a LSN in between the first_lsn and last_lsn of the full backup.

The following transaction log backups now fulfills the chain. Each last_lsn is the new first_lsn.

SQL Server has another source where to find the relevant last_lsn for a given database, the DMV “database_recovery_status“:

select last_log_backup_lsn
from sys.database_recovery_status
where database_id = db_id()

For our special database the last_log_backup_lsn resulted as:

last_log_backup_lsn

but wait, this is exactly the last_lsn from the screenshot above, so everything should be fine, right?

Now this is a shot in the dark, but i assume that the backup software also consideres the date to order the backups, not only the ascending LSNs.

Executing the following T-SQL statement:

SELECT bs.database_name
 , bs.backup_start_date
 , bs.backup_finish_date
 , CASE bs.type
 WHEN 'D' THEN 'DATABASE'
 WHEN 'I' THEN 'DIFFERENTIAL'
 WHEN 'L' THEN 'LOG'
 END AS backup_type
 , first_lsn
 , last_lsn
 , srs.last_log_backup_lsn AS [database_recovery_status.last_log_backup_lsn]
FROM msdb.dbo.backupset bs
 INNER JOIN sys.database_recovery_status srs ON srs.database_guid = bs.database_guid
WHERE database_id = DB_ID()
ORDER BY bs.backup_start_date DESC

results to:

lsns_differ

Here you can see that the LSN from the DMV sys.database_recovery_status differs to the one from msdb.dbo.backupset. This is just because of the column backup_start_date being taken into consideration.

… and in case you want a simple query that tells you, if you have a broken log chain, you can use this:

SELECT  	TOP 1 last_lsn as [msdb.dbo.backupset.last_lsn]
			, srs.last_log_backup_lsn AS [sys.database_recovery_status.last_log_backup_lsn]
			, CASE WHEN bs.last_lsn = srs.last_log_backup_lsn THEN 'log chain intact' ELSE 'LOG CHAIN BROKEN !!!' END AS [Log Chain Status]
FROM		msdb.dbo.backupset bs
			INNER JOIN sys.database_recovery_status srs ON srs.database_guid = bs.database_guid
WHERE		srs.database_id = DB_ID()
AND bs.type = 'L'
ORDER BY	bs.backup_start_date DESC

This results as:

logchainbroken

Solution

Just delete the row from the future from the msdb.

I used the code, that is implemented in the system stored procedure msdb.dbo.sp_delete_backuphistory and slightly changed just two things:

as i had to delete records from the future, i had to change

WHERE backup_finish_date < @oldest_date

into

WHERE backup_finish_date > GETDATE()

So the T-SQL to delete information about backups from the future looks like this:

 DECLARE @backup_set_id TABLE (backup_set_id INT)
DECLARE @media_set_id TABLE (media_set_id INT)
DECLARE @restore_history_id TABLE (restore_history_id INT)

INSERT INTO @backup_set_id (backup_set_id)
SELECT DISTINCT backup_set_id
FROM msdb.dbo.backupset
WHERE backup_finish_date > GETDATE() @oldest_date

INSERT INTO @media_set_id (media_set_id)
SELECT DISTINCT media_set_id
FROM msdb.dbo.backupset
WHERE backup_finish_date > GETDATE() @oldest_date

INSERT INTO @restore_history_id (restore_history_id)
SELECT DISTINCT restore_history_id
FROM msdb.dbo.restorehistory
WHERE backup_set_id IN (SELECT backup_set_id
FROM @backup_set_id)

BEGIN TRANSACTION

DELETE FROM msdb.dbo.backupfile
WHERE backup_set_id IN (SELECT backup_set_id
FROM @backup_set_id)
IF (@@error > 0)
GOTO Quit

DELETE FROM msdb.dbo.backupfilegroup
WHERE backup_set_id IN (SELECT backup_set_id
FROM @backup_set_id)
IF (@@error > 0)
GOTO Quit

DELETE FROM msdb.dbo.restorefile
WHERE restore_history_id IN (SELECT restore_history_id
FROM @restore_history_id)
IF (@@error > 0)
GOTO Quit

DELETE FROM msdb.dbo.restorefilegroup
WHERE restore_history_id IN (SELECT restore_history_id
FROM @restore_history_id)
IF (@@error > 0)
GOTO Quit

DELETE FROM msdb.dbo.restorehistory
WHERE restore_history_id IN (SELECT restore_history_id
FROM @restore_history_id)
IF (@@error > 0)
GOTO Quit

DELETE FROM msdb.dbo.backupset
WHERE backup_set_id IN (SELECT backup_set_id
FROM @backup_set_id)
IF (@@error > 0)
GOTO Quit

DELETE msdb.dbo.backupmediafamily
FROM msdb.dbo.backupmediafamily bmf
WHERE bmf.media_set_id IN (SELECT media_set_id
FROM @media_set_id)
AND ((SELECT COUNT(*)
FROM msdb.dbo.backupset
WHERE media_set_id = bmf.media_set_id) = 0)
IF (@@error > 0)
GOTO Quit

DELETE msdb.dbo.backupmediaset
FROM msdb.dbo.backupmediaset bms
WHERE bms.media_set_id IN (SELECT media_set_id
FROM @media_set_id)
AND ((SELECT COUNT(*)
FROM msdb.dbo.backupset
WHERE media_set_id = bms.media_set_id) = 0)
IF (@@error > 0)
GOTO Quit

COMMIT TRANSACTION
RETURN

Quit:
ROLLBACK TRANSACTION

 

Thank you for reading,

Christoph

 

Advertisements
This entry was posted in Backup, SQL Server. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s