Cleveland DBA

SQL Server administration in Cleveland, OH

Database Snapshots – Part 3 – Data Recovery

In this final part of our series on SQL Server database snapshots, I’m going to talk about what, I think, is the most forgotten feature/use of snapshots:  data recovery.  If you (or ‘a friend’) makes a mistake and deletes data you shouldn’t have, or perhaps accidentally drops a table, if you have a snapshot, you can restore that lost data.  Now, before I go any further, I am in no way, shape, or form suggesting that you use database snapshots as your backup/restore solution. Let me reiterate that:  DO NOT RELY ON SNAPSHOTS FOR YOUR BACKUP/RESTORE SOLUTION!  Got it?  Excellent!  Now then, where was I?

Restoring from snapshots comes in especially handy in testing scenarios.  You’ve got this mammoth database and a developer wants you to take a backup so he can test some data changes and then have you restore the database to its prior state afterwards.  Depending on the size of the database and the speed of your disk subsystem, this entire process could take hours.  Who’s got that kind of time?  A faster solution would be to create a snapshot of the database prior to the testing, and just restore from that afterwards.  Why is this faster?  Because a) you’re not writing out all of the data when you create the snapshot like you would with a backup and b) you’re not writing back all the data when you restore from a snapshot like you would if you restored from a backup. See a pattern here?  When you recover from a snapshot you only write back the data that was changed since the snapshot was taken.

Let’s look at an example.

/*    create the snapshot    */

USE master;
GO
CREATE DATABASE AW_Snap1
    ON
        ( NAME = 'AdventureWorks_Data',
          FILENAME = 'D:\Program Files\Microsoft SQL Server\MSSQL10.D1DB08MS\MSSQL\DATA\AW_Snap1.ss')
AS SNAPSHOT OF Adventureworks;
GO

/*    Query the database and snapshot    */

SELECT * from AW_Snap1.Sales.SalesOrderDetail where SalesOrderID = 57049;
SELECT * from AdventureWorks.Sales.SalesOrderDetail where SalesOrderID = 57049;
GO

Restoring - first query

For the sake of drama, let’s imagine “someone” dropped a table in the database.

/*    oops!  someone dropped the table!    */

DROP TABLE AdventureWorks.Sales.SalesOrderDetail;
GO
SELECT * from AW_Snap1.Sales.SalesOrderDetail where SalesOrderID = 57049;
SELECT * from AdventureWorks.Sales.SalesOrderDetail where SalesOrderID = 57049;
GO

Restoring - after drop

DON’T PANIC!  Just restore from the snapshot.

/*    no problem, restore from the snapshot    */

USE master;
GO
RESTORE DATABASE AdventureWorks FROM DATABASE_SNAPSHOT = 'AW_Snap1';
GO

/*    table is back    */

SELECT * from AW_Snap1.Sales.SalesOrderDetail where SalesOrderID = 57049;
SELECT * from AdventureWorks.Sales.SalesOrderDetail where SalesOrderID = 57049;
GO

Restoring - after restore

Pretty cool, huh?

As with most things in life, there are some restrictions and considerations to keep in mind when restoring from a snapshot.  For one thing, the snapshot you’re restoring from can be the only current snapshot on the database, so drop all the other ones first.  Also, you can’t do a point in time recovery using snapshots (well, unless that point in time is the moment that snapshot was created).  On the same point, all data changes made since the snapshot was created will be lost.  Snapshot restores don’t fix problems like media failure, and they can’t restore over databases with compressed or read-only filegroups.  Restoring from a snapshot breaks the log backup chain, so always take a full backup after restoring from a snapshot.

I hope this series has helped introduce you to the possibilities for using database snapshots in your own environment.  And if you have any questions, feel free to leave a comment.

Advertisements

March 17, 2011 Posted by | Features, General | , , | 1 Comment

Database Snapshots – Part 2 – Offload reporting

So now that we know a little bit about database snapshots in SQL Server, what else are they good for?  One thing you can do is use snapshots in conjunction with database mirroring to offload reporting onto another server.

Suppose you’re using mirroring as part of a high-availability solution.  Normally that mirrored database/server would sit idle, waiting until the day something happens to the primary server and it gets to take over.  Kinda seems like a waste of perfectly good cpu cycles, doesn’t it?  While you can’t query that mirror directly, you can create a snapshot of it and query that.  Then maybe you can offload your reporting to the mirror server, while keeping your primary server freed up for OLTP processing.  Let’s take a look at how to make that happen.

I’m going to assume you already have mirroring set up and not go into that here.  Right now, we have a primary Bookstore database on MyServer1 being mirrored to MyServer2.  Let’s assume that twice a day we want to refresh the data in the snapshot for reporting.  The first thing we do is create a snapshot on the mirror, and test it.

USE master;
GO
CREATE DATABASE Bookstore_Snap_AM
   ON
       ( NAME = 'Bookstore_data',
         FILENAME = 'D:\Program Files\Microsoft SQL Server\MSSQL10.MyServer2\MSSQL\DATA\Bookstore_Snap_AM.ss')
AS SNAPSHOT OF Bookstore;
GO
USE Bookstore_Snap_AM;
GO
SELECT * FROM MyBooks;
GO

Mirroring - First query of snap

So far so good.  We could just have our reports run against this snapshot.  But when we create the new snapshot for the second half of the day, we’ll have to change the connections in the reports to read from that new snapshot.  That’s a little kludgy.  So instead what we’ll do is create another database called Bookstore_reporting.  And all that database will contain is synonyms pointing to the snapshot.  This way reports can be run against one database, regardless of what the current snapshot is called.

/*    Create separate database with synonyms    */
USE [master]
GO
CREATE DATABASE [Bookstore_reporting] ON  PRIMARY
( NAME = N'Bookstore_data', FILENAME = N'D:\Program Files\Microsoft SQL Server\MSSQL10.MyServer2\MSSQL\DATA\Bookstore_data.mdf' )
LOG ON
( NAME = N'Bookstore_log', FILENAME = N'D:\Program Files\Microsoft SQL Server\MSSQL10.MyServer2\MSSQL\DATA\Bookstore_log.ldf' )
GO
USE Bookstore_reporting;
GO
CREATE SYNONYM [dbo].[MyBooks] FOR [MyServer2].[Bookstore_Snap_AM].[dbo].[MyBooks]
GO
/*    Test the synonym    */
USE Bookstore_reporting;
GO
SELECT * FROM Mybooks;
GO

Mirroring - First query of synonym

So now all of our reports are directed against Bookstore_reporting, which is really reading from our AM snapshot.  During the course of the morning, we have users making transactions in the primary Bookstore database, and in the afternoon we generate a new snapshot.

/*    update a record in the principal */
USE Bookstore; --this is the original database on the Primary server, MyServer1
GO
UPDATE Mybooks SET Published = '1936' WHERE Title = 'Gone With The Wind'
GO
/*    generate a new snapshot on the mirror server    */
USE master;
GO
CREATE DATABASE Bookstore_Snap_PM
   ON
       ( NAME = 'Bookstore_data',
         FILENAME = 'D:\Program Files\Microsoft SQL Server\MSSQL10.MyServer2\MSSQL\DATA\Bookstore_Snap_PM.ss')
AS SNAPSHOT OF Bookstore;
GO
/*    query the old and new snaps and the synonym        */
SELECT * FROM Bookstore_Snap_AM.dbo.MyBooks;
SELECT * FROM Bookstore_Snap_PM.dbo.MyBooks;
SELECT * FROM Bookstore_reporting.dbo.MyBooks;
GO

Mirroring - after update and new snap

Now we see that the AM snapshot has the original data, the PM snapshot contains the updated record, and Bookstore_reporting is still pointing to the AM snapshot.  So let’s redirect Bookstore_reporting to the PM snapshot.

/*    drop and recreate the synonym to point to the new snap and drop the old snap    */
USE Bookstore_reporting;
GO
DROP SYNONYM MyBooks;
GO
USE Bookstore_reporting;
GO
CREATE SYNONYM MyBooks FOR MyServer2.Bookstore_Snap_PM.dbo.MyBooks;
GO
USE master;
GO
DROP DATABASE Bookstore_Snap_AM;
GO
/*    Query the Synonym again    */
USE Bookstore_reporting;
GO
SELECT * FROM MyBooks;
GO

Mirroring - second query of synonym (snap2)

And even though our report connection hasn’t changed, we’re getting the updated data.  You can schedule the new snapshot creation and synonym switch into a job to automate everything, and it’s relatively seamless.  I would recommend creating a procedure to change all the synonyms in a single transaction.  You can find code examples of how to do this online.

Bear in mind that you may experience some blocking if you try to drop/recreate the synonyms while a report is running, and this may cause your synonym update or the report to timeout.  Don’t be tempted to use READ UNCOMMITTED in your queries to get around this, you could end up querying the old snapshot for one table and the new snapshot for another in the same query.  The dropping/recreating of a synonym is a fast process, so timeouts should be limited.

Coming up in Part 3 we’ll look at using database snapshots to recover data.

March 14, 2011 Posted by | Features, General | , | 1 Comment

Database Snapshots – Part 1 – An Introduction

Let’s say you support the database for a financial system, and periodically, whether it be monthly, quarterly, or yearly, they close out the period and run a bunch of reports before they let users back on the system to start making new transactions.  Depending on your system, the size of your database, and how those reports are written, that could take hours.

Or let’s say you have a datawarehouse environment that pulls from this financial database, and it has to be balanced against the source database.  It can be difficult and time consuming to balance against a moving target.

Or what if you wanted to offload both of the reporting and the datawarehouse extracts onto another server, so they wouldn’t impact other users?

Wouldn’t it be awesome if you could quickly create a read-only copy of that database at a specific point in time that your financial users could report against while opening the main database back up for next-period transactions, without hours of downtime?  Wouldn’t it be nice if your datawarehouse team had a static copy of the data to run their extracts and balancing against?  Wouldn’t it be a little slice of heaven if you could even offload all this onto another server??  Where do I sign?

The thing is, with database snapshots you can do all this, and more (yes more!).  Database snapshots were introduced in SQL Server 2005 and have some feature enhancements in SQL Server 2008.

What is a database snapshot?

In a nutshell, a snapshot is a read-only “copy” of your database as of a point in time.  I put “copy” in quotes because it’s not really a full copy of the database.  How database snapshots work is, initially an empty file is created, called a sparse file.  Only when data is changed in the original database does anything get written to the sparse file.  And what gets written is the data prior to the change.  So if you update a record in the customers table, before it updates the page SQL Server copies it in its original form to the sparse file, then it completes the update.  A picture’s worth a thousand words, so here are some words from the Microsoft.

How NTFS sparse files work

So, how do you know what data to pull from the snapshot and what data to pull from the original database?  You don’t need to know.  To read the snapshot, you treat it just as you would any other database.  It knows what data it has and what’s still in the original database and handles that logic behind the scenes.  Make sense?  Here are a couple more pics from Microsoft to illustrate the read process.  The first is at the initial snapshot creation, before any data changes.  You can see that, even though the user is querying the snapshot, SQL Server is reading entirely from the original database.

Reading from a new snapshot

Now let’s say someone’s gone in and changed a record in the database.  Prior to writing that page, SQL Server copies it to the snapshot.  Now if you query the snapshot, SQL Server knows to grab that page from the snapshot and the rest of the unchanged data from the database.  This is how you’re able to get a view of the data at a specific point in time, even with transactions going on.

image

How do I create a snapshot?

Unfortunately, you can’t create a snapshot via Management Studio, don’t ask me why.  So you’re stuck with T-Sql.  No biggie, though, it’s just a variation of the CREATE DATABASE statement.

USE master;
GO
CREATE DATABASE AW_Snap1
   ON
       ( NAME = 'AdventureWorks_Data',
         FILENAME = 'D:\MSSQL\Data\AW_Snap1.ss')
AS SNAPSHOT OF Adventureworks;
GO

If you take a peek at the snapshot file at the filesystem level, it appears to be the same size as the AdventureWorks datafile.  But if you look at the file properties or query sys.dm_io_virtual_file_stats, you’ll see the actual size is very small, in this case 131072 bytes.

USE AW_Snap1;
GO
SELECT size_on_disk_bytes from sys.dm_io_virtual_file_stats (DB_ID(),1)    ;
GO
image

A quick query of the snapshot and original database returns the same data:

USE AW_Snap1;
GO
SELECT * FROM Person.Contact WHERE ContactID = 200;
SELECT * FROM AdventureWorks.Person.Contact WHERE ContactID = 200;
GO

image

Now let’s update some data and run the query again.

/*    Update the original database    */
USE AdventureWorks;
GO
UPDATE Person.Contact SET MiddleName = 'Eugene' WHERE ContactID = 200;
GO
/*    Query the snap again    */
USE AW_Snap1;
GO
SELECT * FROM Person.Contact WHERE ContactID = 200;
SELECT * FROM AdventureWorks.Person.Contact WHERE ContactID = 200;
GO

image

As you can see, querying the snapshot returns the original data whereas querying the database returns the updated data.  Also, if we check sys.dm_io_virtual_file_stats again, we can see that the physical size of the snapshot file has gone up to 196608 bytes.

You can’t refresh an existing snapshot.  If you want a more current snap of the database, you’ll need to create a new snapshot.  Be careful with this.  SQL Server will continue to maintain any old snapshots until they’re dropped, so the more snapshots you have laying around out there the more overhead you’re going to incur.  As a rule, if you don’t need a snapshot anymore, drop it.

USE master;
GO
DROP DATABASE AW_Snap1;
GO

Stay tuned for future installments in this series focusing on the fun things you can do with snapshots.

March 7, 2011 Posted by | Features, General | , | 1 Comment