Cleveland DBA

SQL Server administration in Cleveland, OH

IntelliSense tip

The IntelliSense feature of SSMS in SQL 2008 is a handy feature, overall I really like it.  There was one thing that was bugging me though: If I made any schema changes while SSMS was open, Intellisense didn’t seem to know about it.  It would mark any references to those changes as invalid.  The sql would still run, mind you, it was just annoying seeing red squiggly lines everywhere.

Well it seems IntelliSense has its own little cache in SSMS and you need to refresh it with the current metadata information.  This can be done by going to Edit -> Intellisense -> Refresh Local Cache or by hitting CTRL-Shift-R.

Just thought I’d share that here in case anyone else was running into the same problem.

May 31, 2011 Posted by | Features | , | 1 Comment

Page compression and performance

I know, I know.  This is the last one on compression, I promise.  Well, the last one for… at least a month.

So, we’ve talked about the different types of data compression and some things to keep in mind when deploying compression.  But we all know that the biggest factor that might keep you from using compression is the potential performance hit.  I mean, saving disk space is great and all, but if it means your app taking a 10%, 20%, even 50% performance hit?  Well, maybe not so much.  So how do you know what to row compress, what to page compress, and what to leave alone?

The first place I would start is with this awesome article on MSDN:  Data Compression: Strategy, Capacity Planning, and Best Practices.  If you don’t have time to read it now, print it out and read it later.  It’s well worth the paper.

Reads

In general with data compression, you get the biggest bang for your buck on tables/indexes that are scanned.  You’ll still see some performance benefit on single-row lookups, but it won’t be nearly as pronounced.  For this reason, data compression really shines in data warehouses.  To see this for myself, I set up some tests with 2 clustered tables with about 25 million records, one table using no compression, the other using page compression.

Before I show the test results, let’s take a quick look at the physical size difference between the tables using sp_spaceused:

I’d call that some pretty impressive space savings using page compression.  Ok, on to the SELECT tests.

For the first test, I queried on a range of key values (Clustered Index Seek).

set statistics io on
select COUNT(*) from mytab_no_compression where auindex between 215457 and 15555555;
select COUNT(*) from mytab_page_compression where auindex between 215457 and 15555555;

(1 row(s) affected)
Table ‘mytab_no_compression’. Scan count 5, logical reads 252577, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)
Table ‘mytab_page_compression’. Scan count 5, logical reads 56271, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

The mytab_no_compression table required more than 4 times as many logical reads for the same query.

Next I queried on single key value (Clustered Index Seek).

set statistics io on
select COUNT(*) from mytab_no_compression where auindex = 2153545;
select COUNT(*) from mytab_page_compression where auindex = 2153545;

(1 row(s) affected)
Table ‘mytab_no_compression’. Scan count 1, logical reads 4, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)
Table ‘mytab_page_compression’. Scan count 1, logical reads 3, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Meh.  As I stated earlier, the performance gains on single-row queries aren’t very impressive.

My third test was a select on non-key value (Clustered Index Scan)

set statistics io on
select COUNT(*) from mytab_no_compression where aureason = '0518'
select COUNT(*) from mytab_page_compression where aureason = '0518'

(1 row(s) affected)
Table ‘mytab_no_compression’. Scan count 5, logical reads 422394, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(1 row(s) affected)
Table ‘mytab_page_compression’. Scan count 5, logical reads 92418, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Here again we see a large performance gain with the page compressed table.

What about writes?

Well, I’m sold, how about you?  I know, I know.  Reads are one thing.  But what about writes?  What kind of impact does page compression have on OLTP applications?  The article I referenced earlier does a great job in exploring the effect of page compression on heavily updated tables, even providing a general formula for determining what tables should use page compression, row compression, or no compression.  In my environment, however, the application I’m looking at doesn’t really do updates.  It does delete/insert instead.  So how would page compression affect that?  And what about its impact on loads in our data warehouse?  More tests were needed.

The first write scenario I looked at was a large insert of roughly 25 million records.  There were 4 separate tests, I ran each test twice:

  1. Heap with no compression
  2. Heap with page compression
  3. Clustered index with no compression
  4. Clustered index with page compression

Ok, so we took a little hit on on the compressed heap load vs. the non-compressed heap, but personally I don’t think that’s too bad for 25 million records.  But holy schnikes Batman, look at the compressed clustered load!  It’s probably common sense, but really folks, build a heap and add the clustered index after it’s loaded, mkay?

Next I wanted to look at how compression affected singleton deletes/inserts.  I wrote a small bit of code that pulled 250,000 random records into a temp table and then cursored through them to delete the original record and insert a new record.  In this case I had 6 separate tests, again each was executed twice:

  1. Heap with no compression, index with no compression
  2. Heap with page compression, index with no compression
  3. Heap with no compression, index with page compression
  4. Heap with page compression, index with page compression
  5. Clustered index with no compression
  6. Clustered index with page compression

I’d call the first pairing a wash.  In the second pairing a slight advantage goes to the compressed heap with the compressed index.  In the third pairing, the clustered indexes, is where we see the biggest impact of page compression with a 5% average increase in the timings.  But nothing here would make me rethink deploying page compression in an environment that had the CPU cycles to spare.  And to that end, I also monitored processor utilization during my tests.  For the delete/insert tests average CPU utilization was .2% higher on the compressed tables, and on the initial load tests the average utilization was about 3% higher.  Keep in mind this was on a pretty beefy server with no concurrent usage, so your mileage may vary.

So there you have it.  I hope this series of posts was helpful, I know I learned quite a bit.  Have a great weekend!

May 26, 2011 Posted by | Features | , , | Leave a comment

A little more about compression

Happy Monday folks!  Ready to talk some more about compression?

Unicode Compression

Another type of compression new in SQL Server 2008 is Unicode compression.  This is basically row compression for nchar and nvarchar datatypes.  Per BOL, SQL Server uses the Standard Compression Scheme for Unicode (SCSU) algorithm to achieve up to 50% space savings on some european languages, less so with Asian languages but still respectable.  If you really want to know more about what goes on behind the scenes in unicode compression, start with the wiki for the SCSU algorithm and go from there.

Backup Compression

The compression feature that I, personally, have made the most use of thus far would have to be backup compression.  Not only has this greatly reduced the size of my database backups, but it has also increased backup speed (less I/O).  The actual compression ratio is dependent on the data being backed up, whether that data is encrypted (this will not compress as much), and whether the database itself is already compressed.

Like the other types of compression, backup compression does use more CPU, but since we’re usually taking backups during quieter hours, I’ve not noticed any impact on performance.  If performance is a concern, you can always use resource governor to throttle how much CPU the backup process can use.

Backup compression can be used on an ad-hoc basis using the WITH COMPRESSION option of the BACKUP DATABASE command or it can be set globally using sp_configure (“EXEC sp_configure ‘backup compression default’, ‘1’;”).  If you set the compression default at the server level, this can always be overridden at the time of the backup.

Compression Considerations

Now before you run off and start compressing everything, there are some things you might want to keep in mind.

  • Performance – Nothing in life is free and this is no exception.  As I’ve mentioned before, the trade-off for disk savings is increased CPU usage.  If you’re already processor-bound, you might want to think twice before implementing data compression.
  • Version – Compression is currently only available in the Enterprise and Developer editions.  A compressed database cannot be restored to a Standard Edition instance.
  • Partitioning – You can compress (or not compress) each partition individually, and each partition can use a different type of compression.
  • Indexes – Clustered indexes inherit the compression property of the table, non-clustered indexes do not.  If you change the compression settings on a heap, all non-clustered indexes will be rebuilt.
  • Replication – By default, the initial snapshot will contain the same compression settings as the published object(s).  You can change the replication defaults to control whether compression settings are replicated.  If you try to replicate a compressed table to a Standard edition database, replication will fail.
  • Monitoring – You can monitor compression via the Page compression attempts/sec and Pages compressed/sec counters in the Access Methods object of Perfmon.

May 23, 2011 Posted by | Features | , , , | Leave a comment

Page compression (!)

Ok kids, last time I introduced row compression, today I want to take the next step and talk about page compression.  Now, whereas row compression removed NULL and 0 values from each individual record, page compression removes redundant values from a table.  Redundant data is stored on the page only once, and subsequent occurrences will reference this value.  So you can probably imagine the kind of space you could save in a table with many duplicate values.

So, how does it work?  Glad you asked!

Row Compression

The first step in the page compression process is something we’re already familiar with:  row compression.  Makes sense, right?  After all, if you’re trying to compress the data on a given page as much as possible, you should probably start by making each row as small as possible.

Prefix Compression

The second step in page compression is something called prefix compression.  What the who?  SQL Server looks at the beginning bytes all the columns on the page and sees if any of the values in that column share a common pattern.  This common pattern is the prefix.  If they do, SQL Server takes that byte string (prefix) and stores it at the beginning of the page, right after the page header.  Any instances of the prefix in that column are then replaced with a reference to that prefix.  Any given prefix is local to a specific column, so a value in columnB can’t reference a prefix for columnA.  For this reason, prefix compression is sometimes referred to as column-prefix compression.  I think a picture would help illustrate the concept.  So here’s one I shamelessly “borrowed” from BOL.

Take a look at the first record before and after compression.  That “4b” means the row should contain the first 4 bytes of the referenced prefix, plus the letter b.  Also note how each value is only referencing a prefix in its own column.  Got it?  Good.  If not, I think you will once you see the next step.  Onwards!

Dictionary Compression

The third, and final, step in page compression is dictionary compression.  Dictionary compression is pretty much the same as prefix compression, only it applies to any column on the page.  So a given prefix can be referenced by a value in columnA or columnB, just as “4b” is referenced by two different columns below.

And that, my friends, is page compression.  Page compression can be enabled when the table is created, in which case rows will be compressed as they are inserted and the page will be compressed after the page is full and a new row is inserted.  Once the page is compressed, if the new row will fit on the page, it’s placed there, if not a new page is created and the row is written there.  Additionally, you can enable page compression on an existing table using the ALTER TABLE statement.  Just as with row compression, though, this can be quite a resource-consuming process.  And I keep saying “table” but in reality compression can be used on heaps, clustered indexes, non-clustered indexes, table and index partitions, and indexed views.

In my next post I’ll talk briefly about Unicode compression, Backup compression, and discuss some of the things you might want to keep in mind when using compression.  Until next time, folks…

May 19, 2011 Posted by | Features | , | Leave a comment

Row compression

It’s said that there are 2 things you can count on in life: death and taxes. For DBAs, I’d like to add a third: database growth. Let’s face it, our need to store more and more data is only increasing and so is our need to retain this data for longer periods of time. And when you add those two trends together, you get a database that just keeps getting bigger. Kinda like the blob.

SQL Server 2008 introduced a few new compression features to help alleviate some of the growth problem; backup compression, and two types of data compression, row and page. Today I’m going to briefly discuss row compression.

Row compression is pretty simple, really. In a nutshell, SQL Server takes fixed-length data and stores it in a variable-length storage format. Blank characters aren’t stored, neither are NULL and 0 values. It doesn’t change the definition of the table or columns, so there’s no impact to your application. Let’s take a look at an example using the AdventureWorks database.

First we’ll check out the physical stats for the Purchasing.PurchaseOrderDetail table.

USE AdventureWorks;
GO

select index_id, record_count, page_count, avg_page_space_used_in_percent, min_record_size_in_bytes, max_record_size_in_bytes, avg_record_size_in_bytes, compressed_page_count
from sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('Purchasing.PurchaseOrderDetail'), NULL , NULL, 'DETAILED')
where index_level = 0;
GO

In particular, note the 3 columns dealing with record size, min_record_size_in_bytes, max_record_size_in_bytes, and avg_record_size_in_bytes. We can also look at the table properties.

Again, note the data space value.

Now let’s enable row compression on this table and re-run our query on sys.dm_db_index_physical_stats .

ALTER TABLE Purchasing.PurchaseOrderDetail REBUILD WITH (DATA_COMPRESSION = ROW)
GO

select index_id, record_count, page_count, avg_page_space_used_in_percent, min_record_size_in_bytes, max_record_size_in_bytes, avg_record_size_in_bytes, compressed_page_count
from sys.dm_db_index_physical_stats (DB_ID(), OBJECT_ID('Purchasing.PurchaseOrderDetail'), NULL , NULL, 'DETAILED')
where index_level = 0;
GO

Note the change in the record size columns. And if we look at the table properties:

Not too shabby. Now, for a small table like this, the alter table took no time at all. But since this process is actually restructuring data pages, it can be quite expensive and time-consuming on a large table. So Microsoft provided a stored procedure that allows you to estimate your space savings before you commit.

EXEC sp_estimate_data_compression_savings 'Sales', 'SalesOrderDetail', NULL,NULL,'ROW'

In my next post, I’ll go over page compression.

P.S. – I really need to work on more creative and exciting post titles.  Maybe “Row Compression!!!”  Everything’s more exciting with exclamation points!!!

P.P.S – Portland was fantastic!!!  (<– see?  exciting.)

May 17, 2011 Posted by | Features | , | Leave a comment

The Rose City

I’m off to Portland, Oregon for a few days for some much-needed R&R.  I’m *so* looking forward to some of this

Columbia River Gorge

A little of this

And, yes, a bit of this

Oh!  and I have to try one of these!

Voodoo Doughnuts

‘Til next week, folks…

May 9, 2011 Posted by | Uncategorized | | 1 Comment

Scripting out a database

Switching gears a little bit, much of my attention this week has revolved around backups. We have several large databases that have gotten to the point where full backups every night have become unwieldy. So I’ve been redesigning the maintenance plans for them around a full backup once a week and differentials nightly. So far it’s working out very well, saving us a lot of time, disk, and tape. Woot!

While I was thinking about backups in general, I also started thinking about a question that’s come up before: how to script out all the database objects in an automated fashion. Whether it’s so you can create a second ’empty shell” copy of your database, or so you can restore a single stored procedure that you accidentally dropped, or just because it’s a development environment and you don’t care about the data so much as the object definitions. (Or because it’s just cool.)  Whatever the reason, I personally think this is something that could come in handy, and I hadn’t come across any way in SQL Server to automate the scripting a database.

So i did some searching and found a few third party tools (nah), some options that revolved around VB or C# or something (yeah, have I mentioned I’m not a programmer?). Then I came across a few partial solutions using powershell. Now, again me != programmer, but I can handle a little bit of powershell. They were pretty straightforward, but they weren’t quite… enough. So I took a little of each, threw in a few touches of my own, and voila, a solution that I’m pretty darn pleased with. Can it be improved upon? I fairly confident it can be. And probably will be. But in the meantime it gets the job done. And it can be scheduled either through a database job or via the Windows scheduler, which was the whole point.  And so, without further ado, I present… the script… (ooooooo!)

# SQLExportDBSchema.ps1
# Usage -
# From PS:  .\SQLExportDBSchema.ps1 "[output location]" "[Server]" "[Database]"
# From CMD: powershell -command "& 'D:\powershell\SQLExportDBSchema.ps1' "[output location]" "[Server]" "[Database]" "

param(
	[string]$filepath=$null,
	[string]$srv=$null,
	[string]$database=$null
	)

[System.Reflection.Assembly]::LoadWithPartialName('Microsoft.SqlServer.SMO') | out-null

$filepath = $filepath+"\"

$s = new-object ('Microsoft.SqlServer.Management.Smo.Server') $srv

$db=$s.Databases[$database]

$scrp = new-object ('Microsoft.SqlServer.Management.Smo.Scripter') ($s)

$scrp.Options.AppendToFile = $False
$scrp.Options.ClusteredIndexes = $True
$scrp.Options.DriAll = $True
$scrp.Options.ScriptDrops = $False
$scrp.Options.IncludeHeaders = $True
$scrp.Options.IncludeIfNotExists = $True
$scrp.Options.ToFileOnly = $True
$scrp.Options.Indexes = $True
$scrp.Options.Triggers = $True
$scrp.Options.Permissions = $True

#CREATE DATABASE
$outfile =$filepath+$database+"_create.sql"
$scrp.Options.FileName = $outfile
$scrp.Script($db)

#SECURITY
$outfile = $filepath+$database+"_security.sql"
$scrp.Options.FileName = $outfile
$scrp.Options.IncludeDatabaseRoleMemberships = $true
$scrp.Script($db.Roles)
$scrp.Options.AppendToFile = $True
$scrp.Script($db.ApplicationRoles)
$scrp.Script($db.Users)
$scrp.Script($db.Schemas)

$scrp.Options.AppendToFile = $False
$scrp.Options.WithDependencies = $True

#PARTITIONING
$outfile = $filepath+$database+"_partitioning.sql"
$scrp.Options.FileName = $outfile
$scrp.Script($db.PartitionFunctions)
$scrp.Options.AppendToFile = $True
$scrp.Script($db.PartitionSchemes)

$scrp.Options.AppendToFile = $False

#TABLES
$outfile = $filepath+$database+"_tables.sql"
$scrp.Options.FileName = $outfile
$scrp.Script($db.Tables)

#DDLTRIGGERS
$outfile = $filepath+$database+"_triggers.sql"
$scrp.Options.FileName = $outfile
$scrp.Script($db.Triggers)

#VIEWS
$outfile = $filepath+$database+"_views.sql"
$scrp.Options.FileName = $outfile
$views = $db.Views | where {$_.IsSystemObject -eq $false}
$scrp.Script($views)

#FUNCTIONS
$outfile = $filepath+$database+"_functions.sql"
$scrp.Options.FileName = $outfile
$functions = $db.UserDefinedFunctions | where {$_.IsSystemObject -eq $false}
$scrp.Script($functions)

#STORED PROCS
$outfile = $filepath+$database+"_storedProcs.sql"
$scrp.Options.FileName = $outfile
$storedprocs = $db.StoredProcedures | where {$_.IsSystemObject -eq $false}
$scrp.Script($storedprocs)

#SYNONYMS
$outfile = $filepath+$database+"_synonyms.sql"
$scrp.Options.FileName = $outfile
$scrp.Script($db.Synonyms)

May 5, 2011 Posted by | Features, General, Maintenance | , , | 3 Comments

How is SQL Server using all that memory, anyway?

I want to stick with the subject of memory because I think it’s probably the most misunderstood (and sometimes downright mysterious) component of SQL Server. I know I, for one, could benefit from a better understanding of its inner workings. So today I’d like to share that little bit of code I mentioned last week, that I find very handy for getting an overall picture of how memory is being used by a SQL instance. Personally I run it as a complete script, but I’ll break it down here and provide a little description of each section. I should say, too, that I didn’t write any of this from scratch. For most of it I started with code found here, here, and here, and modified it to suit.

So, without further ado, let’s get started. The first section provides an big-picture look at how SQL is currently allocating memory between the database page cache, procedure cache, and miscellaneous memory usage. It also provides the total memory usage, and I added in the maximum available workspace memory.

--HOW MUCH MEMORY IS MY SQL SERVER USING?

declare @plan_cache_size float, @obj_data_size float , @avail_workspace_size float

-- plan cache size

select @plan_cache_size = (cntr_value*8)/1024.0 FROM sys.dm_os_performance_counters
	WHERE object_name like '%Plan Cache%' and counter_name = 'Cache Pages' and instance_name = '_Total'

-- DATABASE PAGE CACHE (PAGES, INCLUDING INDEXES)

select @obj_data_size = (cntr_value*8)/1024.0  FROM sys.dm_os_performance_counters
	WHERE object_name like '%Buffer Manager%' and counter_name like 'Database pages%'

-- Maximum workspace available for sorts, hashes, etc

select @avail_workspace_size = (cntr_value)/1024.0 FROM sys.dm_os_performance_counters
	WHERE object_name like '%Memory Manager%' and counter_name = 'Maximum Workspace Memory (KB)'

-- CACHE TOTALS
select   @obj_data_size [Database_Page_Cache_MB]
 , @plan_cache_size [Procedure_Cache_MB]
 , [Misc_Memory_Usage_MB]
 , [Misc_Memory_Usage_MB] + @obj_data_size + @plan_cache_size [Total_Memory_Usage_MB]
 , @avail_workspace_size [Maximum_Workspace_MB]
from
(
 select sum(cntr_value)/1024.0 [Misc_Memory_Usage_MB]
 from sys.dm_os_performance_counters
 where object_name like '%memory%'
 and (
   counter_name like '%Connection Memory (KB)%'
   or
   counter_name like '%Granted Workspace Memory (KB)%'
   or
   counter_name like '%Lock Memory (KB)%'
   or
   counter_name like '%Optimizer Memory (KB)%'
   or
   counter_name like '%SQL Cache Memory (KB)%'
 )
) x
GO

The output:

The next section uses sys.dm_os_memory_cache_counters to examine how the procedure cache is being broken down.

-- How the Procedure cache is being used

SELECT  TOP 6
	LEFT([name], 20) as [name],
	LEFT([type], 20) as [type],
	([single_pages_kb] + [multi_pages_kb])/1024 AS cache_mb,
	[entries_count]
FROM sys.dm_os_memory_cache_counters
order by single_pages_kb + multi_pages_kb DESC

The output:

The third part of the script goes deeper into the procedure cache and displays the top 25 cached plans by size. Because it displays some application code, I’m not going to include the results here.

-- Top cached plans by size

select top 25
cacheobjType
, ObjType
, (pagesUsed * 8192)/1024.0/1024.0 [space_used_MB]
, db_name(dbid) [database_name]
, object_name(objid, dbid) [object_name]
, [sql]
from master..syscacheobjects (nolock)
order by pagesUsed desc
GO

And, finally, the last section shows us how much of the buffer pool is being utilized by each database, including the dirty and clean page counts, sorted by total memory.

-- Buffer Pool Memory Per DB

SELECT
(CASE WHEN ([database_id] = 32767) THEN 'Resource Database' ELSE DB_NAME (database_id) END) AS 'Database Name',
SUM(CASE WHEN ([is_modified] = 1) THEN 1 ELSE 0 END) AS DirtyPageCount,
SUM(CASE WHEN ([is_modified] = 1) THEN 0 ELSE 1 END) AS CleanPageCount,
count(*)AS TotalPageCount,
cast(count(*) * 8192.0 / (1024.0 * 1024.0) as decimal(8,2)) as BufferPoolMB
FROM sys.dm_os_buffer_descriptors
GROUP BY database_id
ORDER BY TotalPageCount desc
GO

And the output:

And there you have it. I hope you find this useful!

May 2, 2011 Posted by | General, SQL Tuning, Troubleshooting | , , , | 1 Comment