Twitter

Follow SQLQuill on Twitter

Subscribe

MN Pass

TSQL Tip O’ the Week – 8/30/11 – Almost a Reboot

This weeks tip:

Anytime anyone runs any command on a SQL server, it changes the conditions of that SQL server.  Execution plans get created, different objects are stored in memory, etc.  I don’t like to use time as a performance tuning metric (I’m a big fan of logical reads), but most people do – especially executives, so when measuring execution time, it’s important to compare apples to apples.

Below is a script, which I call “Almost a Reboot” as it’s almost like rebooting server, that I run on a server anytime I’m tesing performance. I got the initial script from Kimberly Tripp (Blog/Twitter) at DevConnections in about 2004 – I’ve tweaked it when SQL2005 came out and it’s probably due for a few changes for SQL2008, so don’t be shy to offer suggestions.

My steps for performance tuning any SQL code are:

  • Run this script
  • Run the original code
  • Record the results
  • Make my change
  • Run this script again
  • Run the changed code
  • Record/Compare the results

--Almost a reboot--
DECLARE @DBName Varchar(255)
Set @DBName = <<SET DBNAME HERE>> --Fill this in to the database you want to flush

CHECKPOINT --writes dirty pages to disk
DBCC FREEPROCCACHE --clears entire plan cache
DBCC DROPCLEANBUFFERS --clears all clean data cache pages

DECLARE @intDBID INTEGER
SET @intDBID = (SELECT dbid FROM master.dbo.sysdatabases WHERE name = @DBName)

--Recompile the stored procedures
DBCC FLUSHPROCINDB (@intDBID) --clears all clean plan cache for specified database

Again, let me know if you have any good tips to update this.

Here are some related posts from the intraweb:

Previous Tips O’ the Week

TSQL Looping Stats

Every once in a while, you need to loop through a bunch of records in TSQL.  It’s not the fastest way to do things, but sometimes it’s necessary.  I’ve been working on a process that requires this and I needed to know the progress of the loop, so I came up with this model.  The output will give you a bunch of statistics including a predicted finishing time every 1000 rows (note you can adjust this).  This should save me from getting some grey hairs when a looping process is necessary.

Declare @counter int,
		@currentRow int,
		@rowsFound int,
		@start datetime,
		@SecSoFar bigint,
		@PredictedTotalSec int,
		@PercentComplete varchar(10),
		@CompleteTime datetime

Set @rowsFound = 200000 -- or set this equal to the count of a table you are working with.

Set @currentRow = 1
Set @start = GETDATE()

while @currentRow <=  @rowsFound
Begin
		--<<DO YOUR WORK HERE>>--

			Set @currentRow = @currentRow + 1  ---Update the counter....
			If (@currentRow % 1000) = 0  --display every 1000 rows or adjust...
			Begin
				Set @SecSoFar = Datediff(s, @start, Getdate())
					Set @PercentComplete = Cast(ROUND(@currentRow/Cast(@rowsFound as float), 3)*100 as varchar(20)) + '%'
					Set @PredictedTotalSec = (@SecSoFar/Cast(@currentRow as float))*@rowsFound
					Set @CompleteTime = Dateadd(s, @PredictedTotalSec, @start)

				Select
					@currentRow as 'Current_Row',
					@rowsFound as 'Total_Rows',
					@start as 'Start_Time',
					GETDATE() as 'Current_Time',
					@SecSoFar as 'SecondsSoFar',
					@PercentComplete as 'Percent_Complete',
					@PredictedTotalSec as 'Predicted_Total_Seconds',
					@CompleteTime as 'Predicted_Complete_Time'

			End
End

Troubleshooting Troubleshooting – sys.dm_exec_query_stats

The Setup:

Our current deploy process runs a big script that adds columns, tables, indexes, constraints, etc.  It performs DDL changes.  It even migrates data to the new tables and columns.  So it also performs DML changes.  One issue we run into is that in our development environment we have a deploy everyday (an incremental build).  In production, we only do it once – officially.  We were working on our staging environment, going for the non-incremental deploy and it was taking a long time – what should take 10 minutes, we stopped at 3 hours.

The Statement:

Looking into the deploy issue I was querying the sys.dm_exec_query_stats DMV and only saw the DML statements.  Digging into it, the DMV doesn’t collect stats for the DDL statements.

The Proof:

To prove this, here’s what I did – first – almost a reboot (cleared the cache, clean buffers, etc):

CHECKPOINT        --writes dirty pages to disk
DBCC FREEPROCCACHE    --clears entire plan cache
DBCC DROPCLEANBUFFERS    --clears all clean data cache pages

--from MS page on sys.dm_exec_query_stats

SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;

Here are the results:

Res_1

Then I created a test table and reran the query on sys.dm_exec_query_stats:

Create table test (iid int)
Go
SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;

Here are the results:

Res_2

To Prove this is legit, I ran a select statement on the new table and then reran the query on sys.dm_exec_query_stats:

Select * from test
Go

SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;

The Select statement on the table and the DMV showes up, but not the create table statement.

Res_3

The same thing happened with adding an index, here’s the script (first almost reboot, then prove there is nothing in the results of the query to dm_exec_query_stats, add the index and rerun the query to dm_exec_query_stats):

CHECKPOINT        --writes dirty pages to disk
DBCC FREEPROCCACHE    --clears entire plan cache
DBCC DROPCLEANBUFFERS    --clears all clean data cache pages

SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;
Go
CREATE UNIQUE CLUSTERED INDEX [UC_Test_iID] ON [dbo].[test]
([iid] ASC) WITH (STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;

Here’s the results:

Res_4

In conclusion, remember the fact that sys.dm_exec_query_stats DMV only shows DML statements, not DDL statements!

Code Reviews – What good are they for?

  1. All database changes go through me. 
  2. Do or Review, do or review.

These have been my catch phrases since taking on the lead SQL developer role at my company, on my team.  There have many changes that I’ve had to review – many, many changes, most of them are fine, but occasionally I get a real head scratcher.

Things that I’ve caught reviewing code in the last three weeks (note these are also justifications I’m taking to my boss for more help):

  • VARCHAR(MAX) column – we avoid these for online reindexing
  • A cursor – we avoid these for performance sake – it also didn’t need it
  • Missing comments and Headers in stored procedures – we’ve got a standard header we use
  • An infinite loop – a testing procedure that uses service broker that would have been looping forever.
  • A date that should be UTC Date, but Getdate() was used

Is this good for the company?  This is the question I’m asking myself…all the time actually.  Running the numbers, how much time and money has my reviews saved the company?

For the next six months a guess would be:

varchar(MAX) column 

  • 10 hours of operational DBA/contractor time to write exception statements and meet on how to handle the online re-indexing
  • 10 hours development DBA time to migrate the data on the next release
  • 5 hours testing time

The cursor

  • 10 hours development DBA time to rewrite and test the change
  • 5 hours testing time

Missing Comments and Headers in stored procedures

  • 5 hours developer time

An infinite loop

  • 1 hour to troubleshoot and fix on release night with 30 people waiting around – 30 work hours total

A Date that should be UCT instead of Server time

  • 2 hours development DBA time to locate and fix

48 work hours total for about that the same amount of time reviewing code. 

Note these are just estimates, so I may be way off, but the old saying, “Pay me now or pay me later” really comes into play here.

There are a quite a few intangible benefits as well:

  • I know what’s going on with the changes and I’ve tried to communicate to the other DBAs the changes coming.  That should save lots of time for others
  • I have got to know a lot of developers who I normally wouldn’t talk to.
  • I’ve taught a few developers a few things about SQL.  Hopefully this will save tons of time down the road.
  • I’ve showed this site off to a fairly new SQL guy and he’s used the generate random data script a lot and it’s saved him some testing time.

 

Milt

Big Trys and Small Trans

triceps_0 tran

The catch phrase around the office these days is “Big Trys and small Trans.”  This came about as we were review stored procedure changes and I two suggestions.  First, that the entire procedure has a try…catch around it and second that the transaction be as small as possible.

Big Try

We’ve got standards in place that procedures have a try…catch in them.  One procedure had been copied, then modified with some logic to parse XML before the try.  I said lets move the try to above the XML parsing in case there’s an error in the XML, we’ll catch that as well.  So now we’ve got a bigger try.

Small Tran

The original code had the following steps:

  1. Start the transaction
  2. Look up some data
  3. Insert some data into a table
  4. Look up some other data
  5. Update another table
  6. Finally completed the transaction

In order to keep transactions small, I suggested the following:

  1. Look up all the data
  2. Start the transaction
  3. Insert the row into the table
  4. Update the row in the other table
  5. Finish the transaction

This will help avoid deadlocks and contention and keep things running smoothly.  There’s the smaller tran.

If you stick with the “Big Trys and Small Trans” methodology, it should help the following:

  1. Help avoid contention
  2. Help avoid deadlocks
  3. Catch errors earlier

SQL Performance Trouble Shooting – the basics from Microsoft

  1. Check Activity Monitor
    1. Anything running?
    2. Blocking/wait times
    3. Goto Locks and check lock strategy
  2. SQL Error Log
    1. Any errors? If so investigate
  3. Resource utilizations
    1. DMVs
      1. sys.dm_exec_query_stats
        1. Look at: Total physical reads, logical reads, number to executions, total time, (divide those two to get average)
        2. Join with sys.dm_exec_sql_text
  4. Performance Monitors
    1. Processor Time
    2. Memory
    3. SQL Server Statistics
      1. Batch requests/sec
      2. Compilations
      3. Recompilations – high CPU time (use fully qualified name to avoid recompiatoins)
    4. SQL Wait stats
    5. SQLServer Memory Manager
    6. Physical Disk – Average Disk Read and write – Less than 10 ms on SAN
  5. Trace files – created default
  6. Event Viewer

Get Fairly Accurate Table Counts

This is the fast way to get a count of all the data in the tables in a database…

SELECT
[TableName] = so.name,
[RowCount] = MAX(si.rows)
FROM sysobjects so, sysindexes si
WHERE so.xtype = ‘U’
AND si.id = OBJECT_ID(so.name)
GROUP BY so.name
ORDER BY 2 DESC

Weekly Challenges 4/25/08

Here’s this week’s weekly challenges….

  1. New release went out last Friday, 4/18/08 – lots and lots of SQL servers
    1. Two SQL 2005 Clustered 64 bit
    2. Two SQL 2005 64 bit Log Shipping servers
    3. Three SQL 2005 Utility servers
    4. Eight SQL 2005 Express servers
    5. One SQL 2000 utility server
  2. I spent most of the week in in the “War” room with a consultant from Microsoft.
    1. High CPU, slower performance issue on the biggest database
    2. We updated statistics with full scan and things were under control.
    3. Here’s some information on:
      1. Statistics Used by the Query Optimizer in Microsoft SQL Server 2005
      2. sys.dm_db_missing_index_group_stats
  3. Contiunied to get the performance testing environment up to speed and level set with production – all 16 SQL servers listed above are replicated in our performance testing environment.

Tuning sys.dm_db_index_physical_stats

We had a report that was using sys.dm_db_index_physical_stats to find partition progress information. It was taking 45 minutes reguardless of the data in the results. We started looking at the data it was reading. The slow down was around two tables with LOB_Data in them. One table had about 14 million rows in it and has one column with a data type of XML and the other table has about a million rows and has a column with a data type of text. When SQL 2005 tries to run dm_db_index_physical_stats, it takes a long time due to these large column. Thus we changed the mode from LIMITED to SAMPLED which cut the time down from about 45 minutes to 2.5 minutes for those two tables.