Twitter

Follow SQLQuill on Twitter

Subscribe

MN Pass

SQL Auto Converts True or False to a bit

The title says it all, but here’s the story.  The other day I was modifying a procedure and I didn’t know if the application was going to pass in True/False or 1/0 for a bit column.  Being lazy, I created my procedure with a 1 or 0 and figured I’d update it later if the app was using True/False.  The App was passing in a True/False, but I didn’t need to update my procedure because SQL auto converts it.  More info can be found here.  It’s pretty basic, but it save me some time, so I figured I’d remind people.  Here’s an example to show it:

Create table Test (IDCol int identity(1,1),
     TestBit bit)

Insert Test Values(0)
Insert Test Values(1)

Select * from test

Insert Test Values('False')
Insert Test Values('True')

Select * from test

Here’s an XML example:

Declare @XMLTest xml
Set @XMLTest = '<Test>
     <BitTest>False</BitTest>
     </Test>
     <Test>
     <BitTest>True</BitTest>
     </Test>'

Select x.x.value('BitTest[1]', 'bit')
from @XMLTest.nodes('/Test') as x(x)

How To: Set the Default Scripting Options in SQL Server Management Studio

I just spent 10 minutes looking for this, it’s very simple and fairly obvious.  I’m posting it so I don’t forget it and hopefully it may help others.

I’ve got to script out all tables, indexes, primary keys, foreign keys, defaults and the collation of all tables in all of our databases.  They all have to be the same style and the options I have to choose are slightly different than the defaults.  To make fast work of this, I’ll change the defaults to be exactly what I want then from my Central Management Server, I script each and every database I need.

Here’s how to set the defaults:

In SQL Server Management Studio, go to Tools –> Options.  Expand the SQL Server Object Explorer list and select Scripting.  Change what you want, close this window and Ta-Da the next time you generate scripts, the defaults are set.

 Tools --> Options

BINARY_CHECKSUM and CHECKSUM_AGG – Combo platter

Recently I got an email from a tester claiming that the IDs of values in static data tables are different in one database than they are in another database.  To check this out, I used a combo of the HandsBINARY_CHECKSUM and CHECKSUM_AGG.  Here’s a breakdown of each:
  • The BINARY_CHECKSUM function returns a checksum value for a row of a table or over a list of expressions.
    • A value is computed for each row, this makes it easy to find rows that are different.
    • This function returns a group of values.
    • Will return a different value for most, but not all, changes to the row, and can be used to detect most row modifications.
  • The CHECKSUM_AGG function returns a checksum of the values in a group.
    • If one of the values in the expression list changes, the checksum of the list also generally changes. However, there is a small chance that the checksum will not change.

So, I figured that I could first find the BINARY_CHECKSUM of each row and then find the CHECKSUM_AGG of all the BINARY_CHECKSUMs. If the tables were different, most of the time (as noted in bold above) these values would be different.

There are two situations that I find this very useful for (I’ll post about these later):

  1. Static data  – data that needs to be the same from one environment to another
  2. Data migration – verifying all the columns get migrated correctly.

Note – there are a few data types that BINARY_CHECKSUM does not support.  They are text, ntext, image, cursor, xml, and noncomparable CLR user-defined types.  In my experience, some of the time update dates and insert dates are not important as well as timestamp data types, so I usually ignore these columns.

So to investigate the tester’s claims, I ran the following on each database:

Select CHECKSUM_AGG(BINARY_CHECKSUM(Col1, Col2, Col3)) from Table1

The values were different, so I removed Col1 from the criteria and ran the following on each database and the values were the same.

Select CHECKSUM_AGG(BINARY_CHECKSUM(Col2, Col3)) from Table1

Therefore, Col2 and Col3 had the same values for both Col2 and Col3 **most likely** but Col1 definitely had different values.  The tester was right.

I then removed the CHECKSUM_AGG to find and then compare the values of each row to find the ones that didn’t match.

Select Col1,
BINARY_CHECKSUM(Col1, Col2, Col3)
from Table1
Order by Col1

Here’s an example that shows that this method doesn’t always work – note that the same distinct values are in each table, but each row is not the same.

CREATE TABLE [dbo].[table1](
[col1] [int],
[col2] [varchar](200))

CREATE TABLE [dbo].[table2](
[col1] [int],
[col2] [varchar](200))

Insert table1 Values
(1, 'Row #1 in Table1, Row #2 in Table2'),
(2, 'Row #2 in Table1, Row #1 in Table2')

Insert table2 Values
(2, 'Row #1 in Table1, Row #2 in Table2'),
(1, 'Row #2 in Table1, Row #1 in Table2')

---These are equal
Select CHECKSUM_AGG(BINARY_CHECKSUM(col1, col2)) from table1
Select CHECKSUM_AGG(BINARY_CHECKSUM(col1, col2)) from table2

---These are NOT equal
Select BINARY_CHECKSUM(col1, col2) from table1
Select BINARY_CHECKSUM(col1, col2) from table2

Despite the small chance of a checksum being equal, this method makes it very easy to quickly identify data inconsistencies in different databases or environments.  If you’re ever need to quickly find inconsistent data, give it a try.

The Official SQLFAQ High Level Overview of Data Migration Testing

I’ve been tasked with the SQL portion of our data migration and conversion testing for our big release we’ve got coming up. What’s that mean?  Well, we’ve got an upgrade coming to the application we support.  We’ve got 15 databases on 7 servers.  All of them have database schema changes coming, data moving around from one table to another, new tables getting created, old ones getting dropped, new columns populated from other databases, etc.  I’ve got to make sure all the data that moves makes it there and is correct.  Here was how I attacked it:

The High Level Overview:

  1. Script production databases from our log shipping servers
  2. Create two copies of each database on a test server(s)
  3. Upgrade one database of each set
  4. Compare schema on Pre and Post upgraded databases
  5. List out all the diffences
  6. Catagorize how each new column or table is populated
  7. Ignore the new tables that are populated on use or populated with defaults – no data migration
  8. Create Checksum and row count scripts for all the migrated data
  9. Run the scripts and record the results on our Prod-Stage environment before the upgrade
  10. Upgrade Prod-Stage databases
  11. Run the scripts and compare the results to the Pre-upgrade.
  12. Address any issues

This approached seemed to work well, so my plan is to write a post on a bunch of these individual steps, so stay tuned.

Troubleshooting Troubleshooting – sys.dm_exec_query_stats

The Setup:

Our current deploy process runs a big script that adds columns, tables, indexes, constraints, etc.  It performs DDL changes.  It even migrates data to the new tables and columns.  So it also performs DML changes.  One issue we run into is that in our development environment we have a deploy everyday (an incremental build).  In production, we only do it once – officially.  We were working on our staging environment, going for the non-incremental deploy and it was taking a long time – what should take 10 minutes, we stopped at 3 hours.

The Statement:

Looking into the deploy issue I was querying the sys.dm_exec_query_stats DMV and only saw the DML statements.  Digging into it, the DMV doesn’t collect stats for the DDL statements.

The Proof:

To prove this, here’s what I did – first – almost a reboot (cleared the cache, clean buffers, etc):

CHECKPOINT        --writes dirty pages to disk
DBCC FREEPROCCACHE    --clears entire plan cache
DBCC DROPCLEANBUFFERS    --clears all clean data cache pages

--from MS page on sys.dm_exec_query_stats

SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;

Here are the results:

Res_1

Then I created a test table and reran the query on sys.dm_exec_query_stats:

Create table test (iid int)
Go
SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;

Here are the results:

Res_2

To Prove this is legit, I ran a select statement on the new table and then reran the query on sys.dm_exec_query_stats:

Select * from test
Go

SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;

The Select statement on the table and the DMV showes up, but not the create table statement.

Res_3

The same thing happened with adding an index, here’s the script (first almost reboot, then prove there is nothing in the results of the query to dm_exec_query_stats, add the index and rerun the query to dm_exec_query_stats):

CHECKPOINT        --writes dirty pages to disk
DBCC FREEPROCCACHE    --clears entire plan cache
DBCC DROPCLEANBUFFERS    --clears all clean data cache pages

SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;
Go
CREATE UNIQUE CLUSTERED INDEX [UC_Test_iID] ON [dbo].[test]
([iid] ASC) WITH (STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;

Here’s the results:

Res_4

In conclusion, remember the fact that sys.dm_exec_query_stats DMV only shows DML statements, not DDL statements!

Big Trys and Small Trans

triceps_0 tran

The catch phrase around the office these days is “Big Trys and small Trans.”  This came about as we were review stored procedure changes and I two suggestions.  First, that the entire procedure has a try…catch around it and second that the transaction be as small as possible.

Big Try

We’ve got standards in place that procedures have a try…catch in them.  One procedure had been copied, then modified with some logic to parse XML before the try.  I said lets move the try to above the XML parsing in case there’s an error in the XML, we’ll catch that as well.  So now we’ve got a bigger try.

Small Tran

The original code had the following steps:

  1. Start the transaction
  2. Look up some data
  3. Insert some data into a table
  4. Look up some other data
  5. Update another table
  6. Finally completed the transaction

In order to keep transactions small, I suggested the following:

  1. Look up all the data
  2. Start the transaction
  3. Insert the row into the table
  4. Update the row in the other table
  5. Finish the transaction

This will help avoid deadlocks and contention and keep things running smoothly.  There’s the smaller tran.

If you stick with the “Big Trys and Small Trans” methodology, it should help the following:

  1. Help avoid contention
  2. Help avoid deadlocks
  3. Catch errors earlier

SQL 2008 System Generated Names for Defaults

I had to create a script for dropping constraints on a table – no problem if the constaint is a named default, but if the default has a system generated name, it gets a bit hairy.   Here’s an example of how the defaults get created:

Create table TestTable1
	(TestConst datetime default Getdate())
Go

I googled this and found Pinal Dave had a good script to find the constraints in database located here
Here’s what I came up with for a script to find these constraints:

Declare @TableName varchar(500),
	@ColumnName varchar(500),
	@DFName varchar(500),
	@SchemaName varchar(500),
	@SQL varchar(2000)

--Change these for the table and column you are looking for--
Set @TableName = 'TestTable1'
Set @ColumnName = 'TestConst'

SELECT OBJECT_NAME(SO.OBJECT_ID) AS NameofConstraint,
	SCHEMA_NAME(SO.schema_id) AS SchemaName,
	OBJECT_NAME(SO.parent_object_id) AS TableName,
	SC.name as ColumnName,
	SO.type_desc AS ConstraintType
FROM sys.objects SO
join sys.columns SC on SC.default_object_id = SO.object_id
WHERE SO.type_desc LIKE '%CONSTRAINT'
and SC.object_id = object_id(@TableName)
and SC.name = @ColumnName

Finally, here’s the script to show and drop it:

Declare @TableName varchar(500),
	@ColumnName varchar(500),
	@DFName varchar(500),
	@SchemaName varchar(500),
	@SQL varchar(2000)

--Change these for the table and column you are looking for--
Set @TableName = 'TestTable1'
Set @ColumnName = 'TestConst'

SELECT OBJECT_NAME(SO.OBJECT_ID) AS NameofConstraint,
	SCHEMA_NAME(SO.schema_id) AS SchemaName,
	OBJECT_NAME(SO.parent_object_id) AS TableName,
	SC.name as ColumnName,
	SO.type_desc AS ConstraintType
FROM sys.objects SO
join sys.columns SC on SC.default_object_id = SO.object_id
WHERE SO.type_desc LIKE '%CONSTRAINT'
and SC.object_id = object_id(@TableName)
and SC.name = @ColumnName

If exists (Select 1 from sys.objects SO join sys.columns SC on SC.default_object_id = SO.object_id
		where SC.object_id = object_id(@TableName)
		and SC.name = @ColumnName)
Begin
	Select @DFName = SO.name,
		@SchemaName = SCHEMA_NAME(SO.schema_id)
	from sys.objects SO join sys.columns SC on SC.default_object_id = SO.object_id
	where SC.object_id = object_id(@TableName)
	and SC.name = @ColumnName

	Select @SQL = 'ALTER TABLE ' + @SchemaName + '.' + @TableName + ' DROP CONSTRAINT ' + @DFName
	EXEC (@SQL)
End

XML Data via Service Broker

I’ve been working on some development work.  This involves service broker messages and XML data.  We wanted the XML data in basically the same format as the table, but with an additional root tag on it.  To automate testing, I started digging into how to return XML via TSQL.  The FOR XML AUTO, ELEMENTS option almost worked, except we needed a root element added to it.  I toyed with the idea of converting to varchar, adding tags on the ends and converting back to XML – it would have worked, but that’s a hack.  I went back to the basics and reviewed the Basic Syntax of the FOR XML Clause and found the root clause. Here’s an example of what I needed:

 Create table ATest
(iID int identity(1,1),
Descripton varchar(50));

Insert ATest Values ('test1'),
('test2'),
('test3');

Select * from ATest FOR XML RAW ('ATEST'), ELEMENTS, ROOT('ATESTING');

This output is what I wanted:

<atesting>
	<atest>
		<iID>1</iID>
		<descripton>test1</descripton>
	</atest>
	<atest>
		<iID>2</iID>
		<descripton>test2</descripton>
	</atest>
	<atest>
		<iID>3</iID>
		<descripton>test3</descripton>
	</atest>
</atesting>

SQL 2008 Certified

I took my 70-453 test, Upgrade: Transition Your MCITP SQL Server 2005 DBA to MCITP SQL Server 2008, yesterday and passed, so it’s official:

Performance Counters to Monitor for SQL

I would suggest once a quarter, set up a performance monitor log to collect the following counters for a couple of 24 hour periods.  Look at the results in both a graph and report forms to determine the peak usage times and to see if there are any bottlenecks on your SQL server.

Here are the counters to monitor:

Memory: Pages/sec
Average between 0 and 20
spikes ok
Memory: Availible Bytes
Average > 5 MB
SQL Server: Buffer Manager: Buffer Cache Hit Ratio
> 90 closer to 99
Physical Disk: Disk Reads/sec
Watch for Growth over time
value depends on hardward
Physical Disk: Disk Writes/sec
Watch for Growth over time
value depends on hardward
Physical Disk: % Disk time
< 55%
if > 55% for more than 10 minutes may be an issue
Physical Disk: Avg. Disk Queue Length
< 2*individual drive in an array
i.e. < 10 is ok for 5 disc raid)
if > for more than 10 minutes may be an issue
Physical Disk: % Free Space
> 15%
Logical Disk: % Free Space

> 15%
Processor: % Processor Time
< 80%
Spikes ok
if > 80% for more than 10 minutes may be an issue
System: Processor Queue Length
< 2 per CPU
if > 2 per CPU for more than 10 minutes may be an issue
Network Interface: Bytes Received/sec
A sudden unexpected increase could be an external attack
Network Interface: Bytes Sent/sec
A sudden unexpected increase could be large volume of information being accessed
Network Interface: Output Queue Length
< 2
If NDIS hardware – this should always be 0
SQL Server: General: User Connections
< 255
if > 255 for more than 10 minutes may be an issue

These are the basics, keep a report of the findings and compare them to past results each time you run them.