Twitter

Follow SQLQuill on Twitter

Subscribe

MN Pass

SQL Auto Converts True or False to a bit

The title says it all, but here’s the story.  The other day I was modifying a procedure and I didn’t know if the application was going to pass in True/False or 1/0 for a bit column.  Being lazy, I created my procedure with a 1 or 0 and figured I’d update it later if the app was using True/False.  The App was passing in a True/False, but I didn’t need to update my procedure because SQL auto converts it.  More info can be found here.  It’s pretty basic, but it save me some time, so I figured I’d remind people.  Here’s an example to show it:

Create table Test (IDCol int identity(1,1),
     TestBit bit)

Insert Test Values(0)
Insert Test Values(1)

Select * from test

Insert Test Values('False')
Insert Test Values('True')

Select * from test

Here’s an XML example:

Declare @XMLTest xml
Set @XMLTest = '<Test>
     <BitTest>False</BitTest>
     </Test>
     <Test>
     <BitTest>True</BitTest>
     </Test>'

Select x.x.value('BitTest[1]', 'bit')
from @XMLTest.nodes('/Test') as x(x)

Add a Non-Nullable Column to a Table with a Non-Auto Generated Default Name

I’ve been a stickler for default names since reading Jason Strate’s (blog/twitter) post on cleaning up auto-generated names.  Today I had to create a non-nullable column with a default – can I do it in one step?  Yep – here’s an example:

ALTER TABLE <<TABLENAME>> ADD <<COLUMNNAME>> <<TYPE>> NOT NULL CONSTRAINT <<DEFAULTNAME>> DEFAULT (<<Value>>)

Here’s the example in action:

Create table Testtable
(id int)

Insert Testtable Values (1),(2),(3),(4),(5)

ALTER TABLE Testtable ADD DefaultTest BIT NOT NULL CONSTRAINT DF_Testtable_DefaultTest DEFAULT (0)

Select * from Testtable

--/*Now to drop this column, do it in this order */
Alter table Testtable alter column DefaultTest  bit null
Alter table Testtable Drop constraint DF_Testtable_DefaultTest
Alter table Testtable DROP column DefaultTest

Create Insert/Update Scripts for Static Data Tables

Here’s a script that will create an insert/update statement for a list of values.  This is a major time saver for me – I have to create scripts like this for people fairly often. It is designed for a simple look up table with an ID column and a some sort of description column such as this….

Create table TestAndysAwesomeInsertGenerator
	(TestID int,
	TestDesc varchar(50),
	TSTimestamp timestamp)

Now let’s say you needed to generate insert and/or update statements for this list of values like this:

  1. This is line1
  2. This is line2
  3. This is line3
  4. etc…

You could enter the data into a table, then script out the insert statements and worry about the update statements later – or create row update/insert and do a lot of copying and pasting and editing, etc.  Well here’s a shortcut I came up with…use this script that creates a script.

Set nocount on
--/*Declare the variables.*/
Declare @ScriptOutput varchar(Max),
	@tablename varchar(255),
	@Col1 varchar(255),
	@Col2 varchar(255),
	@Counter1 int,
	@Counter2 int,
	@DescValue varchar(255)

Declare @ValuesTable table ([ID] int IDENTITY(1,1), [Description] varchar(255))
Declare @FinalScript table ([Script] varchar(max))

----------------------------------------------------------------------------------------------
--/*  This is the section that needs to be changed each time for each table and the values. */
----------------------------------------------------------------------------------------------
Set @tablename = 'TestAndysAwesomeInsertGenerator'

Insert @ValuesTable
Values('This is line1'),
	('This is line2'),
	('This is line3'),
	('etc...')

--------------------------------------------------------------------------------------------
--/*  That's it - everything else should be automatic - if it's a simple table.   */
--------------------------------------------------------------------------------------------
Select @Counter1 = COUNT(*) from @ValuesTable
Set @Counter2 = 1

--/* if the table has more columns, hard code the column names in here. */
Select @Col1 = SC.name from sys.columns SC
Join sys.types ST on ST.user_type_id = SC.user_type_id
where SC.object_id = OBJECT_ID(@tablename)
and ST.name like '%int%'

Select @Col2 = SC.name from sys.columns SC
Join sys.types ST on ST.user_type_id = SC.user_type_id
where SC.object_id = OBJECT_ID(@tablename)
and ST.name like '%char%'

--Manipulate the string...
Set @ScriptOutput =
'IF NOT EXISTS (Select 1 from <<TableName>> where <<Col1>> = <<IDVal>>)
	Begin
		Insert <<TableName>> (<<Col1>>, <<Col2>>) Values (<<IDVal>>, <<DescVal>>)
	End
Else
	Begin
		IF NOT EXISTS (Select 1 from <<TableName>> where <<Col1>> = <<IDVal>> and <<Col2>> = <<DescVal>>)
		Update <<TableName>> Set <<Col2>> = <<DescVal>> where <<Col1>> = <<IDVal>>
	End
	'
Select @ScriptOutput = Replace(Replace(Replace(@ScriptOutput, '<<TableName>>', @tablename), '<<Col1>>', @Col1), '<<Col2>>', @Col2)
Insert @FinalScript Select '/* Begin Codes for ' + @tablename + '*/'
While @Counter2 <= @Counter1
Begin
	Select @DescValue = '''' + [Description] + '''' from @ValuesTable where [ID] = @Counter2
	Insert @FinalScript
		Select Replace(Replace(@ScriptOutput, '<<IDVal>>', @Counter2), '<<DescVal>>', @DescValue)
	Select @Counter2 = @Counter2 + 1
End

Insert @FinalScript Select '/* End Codes for ' + @tablename + '*/'
Select * from @FinalScript

Here is the output of this script:

/* Begin Codes for TestAndysAwesomeInsertGenerator*/
IF NOT EXISTS (Select 1 from TestAndysAwesomeInsertGenerator where TestID = 1)
	Begin
		Insert TestAndysAwesomeInsertGenerator (TestID, TestDesc) Values (1, 'This is line1')
	End
Else
	Begin
		IF NOT EXISTS (Select 1 from TestAndysAwesomeInsertGenerator where TestID = 1 and TestDesc = 'This is line1')
		Update TestAndysAwesomeInsertGenerator Set TestDesc = 'This is line1' where TestID = 1
	End

IF NOT EXISTS (Select 1 from TestAndysAwesomeInsertGenerator where TestID = 2)
	Begin
		Insert TestAndysAwesomeInsertGenerator (TestID, TestDesc) Values (2, 'This is line2')
	End
Else
	Begin
		IF NOT EXISTS (Select 1 from TestAndysAwesomeInsertGenerator where TestID = 2 and TestDesc = 'This is line2')
		Update TestAndysAwesomeInsertGenerator Set TestDesc = 'This is line2' where TestID = 2
	End

IF NOT EXISTS (Select 1 from TestAndysAwesomeInsertGenerator where TestID = 3)
	Begin
		Insert TestAndysAwesomeInsertGenerator (TestID, TestDesc) Values (3, 'This is line3')
	End
Else
	Begin
		IF NOT EXISTS (Select 1 from TestAndysAwesomeInsertGenerator where TestID = 3 and TestDesc = 'This is line3')
		Update TestAndysAwesomeInsertGenerator Set TestDesc = 'This is line3' where TestID = 3
	End

IF NOT EXISTS (Select 1 from TestAndysAwesomeInsertGenerator where TestID = 4)
	Begin
		Insert TestAndysAwesomeInsertGenerator (TestID, TestDesc) Values (4, 'etc...')
	End
Else
	Begin
		IF NOT EXISTS (Select 1 from TestAndysAwesomeInsertGenerator where TestID = 4 and TestDesc = 'etc...')
		Update TestAndysAwesomeInsertGenerator Set TestDesc = 'etc...' where TestID = 4
	End

/* End Codes for TestAndysAwesomeInsertGenerator*/

This takes care of the data – here are some test cases, run one or both of these scripts, then re-run the output script above and the data will be back to what you initially set it for.

Delete from TestAndysAwesomeInsertGenerator where TestID = 2
--
Update TestAndysAwesomeInsertGenerator set testdesc = 'Row Updated' where testid = 4
--
Select * from TestAndysAwesomeInsertGenerator

Create a List of Links from Google Reader Starred Items with Powershell

Inspired by Jason Strate’s (Blog| Twitter) post “PowerShell: Download List of SQL Server MVPs”, which was inspired by John Samson’s (Blog| Twitter) post “The Best Database Administrators Automate Everything” – which also inspired me.  I decided to create a Powershell script to grab my starred items in Google Reader and format them into an HTML Page with links for my Weekly Challenges/Good Reading Posts.

The Set Up

I had an HTML template created for these pages, so I split it into three parts:

  1. The portion above the Good Reading and Links – I saved this to a text file located at “c:\temp\TopTemplate.txt”
  2. The Good Reading and Links
  3. The portion below the Good reading and Links – I saved this to a text file located at “c:\temp\BotTemplate.txt”

I then looked at my public page for Starred Items on Google Reader.  Note this script works for any public listing on Google Reader, so I can create a Public Page and change the URL in this script and it still works.

The Code

##------------------------##
##--Reader Starred Items--##
##------------------------##
$webclient = New-Object system.net.webclient
$SourceFile = "c:\temp\Reader.txt"
$HTMLPage = "C:\temp\GoodReading.txt"
$TopTemplate = "c:\temp\TopTemplate.txt"
$BotTemplate = "c:\temp\BotTemplate.txt"
$url="http://www.google.com/reader/shared/user/14135896925018987441/state/com.google/starred"
$webclient.DownloadFile($url,$SourceFile)
#Get the titles with links
$itemtitle = Select-String $SourceFile -pattern "h2 class=""item-title""" |
% {$_ -replace "<div id=""items""><div class=""item""><h2 class=""item-title""><div class="""">", ""} |
% {$_ -replace "</div></h2>", ""} |
% {$_ -replace "<div class=""clear""></div></div></div> <div class=""item""><h2 class=""item-title""><div class="""">", ""} |
% {$_ -replace "c:\\temp\\Reader\.txt:[0-9]*:", ""}
#Get the details with links
$iteminfo = Select-String $SourceFile -pattern "<div class=""item-info"">" |
% {$_ -replace "c:\\temp\\Reader\.txt:[0-9]*:<div class=""item-info"">", ""} |
% {$_ -replace "</div>", ""}
$i=0
$FinalContent = " "
foreach ($t in $itemtitle)
{$FinalContent= $FinalContent +"<li>"+ $itemtitle[$i] + " " + $iteminfo[$i]+"</li>" + "`r`n "
$i++}
$FinalContent = "<p><strong>Good Reading from the Week</strong>:</p>" + "`r`n" + "<ul>" + "`r`n" + $FinalContent + "`r`n" + "</ul>"
$TopTemp = Get-Content $TopTemplate
$BotTemp = Get-Content $BotTemplate
$FinalContent = $TopTemp + $FinalContent + $BotTemp
Set-Content $HTMLPage $FinalContent
invoke-item $HTMLPage

When the script completes, I simply update a couple of statuses and change the photo and it’s good to go!

How to: Number Rows – Grouped by a Column – TSQL

There are quite a few times I need to number rows of data based on some grouping column.  I often forget how to do it.  Well, here’s a couple of examples:

Create table TestTable1
(ID int,
InsertDate datetime)
Go
Insert TestTable1 (ID, InsertDate)
Values(1,'8/1/10'),
(1,'7/1/10'),
(1,'6/1/10'),
(2,'3/15/10'),
(2,'3/16/10'),
(2,'3/13/10'),
(2,'3/11/10')
--Note the values are out of order for ID 2
Select *,
(select count(*) from TestTable1 TT2
where TT1.ID = TT2.ID
and TT1.InsertDate < TT2.InsertDate) as 'rank'
from TestTable1 TT1
Order by ID, rank

Or a simpler way…

SELECT *,
ROW_NUMBER() OVER ( PARTITION BY ID ORDER BY insertdate ) as 'rank'
from TestTable1 TT1
Order by ID, rank

Here are some examples and explanations:

The Official SQLFAQ High Level Overview of Data Migration Testing

I’ve been tasked with the SQL portion of our data migration and conversion testing for our big release we’ve got coming up. What’s that mean?  Well, we’ve got an upgrade coming to the application we support.  We’ve got 15 databases on 7 servers.  All of them have database schema changes coming, data moving around from one table to another, new tables getting created, old ones getting dropped, new columns populated from other databases, etc.  I’ve got to make sure all the data that moves makes it there and is correct.  Here was how I attacked it:

The High Level Overview:

  1. Script production databases from our log shipping servers
  2. Create two copies of each database on a test server(s)
  3. Upgrade one database of each set
  4. Compare schema on Pre and Post upgraded databases
  5. List out all the diffences
  6. Catagorize how each new column or table is populated
  7. Ignore the new tables that are populated on use or populated with defaults – no data migration
  8. Create Checksum and row count scripts for all the migrated data
  9. Run the scripts and record the results on our Prod-Stage environment before the upgrade
  10. Upgrade Prod-Stage databases
  11. Run the scripts and compare the results to the Pre-upgrade.
  12. Address any issues

This approached seemed to work well, so my plan is to write a post on a bunch of these individual steps, so stay tuned.

Risks in Adding a Column to an Existing Table

When someone states that adding a column to an existing table is easy and you need a list of risks, feel free to copy and paste this, add your own, etc.  In fact, contact me via email, comments, twitter – @SQLQuill with any additions.

Adding a column to an existing table – in theory and best practices world is easy – but in reality it is not always as easy as ALTER TABLE ADD COLUMN.  The risk is that any procedure, view or application code that uses a “Select *” or an “Insert <TableName> Values <…>” (without column names explicitly stated) may not function, or may not function correctly, with an additional column added to the table.  Also note there could be temp tables, table variables and table value parameters built off the table that an additional column affects as well.  Best practices and my database development guidelines says not to use the “Select *” or the “Insert <TableName> Values <…>”.

Random Data Part 3 – Random Dates

So apparently I like scripts that create random data.  I’ve written a couple of posts that create random numbers and random characters.  This week I was working on a clean up job that would delete data that was over 50 days old.  To test my clean up routine, I had to combine the random number script with the DATEADD function – here’s the basis for it:

Select Dateadd(D, (-1)* ROUND((RAND() * (100)),0), Getdate())

Breaking it down, the ROUND((RAND() * (100)),0) section randomly chooses an integer from 0 to 100, then multiplies that by –1 as we want to go back in time.  Finally the DATEADD function adds the negative random integer number of days to today’s date.

Here’s an example that populates a test table with 1000 rows of data:

SET NOCOUNT ON
Go
Declare @RandomDates Table (DateTest datetime)
Declare @Counter int
Set @Counter = 0
While @Counter < 1000
Begin
    Insert @RandomDates (DateTest)
    Select Dateadd(D, (-1)* ROUND((RAND() * (100)),0), Getdate()) --Random Date 0 to 100 days old
    Set @Counter = @Counter + 1
End

Select Datediff(d,DateTest, getdate()) as DaysOld,
    Count(Datediff(d,DateTest, getdate())) as Count
from @RandomDates
Group by Datediff(d,DateTest, getdate())
Order by DaysOld Desc

Calendar

Troubleshooting Troubleshooting – sys.dm_exec_query_stats

The Setup:

Our current deploy process runs a big script that adds columns, tables, indexes, constraints, etc.  It performs DDL changes.  It even migrates data to the new tables and columns.  So it also performs DML changes.  One issue we run into is that in our development environment we have a deploy everyday (an incremental build).  In production, we only do it once – officially.  We were working on our staging environment, going for the non-incremental deploy and it was taking a long time – what should take 10 minutes, we stopped at 3 hours.

The Statement:

Looking into the deploy issue I was querying the sys.dm_exec_query_stats DMV and only saw the DML statements.  Digging into it, the DMV doesn’t collect stats for the DDL statements.

The Proof:

To prove this, here’s what I did – first – almost a reboot (cleared the cache, clean buffers, etc):

CHECKPOINT        --writes dirty pages to disk
DBCC FREEPROCCACHE    --clears entire plan cache
DBCC DROPCLEANBUFFERS    --clears all clean data cache pages

--from MS page on sys.dm_exec_query_stats

SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;

Here are the results:

Res_1

Then I created a test table and reran the query on sys.dm_exec_query_stats:

Create table test (iid int)
Go
SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;

Here are the results:

Res_2

To Prove this is legit, I ran a select statement on the new table and then reran the query on sys.dm_exec_query_stats:

Select * from test
Go

SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;

The Select statement on the table and the DMV showes up, but not the create table statement.

Res_3

The same thing happened with adding an index, here’s the script (first almost reboot, then prove there is nothing in the results of the query to dm_exec_query_stats, add the index and rerun the query to dm_exec_query_stats):

CHECKPOINT        --writes dirty pages to disk
DBCC FREEPROCCACHE    --clears entire plan cache
DBCC DROPCLEANBUFFERS    --clears all clean data cache pages

SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;
Go
CREATE UNIQUE CLUSTERED INDEX [UC_Test_iID] ON [dbo].[test]
([iid] ASC) WITH (STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
GO
SELECT TOP 5 query_stats.query_hash AS "Query Hash",
SUM(query_stats.total_worker_time) / SUM(query_stats.execution_count) AS "Avg CPU Time",
MIN(query_stats.statement_text) AS "Statement Text"
FROM
(SELECT QS.*,
SUBSTRING(ST.text, (QS.statement_start_offset/2) + 1,
((CASE statement_end_offset
WHEN -1 THEN DATALENGTH(ST.text)
ELSE QS.statement_end_offset END
- QS.statement_start_offset)/2) + 1) AS statement_text
FROM sys.dm_exec_query_stats AS QS
CROSS APPLY sys.dm_exec_sql_text(QS.sql_handle) as ST) as query_stats
GROUP BY query_stats.query_hash
ORDER BY 2 DESC;

Here’s the results:

Res_4

In conclusion, remember the fact that sys.dm_exec_query_stats DMV only shows DML statements, not DDL statements!

Code Reviews – What good are they for?

  1. All database changes go through me. 
  2. Do or Review, do or review.

These have been my catch phrases since taking on the lead SQL developer role at my company, on my team.  There have many changes that I’ve had to review – many, many changes, most of them are fine, but occasionally I get a real head scratcher.

Things that I’ve caught reviewing code in the last three weeks (note these are also justifications I’m taking to my boss for more help):

  • VARCHAR(MAX) column – we avoid these for online reindexing
  • A cursor – we avoid these for performance sake – it also didn’t need it
  • Missing comments and Headers in stored procedures – we’ve got a standard header we use
  • An infinite loop – a testing procedure that uses service broker that would have been looping forever.
  • A date that should be UTC Date, but Getdate() was used

Is this good for the company?  This is the question I’m asking myself…all the time actually.  Running the numbers, how much time and money has my reviews saved the company?

For the next six months a guess would be:

varchar(MAX) column 

  • 10 hours of operational DBA/contractor time to write exception statements and meet on how to handle the online re-indexing
  • 10 hours development DBA time to migrate the data on the next release
  • 5 hours testing time

The cursor

  • 10 hours development DBA time to rewrite and test the change
  • 5 hours testing time

Missing Comments and Headers in stored procedures

  • 5 hours developer time

An infinite loop

  • 1 hour to troubleshoot and fix on release night with 30 people waiting around – 30 work hours total

A Date that should be UCT instead of Server time

  • 2 hours development DBA time to locate and fix

48 work hours total for about that the same amount of time reviewing code. 

Note these are just estimates, so I may be way off, but the old saying, “Pay me now or pay me later” really comes into play here.

There are a quite a few intangible benefits as well:

  • I know what’s going on with the changes and I’ve tried to communicate to the other DBAs the changes coming.  That should save lots of time for others
  • I have got to know a lot of developers who I normally wouldn’t talk to.
  • I’ve taught a few developers a few things about SQL.  Hopefully this will save tons of time down the road.
  • I’ve showed this site off to a fairly new SQL guy and he’s used the generate random data script a lot and it’s saved him some testing time.

 

Milt