There are quite a few times I need to number rows of data based on some grouping column. I often forget how to do it. Well, here’s a couple of examples:
Create table TestTable1
Insert TestTable1 (ID, InsertDate)
--Note the values are out of order for ID 2
(select count(*) from TestTable1 TT2
where TT1.ID = TT2.ID
and TT1.InsertDate < TT2.InsertDate) as 'rank'
from TestTable1 TT1
Order by ID, rank
Or a simpler way…
ROW_NUMBER() OVER ( PARTITION BY ID ORDER BY insertdate ) as 'rank'
from TestTable1 TT1
Order by ID, rank
Here are some examples and explanations:
I’ve been tasked with the SQL portion of our data migration and conversion testing for our big release we’ve got coming up. What’s that mean? Well, we’ve got an upgrade coming to the application we support. We’ve got 15 databases on 7 servers. All of them have database schema changes coming, data moving around from one table to another, new tables getting created, old ones getting dropped, new columns populated from other databases, etc. I’ve got to make sure all the data that moves makes it there and is correct. Here was how I attacked it:
The High Level Overview:
- Script production databases from our log shipping servers
- Create two copies of each database on a test server(s)
- Upgrade one database of each set
- Compare schema on Pre and Post upgraded databases
- List out all the diffences
- Catagorize how each new column or table is populated
- Ignore the new tables that are populated on use or populated with defaults – no data migration
- Create Checksum and row count scripts for all the migrated data
- Run the scripts and record the results on our Prod-Stage environment before the upgrade
- Upgrade Prod-Stage databases
- Run the scripts and compare the results to the Pre-upgrade.
- Address any issues
This approached seemed to work well, so my plan is to write a post on a bunch of these individual steps, so stay tuned.
When someone states that adding a column to an existing table is easy and you need a list of risks, feel free to copy and paste this, add your own, etc. In fact, contact me via email, comments, twitter – @SQLQuill
with any additions.
Adding a column to an existing table – in theory and best practices world is easy – but in reality it is not always as easy as ALTER TABLE ADD COLUMN. The risk is that any procedure, view or application code that uses a “Select *” or an “Insert <TableName> Values <…>” (without column names explicitly stated) may not function, or may not function correctly, with an additional column added to the table. Also note there could be temp tables, table variables and table value parameters built off the table that an additional column affects as well. Best practices and my database development guidelines says not to use the “Select *” or the “Insert <TableName> Values <…>”.
So apparently I like scripts that create random data. I’ve written a couple of posts that create random numbers and random characters. This week I was working on a clean up job that would delete data that was over 50 days old. To test my clean up routine, I had to combine the random number script with the DATEADD function – here’s the basis for it:
Select Dateadd(D, (-1)* ROUND((RAND() * (100)),0), Getdate())
Breaking it down, the ROUND((RAND() * (100)),0) section randomly chooses an integer from 0 to 100, then multiplies that by –1 as we want to go back in time. Finally the DATEADD function adds the negative random integer number of days to today’s date.
Here’s an example that populates a test table with 1000 rows of data:
SET NOCOUNT ON
Declare @RandomDates Table (DateTest datetime)
Declare @Counter int
Set @Counter = 0
While @Counter < 1000
Insert @RandomDates (DateTest)
Select Dateadd(D, (-1)* ROUND((RAND() * (100)),0), Getdate()) --Random Date 0 to 100 days old
Set @Counter = @Counter + 1
Select Datediff(d,DateTest, getdate()) as DaysOld,
Count(Datediff(d,DateTest, getdate())) as Count
Group by Datediff(d,DateTest, getdate())
Order by DaysOld Desc