Shaun Baggett: Sorta SQL: 2010.02

Friday, February 26, 2010

Merge deploying data to SQL 2008 from Database Projects

I have been trying out Microsoft's newest version of Visual Studio, the Visual Studio 2010 Ultimate Release Candidate. The new environment is quite nice although there doesn't seem to be a lot of difference with database projects so far. One nice extra is that you can now have windows tear off and be placed outside of the IDE. As I have multiple monitors I like to have the Solution Explorer open in the main IDE, and the Schema View window open on my second monitor. I find this greatly helps with quickly navigating around my project.

Anyway, back to the blog post.

When I am creating a new database schema, I like to have my project automatically load any required default data to my lookup tables. These tables are the ones where the data very rarely changes and are primarily referenced by Foreign keys. Examples of this table type could be a Status table, an Address type or even a Public Holiday date table that you'd like pre loaded.

Where I work we have tried different ways to action this but all the solutions have depended on a post deployment script executing either a bulk copy or multiple insert statements as follows;

-- Non SQL 2008 version
IF NOT EXISTS (SELECT 1 FROM [dbo].[TableName]) 
   BEGIN
      INSERT [dbo].[TableName] ([Column1], [Column2])
         VALUES (Value1, Value2);
      INSERT [dbo].[TableName] ([Column1], [Column2])
         VALUES (Value3, Value4);
      INSERT [dbo].[TableName] ([Column1], [Column2])
         VALUES (Value5, Value6);
   END;

-- SQL 2008 version
IF NOT EXISTS (SELECT 1 FROM [dbo].[TableName]) 
   BEGIN
      INSERT [dbo].[TableName] ([Column1], [Column2])
         VALUES (Value1, Value2) 
              , (Value3, Value4) 
              , (Value5, Value6);
   END;

-- Bulk Copy version
IF NOT EXISTS (SELECT 1 FROM [dbo].[TableName]) 
   BEGIN
      BULK INSERT [dbo].[TableName] FROM '$(DataFilePath)TableName.bcp WITH ...
   END;

The main part of this that has always bothered me is the existence check as the first statement. This basically states that if ANY data exists in the target table, we assume that the table is already loaded correctly so nothing further occurs. The problem with this is that if your default table data changes, or has extra rows added, or has extra rows deleted, then unless you empty the target table before you run this script these changes are never made to the table when you deploy the project.

Well this week, while working on a new SQL 2008 project, I realised that the new SQL 2008 only MERGE statement would be an ideal alternate method to pre load these lookup tables. With one statement I would be able to let SQL 2008 do all the work of comparing my project default data to what already exists in the deployment target database and take the appropriate action to either add or delete a row, depending on whether I have added or deleted a data row in my default data.

I set it up as follows for my AddressType table;

-- Create the table
CREATE TABLE [dbo].[AddressType]
   (
    [Id] INT IDENTITY(1, 1)
             CONSTRAINT [PK_AddressType] PRIMARY KEY CLUSTERED
             NOT NULL
  , [Code] NVARCHAR(20) CONSTRAINT [UQ_AddressType_Code] UNIQUE ([Code])
                        NOT NULL
  , [Description] NVARCHAR(150) NULL
   );

Then I create a post deployment script LoadAddressType.sql in the database project with the following code in it;

MERGE [dbo].[AddressType] AS Target
   USING 
      (
       SELECT N'Billing', N'Billing Address Type'
       UNION ALL
       SELECT N'Branch', N'Branch Address Type'
       UNION ALL
       SELECT N'Head Office', N'Head Office Address Type'
       UNION ALL
       SELECT N'Postal', N'Postal Address Type'
       UNION ALL
       SELECT N'Residential', N'Residential Address Type'
      ) AS Source ([Code], [Description])
   ON (TARGET.[Code] = SOURCE.[Code])
   WHEN NOT MATCHED BY SOURCE
      THEN DELETE 
   WHEN NOT MATCHED BY TARGET 
      THEN INSERT ([Code], [Description])
         VALUES (SOURCE.[Code], SOURCE.[Description]) ;

This creates a virtual Source table inside the USING block which, at project deployment is compared to the target table on the common [Code] field. If the target table has a row which is not in the virtual table then it is deleted from the target table and if the target table does not have a row which is in the virtual table, then the virtual row gets inserted to the target table. The first time this script is executed will insert 5 rows to the target AddressType table.

If I were then to update the LoadAddressType.sql script as follows, deleting 'Branch' and modifying 'Head Office' to 'Central Office';

MERGE [dbo].[AddressType] AS Target
   USING 
      (
       SELECT N'Billing', N'Billing Address Type'
       UNION ALL
       SELECT N'Central Office', N'Central Office Address Type'
       UNION ALL
       SELECT N'Postal', N'Postal Address Type'
       UNION ALL
       SELECT N'Residential', N'Residential Address Type'
      ) AS Source ([Code], [Description])
   ON (TARGET.[Code] = SOURCE.[Code])
   WHEN NOT MATCHED BY SOURCE
      THEN DELETE 
   WHEN NOT MATCHED BY TARGET 
      THEN INSERT ([Code], [Description])
         VALUES (SOURCE.[Code], SOURCE.[Description]) ;

Then on the next deployment of the project, the 2 rows with a [Code] value of 'Branch' and 'Head Office' would be deleted from the target table as they are no longer in the virtual table and a new row with a [Code] of 'Central Office' would be inserted. The other 3 rows would remain unchanged.

If you wanted to extend the functionality so that the Description field gets updated if changed, then you could add the following code block to the merge which also compares the [Description] field;

MERGE [dbo].[AddressType] AS Target
   USING 
      (
       SELECT N'Billing', N'Billing Address Type'
       UNION ALL
       SELECT N'Central Office', N'Central Office Address Type'
       UNION ALL
       SELECT N'Postal', N'Postal Address Type'
       UNION ALL
       SELECT N'Residential', N'Residential Address Type'
      ) AS Source ([Code], [Description])
   ON (TARGET.[Code] = SOURCE.[Code])
   WHEN MATCHED AND TARGET.[Description] <> SOURCE.[Description]
      THEN UPDATE 
         SET TARGET.[Description] = SOURCE.[Description]
   WHEN NOT MATCHED BY SOURCE
      THEN DELETE 
   WHEN NOT MATCHED BY TARGET 
      THEN INSERT ([Code], [Description])
         VALUES (SOURCE.[Code], SOURCE.[Description]) ;

The way I have the virtual table coded relies on implicit data conversion for it's success, as well as being easier to read for blog entries but it really should be coded with the proper data type casting as follows;

MERGE [dbo].[AddressType] AS Target
   USING 
      (
       SELECT CAST('Billing' AS NVARCHAR(20)), CAST('Billing Address Type' AS NVARCHAR(150))
       UNION ALL
       SELECT CAST('Central Office' AS NVARCHAR(20)), CAST('Central Office Address Type' AS NVARCHAR(150))
       UNION ALL
       SELECT CAST('Postal' AS NVARCHAR(20)), CAST('Postal Address Type' AS NVARCHAR(150))
       UNION ALL
       SELECT CAST('Residential' AS NVARCHAR(20)), CAST('Residential Address Type' AS NVARCHAR(150))
      ) AS Source ([Code], [Description])
   ON (TARGET.[Code] = SOURCE.[Code])
   WHEN NOT MATCHED BY SOURCE
      THEN DELETE 
   WHEN NOT MATCHED BY TARGET 
      THEN INSERT ([Code], [Description])
         VALUES (SOURCE.[Code], SOURCE.[Description]) ;

Note that you could also copy the Merge query into an SSMS query and run it directly to get the same results.

One problem you may run into if you delete a value from your virtual source table is that come deployment, if that value is already referenced by a foreign key in another table, then unless you have the ON DELETE CASCADE option set for the foreign key the Merge statement will fail, causing your deployment to fail.

This method works well for us for small amounts of default data, but is still not suitable for large scale Bulk Insert data loads as it would just take too long to set up the virtual table, not not mention it would probably not be very efficient. For these it is probably best to empty the target table before executing the bulk insert.

Thursday, February 11, 2010

Visual Studio 2010 Release Candidate bug: SQL deployment property IgnoreFileAndLogFilePath

As you are probably aware by now, Microsoft this week made available the Release Candidate of Visual Studio 2010. It's looking like a nice product but still has a few issues.

I was keen to upgrade an existing database solution I am working on so opened the solution and allowed the project upgrade wizard to complete. Everything looked great. All my projects were there, all the database objects and code were there, and it built successfully. But it wouldn't deploy. I would get this error on deployment;

------ Build started: Project: BillingDB, Configuration: Debug Any CPU ------
BillingDB -> D:\VS 2010 tests\Billing\Database\sql\BillingDB.dbschema
------ Deploy started: Project: BillingDB, Configuration: Debug Any CPU ------
D:\VS 2010 tests\Billing\Database\sql\Billing_BillingDB.sqldeployment(0,0): Error SQL00256: The deployment property IgnoreFileAndLogFilePath could not be used to configure deployment.
BillingDB.dbschema(0,0): Error TSD01234: The deployment configuration file could not be loaded. Deployment cannot continue
Done executing task "SqlDeployTask" -- FAILED.
Done building target "DspDeploy" in project "BillingDB.dbproj" -- FAILED.
Done executing task "CallTarget" -- FAILED.
Done building target "DBDeploy" in project "BillingDB.dbproj" -- FAILED.
Done building project "BillingDB.dbproj" -- FAILED.

Build FAILED.
========== Build: 1 succeeded or up-to-date, 0 failed, 0 skipped ==========
========== Deploy: 0 succeeded, 1 failed, 0 skipped ==========

After some investigation it seems that the VS 2008 database deployment property IgnoreFileAndLogFilePath is no longer available in VS 2010, and that the upgrade process for database projects does not delete or modify this setting. Selecting Edit for the sqldeployment options, making a change, then saving the change, also does not remove the property from the file.

The workaround is to open the <project>.sqldeployment file in an editor of your choice and delete the line containing;

<ignorefileandlogfilepath>True</ignorefileandlogfilepath>

Save the file and your database project should now deploy successfully.

I didn't investigate further to see if there were any other deployment properties that no longer exist but now know how to fix it if I get the problem again.

I created a Microsoft Connect bug detailing the problem and I have had offline verification that it is a bug by Microsoft staff. They have now scheduled this to be fixed for the RTM of VS 2010.

Wednesday, February 10, 2010

My Microsoft Exams and Certifications

Just posting this so it's easy for me and possible employers to find .

https://mcp.microsoft.com/authenticate/validatemcp.aspx
TranscriptID: 746481
Access Code: sbaggett

Monday, February 08, 2010

Be careful with comment location in pre/post deployment scripts

Just had a small issue with my SQL Database project.

In my Script.PostDeployment.sql file I put a standard SQL comment at the end of a line just to remind me of the required data load order;

:r LoadLookupTable.sql
:r LoadTestData.sql  -- Must load LookupTable first

When building the project this caused the following error;

C:\Program Files (x86)\MSBuild\Microsoft\VisualStudio\v9.0\TeamData\Microsoft.Data.Schema.SqlTasks.targets(80,5)Error MSB4018: The "SqlSetupDeployTask" task failed unexpectedly.System.ArgumentException: Illegal characters in path.
at System.IO.Path.CheckInvalidPathChars(String path)
at System.IO.Path.IsPathRooted(String path)
at Microsoft.Data.Schema.Sql.Build.SqlDeploymentScriptModifier.GenerateMergedSqlCmdFiles(ContributorConfigurationSetup setup, ContributorConfigurationFile configFile)
at Microsoft.Data.Schema.Sql.Build.SqlDeploymentScriptModifier.OnEstablishDeploymentConfiguration(ContributorConfigurationSetup setup)
at Microsoft.Data.Schema.Build.DeploymentProjectBuilder.VerifyConfiguration()
at Microsoft.Data.Schema.Tasks.DBSetupDeployTask.BuildDeploymentProject(ErrorManager errors, ExtensionManager em)
at Microsoft.Data.Schema.Tasks.DBSetupDeployTask.Execute()
at Microsoft.Build.BuildEngine.TaskEngine.ExecuteInstantiatedTask(EngineProxy engineProxy, ItemBucket bucket, TaskExecutionMode howToExecuteTask, ITask task, Boolean& taskResult)
Done executing task "SqlSetupDeployTask" -- FAILED.

By moving the comment to its own line like the following it then built correctly;

:r LoadLookupTable.sql
-- Must load LookupTable first
:r LoadTestData.sql

I know it's because it's being interpreted in SQLCMD mode but you'd think the parser would be smart enough to ignore it.

Thursday, February 04, 2010

Visual Studio 2008 Database Edition schema refactoring causes "Object Reference not set to an instance of an object"

I recently had a problem in a SQL 2008 database project. I am using Visual Studio 2008 Team Suite (which include the database Edition) with Service Pack 1 and the GDR2 database edition update installed.

The problem was that no refactorings would work. By refactorings I am referring to the refactor options available from the Data menu or by right clicking a database object from the Schema View window of Visual Studio 2008. These include options such as Rename and Move schema.

I would try to refactor an object, say to rename it. I would get the first dialog asking what I want to change and allowing me to select extra options such as Preview changes, Generate refactoring log, etc. Upon clicking OK from this dialog I would always get a further error dialog windows stating "Object Reference not set to an instance of an object.". This occurred every time, for any object that I tried to refactor, for any type of refactoring. The error dialog was also of the old style without the nice options to see the full error information or stack trace that is more common in most Visual Studio 2008 error dialogs.

I attached a second instance of Visual Studio to the first instance and tried the refactoring again. The second instance managed to intercept some errors that were being thrown by the first instance and they consisted of multiple errors like the following;

A first chance exception of type 'antlr.NoViableAltForCharException' occurred in Microsoft.Data.Schema.ScriptDom.Sql.dll
A first chance exception of type 'antlr.MismatchedTokenException' occurred in Microsoft.Data.Schema.ScriptDom.Sql.dll
A first chance exception of type 'System.NullReferenceException' occurred in Microsoft.Data.Schema.Sql.dll

I was getting a bit out of my depth here so started a thread on the Microsoft Visual Studio Team System Database Edition forum asking for help.

After some forum discussion and guidance with Joyce Wang of Microsoft and further investigation by me to pinpoint the issue, the problem turned out to be a very obscure problem that they were aware of but does not occur very often.

I had a post deployment SQL script in my project that used a CTE (Common table Expression) to load a Numbers table. Joyce explained that there is a bug in the VS 2008 GDR editions where post deployment scripts with CTE's cause this refactoring error, even though the script is valid and the database project both builds and deploys correctly. Joyce further explained that there is a way to not cause the refactoring error by adding a semi colon after the CTE closing bracket, but of course this then causes the script to no longer work when deployed as it is not the correct SQLsyntax for a CTE based query.

The workaround is to exclude the pre or post deployment script from your database project, action all the refactorings you need to do, then include the script back in the project. Clunky but it gets the job done.

This is apparently fixed in Visual Studio 2010 so I'll have to have a look at that soon.

So, essentially this post is trying to explain that if you get this Object Reference error when refactoring, the most likely culprit is a pre or post deployment script that contains a CTE and the workaround is to exclude the script from the database project while you are doing your refactoring, then include it back in when you are done.

Wednesday, February 03, 2010

Canberra SQL UG with Kevin Wong

Last night (Feb 2nd 2010) was the first meeting of the Canberra SQL Server User group for 2010. This time we had our speaker, Kevin Wong, travel from Sydney to present a topic on SSIS 2008 new features such as ADO.Net connections, the Lookup Cache Connection Manager, using SSIS 2008 to create Reporting Services snapshots, package logging and more. Although there wasn't a large turnout, it was very well received and I think everyone learnt something new.

The prize of Kalen Delaney's book SQL Server 2008 Internals, kindly donated to the group by O'Reilly publications, was won by Mark Stafford. Congratulations Mark, some serious reading to do there.

Shaun Baggett: Sorta SQL