Thursday, December 15, 2011

Tales of a Lazy DBA–MySQL, SSIS, and “0” Dates


We’ve recently been tasked with converting a lot of MySQL Data into our system from a multi-tenant MySQL Database. We don’t have a fixed schedule to actually execute the imports because they’ll be on a “per customer” basis.  Well, that sounded like a great task for SSIS.  I set up the ODBC driver, connected, set the initial “sql_mode” options for our connections to (‘mssql, allow_invalid_dates’) and started to work.
First problem we ran into with an ADO.NET connection to MySQL and writing a SELECT * from schema.table was when we hit a MySQL “Date” column containing a value of ‘0000-00-00’.  SSIS threw an error, not sure what to do.  Thanks to some others who have solved this problem, I realized that within the MySQL Select statement, we could do something like:
CASE date_created WHEN '0000-00-00' THEN NULL else date_created END as date_created
to pass those dates over as NULL. That solves the implicit conversion to datetime (SQL 2005) and avoids the invalid dates.  I ran something similar for a “Time” column to:
CAST(Time_Created as char(8) ) as Time_Created
 
So that solved one particular table export with about 20 or so CAST and CASE statements.  Needless to say, I wasn’t looking forward to doing this for another 500 tables with a total of almost 6000 columns.
I finally set up a really basic query to generate most of the SELECT statements we would need to pull our MySQL data across without too much pain. Admittedly, a small step and I’d still need to copy/paste when I set up each new ADO.NET source, but it worked reasonably well. I’m adding the code snippet for MySQL here in case anyone else has a similar problem.
select
CONCAT(CASE WHEN ordinal_position = 1 THEN 'SELECT ' ELSE ', ' END,
 CASE Data_Type WHEN 'date'
 THEN CONCAT('CASE ',column_name,' WHEN ''0000-00-00'' THEN NULL else '
    ,column_name,' END as ',column_name)
 WHEN 'time' THEN CONCAT('CAST(',column_name,' AS CHAR(8) ) as ',column_name)
 ELSE column_name END, CASE WHEN Ordinal_Position <> MaxOrd THEN ''
 ELSE CONCAT('
    FROM MySchema.', c.table_name) END
    ) as Select_Column_Name
from information_schema.columns as c
JOIN (select table_name, MAX(ordinal_position) as MaxOrd
        from information_schema.columns
        WHERE Table_Schema = 'MySchema'
        GROUP BY Table_Name) as t1
    ON c.Table_Name = t1.Table_Name
where table_schema = 'MySchema'
order by c.table_name, ordinal_position LIMIT 0, 50000;



I’d love to hear other ideas if anyone has encountered this before and come up with a more elegant solution for translating “0” Date or Time data from MySQL into SQL Server.

Thursday, September 15, 2011

Installing SQL Server, Sharepoint, PowerPivot on a single server

I've been trying to set up a single virtual machine for a proof of concept using SQL Server Denali CTP3, Sharepoint 2010, and PowerPivot. I want to get Project "Crescent" running through this as well to prove out some report concepts. I've had a lot of trouble finding the right steps to get everything installed and working correctly on a single box, especially since my Sharepoint knowledge right now is practically non-existent when it comes to BI. Every time I tried to get the Sharepoint / PowerPivot integration working, I hit a roadblock or managed to corrupt my Sharepoint install.

I had the opportunity to speak with Brian Knight ( blog | twitter ) and he pointed me to PowerPivotPro.com which, in turn, led me to PowerPivotGeek.com. They had a page of instructions for single-server installs with a pointer to an MS whitepaper and a private cached file of the whitepaper just in case. This gave me the information needed to get PowerPivot working on my VM.

A couple of notes from my personal experience:
  • Install Sharepoint SP1 right after installing Sharepoint 2010. SP1 is required when you're using Denali
  • Do NOT configure Sharepoint until the directions tell you to do so.
  • Use a Named Instance in order to get PowerPivot working.
  • If you plan to use Crescent, make sure that you set up SSAS with the "Tabular Data" option (or whatever the final name ends up being).
Once again, thanks to Brian for pointing me towards the right solution. Thanks to PowerPivotGeek for hosting these files. I imagine the instructions would work for Virtual Server, Virtual PC, VMWare, VirtualBox, or similar Virtual Machine apps. I hope this is helpful to anyone else trying to set up their own VM for testing out the Microsoft BI stack.

Wednesday, June 29, 2011

Service Broker–WAITFOR and Activated Queues

We’ve implemented Service Broker for handling a small portion of the transactions we want to catch and apply into our Operational Data Store. Mostly we wanted to make sure we accounted for actual DELETE operations and handled those records correctly. We can use Change Data Capture at this time on our source systems and Service Broker seemed to fit the bill. We still handle the remaining inserts and updates through SSIS.

While running Service Broker, we noticed a pretty constant CPU hum on the receiving server. That seemed odd, but we had a lot of trouble tracking it down. Regular Profiler traces didn’t show any running TSQL, “normal” Service Broker traces weren’t showing much, either. We just saw SQL Server running at a pretty constant 20% even with nothing seemingly happening.

I want to give a public thank you to Mark Hill (twitter) for doing a little extra digging and catching the root cause of this. In our tinkering with Service Broker, Queues, Activation, Stored Procs, and such we missed some very important information along the way.

We had written a stored procedure that would be used on the Receiving side of our Service Broker queue to run as the “Activated” proc. Inside the stored procedure, we included a WHILE loop to process anything that was in the queue. If nothing, exit.  That seemed pretty simple – if something comes into the queue, process it and stop when nothing is left.  End of story, right?

Sadly, this is where we missed a small, but important, fact about Service Broker. If you have an activated stored procedure that doesn’t use the WAITFOR command, Service Broker, we execute that stored proc as many times as possible looking for something to process. We looked at it, turned off the Activation on the queue and saw the CPU drop to nothing. We re-activated the queue and saw the CPU shoot up again. We tweaked the stored procedure after that and added a WAITFOR command with a timeout of 60000. After doing this, we saw the stored procedure run to process everything in the queue, then go idle, which was exactly the intended behavior in the first place.

I’m not going to pretend to be an expert on Service Broker. We had tried to code with the intention of being able to use this stored procedure as an Activated stored procedure or as one called from an external process to work through the queue. While that may be possible, it was unnecessary in our actual usage. Adding a WAITFOR command around our queue processing eliminated our extra, non-essential CPU usage and stopped trying to execute a stored procedure for no reason.

Resources

Pro SQL Server 2008 Service Broker ( Amazon | Apress )

WAITFOR (Also see this performance article – it’s the little details that get you)

Thursday, February 17, 2011

Quickly script permissions in SQL 2005+

I wanted an easy way to generate a script to recreate all DB object and schema permissions for a database. Searching online, I found several different examples, but none quite did everything I needed. I put this together from some examples and added it to my toolkit. This script will generate the appropriate T-SQL to grant permissions to DB Objects and Schemas.

 

SELECT
state_desc
+ ' ' + permission_name +
' on ['+ ss.name + '].[' + so.name + ']
to [
' + sdpr.name + ']'
COLLATE LATIN1_General_CI_AS
as [Permissions T-SQL]
FROM SYS.DATABASE_PERMISSIONS AS sdp
JOIN sys.objects AS so
ON sdp.major_id = so.OBJECT_ID
JOIN SYS.SCHEMAS AS ss
ON so.SCHEMA_ID = ss.SCHEMA_ID
JOIN SYS.DATABASE_PRINCIPALS AS sdpr
ON sdp.grantee_principal_id = sdpr.principal_id

UNION

SELECT
state_desc
+ ' ' + permission_name +
' on Schema::['+ ss.name + ']
to [
' + sdpr.name + ']'
COLLATE LATIN1_General_CI_AS
as [Permissions T-SQL]
FROM SYS.DATABASE_PERMISSIONS AS sdp
JOIN SYS.SCHEMAS AS ss
ON sdp.major_id = ss.SCHEMA_ID
AND sdp.class_desc = 'Schema'
JOIN SYS.DATABASE_PRINCIPALS AS sdpr
ON sdp.grantee_principal_id = sdpr.principal_id
order by [Permissions T-SQL]
GO

I know that a lot of you may have something that does this already, but figured it can’t hurt to have another snippet available to generate permission statements. If you make any enhancements, let me know. I’d love to keep this up to date so it’s helpful to others.

Monday, February 7, 2011

SQL 2008 R2 – SSIS Data Export Oddities

I don’t know if anyone else has had these sorts of issues, but we work quite a bit with conversions of various source data into our SQL Server database. I recently was handed a backup from SQL 2008 R2. I figured that wasn’t a huge problem. Restore onto my local test box, use SSIS to push over to a SQL 2005 instance so it can work its way to Production. No big deal, right?

The first time I tried this using the SQL Native Client 10.0 on both sides, I ran into an errors with the mappings. I can’t quite figure that one out because from what I can see, there’s nothing at all in the source database using a feature that would not be available in SQL 2005.

I tried exporting to an MS Access MDB file. No luck from SQL 2008 R2 with the default settings because of an invalid size error on a varchar to longtext conversion.

I was able to successfully export using SNAC from SQL 2008 R2 to a SQL 2008 instance. So from there I thought I could upload directly to the SQL 2005 instance. No such luck using SNAC – again. I was able to export from there to an Access MDB file and pull that into SQL 2005.  I don’t quite get why that worked, but figure I’ve got some odd mapping in the XML files defining the defaults that I’m missing.

 

I was recently asked to repeat this task and figured there had to be a better way. This time I restored the DB to my R2 instance – no issues. I then used the SNAC client to access my R2 instance, but set up an OLEDB connection to my SQL Server 2005 target. For some reason, the mappings are just different enough that this worked with no issues. I was able to transfer directly. I now wish I’d tried that the first time, but I’d already blown a couple of hours on it.

If anyone else has encountered that and knows why SNAC from 2008 R2 doesn’t seem to work directly to SNAC on 2005, I’d love to know the reasons as well. If not and you encounter something similar, maybe trying the OLEDB connections will work for you.

Monday, January 17, 2011

SQL 2008 Merge and PK/FK Constraints

We ran into this issue a while back.  SQL 2008 has a “feature” that will not allow inserts into a table on either side of a Foreign Key relationship.  If you attempt to insert into a table involved in the FK relationship, you get an error something like:
The target table 'TableName' of the INSERT statement cannot be on either side of a (primary key, foreign key) relationship when the FROM clause contains a nested INSERT, UPDATE, DELETE, or MERGE statement. Found reference constraint 'ConstraintName'.
This is documented in Connect 435031. There is one workaround documented in the ticket – drop the FK constraint, run the MERGE query, then re-create the constraint. Happily, this bug is still open as the requirement to drop a constraint just to run a MERGE statement kind of defeats the purpose of putting a constraint on the table in the first place. While dropping the constraints will work, there is another possible workaround. I was able to run this without any issues while leaving my FK Constraint in place.
  1. Create a temp table that matches the definition of the table into which you want to perform your insert.
  2. Instead of inserting into your base table like you normally would in a merge, write your code to INSERT #Tablename at the beginning of the MERGE process. You’ll still use your main table and staging table inside the MERGE INTO … USING query.
  3. After the Merge, insert into the main table using the values in your Temp table.
Here’s a shell of an example. Line 11 below contains the main difference to the MERGE query from a “normal” MERGE.
   1: --1: Create Temp table
   2: CREATE TABLE #MyFactTable(
   3:     ID INT NULL
   4:     , CustomerName VARCHAR(100) NULL
   5:     , SourceID INT NULL
   6:     , OutputAction VARCHAR(100) NULL
   7: );
   8:  
   9: --2: INSERT into the temp table instead of your normal target table
  10: --   Merge query will be the same otherwise
  11: INSERT INTO #MyFactTable (ID, CustomerName, SourceID, OutputAction)
  12: SELECT so.ID, so.CustomerName, so.SourceID, so.output_action
  13: FROM (
  14:     MERGE INTO dbo.MyFactTable AS t
  15:     USING Staging.MyFactTable AS s
  16:     ON ( s.ID = t.ID
  17:         AND s.NewLoad = 0 )
  18:         
  19:     WHEN MATCHED AND ( s.SourceID <> t.SourceID )
  20:             AND s.NewLoad = 0            
  21:     THEN UPDATE 
  22:         SET RecordState = 0
  23:         , UpdatedDate = getdate()
  24:         
  25:     WHEN NOT MATCHED BY TARGET AND s.NewLoad = 0 THEN
  26:         INSERT (ID, CustomerName, SourceID)
  27:         VALUES (s.ID, s.CustomerName, s.SourceID)
  28:         OUTPUT $action AS OutputAction
  29:             , ID
  30:             , CustomerName
  31:             , SourceID
  32:         ) AS so (OutputAction, ID, CustomerName, SourceID)
  33:         WHERE OutputAction = 'UPDATE'  ;
  34:  
  35: --3: Perform the final insert into your target table
  36: INSERT INTO MyFactTable (ID, CustomerName, SourceID)
  37: SELECT DISTINCT ID, CustomerName, SourceID
  38: FROM #MyFactTable ;
  39:  
  40: --4: Clean up your temp objects.
  41: DROP TABLE #MyFactTable ;

I’d love to hear if anyone else has a workaround for this that doesn’t involve dropping and recreating the FK constraint. If this is an issue for you, please vote for the Connect Ticket.