Monday, September 23, 2013

SSIS, MySQL, and invalid Dates–Take #2

A while back I posted an article on handling dates in select statements from MySQL. The idea was that I wanted a quick way to generate select statements for MySQL that would automatically set any invalid date to NULL. (an invalid date being one such as “0000-00-00” or “1973-02-00”)  After some more trial and error, I realized that this script was incomplete. I’ve created an updated script that should better handle those edge cases.

The script is largely the same as the former, but I’ve added handling for invalid years or months or days. We don’t want to redirect the rows for further error handling. We want to treat the dates that won’t successfully import as NULL. This could be done several ways within SSIS, but handling the transform within the select statement reduces our need to write many transforms to handle each date time for each table.  (I suspect that BIML could be my friend for this task, but I’m not quite ready to tackle this with BIML.)

SELECT CONCAT(CASE
  WHEN ordinal_position = 1 THEN 'SELECT ' 
  ELSE ', ' 
  END
, CASE data_type WHEN 'date'
THEN CONCAT('CASE WHEN ', column_name, '= ''0000-00-00'' THEN NULL '
, 'WHEN ', column_name, ' LIKE ''%0000%'' THEN NULL '
, 'WHEN ', column_name, ' LIKE ''%-00%'' THEN NULL '
, 'else ', column_name, ' END as ', column_name)
WHEN 'timestamp'
THEN CONCAT('CASE WHEN ', column_name, '= ''0000-00-00'' THEN NULL '
, 'WHEN ', column_name, ' LIKE ''%0000%'' THEN NULL '
, 'WHEN ', column_name, ' LIKE ''%-00%'' THEN NULL '
, 'else ', column_name, ' END as ', column_name)
WHEN 'time'
THEN CONCAT('CAST(', column_name, ' AS CHAR(8) ) as ', column_name ) 
ELSE column_name
END
, CASE WHEN ordinal_position <> maxord THEN '' 
  ELSE CONCAT('     FROM MySchema.', c.table_name, ';') 
  END) AS Select_Column_Name
FROM   information_schema.columns AS c
       JOIN (SELECT table_name,
                    Max(ordinal_position) AS MaxOrd
             FROM   information_schema.columns
             WHERE  table_schema = 'MySchema'
             GROUP  BY table_name) AS t1
         ON c.table_name = t1.table_name
WHERE  table_schema = 'MySchema'
ORDER  BY c.table_name, ordinal_position
LIMIT  0, 50000;