Secret SQL Escape Characters
I learned of an evil SQL escape sequence today, in the context of a data migration script. The script moves data from one database to another, but the schema changes between databases, so you can’t use BCP. As such, the script needs to ensure that it does not modify any of the database content.
The script works well, and it creates insert statements based on the values and the content of the old data. In the VALUES clause, the script single-quotes the data, and then escapes out any single quotes in the data. According to best practices (and all of the documentation I can find,) that is enough to neutralize any SQL string.
INSERT INTO MyTable ([Column1], [Column2])
VALUES (‘Value '’ 1', ‘Value 2’)
But it turns out there were backslash characters getting randomly dropped in the content we were migrating.
After some research, it turns out that only backslash characters before newlines get erased. If you want to include backslash followed by a <cr><lf>, you have to do give the sequence: \\<cr><lf><cr><lf>. Newlines seem to be the only characters that cause a backslash to become a special character – and the only place I can find that mentions it is http://support.microsoft.com/kb/164291.
As far as I can tell, this must be a relic from ‘Embedded SQL for C and SQL Server’ days: http://msdn2.microsoft.com/En-US/library/aa225198(sql.80).aspx
EXEC SQL INSERT INTO TEXT132 VALUES (‘TEST 192 IS THE TEST FOR THE R\
ULE OF THE CONTINUATION OF LINES FROM ONE LINE TO THE NEXT LINE.');
So, now you know – escaping single quotes isn’t enough to neutralize a SQL string. (And it should be pointed out that you should use SQLCommand and prepared statements whenever possible. In this case, it’s not possible :) )