Saturday 10 March 2018

Strange behaviour converting NVARCHAR(MAX) to BINARY

Few days back I was writing a CLR function to be used for hashing string values. The only option was the CLR functions since T-SQL doesn’t have any functionality to convert a string to hashed value using a key. Using the HASHBYTES function you can only provide the algorithm.

DECLARE @Data NVARCHAR(4000);  
SET @Data = CONVERT(NVARCHAR(4000),'My Secret Message');  
SELECT HASHBYTES('SHA1', @Data);  

I have written the CLR function to achieve the requirement, but during testing the validation was failing and when I go through the code I couldn’t find any issue in the function as well. But inspecting carefully I noticed that when a variable type NVARCHAR(n) and a variable type of NVARCHAR(MAX) gives different results when it’s converted to Binary. Which was the root cause for the issue I was facing.


DECLARE 
	@Data1	AS NVARCHAR(MAX) = '1111'
	,@Data2	AS NVARCHAR(10) = '1111'

SELECT 
	CAST(@Data1 AS BINARY(30)) AS ValueMax
SELECT 
	CAST(@Data2 AS BINARY(30)) AS ValueN


image_thumb1

As you can see the above example the zero bytes are represented differently for NVARCHAR(MAX) when it’s converted to BINARY.

I do not have any explanation for this. I am sharing the information in case anyone come across this issue. Please feel free to comment.