Tuesday 18 October 2016

Always Encrypted feature in SQL Server 2016

With the introduction of SQL Server 2016 in June (Official Final Release), Microsoft had introduced few, new and very useful features in to the SQL Server. One such feature is the ‘Always Encrypted’.

‘Always Encrypted’ is the ability to perform SQL operations (there are restrictions) on your data as it were normal (non encrypted), while keeping them encrypted all the time. This means SQL Server will always get encrypted data to be stored into the tables. This will put an extra layer of protection on to your data making sure that even onsite DBA’s or Developers cannot see the plain text value behind the encrypted data using their level of access. (Users with ‘SysAdmin’ access won’t be able to see these details without the Key). Therefore ‘Always Encryption’ provides a separation between those who own the data (and can view it) and those who manage the data (but should have no access).

 

Why Always Encrypted ?

There are many benefits using Always Encrypted feature:

  • It provides a clear separation between the data owners and people who manage it
  • Unless proper access is provided via encryption keys, even DBA’s or SysAdmin users cannot access the data in plain text

Ultimately aforementioned points will provide an unparalleled protection against data breaches and help to protect sensitive information such as credit card numbers, personal details etc. Also this will broaden the boundaries where such sensitive information can be kept.

 

How Always Encrypted Works ?

This is a client-side encryption technology which the SQL Server Client Driver plays a key role.

image

  • The data is transparently encrypted inside a client driver
  • Client manages the encryption key. SQL Server doesn’t have any information regarding the encryption key.

SQL Server can query and perform certain computations on the encryption data, such as equality comparisson, equality joins, group by etc.

 

Always Encrypted Demonstration

We will see how Always Encrypted can be implemented and used. In order to illustrate, we will use a table which contains employee information.

CREATE TABLE Employee(
	Id			INT
	,FirstName	VARCHAR(100)
	,LastName	VARCHAR(100)
	,DOB		DATE
	,SSN		INT
	,[Address]	VARCHAR(255)
	,PostalCode	INT
)

INSERT INTO Employee (
	[Id],[FirstName],[LastName],[DOB],[SSN],[Address],[PostalCode]) 
VALUES 
	(1,'James','Rubin','20-Jul-1986',173456858,'10585 N 600 E',46310)
	,(2,'Austin','Pyatt','24-Dec-1985',138868248,'100 BENTBROOK CT',27519)
	,(3,'Stacey','Munoz','23-Dec-1988',185682639,'1 WOODSIDE DR',4976)
	,(4,'James','Tweed','03-Jan-1987',133890886,'1 AUNNEK CT',95023)
	,(5,'James','Robles','11-Sep-1989',154135505,'101 FISHTRAP RD',35504)
	,(6,'Ebony','Lewis','17-Jul-1988',120488337,'101 N OAKS DR',35180)
	,(7,'Marian','Caro','20-Nov-1985',115281829,'1017 FISK ST SE',49507)
	,(8,'Lynne','Martinez','22-Apr-1985',157900240,'103 UNITY CT',78214)
	,(9,'Elsa','Cole','25-Apr-1990',150631885,'1001 E FERN AVE APT 201',78501)
	,(10,'Kiley','Caldwell','03-Jan-1988',131368172,'103 NOB HILL LN APT 5',40206)
	,(11,'Michael','Soluri','17-Jun-1985',173245124,'10770 S KILBOURN AVE',60453)
	,(12,'Gregory','Emmons','06-Sep-1988',137693229,'10 LOUISA PL APT 2F',7086)
	,(13,'Jessica','Barr','04-Feb-1989',155895863,'1 FAWNRIDGE DR',94945)
	,(14,'Daniel','Mccabe','06-Sep-1985',148236776,'1 CALLE MARGINAL GARCIA',674)
	,(15,'Sharon','Schwartz','06-Sep-1987',117569460,'1 KRITTER CT',8050)
	,(16,'Dorthy','Wear','13-Dec-1988',170517705,'1 CLARK RD',35747)
	,(17,'Betsy','Blansett','17-Jun-1990',182202498,'10 CALLE 1 DE FLORIDA',612)
	,(18,'Margaret','Payne','25-Jul-1985',157359609,'1003 BLOOMFIELD AVE',7006)
	,(19,'James','Walker','26-Jan-1989',142829150,'100 CONGLETON HOLLOW SPUR RD',40447)
	,(20,'Sarah','Reeves','22-Jun-1990',146171169,'1 BLUEBERRY LN',1832)

 

I have a small MVC Web Application which has a page to list out the aforementioned details from the SQL Server. The MVC Controller will load the details to a list of Employee records and pass it to the Html view which will be displayed as follows.

image

In the MVC application I have the following data model to load details from the SQL Database Table.

public class Employee {
        public int Id { get; set; }
        public string FirstName { get; set; }
        public string LastName { get; set; }
        public DateTime DOB { get; set; }
        public int SSN { get; set; }
        public string Address { get; set; }
        public int PostalCode { get; set; }

        public Employee() {
            
        }
    }

And I am using the following connection string in order to connect to the SQL Server Database.

 const string zConnectionString =
                @"Server=.\SQL2K16; Network Library=DBMSSOCN;Database=SQLTraining;Trusted_Connection=True;";

 

There are few steps to be followed on both SQL Server and application side (Client Applications) in order to implement and use this feature.

From the SQL Server side, there are few ways to enable the Always Encrypted feature. We will look more details how to use these feature using the wizard.

 

1. Right click the table which you want to encrypt details and select ‘Encrypt Columns’. This will take you the to wizard.

image

 

2. You will get the introduction screen which contains few details about what ‘Always Encrypted’ is all about. Click next and proceed to the next screen.

image

This is the column selection screen, which allows you to select which columns you want to encrypt and using which Encryption Type. There are two Encryption Types available in SQL Server 2016.

  • Deterministic –> Deterministic encryption always generates the same encrypted value for any given plain text value. Using deterministic encryption allows point lookups, equality joins, grouping and indexing on encrypted columns. However, but may also allow unauthorized users to guess information about encrypted values by examining patterns in the encrypted column, especially if there is a small set of possible encrypted values, such as True/False, or North/South/East/West region. Deterministic encryption must use a column collation with a binary2 sort order for character columns.

 

  •  Randomized –> Randomized encryption uses a method that encrypts data in a less predictable manner. Randomized encryption is more secure, but prevents searching, grouping, indexing, and joining on encrypted columns.

This advice has been included in Microsoft Documentation: Use deterministic encryption for columns that will be used as search or grouping parameters, for example a government ID number. Use randomized encryption, for data such as confidential investigation comments, which are not grouped with other records and are not used to join tables.

So in our example we will choose DOB & SSN columns for encryption. For DOB we will choose Randomized and for SSN we will choose Deterministic.

Once the encryption type is chosen the wizard should be similar to the screen shown below.

image

 

If you look closely, you will be able to see that the Encryption Key combo is disabled. The reason for this is the fact that we haven’t created any Column encryption keys so far. If the keys are created prior to the column selection then you will have the option to choose whether to use an existing key or to generate a new key.

image

In this illustration, we will use the option which will create a new column encryption key. Click next to proceed to the next step.

3. The next step is the Column Master Key Configuration. A Column Master Key will be used to encrypt and protect the Column Encryption Key, which is used to encrypt the data. We will use the option ‘Auto generated column master key’, which the wizard will generate the key for us. When we are creating a new Master Key, there are two options available, where to store the newly generated key. Clicking on the small info button beside each option will give further details about each option

image

 

4. Click next to move to the next step. In this step you can decide whether you require a PowerShellscript to be generated for the encryption process or to proceed with the encryption immediately. In this example we will select the second option and click on the next button.

 

image

In this step you will be presented with the steps which will be followed during the data encryption

image

Click finish to complete the encryption process. Once process is completed click close button.

image

 

Now if you check the details on SQL Table you can see that, data in SSN and DOB columns are encrypted.

SELECT * FROM dbo.Employee

image

If you see the Table creation script for the Employee table now, you could see few changes which has been done by the SQL Server after we enabled the encryption for those two columns.

CREATE TABLE [dbo].[Employee](
	[Id] [INT] NULL,
	[FirstName] [VARCHAR](100) NULL,
	[LastName] [VARCHAR](100) NULL,
	[DOB] [DATE] ENCRYPTED WITH (COLUMN_ENCRYPTION_KEY = [CEK_Auto1], 
		ENCRYPTION_TYPE = RANDOMIZED, 
		ALGORITHM = 'AEAD_AES_256_CBC_HMAC_SHA_256') NULL,
	[SSN] [INT] ENCRYPTED WITH (COLUMN_ENCRYPTION_KEY = [CEK_Auto1], 
		ENCRYPTION_TYPE = DETERMINISTIC, 
		ALGORITHM = 'AEAD_AES_256_CBC_HMAC_SHA_256') NULL,
	[Address] [VARCHAR](255) NULL,
	[PostalCode] [INT] NULL
) ON [PRIMARY]

You can see that it had added the ENCRYPTED WITH clause for those two columns. ENCRYPTED WITH clause consist 3 attributes which are:

  • COLUMN_ENCRYPTION_KEY –> CEK_Auto1 since we have chosen the option for SQL to generate a new key.
  • ENCRYPTION_TYPE –> Can be either RANDOMIZED or DETERMINISTIC
  • ALGORITHM –> This is always AES_256

If you inspect the Always Encrypted keys in the object explorer in SSMS you could see the following meta data for the Master and the Column Encrypted Keys.

image

 

Column Encrypted Key – CEK_Auto1

image

  • COLUMN_MASTER_KEY –> Name of the column master key protecting the value of the column encryption key.
  • ALGORITHM –> Algorithm used to generate the encrypted value of the column encryption key (RSA_OAEP).
  • ENCRYPTED_VALUE –> Encrypted value of the column encryption key. The encrypted value is assumed to be produced by encrypting the plaintext of the column encryption key using the specified column master key and the specified algorithm.

For further information please refer to the following url: https://blogs.msdn.microsoft.com/sqlsecurity/2015/07/06/always-encrypted-key-metadata/

 

Column Master Key - CMK_Auto1

image

  • KEY_STORE_PROVIDER_NAME –> Name of a provider for the key store that holds the column master key.
  • KEY_PATH –> Key path specifying the location of the column master key in the key store.

 

For further information please refer to the following url: https://blogs.msdn.microsoft.com/sqlsecurity/2015/07/06/always-encrypted-key-metadata/

 

Now if we try to fetch details without doing anything on the sample .Net Application you will get a similar error like shown below.

image

Now we will look into the things that we required to change on our application side (Business) in order to retrieve the required information.

1. Make sure that the target framework is version 4.6 or higher.

image

 

2. In the Connection String include ‘Column Encryption Setting=enabled’

And I am using the following connection string in order to connect to the SQL Server Database.

 const string zConnectionString =
@"Server=.\SQL2K16; Network Library=DBMSSOCN;Database=SQLTraining;Trusted_Connection=True;Column Encryption Setting=enabled;";

 

Now if we check the details from our application we can see that DOB and SSN values are fetched as plain text, even though the values are encrypted in the SQL Server.

image

image

Hope this will help you to understand the ‘Always Encrypted’ feature in SQL Server 2016 and how to integrate it to an existing application.

Sunday 9 October 2016

Understanding JOINs in SQL Server

During my work I get the chance reviewing lots of T-SQL Procedures and Views and I often see that the SQL joins are mis-used in them. When I enquire the developers regarding this, it’s evident that most of the time it has been the case that they don’t have the proper understanding what each JOIN exactly does or how it behaves, ultimately causing the SQL Procedure or the View to return an unexpected resultset. Therefore I thought of writing this blog post.
When we require to fetch details from multiple tables the JOIN caluse is there for the rescue. But in SQL Server there are various types of JOINs which will cater our requirement in different ways. So it’s very important to have a good understanding in these types of JOINs and their usage.
In SQL Server following types of JOINs available.
  • INNER JOIN
  • OUTER JOIN
    • LEFT OUTER JOIN
    • RIGHT OUTER JOIN
    • FULL OUTER JOIN
  • CROSS JOIN
  • CROSS APPLY
  • OUTER APPLY
We will look into the afrementioned JOINs more closely. The scope of this article is to give a high-level idea on the aforementioned  JOINs and the APPLY operator in SQL Server.
To illustrate the aforementioned JOINs I will use the following sample tables:
  • SalesRep
  • SalesDetails
  • RepRating
  • Settings
We consider a case where we have 5 Sales Reps and the details will be saved in ‘RepDetails’ table and the sales transactions which they have done is recorded under ‘SalesDetails’ table. In the SalesDetails table we have included few transactions which we don’t have a matching Sales Rep. Similarly in the RepDetails table there are couple of sales reps which we don’t have any sales infromation.

--== Create Tables ==--
CREATE TABLE RepDetails(
 RepId  INT
 ,RepName VARCHAR(30)
)

CREATE TABLE SalesDetails(
 RepId  INT
 ,SaleMonth VARCHAR(6)
 ,OrderNo VARCHAR(6)
 ,SaleValue MONEY
)

CREATE TABLE RepRating(
 RepId  INT
 ,Rate  INT
 ,YearMonth VARCHAR(6)
)

CREATE TABLE Settings(
 S_Id  INT
 ,S_Desc  VARCHAR(20)
 ,S_Value VARCHAR(20)
)


--== Populate Sample Data ==--
INSERT INTO RepDetails (
 [RepId]
 ,[RepName]
) VALUES 
 (1,'Eugene Thomas')
 ,(2,'John Wheeler')
 ,(3,'Curtis Bailey')
 ,(4,'Jeffrey Garrett')
 ,(5,'Rosemarie Hubbard')

INSERT INTO SalesDetails (
 [RepId]
 ,[SaleMonth]
 ,[OrderNo]
 ,[SaleValue]
) 
VALUES 
(7,'201607','XpyDy3',839)
,(1,'201607','NR0RTp',496)
,(4,'201607','4552T4',299)
,(6,'201607','GKhkyC',877)
,(4,'201606','iyK65Z',291)
,(6,'201606','NFCszW',446)
,(7,'201606','D238bN',135)
,(1,'201607','bERDXk',304)
,(7,'201608','nykZqB',935)
,(4,'201608','R7ea5v',352)
,(6,'201606','VVjIdo',407)
,(7,'201608','vtLT4z',977)
,(2,'201608','xnHTnO',416)
,(1,'201606','jFAJIm',674)
,(6,'201606','0Q011m',480)


INSERT INTO dbo.RepRating(
 RepId
 ,Rate
 ,YearMonth
)
VALUES
 (1,1,'201608')
 ,(3,2,'201608')
 ,(4,1,'201609')
 ,(2,2,'201609')

INSERT INTO dbo.Settings(
 S_Id
 ,S_Desc
 ,S_Value
)
VALUES
 (1,'LedgerMonth','201609')
 ,(2,'TaxRate','10%')

**Note: During the illustraion I will refer the table which is followed by the ‘FROM’ clause as the ‘Left Table’ and the table which is follwed by the JOIN clause as the ‘Right Table’.

INNER JOIN / JOIN

When we join two or more tables using an INNER JOIN, it will only return us the results when records can only be found on both left and right tables which will satisfy the condition we supply.
image


This can be illustrated using a venn diagram as follows:
image

SELECT *
FROM
 dbo.RepDetails AS RD
 JOIN dbo.SalesDetails AS SD
  ON SD.RepId = RD.RepId

image

**Please note: We have sales reps having RepId’s 1,2,3,4, & 5. But in SalesDetails table we have sales details for RepId’s 1,2,4,6 &7. So when these tables are joined the RepId’s which resides on both tables, which are 1,2, and 4 will return the details, ultimately giving us the aforementioned result set.

LEFT OUTER JOIN / LEFT JOIN

In a LEFT OUTER JOIN, unlike the INNER JOIN, it will select all the records from the ‘Left’ table and based on the JOIN condition, it will select any matching records from the ‘Right’ table and return us the results. If there are no matching details on the ‘Right’ table, columns on related to those rows will return as ‘NULL’.
image

This can be shown using a venn diagram as follows:
image

SELECT * 
FROM
 dbo.RepDetails AS RD
 LEFT JOIN dbo.SalesDetails AS SD
  ON SD.RepId = RD.RepId

image

RIGHT OUTER JOIN / RIGHT JOIN

In a RIGHT OUTER JOIN, it will select all records from the ‘Right’ table and based on the JOIN condition it will select any matching records from the left table and return. If there aren’t any matching records on the left table it will return a ‘NULL’ value.










This can be shown using a venn diagram as follows:
image

SELECT * 
FROM
 dbo.SalesDetails AS SD
 RIGHT JOIN dbo.RepDetails AS RD
  ON SD.RepId = RD.RepId

image

FULL OUTER JOIN / FULL JOIN

FULL OUTER JOIN is kind of a mx of both LEFT & RIGHT OUTER JOINs. It will return all rows from both ‘Left’ and ‘Right’ tables based on the JOIN condition. When the details aren’t matched it will return a NULL value in those respective columns.


image


This can be shown using a venn diagram as follows:
image

SELECT * 
FROM
 dbo.RepDetails AS RD
 FULL OUTER JOIN dbo.SalesDetails AS SD
  ON SD.RepId = RD.RepId


image

CROSS JOIN

CROSS JOIN will return a result set which the number of rows equal to rows in ‘Left’ table multiplied by the number of rows in ‘Right’ table. Usually this behaviour is present when there’s no condition provided in the WHERE condition. So each row in the left table is joined to each row in the right table. Usuually this behaviour is called ‘Cartisian Product’


image
SELECT * 
FROM
 dbo.RepDetails AS RD
 CROSS JOIN dbo.Settings AS S

image


But when some condition is provided via the WHERE clause CROSS JOIN will behave like an INNER JOIN
SELECT * 
FROM
 dbo.RepDetails AS RD
 CROSS JOIN dbo.Settings AS S
WHERE
 RD.RepId = S.S_Id

image
**Note: In a CROSS JOIN it’s not possible to refer to a value in the Left table along with the right table. Example following code will result in an error.
SELECT * 
FROM
 dbo.RepDetails AS RD
 CROSS JOIN (SELECT * FROM dbo.Settings AS S WHERE S.S_Id = RD.RepId ) AS ST

 

CROSS APPLY behaves like an INNER JOIN and OUTER APPLY behaves like an OUTER JOIN. But the main differnce in APPLY compared to the JOIN is that the right side of the APPLY operator can reference columns in the table which is on the left side. This is not possible in a JOIN.
For example, suppose we need to fetch sales rep details along with the maximum sale record which they have done. So the following query is not possible since it is returning an error due to the aforementioned reason.
SELECT 
 *
FROM
 dbo.RepDetails AS RD
 JOIN(
  SELECT TOP 1 * 
  FROM 
   dbo.SalesDetails AS SD 
  WHERE 
   RD.RepId = SD.RepId 
  ORDER BY  
   SD.SaleValue DESC
 ) AS SData 
  ON 1=1
It will result in an error:
Msg 4104, Level 16, State 1, Line 78
The multi-part identifier "RD.RepId" could not be bound.



The way to achieve this is by using an APPLY.

CROSS APPLY

Considering the above requirement, we can use a CROSS APPLY in order to achieve the aforementioned.
SELECT 
 *
FROM
 dbo.RepDetails AS RD
 CROSS APPLY(
  SELECT TOP 1 * 
  FROM 
   dbo.SalesDetails AS SD 
  WHERE 
   RD.RepId = SD.RepId 
  ORDER BY  
   SD.SaleValue DESC
 ) AS SData 

image


Noticed the above sample, you can see that it returned three records. But if you inspect closely, the SalesRep table consists with five Reps. But CROSS APPLY has only returned the maximum sales value if there’s a matching record in the table right side to the  APPLY operator. (Similar to an INNER JOIN)


OUTER APPLY

Using OUTER APPLY we can achieved a similar result like CROSS APPLY, but the difference is that even though there aren’t any matching records in the table right side to the APPLY operator, still it will return all the rows from the left side table, will NULL values for the columns in the right side table. We will consider the same query what we used in the above example, but changing the APPLY to an OUTER APPLY.
SELECT 
 *
FROM
 dbo.RepDetails AS RD
 OUTER APPLY(
  SELECT TOP 1 *
  FROM 
   dbo.SalesDetails AS SD 
  WHERE 
   RD.RepId = SD.RepId 
  ORDER BY  
   SD.SaleValue DESC
 ) AS SData
image
There are other capabilities which is possible using the APPLY. The following article explains these capabilites really well: http://bradsruminations.blogspot.sg/2011/04/t-sql-tuesday-017-it-slices-it-dices-it.html

Hope this will help you to understand the JOIN and the APPLY operator in SQL Server and where it can be used precisely.











Thursday 29 September 2016

Enable Code Lense feature on VS 2015 Community Edition

As per the Visual Studio 2015 feature comparisson between the editions, Microsoft has clearly mentioned that certain features will only be available in Professional & Enterprise editions of VS 2015.

One such feature happens to be the ‘CodeLense’

If you google you can easily find out what’s code lense and its features. This is a good article which explains about code lense (http://www.codeproject.com/Articles/794766/What-is-CodeLens)

This is what has been mentioned in the feature comparisson for Visual Studio 2015 (https://www.visualstudio.com/vs/compare/)

image_thumb3

However installing SQL Server Data Tools for VS 2015, will enable this feature in the Community Edition of VS 2015.

SSDT can be obtained from the this site: https://msdn.microsoft.com/en-us/mt186501.aspx

image_thumb10

Codel lense feature can be customize using option dialog.

image_thumb8

Hope this will be useful to you.

Wednesday 7 September 2016

Native JSON Support in SQL Server 2016

When it comes to modern web development, JSON is one of the well known technologies when you require to exchange information between different applications. Before JSON, XML was used (and still being used in various applications and technologies) to do this job. But compared to XML, JSON is less verbose (like XML, JSON doesn’t have a closing tag), and it will make the JSON data somewhat smaller in size, ultimately making the data flow much faster. Perhaps the most significant advantage that JSON has over XML is that JSON is a subset of JavaScript, so code to parse and package it fits very naturally into JavaScript code. This seems highly beneficial for JavaScript programs and that happens to be a good reason for JSON to be very popular among web application developers.

However using XML or JSON is up to the personal preference and the requirement.

Prior to SQL Server 2016, there’s wasn’t any support for JSON in the earlier editions. So native JSON support is one of the new features which Microsoft introduced in SQL Server 2016.

Prior to SQL Server 2016 there are other databases which supports JSON.

  • MongoDB
  • CouchDB
  • eXistDB
  • Elastisearch
  • BaseX
  • MarkLogic
  • OrientDB
  • Oracle Database
  • PostgresSQL
  • Riak

But my main focus in this post will be the JSON support in SQL Server 2016.

In order to support JSON, in SQL 2016 following in built functions have been introduced:

  • ISJSON
  • JSON_VALUE
  • JSON_QUERY
  • JSON_MODIFY
  • OPENJSON
  • FOR JSON

 

There’s no specific data type in SQL Server to be used for JSON (unlike XML). You have to use NVARCHAR when you interact with JSON in SQL Server.

 

image

 

This is how we assign JSON data to a variable.

 

DECLARE @varJData AS NVARCHAR(4000)
SET @varJData = 
N'{
	"OrderInfo":{
		"Tag":"#ONLORD_12546_45634",
		"HeaderInfo":{
			"CustomerNo":"CUS0001",
			"OrderDate":"04-Jun-2016",
			"OrderAmount":1200.00,
			"OrderStatus":"1",
			"Contact":["+0000 000 0000000000", "info@abccompany.com", "finance@abccompany.com"]
		},
		"LineInfo":[
			{"ProductNo":"P00025", "Qty":3, "Price":200},
			{"ProductNo":"P12548", "Qty":2, "Price":300}
		]
	}
}'

We will look closely how the aforementioned functions can be used with some sample

 

ISJSON()

As the name implies ISJSON functions is used to validate a given JSON string. The function will return in INT value and If the provided string is properly formatted as JSON it will return 1 else it will return 0.

Eg:

SELECT ISJSON(@varJData)

 

JSON_VALUE()

JSON_VALUE function can be used to return a scalar value from a JSON string.

Eg:

DECLARE @varJData AS NVARCHAR(4000)
SET @varJData = 
N'{
	"OrderInfo":{
		"Tag":"#ONLORD_12546_45634",
		"HeaderInfo":{
			"CustomerNo":"CUS0001",
			"OrderDate":"04-Jun-2016",
			"OrderAmount":1200.00,
			"OrderStatus":"1",
			"Contact":["+0000 000 0000000000", "info@abccompany.com", "finance@abccompany.com"]
		},
		"LineInfo":[
			{"ProductNo":"P00025", "Qty":3, "Price":200},
			{"ProductNo":"P12548", "Qty":2, "Price":300}
		]
	}
}'

SELECT JSON_VALUE(@varJData,'$.OrderInfo.Tag')
SELECT JSON_VALUE(@varJData,'$.OrderInfo.HeaderInfo.CustomerNo')

Please note that the provided key is case sensitive and instead of ‘Tag’ if you pass ‘tag’ it will return a NULL since the function cannot find the key.

SELECT JSON_VALUE(@varJData,'$.OrderInfo.tag') /* This will returns NULL */

In such case if you require to see the exact error or the root cause, you need to specify ‘strict’ prior to the key. Eg:

SELECT JSON_VALUE(@varJData,'strict $.OrderInfo.tag') /* This will thorw an Error */

This will return the following error message instead of returning a NULL value.

Msg 13608, Level 16, State 1, Line 62
Property cannot be found on the specified JSON path.

Also JSON_VALUE can be used to fetch an element from a simple array (not an object array). In our sample JSON there are two arrays, which are ‘Contact’ and ‘LineInfo’, where the first being a simple string array and the latter is an object array.

Suppose if we require to fetch only the phone number from the contact details we can use the following query:

SELECT JSON_VALUE(@varJData,'$.OrderInfo.HeaderInfo.Contact[0]')

Also this can be used when we require to fetch an attribute from an array element as well. Suppose if we require to get the product number from the first element of the ‘LineInfo’ we could use:

SELECT JSON_VALUE(@varJData, '$.OrderInfo.LineInfo[0].ProductNo')

 

JSON_QUERY()

JSON_QUERY function is used when you require to extract an array of data or an object from a JSON.And we can extract the contact details and the line details which are arrays in this scenario as follows

SELECT JSON_QUERY(@varJData, '$.OrderInfo.HeaderInfo.Contact')
SELECT JSON_QUERY(@varJData, '$.OrderInfo.LineInfo')

And this can be used to fetch a certain element from an object array. Suppose if we want to fetch details for the second prodcut in the LineInfo section we can use the following:

SELECT JSON_QUERY(@varJData, '$.OrderInfo.LineInfo[1]')

**Note: If the JSON text contains duplicate properties - for example, two keys with the same name on the same level- the JSON_VALUE and JSON_QUERY functions will return the first value that matches the path.

 

JSON_MODIFY()

JSON_MODIFY function updates the value of a property in a JSON string and return the updated JSON string. The syntax for this function is as follows:

JSON_MODIFY(expression, path, new_value)

Using this function you can either Update, Insert, Delete or Append a value to the JSON string. We will see each of these operations using the above JSON string.

Updating an Exitsing Value

In order to update an existing value you need to provide the exact path followed by the value which should be updated to.

SET @varJData = JSON_MODIFY(@varJData,'$.OrderInfo.Tag','#NEWTAG_00001')
SELECT JSON_VALUE(@varJData,'$.OrderInfo.Tag')

 

Deleting an Existing Value

In order to delete an existing value you need to provide the exact path follwed by the value ‘NULL’.

SET @varJData = JSON_MODIFY(@varJData,'$.OrderInfo.Tag',NULL)
SELECT JSON_VALUE(@varJData,'$.OrderInfo.Tag')
PRINT @varJData

When the value is printed you can see that the ‘Tag’ attribute has been completely removed from the JSON string.

image

 

Inserting a Value

In order to insert an attribute along with a value you need to provide a path which isn’t currently availble in the JSON followed by the value. If the provided path is already present then the existing value will be replaced by the new value. The new value will always be added at the end of the existing JSON string.

SET @varJData = JSON_MODIFY(@varJData,'$.OrderInfo.Batch','#B_100000')
SELECT JSON_VALUE(@varJData,'$.OrderInfo.Batch')
PRINT @varJData

image

 

Appending a Value

In order to append an existing array in a JSON, you need to use ‘append’ before the path. Suppose if we require to add another element to the

SET @varJData = JSON_MODIFY(@varJData, 'append $.OrderInfo.HeaderInfo.Contact','+0010 111 1111111111')
SELECT JSON_QUERY(@varJData,'$.OrderInfo.HeaderInfo.Contact')

image

 

JSON_MODIFY can only manipulate a single value at a time. Therefore if the requirement is to change multiple values within a single query, you need to use JSON_MODIFY function multiple times. Suppose if we require to change the ‘ProductNo’ and the ‘Price’ of the first product in the ‘LineInfo’ we coud use the following syntax.

SET @varJData =
	JSON_MODIFY( 
		JSON_MODIFY(@varJData,'$.OrderInfo.LineInfo[0].ProductNo','P99999')
		,'$.OrderInfo.LineInfo[0].Price'
		,150
	)

image

 

FOR JSON

FOR JSON functionality is used When we are required to export SQL Tabular data as JSON data. This is very similar to the functionality of ‘FOR XML’. Each row will be formatted as a JSON object and values in cells will be generated as values of those respective JSON objects. Column names (or aliases) will be used as key names. Based on the options provided there are two variations in ‘FOR JSON’ usage.

1. FOR JSON AUTO - This will automatically create nested JSON sub arrays based on the table hierarchy used in the query. (similar to FOR XML AUTO)

2. FOR JSON PATH - This enables you to define the structure of output JSON using the column names/aliases. If you put dot-separated names in the column aliases, JSON properties will follow the naming convention. (This is similar to FOR XML PATH where you can use slash separated paths)

 

In order to illustrate the aforementioned concepts we need to prepare some sample data. Please use the following scripts to generate the necessary data.

--== Generate Required Schemas ==--
CREATE TABLE OrderHeader(
	TAG					VARCHAR(24)
	,ORD_NO				VARCHAR(10)
	,CUST_NO			VARCHAR(8)
	,ORD_DATE			DATE
	,ORD_AMOUNT			MONEY
	,ORD_STATUS			TINYINT
	
)

CREATE TABLE OrderLine(
	ORD_NO				VARCHAR(10)
	,LINE_NO			INT
	,PROD_NO			VARCHAR(8)
	,ORD_QTY			INT
	,ITEM_PRICE			MONEY
)

CREATE TABLE CustomerContact(
	CONT_ID				INT
	,CUST_NO			VARCHAR(8)
	,CONTACT_DETAILS	VARCHAR(24)
)

--== Insert Sample Data ==--
INSERT INTO dbo.OrderHeader(TAG,ORD_NO,CUST_NO,ORD_DATE,ORD_AMOUNT,ORD_STATUS)
VALUES('#ONLORD_12546_45634','ORD_1021','CUS0001','04-Jun-2016',1200.00,1)

INSERT INTO dbo.OrderLine(ORD_NO,LINE_NO,PROD_NO,ORD_QTY,ITEM_PRICE)
VALUES ('ORD_1021',1,'P00025',3,200.00), ('ORD_1021',1,'P12548',2,300.00)

INSERT INTO dbo.CustomerContact(CONT_ID, CUST_NO, CONTACT_DETAILS)
VALUES (1,'CUS0001','+0000 000 0000000000') ,(2,'CUS0001','info@abccompany.com'),(3,'CUS0001','finance@abccompany.com')

 

Extracting data as JSON using FOR JSON AUTO

SELECT 
    H.TAG
   ,H.ORD_NO
   ,H.CUST_NO
   ,H.ORD_DATE
   ,H.ORD_AMOUNT
   ,H.ORD_STATUS
   ,L.ORD_NO
   ,L.LINE_NO
   ,L.PROD_NO
   ,L.ORD_QTY
   ,L.ITEM_PRICE
FROM
	dbo.OrderHeader AS H
	JOIN dbo.OrderLine AS L
		ON L.ORD_NO = H.ORD_NO
WHERE
	H.ORD_NO = 'ORD_1021'
FOR JSON AUTO

 

You will get a similar result which is shown below:

[
    {
        "TAG":"#ONLORD_12546_45634",
        "ORD_NO":"ORD_1021",
        "CUST_NO":"CUS0001",
        "ORD_DATE":"2016-06-04",
        "ORD_AMOUNT":1200.0000,
        "ORD_STATUS":1,
        "L":[
            {"ORD_NO":"ORD_1021","LINE_NO":1,"PROD_NO":"P00025","ORD_QTY":3,"ITEM_PRICE":200.0000},
            {"ORD_NO":"ORD_1021","LINE_NO":1,"PROD_NO":"P12548","ORD_QTY":2,"ITEM_PRICE":300.0000}
        ]
    }
]

As described previously ‘FOR JSON AUTO’ will simply convert the column names or aliases as keys and produce the JSON. Table aliases will be used to create sub arrays.

But we could get a similar resultset like what we had in our previous examples by tweaking the above select statement as follows:

SELECT 
    H.TAG AS Tag
   ,H.ORD_NO AS OrderNo
   ,H.CUST_NO AS CustNo
   ,H.ORD_DATE AS OrderDate
   ,H.ORD_AMOUNT AS OrderAmount
   ,H.ORD_STATUS AS  OrderStatus
   ,LineInfo.ORD_NO AS [OrderNo]
   ,LineInfo.LINE_NO AS [LineNo]
   ,LineInfo.PROD_NO AS [ProdNo]
   ,LineInfo.ORD_QTY AS [Qty]
   ,LineInfo.ITEM_PRICE AS [ItemPrice]
FROM
	dbo.OrderHeader AS H
	JOIN dbo.OrderLine AS LineInfo
		ON LineInfo.ORD_NO = H.ORD_NO
WHERE
	H.ORD_NO = 'ORD_1021'
FOR JSON AUTO, ROOT ('OrderInfo')

Then we will be able to get the following JSON string.

{
    "OrderInfo":[
        {
            "Tag":"#ONLORD_12546_45634",
            "OrderNo":"ORD_1021",
            "CustNo":"CUS0001",
            "OrderDate":"2016-06-04",
            "OrderAmount":1200.0000,
            "OrderStatus":1,
            "LineInfo":[
                {"OrderNo":"ORD_1021","LineNo":1,"ProdNo":"P00025","Qty":3,"ItemPrice":200.0000},
                {"OrderNo":"ORD_1021","LineNo":1,"ProdNo":"P12548","Qty":2,"ItemPrice":300.0000}
            ]
        }
    ]
}

 

Extracting data as JSON using FOR JSON PATH

We can use the FOR JSON PATH functionality to format the output JSON the way we require easily. But there’s a restriction when we use ‘FOR JSON PATH’ to extract data, which is that you cannot have the same column name (or alias) duplicated among multiple columns. This will result in an error.

We will see how the details will be fetched using ‘FOR JSON PATH’

SELECT 
    H.TAG
   ,H.ORD_NO
   ,H.CUST_NO
   ,H.ORD_DATE
   ,H.ORD_AMOUNT
   ,H.ORD_STATUS
   --,L.ORD_NO	--If this line is uncommented it will throw an error	
   ,L.LINE_NO
   ,L.PROD_NO
   ,L.ORD_QTY
   ,L.ITEM_PRICE
FROM
	dbo.OrderHeader AS H
	JOIN dbo.OrderLine AS L
		ON L.ORD_NO = H.ORD_NO
WHERE
	H.ORD_NO = 'ORD_1021'
FOR JSON PATH

 

We will get the following JSON result.

[
    {
        "TAG":"#ONLORD_12546_45634",
        "ORD_NO":"ORD_1021",
        "CUST_NO":"CUS0001",
        "ORD_DATE":"2016-06-04",
        "ORD_AMOUNT":1200.0000,
        "ORD_STATUS":1,
        "LINE_NO":1,
        "PROD_NO":"P00025",
        "ORD_QTY":3,
        "ITEM_PRICE":200.0000
    },
    {
        "TAG":"#ONLORD_12546_45634",
        "ORD_NO":"ORD_1021",
        "CUST_NO":"CUS0001",
        "ORD_DATE":"2016-06-04",
        "ORD_AMOUNT":1200.0000,
        "ORD_STATUS":1,
        "LINE_NO":1,
        "PROD_NO":"P12548",
        "ORD_QTY":2,
        "ITEM_PRICE":300.0000
    }
]

Advantage in using ‘FOR JSON PATH’ is that you have the ability to control the structure using the column names/aliases. When dot seperated aliases are used, JSON properties will follow the naming convention. Please consider the below query and the results.

SELECT 
    H.TAG		AS 'HeaderInfo.Tag'
   ,H.ORD_NO		AS 'HeaderInfo.OrderNo'
   ,H.CUST_NO		AS 'HeaderInfo.CustNo'
   ,H.ORD_DATE		AS 'HeaderInfo.OrderDate'
   ,H.ORD_AMOUNT	AS 'HeaderInfo.OrderAmount'
   ,H.ORD_STATUS	AS 'HeaderInfo.OrderStatus'
   ,L.ORD_NO		AS 'LineInfo.OrderNo'
   ,L.LINE_NO		AS 'LineInfo.LineNo'
   ,L.PROD_NO		AS 'LineInfo.ProdNo'
   ,L.ORD_QTY		AS 'LineInfo.Qty'
   ,L.ITEM_PRICE	AS 'LineInfo.ItemPrice'
FROM
	dbo.OrderHeader AS H
	JOIN dbo.OrderLine AS L
		ON L.ORD_NO = H.ORD_NO
WHERE
	H.ORD_NO = 'ORD_1021'
FOR JSON PATH

You will see the following JSON result.

[
    {
        "HeaderInfo":{
            "Tag":"#ONLORD_12546_45634",
            "OrderNo":"ORD_1021",
            "CustNo":"CUS0001",
            "OrderDate":"2016-06-04",
            "OrderAmount":1200.0000,
            "OrderStatus":1
        },
        "LineInfo":{"OrderNo":"ORD_1021","LineNo":1,"ProdNo":"P00025","Qty":3,"ItemPrice":200.0000}
    },
    {
        "HeaderInfo":{
            "Tag":"#ONLORD_12546_45634",
            "OrderNo":"ORD_1021",
            "CustNo":"CUS0001",
            "OrderDate":"2016-06-04",
            "OrderAmount":1200.0000,
            "OrderStatus":1
        },
        "LineInfo":{"OrderNo":"ORD_1021","LineNo":1,"ProdNo":"P12548","Qty":2,"ItemPrice":300.0000}
    }
]

 

OPENJSON

OPENJSON is a table value function which will go through a given JSON string, and returns a relational table with it’s contents. It’ll iterate through JSON object arrays, elemets and generates a row for each element. There are two variations of this functionality.

  • Without a pre-defined schema where the values will be returned as key value pairs including it’s type to identify what sort of value is being returned.
  • With a well defined schema. This schema will be provided by us in the OPENJSON statement.

 

OPENJSON without a pre-defined schema

We will use the following JSON data string to find out the types which will be returned based on the data type.

{
    "Null Data":null,
    "String Data":"Some String Data",
    "Numeric Data": 1000.00,
    "Boolean Data": true,
    "Array Data":["A","B","C"],
    "Object Data":{"SomeKey":"Some Value"}
    }

 

DECLARE @vJSON AS NVARCHAR(4000) = N'{
	"Null Data":null,
	"String Data":"Some String Data",
	"Numeric Data": 1000.00,
	"Boolean Data": true,
	"Array Data":["A","B","C"],
	"Object Data":{"SomeKey":"Some Value"}
	}';  
  
SELECT * FROM OPENJSON(@vJSON) 

image

 

With some realistic set of JSON data.

DECLARE @vJSON AS NVARCHAR(4000) = N'{
	"Tag":"#ONLORD_12546_45634",
	"OrderNo":"ORD_1021",
	"CustNo":"CUS0001",
	"OrderDate":"2016-06-04",
	"OrderAmount":1200.0000,
	"OrderStatus":1
}';  
  
SELECT * FROM OPENJSON(@vJSON)  

image

 

OPENJSON with a pre-defined schema

We will use the same JSON string which we have used in the previous example and generate the result set with a pre-defined schema.

DECLARE @vJSON AS NVARCHAR(4000) = N'{
	"Tag":"#ONLORD_12546_45634",
	"OrderNo":"ORD_1021",
	"CustNo":"CUS0001",
	"OrderDate":"2016-06-04",
	"OrderAmount":1200.0000,
	"OrderStatus":1
}';  
  
SELECT * FROM OPENJSON(@vJSON) WITH(
	Tag				VARCHAR(24)
	,OrderNo		VARCHAR(8)
	,CustNo			VARCHAR(8)
	,OrderDate		DATE
	,OrderAmount	MONEY
	,OrderStatus	INT
) 
 

image

 

This is basically what has  been provided to support with JSON data in SQL 2016 natively. Hope this would be helpful for you.

Tuesday 23 August 2016

DROP IF EXISTS in SQL Server 2016 (DIE)

 

Prior to SQL Server 2016, when we need to drop a SQL Object, it's the best practice to check whether the respective object exists or not. Otherwise the operation will return in an error.


DROP TABLE [SomeTable]

If the object is not found it will return the following error.

Msg 3701, Level 11, State 5, Line 11
Cannot drop the table 'SomeTable', because it does not exist or you do not have permission.

Hence we need to change the syntax as:

IF EXISTS(SELECT 'x' FROM sys.objects AS O WHERE O.name = 'SomeTable' AND O.[type] = 'U')
    DROP TABLE [SomeTable]

   
OR

IF OBJECT_ID('dbo.SomeTable','U') IS NOT NULL
    DROP TABLE [SomeTable]

   
   
In SQL Server 2016 there is an easier way to do this using comparatively less amount for coding.

DROP TABLE IF EXISTS [SomeTable];
DROP PROCEDURE IF EXISTS [SomeProcedure];

Even this can be use when dropping columns and constraints from a table.

ALTER TABLE [TableName] DROP CONSTRAINT IF EXISTS [ConstraintName]
ALTER TABLE [TableName] DROP COLUMN IF EXISTS [TableName]

Eg:
CREATE TABLE SomeTable(
    Id        INT
    ,Name    VARCHAR(10)        NOT NULL CONSTRAINT [DF_SomeTable_Name] DEFAULT ('')
)

ALTER TABLE dbo.SomeTable
DROP CONSTRAINT IF EXISTS [DF_SomeTable_Name]

ALTER TABLE dbo.SomeTable
DROP COLUMN IF EXISTS [Name]


The beauty of this functionality is that even the object does not exists, it will not fail and execution will continue.

Currently, the following objects can be dropped with the DIE functionality:

  • ASSEMBLY
  • VIEW
  • DATABASE
  • DEFAULT
  • FUNCTION
  • PROCEDURE
  • INDEX
  • AGGREGATE
  • ROLE
  • RULE
  • SCHEMA
  • SECURITY POLICY
  • SEQUENCE
  • SYNONYM
  • TABLE
  • TRIGGER
  • TYPE
  • USER
  • VIEW

Hope this will be useful to you.

Saturday 13 August 2016

Connecting to an MS SQL Instance using NodeJS (Fixing ConnectionError: Port for SQLServer not found in ServerName & Failed to connect to localhost:undefined in 15000ms)

After few debates and discussions on new technologies in the market and how to adapt to them, during the weekend I thought of exploring NodeJS and it applications. Since I spent most of my time designing data-centric applications at office, as the first step I thought of connecting to MS SQL Server using NodeJS.
As a newbie to NodeJS, I went through the official documentation and managed to achieve it. However, during the course, I faced many difficulties and by referring to many of the articles, I was managed to resolve all these hurdles.
So I thought of including these problems which I faced and how to overcome them. So it would be a great help to anyone who’s exploring or trying to achieve this more easily.
For this, I will be using SQL Server 2014 on an Instance (.\SQL2K14).

1. First you need to download and install NodeJS. (https://nodejs.org/en/)
2. Install MSSQL package for Node, using the following syntax: (Use windows command prompt)
npm install mssql
3. Create a file named ‘connecttosql.js’ and include the following code:

//We require mssql package for this sample
var sqlcon = require('mssql');

function GetSQLData(queryCallback){        //A callback function is taken as an argument. Once the operation is completed we will be calling this
   
    //SQL Configuration
    var config = {
        user:'###'            //SQL User Id. Please provide a valid user
        ,password:'######'    //SQL Password. Please provide a valid password
        ,server:'localhost\\SQL2K14'   
        /*
            Since my SQL is an instance I am using 'localhost\\Instance'.
            If you have SQL installed on the default instance, it should be server:'localhost'
        */
        ,database: 'master'        //You can use any database here
    }

    var connection = new sqlcon.Connection(config,function(err){
        //In case of an error print the error to the console. You can have your customer error handling
        if (err) console.log(err);
       
        //Query Database
        var dbQuery = new sqlcon.Request(connection);
        //Purposely we are delaying the results
        dbQuery.query("WAITFOR DELAY '00:00:05';SELECT * FROM INFORMATION_SCHEMA.TABLES",function(err,resultset){
            //In case of an error print the error to the console. You can have your customer error handling
            if (err) console.log(err);
           
            //Passing the resultset to the callback function
            queryCallback(resultset);
        })
    });
}

function callback (resultset){
    console.dir('Results returned and printed from the call back function');
    console.dir(resultset);
   
    //Exit the application
    console.dir('Exiting the Application');
    process.exit(0);
}

//Calling the function
console.dir('Calling GetSQLData');
GetSQLData(callback);
/*
    Once we call this function even there's a delay to return the results
    you will see the next line printing 'Waiting for callback function to get invoked...'
*/
console.dir('Waiting for callback function to get invoked...');

I have provided the relevant information as comments.
Before running the program please make sure the following configurations on the SQL server is already done:

1. Enable TCP/IP Protocols in SQL Server Configuration Manager for both server and client.

image

Or else when running it will result an error shown below:
{ [ConnectionError: Port for SQL2K14 not found in ServerName;[YourPCName];InstanceName;SQL2K14;IsClustered;No;Version;12.0.2000.8;np;\\[YourPCName]\pipe\M
SSQL$SQL2K14\sql\query;;]
  name: 'ConnectionError',
  message: 'Port for SQL2K14 not found in ServerName;[YourPCName];InstanceName;SQL2K14;IsClustered;No;Version;12.0.2000.8;np;\\\\[YourPCName]\\pipe\\MSSQL
$SQL2K14\\sql\\query;;',
  code: 'EINSTLOOKUP' }
{ [ConnectionError: Connection is closed.]
  name: 'ConnectionError',
  message: 'Connection is closed.',
  code: 'ECONNCLOSED' }

2. In SQL Server Configuration Manager under SQL Server Services make sure that ‘SQL Server Browser’ service is running.

image

Or else when running the script it will result an error shown below:
{ [ConnectionError: Failed to connect to localhost:undefined in 15000ms]
  name: 'ConnectionError',
  message: 'Failed to connect to localhost:undefined in 15000ms',
  code: 'ETIMEOUT' }
{ [ConnectionError: Connection is closed.]
  name: 'ConnectionError',
  message: 'Connection is closed.',
  code: 'ECONNCLOSED' }

if the aforementioned issues are already addressed execute the above file using the following syntax in a Windows Command Window:
node connecttosql.js
You should get a similar result which is shown below:
'Calling GetSQLData'
'Waiting for callback function to get invoked...'
'Results returned and printed from the call back function'
[ { TABLE_CATALOG: 'master',
    TABLE_SCHEMA: 'dbo',
    TABLE_NAME: 'spt_fallback_db',
    TABLE_TYPE: 'BASE TABLE' },
  { TABLE_CATALOG: 'master',
    TABLE_SCHEMA: 'dbo',
    TABLE_NAME: 'spt_fallback_dev',
    TABLE_TYPE: 'BASE TABLE' },
  { TABLE_CATALOG: 'master',
    TABLE_SCHEMA: 'dbo',
    TABLE_NAME: 'spt_fallback_usg',
    TABLE_TYPE: 'BASE TABLE' },
  { TABLE_CATALOG: 'master',
    TABLE_SCHEMA: 'dbo',
    TABLE_NAME: 'spt_values',
    TABLE_TYPE: 'VIEW' },
  { TABLE_CATALOG: 'master',
    TABLE_SCHEMA: 'dbo',
    TABLE_NAME: 'spt_monitor',
    TABLE_TYPE: 'BASE TABLE' },
  { TABLE_CATALOG: 'master',
    TABLE_SCHEMA: 'dbo',
    TABLE_NAME: 'MSreplication_options',
    TABLE_TYPE: 'BASE TABLE' } ]
'Exiting the Application'

I hope this will help anyone who’s using node to connect to SQL and facing the aforementioned issues.

Wednesday 9 March 2016

String or binary data would be truncated / Arithmetic overflow error converting numeric to data type numeric – Workaround

 

There’s nothing more annoying than getting the error ‘String or binary data would be truncated’ or ‘Arithmetic overflow error converting numeric to data type numeric’, when you need to insert data to a table using a SELECT statement. To make it more interesting, the SQL won’t be providing us the name of the column (or columns) which is causing this issue. (This is due to the SQL architecture on how it executes queries)

To illustrate this I will use a small sample.

Suppose we have a table to store some Customer details:

CREATE TABLE Customer_Data(
    CustId        TINYINT
    ,CustFName    VARCHAR(10)
    ,CustLName    VARCHAR(10)
    ,MaxCredit    NUMERIC(6,2)
)

We will try to insert details to the above table. (In reality the SELECT statement will be very complex and could fetch lots of rows)


INSERT INTO dbo.Customer_Data(
    CustId
    ,CustFName
    ,CustLName
    ,MaxCredit
)

SELECT 1 AS CustId,'John' AS CustFName,'Doe' AS CustLName,1000.00 AS MaxCredit UNION ALL
SELECT 2,'Jane','Doe',1000.00 UNION ALL
SELECT 3,'James','Whitacker Jr.',15000.00

 

This will result the following error:

Msg 8152, Level 16, State 14, Line 48
String or binary data would be truncated.

The statement has been terminated.

The challenge here is to find out actually which columns are having this issue. (As mentioned in reality number of columns could be very large)

However there is a small workaround which we can use to find out the columns which is causing the insertion to fail. You need to do the following in order to find out these columns.

1. First create a table using the same select statement. (You can either create a temporary table or an actual table based on the environment and your need). I will create two tables, one actual and one temporary to illustrate both the options.

SELECT A.*
INTO Temp_Customer_Data
FROM(
    SELECT 1 AS CustId,'John' AS CustFName,'Doe' AS CustLName,1000.00 AS MaxCredit UNION ALL
    SELECT 2,'Jane','Doe',1000.00 UNION ALL
    SELECT 3,'James','Whitacker Jr.',15000.00
) AS A


SELECT A.*
INTO #Customer_Data
FROM(
    SELECT 1 AS CustId,'John' AS CustFName,'Doe' AS CustLName,1000.00 AS MaxCredit UNION ALL
    SELECT 2,'Jane','Doe',1000.00 UNION ALL
    SELECT 3,'James','Whitacker Jr.',15000.00
) AS A

2. Use the following query to identify the issue columns

Actual Table:

;WITH Cte_Source AS (
SELECT
    C.COLUMN_NAME
    ,C.DATA_TYPE
    ,C.CHARACTER_MAXIMUM_LENGTH
    ,C.NUMERIC_PRECISION
    ,C.NUMERIC_SCALE
FROM
    INFORMATION_SCHEMA.TABLES AS T
    JOIN INFORMATION_SCHEMA.COLUMNS AS C
        ON C.TABLE_CATALOG = T.TABLE_CATALOG
        AND C.TABLE_NAME = T.TABLE_NAME
        AND C.TABLE_SCHEMA = T.TABLE_SCHEMA
WHERE
    T.TABLE_NAME = 'Temp_Customer_Data'        -- Source Table
    AND T.TABLE_SCHEMA = 'dbo'
)
,Cte_Destination AS (
SELECT
    C.COLUMN_NAME
    ,C.DATA_TYPE
    ,C.CHARACTER_MAXIMUM_LENGTH
    ,C.NUMERIC_PRECISION
    ,C.NUMERIC_SCALE
FROM
    INFORMATION_SCHEMA.TABLES AS T
    JOIN INFORMATION_SCHEMA.COLUMNS AS C
        ON C.TABLE_CATALOG = T.TABLE_CATALOG
        AND C.TABLE_NAME = T.TABLE_NAME
        AND C.TABLE_SCHEMA = T.TABLE_SCHEMA
WHERE
    T.TABLE_NAME = 'Customer_Data'        -- Destination Table
    AND T.TABLE_SCHEMA = 'dbo'
)
SELECT
    S.COLUMN_NAME
   ,S.DATA_TYPE
   ,S.CHARACTER_MAXIMUM_LENGTH
   ,S.NUMERIC_PRECISION
   ,S.NUMERIC_SCALE

   ,D.COLUMN_NAME
   ,D.DATA_TYPE
   ,D.CHARACTER_MAXIMUM_LENGTH
   ,D.NUMERIC_PRECISION
   ,D.NUMERIC_SCALE
FROM
    Cte_Source AS S
    JOIN Cte_Destination AS D
        ON D.COLUMN_NAME = S.COLUMN_NAME
WHERE
    S.CHARACTER_MAXIMUM_LENGTH > D.CHARACTER_MAXIMUM_LENGTH
    OR S.NUMERIC_PRECISION > D.NUMERIC_PRECISION

 

Temporary Table:

;WITH Cte_Source AS (
SELECT
    C.COLUMN_NAME
    ,C.DATA_TYPE
    ,C.CHARACTER_MAXIMUM_LENGTH
    ,C.NUMERIC_PRECISION
    ,C.NUMERIC_SCALE
FROM
    tempdb.sys.objects so
    JOIN tempdb.INFORMATION_SCHEMA.TABLES AS T
        ON so.name = T.TABLE_NAME
        AND so.[object_id] = OBJECT_ID('tempdb..#Customer_Data')
    JOIN tempdb.INFORMATION_SCHEMA.COLUMNS AS C
        ON C.TABLE_CATALOG = T.TABLE_CATALOG
        AND C.TABLE_NAME = T.TABLE_NAME
        AND C.TABLE_SCHEMA = T.TABLE_SCHEMA
   
WHERE
    T.TABLE_SCHEMA = 'dbo'
)
,Cte_Destination AS (
SELECT
    C.COLUMN_NAME
    ,C.DATA_TYPE
    ,C.CHARACTER_MAXIMUM_LENGTH
    ,C.NUMERIC_PRECISION
    ,C.NUMERIC_SCALE
FROM
    INFORMATION_SCHEMA.TABLES AS T
    JOIN INFORMATION_SCHEMA.COLUMNS AS C
        ON C.TABLE_CATALOG = T.TABLE_CATALOG
        AND C.TABLE_NAME = T.TABLE_NAME
        AND C.TABLE_SCHEMA = T.TABLE_SCHEMA
WHERE
    T.TABLE_NAME = 'Customer_Data'        -- Destination Table
    AND T.TABLE_SCHEMA = 'dbo'
)
SELECT
    S.COLUMN_NAME
   ,S.DATA_TYPE
   ,S.CHARACTER_MAXIMUM_LENGTH
   ,S.NUMERIC_PRECISION
   ,S.NUMERIC_SCALE

   ,D.COLUMN_NAME
   ,D.DATA_TYPE
   ,D.CHARACTER_MAXIMUM_LENGTH
   ,D.NUMERIC_PRECISION
   ,D.NUMERIC_SCALE
FROM
    Cte_Source AS S
    JOIN Cte_Destination AS D
        ON D.COLUMN_NAME = S.COLUMN_NAME
WHERE
    S.CHARACTER_MAXIMUM_LENGTH > D.CHARACTER_MAXIMUM_LENGTH
    OR S.NUMERIC_PRECISION > D.NUMERIC_PRECISION

 

Both the aforementioned queries will return the following result.

image

The reason to return the above three columns as follows:

1. CustId ==> In our destination table CustId’s data type is TINYINT. Even the select query is returning the results within the boundary, the data type which our insertion query is returning is an INT. So there could be a possibility that there could be large numbers that the destination table could not hold.

2. CustName ==> ‘Whitacker Jr.’ is exceeding the maximum length of 10 which is in the destination table.

3. MaxCredit ==> In the destination table the size of the column is numeric (6,2). Which means it can hold values up to 9999.99. But our insertion query contains a record which consists of 15000.00.

 

Hope this might be helpful to you.

Thursday 3 March 2016

Extracting Date (Excluding Time) from a DateTime value in SQL Server

 

SQL Server supports many data types where we can store the Date along with the time, such as

  • DateTime
  • SmallDateTime
  • DateTimeOffset
  • DateTime2

But in some cases it’s required only to fetch only the date portion from an aforementioned type of field.

There are few ways which we can achieve this task easily using T-SQL.

The easiest of the method is to CAST the DateTime value directly to a DATE type.

SELECT CAST(GETDATE() AS DATE)                                    --==> 2016-03-03

Also you can achieve this by using the CONVERT function providing different styles as per your requirement.

SELECT CONVERT(VARCHAR(24),GETDATE(),101)                        --==> 03/03/2016
SELECT CONVERT(VARCHAR(24),GETDATE(),102)                        --==> 2016.03.03

Please refer to the following URL (https://msdn.microsoft.com/en-sg/library/ms187928.aspx) for more details on the CONVERT function and supported styles:

But if your requirement is to return a DateTime type but having only the date portion you can use the following syntax:

SELECT CAST(FLOOR(CAST(GETDATE() AS FLOAT)) AS DATETIME)        --==> 2016-03-03 00:00:00.000
SELECT CAST(CONVERT(VARCHAR(8),GETDATE(),112) AS DATETIME)      --==> 2016-03-03 00:00:00.000

Tuesday 16 February 2016

Index REBUILD vs. REORGANIZE in SQL SERVER

Couple of days back there was an interesting statement (or rather a question) was brought up by one of the colleagues in the company. Ultimately the initial stement left us with one simple question, which is the difference between Index REBUILD and REORGANIZE and when should be exactly use it.

If you google the aforementioned you can find numerous posts/blogs regarding this. Therefore I will keep things very simple and easier way to understand.

Rebuilding an index or Reorganizing is required when index fragmentation has reached to a considerable percentage. The fragmentation percentage can be identified using the Dynamic Management View - sys.dm_db_index_physical_stats in SQL Server.

You may get more details on the view on the following link: https://msdn.microsoft.com/en-us/library/ms188917.aspx

You can get a list of fragmented indexes using the following query:

SELECT
    OBJECT_NAME(Stat.object_id)
    ,I.name
    ,Stat.index_type_desc
    ,Stat.avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats(DB_ID(),NULL,NULL,NULL,NULL) AS Stat
JOIN sys.indexes AS I
        ON Stat.index_id = I.index_id
        AND Stat.object_id = I.object_id
WHERE
    Stat.avg_fragmentation_in_percent > 30

Executing the above query will give you a list of fragmented indexes which has more than 30% fragmentation. ‘index_type_desc’ will give you a hint what sort of index is it. (clustered, non-clustered, heap etc…)

As per the guidlines provided by Microsoft, it’s the best practice to Reorganize the index if the fragmentation is less than or equal to 30% (more than 5%) and Rebuild it if it’s more than 30%

 

Rebuilding Indexes

  • Should perform this if the fragmentation is more than 30%
  • Operation can be done online or offline

Index rebuilding can be done useing the following syntax:

In order to build all the indexes on a specific table:

USE <Database_Name>
GO

ALTER INDEX ALL ON <Table_Name> REBUILD
GO

 

In order to build only a specific index:

USE <Database_Name>
GO

ALTER INDEX <Index_Name> ON <Table_Name> REBUILD
GO

 

Reorganizing Indexes

  • Should perform this if the fragmentation is more than 5% but less than or equal to 30%
  • Operation is always online

Index reorganizing can be done using the following syntx:

In order to reorganize all the indexes on a specific table:

USE <Database_Name>
GO

ALTER INDEX ALL ON <Table_Name> REORGANIZE
GO

In order to reorganize only a specific index:

USE <Database_Name>
GO

ALTER INDEX <Index_Name> ON <Table_Name> REORGANIZE
GO

 

Optionally you can set many attributes during the Rebuild or Re-Organize process (Eg: FILLFACTOR, SORT_IN_TEMPDB etc..). Please check on the following link for more details on the REBUILD options: https://msdn.microsoft.com/en-us/library/ms188388.aspx

How ever REBUILD or REORGANIZE will not have an effect on the HEAP fragmentation. In order to remove the heap fragmentation you can use the followng syntax (*** NOT THE BEST PRACTICE):

USE <Database_Name>
GO

ALTER TABLE <Table_Name> REBUILD
GO

** Eventhough the aforementioned syntax will remove the HEAP fragmentation, it is considered as bad as creating and dropping a clustered index, which will leave behind lots of fragmentation on non clustered indexes. The best practise would be to create a clustered index on the table to remove the HEAP fragmentation. You can find more details on this on the following blog post by Paul. S Randal which he had illustrated nicely.