Manjuke's Blog: September 2010

Thursday, 30 September 2010

Design Patterns ~ Singleton Pattern using C#

It is considered as the simplest of patterns. Main objective of singleton pattern is to restrict the instantiation of a class to a singe object. It can be implemented like this.

public class Singleton {
        private static Singleton _instance;

        private Singleton() {

        }

        public static Singleton Instance {
            get{
                if(_instance == null){
                    _instance = new Singleton();
                }
                return _instance;
            }
        }
    }

The above mentioned implementation has advantages and disadvantages. The main advantages are :

Since the instance is created inside the ‘Instance’ property, the class can handle additional functionality like instantiating a subclass.

‘Lazy Instantiation’ approach. That is, instantiation of the class is not performed, till an object asks for an instance. This will avoid instantiating unnecessary singletons when the application starts.

And the main disadvantage of this approach is, that this is not ideal for multithreaded environments. If separate threads of execution enter the Instance property method at the same time, more that one instance of the Singleton object may be created. Each thread could execute the following statement and decide that a new instance has to be created.

if(_instance == null)

The most common and suitable solution to overcome this is to use the ‘Double-Check Locking’. This will keep separate threads from creating new instances of the singleton at the same time.

public class Singleton {
        private static volatile Singleton _instance;
        private static object syncRoot = new object();

        private Singleton() {

        }

        public static Singleton Instance {
            get{
                if(_instance == null){
                    lock (syncRoot) {
                        if (_instance == null) {
                            _instance = new Singleton();
                        }
                    }
                }
                return _instance;
            }
        }
    }

This method make sure that only one instance is created and only when the instance is required. Making the variable ‘volatile’ make sure that assignment to the instance variable completes before the instance variable can be accessed. And in order to avoid deadlocks it uses a separate object to lock on, rather than using the type itself.

Saturday, 25 September 2010

Removing Duplicate Records From a MS SQL Table – (MS SQL 2005 or above)

Have you ever been in a situation that your SQL tables contain duplicate records, where you have not defined a primary key or an auto increment field. And you need to keep one record and delete the rest.

The usual method of doing this is to use a temporary table or to use a cursor. But there is another method of doing this using a single query in SQL 2005 or above.

To illustrate this first I will create the following table.

create table SampleTable(
    id        int not null,
    name    varchar(20) not null,
    age        int not null
    )

Now I will insert some duplicate records to the above created table.

insert into SampleTable    (id,name,age) values (1,'John',30)
insert into SampleTable    (id,name,age) values (1,'John',30)
insert into SampleTable    (id,name,age) values (1,'John',30)
insert into SampleTable    (id,name,age) values (1,'John',30)
insert into SampleTable    (id,name,age) values (1,'John',30)
insert into SampleTable    (id,name,age) values (2,'Mary',26)
insert into SampleTable    (id,name,age) values (2,'Mary',26)
insert into SampleTable    (id,name,age) values (2,'Mary',26)
insert into SampleTable    (id,name,age) values (2,'Mary',26)
insert into SampleTable    (id,name,age) values (3,'Ann',25)
insert into SampleTable    (id,name,age) values (3,'Ann',25)
insert into SampleTable    (id,name,age) values (3,'Ann',25)
insert into SampleTable    (id,name,age) values (3,'Ann',25)
insert into SampleTable    (id,name,age) values (3,'Ann',25)
insert into SampleTable    (id,name,age) values (4,'James',21)

Using the below given query you can easily find out the duplicates (number of duplicate records).

select SUM(rec_count) as rec_count from(
select COUNT (*) - 1 as rec_count from SampleTable group by CHECKSUM(*)
) T having COUNT(*) > 1

On the above query I have remove one record (COUNT (*) - 1), since one should be there as a valid record. And you really don’t need ‘having COUNT(*) > 1’, since non duplicate record count(*) will return 1 and count(*)-1 will be 0. It’s there for the ease of readability. So if you execute the above query you will get 11 records as the record count (Total 15 records, 4 valid records. So 15-4 = 11 records).

If you can see I have used ‘CHECKSUM(*)’. This to avoid typing all field names. Without using that the query would be like ‘group by id,name,age’.

And finally we can build the query to delete duplicates like this. First we must find the valid records, which should not be deleted. The way to do is using the function ‘ROW_NUMBER’. Using that we assign a unique row number for each record and select the maximum row number for each group. Then we will only get one record per group.

select MAX(row_num) from (
select ROW_NUMBER() over (order by checksum(*)) as row_num, CHECKSUM(*) as ChkSum  
from SampleTable
) as T Group By ChkSum

And if you execute the above query you will get the following result:

It will return row numbers 5,9,14 and 15 as valid records which we must keep. And we must only delete records which the row number is not equal to the ones that’s been returned from the above mentioned query. First we’ll select those records (Only for checking purpose). You can select those records using the following query.

select T.* from(
    select ROW_NUMBER() over (order by checksum(*)) as row_num, CHECKSUM(*) 
    as ChkSum from SampleTable) as T
    where T.row_num not in (
        select MAX(row_num) from (
            select ROW_NUMBER() over (order by checksum(*)) as row_num, CHECKSUM(*) 
            as ChkSum from SampleTable
        ) as T Group By ChkSum
    )

And if you execute the above query you will get the following result.

So if you see closely row numbers 5,9,14 and 15 are not there. So we can sure, that we are deleting the correct set of records. So in order to delete the duplicated we can use the following query.

    
delete T from(
    select ROW_NUMBER() over (order by checksum(*)) as row_num, CHECKSUM(*) 
    as ChkSum from SampleTable) as T
    where T.row_num not in (
        select MAX(row_num) from (
            select ROW_NUMBER() over (order by checksum(*)) as row_num, CHECKSUM(*) 
            as ChkSum  from SampleTable
        ) as T Group By ChkSum
    )

And if you query the table you will get the following result.

Friday, 10 September 2010

The Null Coalescing Operator (??)

This operator can be extremely useful in situations where you have to check for the value of a variable and if it's null assign an empty string instead (or any other value).

Well you can achieve this using an if statement.

string zVar = somevalue;
if (somevalue == null) {
    zVar = string.Empty;
}

Or you could use the ternary operator (?:)

string zVar = (somevalue != null) ? somevalue : string.Empty;

But we can do better using the null coalescing operator (??)

string zVar = somevalue ?? string.Empty;