Thursday, August 6, 2009

SQL Metal for LINQ

I found this useful tool, SQL Metal Builder, to geneate LINQ Classes for your .NET project.




This tool help you to generate the complete .cs or the .dbml file.

In many cases where we have big databases it becomes difficult to generate the dbml files by dragging each of the tables. It even becomes difficult to maintain such a file each time you make a change to the database.

This is where this tools comes in handy.

Tuesday, August 4, 2009

Deleting Duplicate Records - Single Query

I had a requirement where I wanted to delete all the duplicate email ids from my table.

My table structure is :

* EMAIL_ID uniqueidentifier PRIMARY KEY
* EMAIL varchar (128) Stores the email address


I write the following query to find out the duplicate records :

SELECT EMAIL, COUNT(*) DUPLICATES
FROM EMAILS
GROUP BY EMAIL
HAVING COUNT(*) > 1
ORDER BY DUPLICATES


Here's the delete Query:

Here's the delete Query:

DELETE FROM EMAILS
WHERE EMAIL_ID IN
(
SELECT EMAIL_ID FROM
(
SELECT EMAIL, EMAIL_ID
, RANK() OVER (PARTITION BY EMAIL ORDER BY EMAIL_ID) AS EMAIL_RANK
FROM EMAILS
WHERE EMAIL IN
(
SELECT EMAIL
FROM EMAILS
GROUP BY EMAIL
HAVING COUNT(*) > 1
)
) A
WHERE A.EMAIL_RANK != 1
)

There were lot of other approaches which I found, most of them suggested using cursors or using Set Row_Count 1 and then running the delete query iteratively. But I wanted to do it in one shot.

It took me less than a second to delete almost 1000 duplicate rows from around 10,000 rows in my table.

I am sure there would be better approaches, surely to avoid multiple IN clauses. Just in case you have one put in the comments.

I got to figure out one for Oracle as well.