Skip to main content

Delete duplicate records from table even when there is no key field

Hi all,yesterday only i was discussing with my colleague Kamlesh Samnani who is MCTS about deleting records which does not have any primary key field based on which we can delete the data (actually he only fired this question yesterday to me)

so i thought to write a blog post about it,there are many posts which talk about deleting duplicate records from the table which has some key field,so i started working on the query and what i got is what i am gonna write here.

Lets start creating the table and after inserting some data in it,we will fire our query in that and will check the output as well

CREATE TABLE [dbo].[Employee](
[EmployeeId] [int] NULL,
[Salary] [money] NULL
) ON [PRIMARY]

INSERT INTO Employee VALUES(1,1000)
INSERT INTO Employee VALUES(1,2000)
INSERT INTO Employee VALUES(1,3000)
INSERT INTO Employee VALUES(2,2000)
INSERT INTO Employee VALUES(3,3000)
INSERT INTO Employee VALUES(3,4000)
INSERT INTO Employee VALUES(4,4000)
INSERT INTO Employee VALUES(4,5000)


after creating table and inserting some data table will look like following



Now we will try solving the problem using two method,first one is using simple query which is quite interesting (It was new for me when i created this query and performed) and second approach is using CTE (Common Table Expression)

1) Using Simple Query with Row_Number()

DELETE temp FROM
  (
      SELECT id FROM
      (
        SELECT row_number() over (PARTITION BY EmployeeId order by EmployeeId) AS   
        id, EmployeeId,Salary 
        FROM Employee
      ) AS t WHERE id > 1
   ) temp
   SELECT * FROM Employee

Here what row_number() will do is will create numbers for each of the row,now we have PARTITION BY EmployeeId so it will reset the number when EmployeeId changes,so on every new EmployeeId there will be number generated from 1,so we can directly delete data which are > 1,because they are duplicate,now in case if we do not have any duplicate record this row_number() will generated only once and it will never be reset,so in that case no records would be having same number generated

2) Using CTE

;WITH temp (uniquid,id,salary) AS 
 (
    SELECT row_number() OVER (PARTITION BY EmployeeId order by EmployeeId) AS id, 
    EmployeeId,Salary 
    FROM Employee
 )
 DELETE FROM temp WHERE uniquid>1
 SELECT * FROM Employee

Output




Both of the query will produce the desire output and will delete the duplicate records

I hope this is interesting one right,honestly i was not knowing until i performed it my self yesterday,so you can also hands on...

Comments

  1. I left IT befor some years.but this query reminds me of our CMC days.

    ReplyDelete

Post a Comment

Popular posts from this blog

Why SitecoreAI - Getting into the shoes of the customer how to select right CMS

Hi Team, Lately, I have been talking to lot of our customers / potential customers and having pre-sales demos where one question always comes is "Why Sitecore" ?  Now this question can be for any product which is out for sell. And as a technician I always get into product technical features, but at the same time as a pre-sales guy, it also makes me think, surely all competitive products have same features, so definitely answer to this is not in the technicalities.  If you step back and think, we are also a customer in our daily life and buy lot of things, what is that process we go through? When we buy, how can your customer decide if this is a right fit for you or not, why we select A over B? Is it price? is it service? Is it a brand? Is it about features? Is it about brand loyalty?  When it is a technical product, I am sure it cannot start with the technicalities of the product or selecting product itself, 100% not, I feel decision is always business strategy first and ...

Hell of sitecore aliases pipeline breaking the site with 500 error

Hello Friends, I belive this blog post is very important for everyone because, It has some very serious effect on working of your headless website, i will share my experience what we faced and how we resolved it Issue we started facing Our site started giving "Key cannot be null or empty" with YSOD like following  Side affect Because of this 500 error, Our site pages were showing 500 custom error page intermittently and our MAU (Monthly Active User) drop rate increased. Sitecore KB There is already Sitecore KB article talking about this error but the patch which is provided on this link is confusing as well as very huge and it could bring other issues along with it as that upgrade patch also has lot of other things too which i did not want to introduce in our stable CMS. Known Issues - Retrieving the child items of resource items is not thread-safe Observation Though the surfaced exception was looking similar and giving same error and behavior given on this article, We looked...

Zero to Hero - A real life RCA of exact issue in Sitecore Managed Cloud environment

Hello All, The purpose of today's post is to share a real life burning and escalated scenario which was new to me and how did I approach it and how big the escalations were and what was the outcome Sitecore's goodwill was at stack not because Sitecore is not capable of handling it but just because our environment was Sitecore Managed Cloud, and any issue that comes if its infra, back end code, front end code will be first pointed as Sitecore issue and that is where our consultancy and experience will play a role to prove that it is not Sitecore issue.  Issue we faced Out of the blue our site started giving "504 Gateway Time-out", and it was reported that almost everyone is getting this error, but when we used to browse the site, everything looked good and never 504. 504 Gateway Time-out error tells that, That the request went to Content Delivery servers of Sitecore from gateway, but gateway did not get response in time from those CDs and hence it gave time out error. ...