Skip to main content

Delete duplicate records from table even when there is no key field

Hi all,yesterday only i was discussing with my colleague Kamlesh Samnani who is MCTS about deleting records which does not have any primary key field based on which we can delete the data (actually he only fired this question yesterday to me)

so i thought to write a blog post about it,there are many posts which talk about deleting duplicate records from the table which has some key field,so i started working on the query and what i got is what i am gonna write here.

Lets start creating the table and after inserting some data in it,we will fire our query in that and will check the output as well

CREATE TABLE [dbo].[Employee](
[EmployeeId] [int] NULL,
[Salary] [money] NULL
) ON [PRIMARY]

INSERT INTO Employee VALUES(1,1000)
INSERT INTO Employee VALUES(1,2000)
INSERT INTO Employee VALUES(1,3000)
INSERT INTO Employee VALUES(2,2000)
INSERT INTO Employee VALUES(3,3000)
INSERT INTO Employee VALUES(3,4000)
INSERT INTO Employee VALUES(4,4000)
INSERT INTO Employee VALUES(4,5000)


after creating table and inserting some data table will look like following



Now we will try solving the problem using two method,first one is using simple query which is quite interesting (It was new for me when i created this query and performed) and second approach is using CTE (Common Table Expression)

1) Using Simple Query with Row_Number()

DELETE temp FROM
  (
      SELECT id FROM
      (
        SELECT row_number() over (PARTITION BY EmployeeId order by EmployeeId) AS   
        id, EmployeeId,Salary 
        FROM Employee
      ) AS t WHERE id > 1
   ) temp
   SELECT * FROM Employee

Here what row_number() will do is will create numbers for each of the row,now we have PARTITION BY EmployeeId so it will reset the number when EmployeeId changes,so on every new EmployeeId there will be number generated from 1,so we can directly delete data which are > 1,because they are duplicate,now in case if we do not have any duplicate record this row_number() will generated only once and it will never be reset,so in that case no records would be having same number generated

2) Using CTE

;WITH temp (uniquid,id,salary) AS 
 (
    SELECT row_number() OVER (PARTITION BY EmployeeId order by EmployeeId) AS id, 
    EmployeeId,Salary 
    FROM Employee
 )
 DELETE FROM temp WHERE uniquid>1
 SELECT * FROM Employee

Output




Both of the query will produce the desire output and will delete the duplicate records

I hope this is interesting one right,honestly i was not knowing until i performed it my self yesterday,so you can also hands on...

Comments

  1. I left IT befor some years.but this query reminds me of our CMC days.

    ReplyDelete

Post a Comment

Popular posts from this blog

High CPU to completely normal CPU - SXA issue, SXA pages not loading in mobile device

  Hi Team, Today i am going to share one of the nightmarish issue with you all, We are having Sitecore 9.1.1 hosted in azure PaaS environment Our site was working just fine and no noise, but we have been working on a feature release where 7-8 months of development needed to be released to production, Big GO LIVE event right?  Also to make the development smoother we also introduced BLUE/GREEN deployment slots in the same release, so we can easily SWAP slots and go live Everything went well, we went live, we even did load and performance testing on our staging and pre-prod and we were confident enough of results Very next day we started getting "SITE DOWN" alerts, and also product owners and clients mentioned that site is very slow for them in US time and in our morning when we were accessing it, it was working lighting fast so we were clue less at start, but we started digging  1) First thing caught our eyes were HIGH CPU spikes, in US time, also without any traffic CPU u...

Set up leprechaun code generation with Sitecore XM Cloud Starterkit

Hi Sitecorians, It has been amazing learning year so far and with the change in technology and shift of the focus on frontend frameworks and composable products, it has been market demand to keep learning and exploring new things. Reasons behind this blog Today's topic is something that was in my draft from April-May, and I always thought that there is already a good documentation out there for  Leprechaun  and a blog post is not needed, Until I realized that there was so many of us facing same kind of issues and same kind of problems and spending same amount of time, That is where I thought, if I could write something which can reduce that repetitive troubleshooting time, That would really help the community. 1)  In a project environment, if we get into some configuration issues, we resolve them, we make sure we are not blocked and continue, but if you think same issue, same step and same scenario will come to other people, so if we can draft it online, it will help othe...

An error occurred while receiving the HTTP response to This could be due to the service endpoint binding not using the HTTP protocol. This could also be due to an HTTP request context being aborted by the server (possibly due to the service shutting down). See server logs for more details.

You have noticed many times that everything was working fine and suddenly the below error starts coming and you find no way to work it out An error occurred while receiving the HTTP response to This could be due to the service endpoint binding not using the HTTP protocol. This could also be due to an HTTP request context being aborted by the server (possibly due to the service shutting down). See server logs for more details. The reason for this is the receiving size of WCF service is smaller then the data which is coming from service It was working before because it was small,So you will have to try to increase the receiving setting in your end point,Possible settings can be following maxStringContentLength="2147483647" maxReceivedMessageSize="2147483647" maxBufferSize="2147483647" maxArrayLength="2147483647" That would definately help you!!!