Simple about SAP basis: Oracle SQL: how to find and delete duplicate records in column

8/13/2017

Oracle SQL: how to find and delete duplicate records in column

Oracle SQL: how to delete duplicate records in column

Oracle has many ways to delete duplicate records.The below examples are described to explain the different cases.

This techniques are only applicable in few scenarios.

Always take extra caution while deleting records.

1. First identify the duplicates using select.

2. Double verify those are actual ‘duplicates’ or not

3. Take backup if necessary

4. Apply commit only if you are sure.

Consider the EMP table with below rows

create table emp(
EMPNNO integer,
EMPNAME varchar2(20),
SALARY number);

10 Bill 2000
11 Bill 2000
12 Mark 3000
12 Mark 3000
12 Mark 3000
13 Tom 4000
14 Tom 5000
15 Susan 5000

1. Using rowid

We have a table with duplicate records for empno, and we want to write a SQL query to make empno the primary key. For this we need a delete query to delete the duplicate records so that rmpno will have only unique values.

To remove duplicate rows of data, use the following statement:

SQL > delete from emp
where rowid not in
(select max(rowid) from emp group by empno);

The GROUP BY clause, enumerate all of your columns in your table, or the columns you think should be the primary key columns. The subquery will get the max rowid of these groupings. The DELETE will remove all rows that do not have these rowid values.

This technique can be applied to almost scenarios. Group by operation should be on the columns which identify the duplicates.

2. Using self-join

SQL > delete from emp e1
where rowid not in
(select max(rowid) from emp e2
where e1.empno = e2.empno );

3. Using row_number()

SQL > delete from emp where rowid in
(
select rid from
(
select rowid rid,
row_number() over(partition by empno order by empno) rn
from emp
)
where rn > 1
);

This is another efficient way to delete duplicates

4. Using dense_rank()

SQL > delete from emp where rowid in
(
select rid from
(
select rowid rid,
dense_rank() over(partition by empno order by rowid) rn
from emp
)
where rn > 1
);

Here you can use both rank() and dens_rank() since both will give unique records when order by rowid.

5. Using group by

Consider the EMP table with below rows

10 Bill 2000
11 Bill 2000
12 Mark 3000
13 Mark 3000

SQL > delete from emp where
(empno,empname,salary) in
(
select max(empno),empname,salary from emp
group by empname,salary
);

Simple about SAP basis

8/13/2017

Oracle SQL: how to find and delete duplicate records in column

Oracle SQL: how to delete duplicate records in column

No comments:

Post a Comment

Popular Posts