Header

Remove duplicates in table

Author: Paulo van Breugel
Updated on: 2010-11-22

Issue

Suppose you have the table 'MyTable' with a column A, which contain some duplicate values and the column cat with unique values. How to deal with these duplicates.

There are several options, depending on what you want to achieve. Note that all options are run directly in SQLite.

Count duplicates

To check how many duplicate values do you have in column A

SELECT COUNT ( * ) FROM MyTable;
SELECT COUNT ( DISTINCT A ) FROM MyTable;

Delete duplicates

To delete all rows with duplicate values in A from the table:

DELETE FROM MyTable WHERE cat NOT IN
(SELECT MAX(cat) FROM table GROUP BY A);

Mark duplicates

To mark the rows with duplicate values in A

CREATE INDEX 'tmp.idx' ON 'MyTable' ('A');
CREATE INDEX 'tmp2.idx' ON 'MyTable' ('cat');

CREATE TABLE tmp AS
SELECT X1.cat AS cat
from MyTable X1, MyTable X2
where X1.A = X2.A
and X1.cat > X2.cat;
CREATE INDEX 'tmp3.idx' ON 'tmp' ('cat');

CREATE TABLE tmp2 AS
SELECT MyTable.*, tmp.cat
FROM MyTable Left Join tmp ON (MyTable.cat = tmp.cat)

If you have questions

If you have questions or comments about the text, let me know. You can use this contact form. Please make sure to include the page title ("Remove duplicates in table") or page name ("remove duplicates in sqlite").