Removing Duplicates from the DataTable

DataTable has a Select method which can help you limit the number of rows being selected using a criteria. But DataTable does not contain any method which can remove the duplicate rows. First, if you have duplicates rows then I suggest using the DISTINCT keyword of the SQL SERVER. But sometimes, your whole row is not a duplicate but some of the cells are repeating. Anyway, here is a simple method that can remove the duplicate rows from the DataTable.

In the example below the UserID is the primary key and is checked for repeatition. The RemoveDuplicateRows method is called until the DataTable contain a single row per person. Also, it is a good idea to sort the rows by UserID before calling the RemoveDuplicateRows method. 

private DataTable RemoveDuplicateRows(DataTable dt)
    {
        
int oldUserID = 0;

        DataRowCollection rows = dt.Rows;
        
int rowCount = dt.Rows.Count;      

        
for (int index = 0; index < dt.Rows.Count; index++)
        {                 
                DataRow row = rows[index];

                
int currentUserID = Convert.ToInt32(row["UserID"]);

                
if (oldUserID == currentUserID)
                    dt.Rows.Remove(row);  

                oldUserID = currentUserID;
            
        }

        
if (dt.Rows.Count == 1)
            
return dt;
        
else return RemoveDuplicateRows(dt); 
    }

powered by IMHO 1.3

Print | posted @ Friday, April 21, 2006 9:50 PM

Twitter