question

GaniTPT avatar image
0 Votes"
GaniTPT asked ChristopherSingleton-4721 edited

How to combine two column values and find the duplicates in that using C# Linq

L want to find the duplicate if any in the combined string.

for example below is the data table.

I want to concatenate the two column values (EMPNO and DEPT) from the data table.

and then find the duplication in that.

85826-image.png

How to form the above requirement in LINQ C#



NOTE : last box Step-2 is the final result.

dotnet-csharp
image.png (95.5 KiB)
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

karenpayneoregon avatar image
0 Votes"
karenpayneoregon answered GaniTPT commented

Here is an example passing the key to check

 private void HasDuplicatesDemo(string identifier)
 {
     var tblData = new DataTable();
        
     tblData.Columns.Add("EMPNO", typeof(string));
     tblData.Columns.Add("NAME", typeof(string));
     tblData.Columns.Add("DEPT", typeof(string));
     tblData.Columns.Add("CATEGORY", typeof(string));
     tblData.Rows.Add("AM-101", "RAFEEK", "CA-12-MM", "BOLD");
     tblData.Rows.Add("AM-101", "RAFEEK", "CA-13-AB", "NEW");
     tblData.Rows.Add("AM-101", "RAFEEK", "CA-13-AB", "NEW");
     tblData.Rows.Add("AM-102", "MANA", "CA-12-MM", "CESS");
     tblData.Rows.Add("AM-102", "MANA", "CA-12-MM", "CESS");
     tblData.Rows.Add("AM-102", "MANA", "CA-13-XL", "REVERSE");
        
     tblData.Columns.Add("ID", typeof(string)).Expression = "EMPNO +NAME";
     tblData.Columns["ID"].SetOrdinal(0);
        
        
     bool hasDuplicates1 = tblData.AsEnumerable().Where(row => row.Field<string>("ID") == identifier)
         .GroupBy(row => row[0]).Any(gr => gr.Count() > 1);
    
     Console.WriteLine($"From {nameof(HasDuplicatesDemo)}: {hasDuplicates1}"); 
    
 }
 private void HasNoDuplicatesDemo(string identifier)
 {
     var tblData = new DataTable();
    
     tblData.Columns.Add("EMPNO", typeof(string));
     tblData.Columns.Add("NAME", typeof(string));
     tblData.Columns.Add("DEPT", typeof(string));
     tblData.Columns.Add("CATEGORY", typeof(string));
     tblData.Rows.Add("AM-101", "RAFEEK", "CA-12-MM", "BOLD");
     tblData.Rows.Add("AM-102", "MANA", "CA-12-MM", "CESS");
    
     tblData.Columns.Add("ID", typeof(string)).Expression = "EMPNO +NAME";
     tblData.Columns["ID"].SetOrdinal(0);
    
     bool hasDuplicates1 = tblData.AsEnumerable().Where(row => row.Field<string>("ID") == identifier)
         .GroupBy(row => row[0]).Any(gr => gr.Count() > 1);
    
     Console.WriteLine($"From {nameof(HasNoDuplicatesDemo)}:  {hasDuplicates1}");
    
 }

Click event

 private void runButton_Click(object sender, EventArgs e)
 {
     HasDuplicatesDemo("AM-101RAFEEK");
     HasNoDuplicatesDemo("AM-101RAFEEK");
 }


86277-f2.png

Then in the second example let's be clear

 private void HasNoDuplicatesDemo(string identifier)
 {
     var tblData = new DataTable();
    
     tblData.Columns.Add("EMPNO", typeof(string));
     tblData.Columns.Add("NAME", typeof(string));
     tblData.Columns.Add("DEPT", typeof(string));
     tblData.Columns.Add("CATEGORY", typeof(string));
     tblData.Rows.Add("AM-101", "RAFEEK", "CA-12-MM", "BOLD");
     tblData.Rows.Add("AM-102", "MANA", "CA-12-MM", "CESS");
    
     tblData.Columns.Add("ID", typeof(string)).Expression = "EMPNO +NAME";
     tblData.Columns["ID"].SetOrdinal(0);
    
     bool hasDuplicates1 = tblData.AsEnumerable().Where(row => row.Field<string>("ID") == identifier)
         .GroupBy(row => row[0]).Any(gr => gr.Count() > 1);
    
     Console.WriteLine($"From {nameof(HasNoDuplicatesDemo)}:  {hasDuplicates1  == false}");
    
 }

86209-f3.png

You can also create an extension method

 using System.Data;
 using System.Linq;
    
 namespace CommonExtensions
 {
     public static class DataTableExtensions
     {
         /// <summary>
         /// Determine if the DataTable has duplications based on a column
         /// where the column may be a concatenation of two or more columns
         /// </summary>
         /// <param name="sender">DataTable to check for duplicates</param>
         /// <param name="columnName">Column name to check, if column does not exists a runtime exception is thrown</param>
         /// <param name="identifier">Value in columnName</param>
         /// <returns></returns>
         public static bool HasDuplicates(this DataTable sender, string columnName, string identifier) => 
             sender.AsEnumerable()
                 .Where(row => row.Field<string>(columnName) == identifier)
                 .GroupBy(row => row[0]).Any(gr => gr.Count() > 1);
    
     }
 }

Usage

 Console.WriteLine($"From {nameof(HasNoDuplicatesDemo)}:  {tblData.HasDuplicates("ID", "AM-101RAFEEK")}");





f2.png (4.0 KiB)
f3.png (11.3 KiB)
· 5
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thanks for your more details.

below is my requirement step

STEP 1 : i am passing parameter EMPNO Only. i don't have name value
EX : AM-102
86385-image.png


STEP 2 : Based on the EMPNO, we are fetching the empno and dept values. (can't say empno and name not an unique)
86411-image.png


STEP 3 : then concatenating, both empno and dept.
86366-image.png

STEP 4 : from these, we want to find the duplications.

Note ; passing only one single value as a parameter (EX : AM-102), and the concatenate then to find the duplicates.


0 Votes 0 ·
image.png (6.7 KiB)
image.png (11.1 KiB)
image.png (2.8 KiB)

Well you should be able to adapt my logic from the concatenated column to the one column.

There is enough from what I provided to solve this yourself.

0 Votes 0 ·

Thanks. i am getting the problem when i am passing one parameter "AM-102"


          sender.AsEnumerable()
              .Where(row => row.Field<string>(columnName) == "AM-102")
              .GroupBy(row => row[0]).Any(gr => gr.Count() > 1);




0 Votes 0 ·
Show more comments
karenpayneoregon avatar image
0 Votes"
karenpayneoregon answered karenpayneoregon edited

Try the following if you only need to determine if there are duplicates.

  • Create an expression column e.g. table.Columns.Add("EmpNoDept", typeof(string), "EmpNo + Dept");

  • Get distinct and assert the table count against the distinctValues count

    DataView view = new DataView(table);
    DataTable distinctValues = view.ToTable(true, "EmpNoDept");

Or via Lambda in the case the indexer [0] is looking at the first column data so keeping with using an expression change the index to the column index of EmpNoDept.


 var duplicates = table.AsEnumerable().GroupBy(row => row[0]).Where(gr => gr.Count() > 1);



dups.png (109.3 KiB)
· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thanks for your support.
i used the below code and found some duplicates.
How to return true, if found duplicates. ("AM-012") ==> has some duplicates.
86264-image.png
complete source code below.

0 Votes 0 ·
image.png (24.2 KiB)

pls. find the source code below.

86291-image.png


0 Votes 0 ·
image.png (61.1 KiB)
karenpayneoregon avatar image
0 Votes"
karenpayneoregon answered GaniTPT commented

Two approaches

Option 1

 bool hasDuplicates = table.AsEnumerable().GroupBy(row => row[0]).Any(gr => gr.Count() > 1);


Option 2

 if (table.AsEnumerable().GroupBy(row => row[0]).Any(gr => gr.Count() > 1))
 {
     // has dups
 }

Then try the following

             var tblData = new DataTable();
                
             tblData.Columns.Add("EMPNO", typeof(string));
             tblData.Columns.Add("NAME", typeof(string));
             tblData.Columns.Add("DEPT", typeof(string));
             tblData.Columns.Add("CATEGORY", typeof(string));
             tblData.Rows.Add("AM-101", "RAFEEK", "CA-12-MM", "BOLD");
             tblData.Rows.Add("AM-101", "RAFEEK", "CA-13-AB", "NEW");
             tblData.Rows.Add("AM-101", "RAFEEK", "CA-13-AB", "NEW");
             tblData.Rows.Add("AM-102", "MANA", "CA-12-MM", "CESS");
             tblData.Rows.Add("AM-102", "MANA", "CA-12-MM", "CESS");
             tblData.Rows.Add("AM-102", "MANA", "CA-13-XL", "REVERSE");
                
             tblData.Columns.Add("ID", typeof(string)).Expression = "EMPNO +NAME";
             tblData.Columns["ID"].SetOrdinal(0);
                
             bool hasDuplicates = tblData.AsEnumerable()
                 .GroupBy(row => row[0]).Any(gr => gr.Count() > 1);
                
             bool hasDuplicates1 = tblData.AsEnumerable().Where(row => row.Field<string>("ID") == "AM-101RAFEEK")
                 .GroupBy(row => row[0]).Any(gr => gr.Count() > 1);


· 2
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

Thanks. But, it always return true even though if not duplicate.
below screenshot, there is no duplicate. but, still showing duplicates.


86284-image.png


0 Votes 0 ·
image.png (19.7 KiB)

No. again it is returning true always..

give us the simple idea which is related to my code..

0 Votes 0 ·
karenpayneoregon avatar image
0 Votes"
karenpayneoregon answered ChristopherSingleton-4721 edited

In regards to always returning true, I copied your code, added the column with an expression then tested it with two examples


 private void HasDuplicatesDemo()
 {
     var tblData = new DataTable();
        
     tblData.Columns.Add("EMPNO", typeof(string));
     tblData.Columns.Add("NAME", typeof(string));
     tblData.Columns.Add("DEPT", typeof(string));
     tblData.Columns.Add("CATEGORY", typeof(string));
     tblData.Rows.Add("AM-101", "RAFEEK", "CA-12-MM", "BOLD");
     tblData.Rows.Add("AM-101", "RAFEEK", "CA-13-AB", "NEW");
     tblData.Rows.Add("AM-101", "RAFEEK", "CA-13-AB", "NEW");
     tblData.Rows.Add("AM-102", "MANA", "CA-12-MM", "CESS");
     tblData.Rows.Add("AM-102", "MANA", "CA-12-MM", "CESS");
     tblData.Rows.Add("AM-102", "MANA", "CA-13-XL", "REVERSE");
        
     tblData.Columns.Add("ID", typeof(string)).Expression = "EMPNO +NAME";
     tblData.Columns["ID"].SetOrdinal(0);
        
     bool hasDuplicates = tblData.AsEnumerable()
         .GroupBy(row => row[0]).Any(gr => gr.Count() > 1);
        
     bool hasDuplicates1 = tblData.AsEnumerable().Where(row => row.Field<string>("ID") == "AM-101RAFEEK")
         .GroupBy(row => row[0]).Any(gr => gr.Count() > 1);
    
     Console.WriteLine($"From {nameof(HasDuplicatesDemo)}: {hasDuplicates}, {hasDuplicates1}"); 
    
 }
 private void HasNoDuplicatesDemo()
 {
     var tblData = new DataTable();
    
     tblData.Columns.Add("EMPNO", typeof(string));
     tblData.Columns.Add("NAME", typeof(string));
     tblData.Columns.Add("DEPT", typeof(string));
     tblData.Columns.Add("CATEGORY", typeof(string));
     tblData.Rows.Add("AM-101", "RAFEEK", "CA-12-MM", "BOLD");
     tblData.Rows.Add("AM-102", "MANA", "CA-12-MM", "CESS");
    
     tblData.Columns.Add("ID", typeof(string)).Expression = "EMPNO +NAME";
     tblData.Columns["ID"].SetOrdinal(0);
    
     bool hasDuplicates = tblData.AsEnumerable()
         .GroupBy(row => row[0]).Any(gr => gr.Count() > 1);
    
     bool hasDuplicates1 = tblData.AsEnumerable().Where(row => row.Field<string>("ID") == "AM-101RAFEEK")
         .GroupBy(row => row[0]).Any(gr => gr.Count() > 1);
    
     Console.WriteLine($"From {nameof(HasNoDuplicatesDemo)}: {hasDuplicates}, {hasDuplicates1}");
    
 }


Click a button

 private void runButton_Click(object sender, EventArgs e)
 {
     HasDuplicatesDemo();
     HasNoDuplicatesDemo();
 }


Results

86311-f1.png



f1.png (4.4 KiB)
· 1
5 |1600 characters needed characters left characters exceeded

Up to 10 attachments (including images) can be used with a maximum of 3.0 MiB each and 30.0 MiB total.

I have a web grid with columns "TrailerNumber" and "Carrier" that I am trying to find duplicates on, when any row has the same values in the two columns, but have not been able to figure this out as of yet using LinQ.

My code always comes up as 1 for some reason, even if there is not a duplicate in the web grid. I want to find the duplicate in entity framework before any changes are saved to the database. This code makes a lot of sense to me, but not sure if it is the right approach?... Any help would be greatly appreciated!!!


     **private string FindDuplicates(LocationData customer)
     {
         using (PW_YardDogDataEntitiesModel3 entities = new PW_YardDogDataEntitiesModel3())
         {
             var result =
                 entities.LocationDatas.GroupBy(o => new { customer.TrailerNumber, customer.CarrierName })
            .Where(g => g.Count() > 1)
            .Select(y => y.Key)
            .ToList();
             if (result.Count() > 1)
             {
                   
                 System.Windows.Forms.MessageBox.Show("Found Duplicates:{0}", string.Join(Environment.NewLine, result));
                   
             }
             return result.Count().ToString();
         }
     }
 }**
0 Votes 0 ·