SQL Server Yukon DTS: Success in cleaning CRM data

I was just reading an article at TechnologyEvaluation.com about the difficulty in maintaining data quality for CRM applications.  The extremely volatile nature of CRM activity leads to degradation in the data such as:

  • Customer details that are incorrect or inconsistent with other data
  • Duplicate records
  • Multiple database synchronization problems

Duplicate records is noted as being the most common issue.  This can be caused, for example, by spelling variations resulting in duplicate records for the same customer.  As I was reading this I recalled what I had learned about the new Data Transformation Services (DTS) feature of SQL Server Yukon.  Yukon's DTS will include data cleaning technology such as fuzzy lookup and fuzzy grouping.  Fuzzy lookup involves an input row of data whose columns are mapped to the columns of a given table of existing data.  So you may have an existing Customer table and now you have a new candidate row to insert into the Customer table.  Fuzzy lookup involves scanning the candidate row's values and comparing them against existing data to check for similarities so that the data can be “fixed up” so that they are consistent.  For example, if the candidate Customer's company name is equal to “Micrsooft” and it finds that there exists a current Customer with a company name of “Microsoft”, it will classify that as a high-scoring match all allow the user to take the appropriate actions either manually or through an automatic action.  Fuzzy grouping allows you to find fuzzy duplicates in a given result set in a similar way to fuzzy lookups.  This allows you to fix up data that already exists.  As you can see, these DTS features match well with solving the data quality issues of CRM.  I'm sure Yukon will be taken advantage of in this sector.

This posting is provided "AS IS" with no warranties, and confers no rights.