cancel
Showing results for 
Search instead for 
Did you mean: 

Advice on Deduping Contact records? Please Help.

emon_amin
Explorer

Hi All, 

I was wondering if I can get your input on using this tool. My team and I recently had Demandtools purchased. As it appears to be a prominent tool we need to dedupe millions of contact records. Although, it gathers duplicate records fast but the mass merging is quite slow. Occasionally it crashes and it is a bit upsetting. I may be unaware of other features in the tool but I would like to know:

- Is there a way to speed up the merging process? I saw something about 'Use salesforce merge checked" in this source.

- If my colleague and I would like run a mass merge at same time (with different selection criteria, so we dont have the same records) will that impact the tool in some negative way or anything in the background? 

- From understanding there are no limits to the tool but it is based on the power of the machine. What is the recommend batch size for a mass merge? 

I am open to suggestions and if you have any questions for me to further clarify the sitaution please let me know. Thank you!

P.S. we are using Salesforce.

1 ACCEPTED SOLUTION

TarsaF_Validity
Validity Team Member
Validity Team Member

Howdy @emon_amin, great questions! I hope I can point you in the right direction with a few answers. The initial article you found is a perfect place to find many our best DeDuplication suggestions!

1)  "Use Salesforce Merge" (Demand Tools 2.91) as well as "Bulk API Calls"  are usually the best way to reduce the time it takes to run a scenario. Also controlling for Batch size is also important!

2) You CAN run simultaenous jobs, however if two people in an org are running the same job configuration, they can overwrite one another's work. Coordinate with other people running a job whenever possible.

3) We don't have an exact recommendation for an optimal batch size...Exactly what the upper threshold is for the total number of groups that can be merged is unknown, and will vary by organization.

You might also find some valuable insights in this article DeDupe Playbook

View solution in original post

6 REPLIES 6

TarsaF_Validity
Validity Team Member
Validity Team Member

Howdy @emon_amin, great questions! I hope I can point you in the right direction with a few answers. The initial article you found is a perfect place to find many our best DeDuplication suggestions!

1)  "Use Salesforce Merge" (Demand Tools 2.91) as well as "Bulk API Calls"  are usually the best way to reduce the time it takes to run a scenario. Also controlling for Batch size is also important!

2) You CAN run simultaenous jobs, however if two people in an org are running the same job configuration, they can overwrite one another's work. Coordinate with other people running a job whenever possible.

3) We don't have an exact recommendation for an optimal batch size...Exactly what the upper threshold is for the total number of groups that can be merged is unknown, and will vary by organization.

You might also find some valuable insights in this article DeDupe Playbook

AnthonyValidity
Validity Team Member
Validity Team Member

Hi @emon_amin ,

I'd like to add some additional information from what my colleague Forrest has provided.  

1. Start your batches off by conditioning on Last Name "starts with" a,b,c, etc.  By conditioning in this way you break the batches into smaller ones.  At the very end of your deduping, you'd run a final batch without this condition.  You can also do this by State if you prefer but I like last name better.

2. Use you Bulk API feature if you're trying to sort through 1.5 million or more records.  This will increase the speed, but will not help with the merging piece.

3. Go to Cogwheel > General Settings.  Check out the merge batch size - depending on your system you may wan to reduce this number, which would slow it down though.  I'm thinking if you're getting errors it may just be the total batch size but you may want to play around with the merge size as well.  If you're getting any timeout errors you can increase the CRM timeout setting to up to 10 minutes.

4.  Make sure your local workstation has at least 8 GB of RAM, but I would recommend 16 GB.

5.  If you're both working on different record sets, there should not be a problem at all.  If you were to both be processing the same records, of course, you would most likely get a Salesforce error.  

6.  Standardizing your data in modify first will speed up the Dedupe scenario if you don't have to use as many comparison types.  It is not necessary, but since you're looking for speed, I know that the comparison types and algorithims can add time to the process.  

Come to Expert Office Hours and we'd be love to answer any additional questions!

Anthony Lardiere Jr
Senior Customer Success Manager

emon_amin
Explorer

@AnthonyValidity and @TarsaF_Validity Thank you for responding so quickly to my questions! I like how the tool gathers data so quickly but yeah the merging process is slow. In terms of batch sizes (I know you said mess around with it) are there any common numbers when it comes to batch size recommendation for merging? like 10, 20, or whatever it may be (default is 5)?

I am a bit confused where I would locate and enable  "Use Salesforce Merge Check" Are you able to provide a step-by-step write out for me? 

Thank you. 

Hi @emon_amin,

Sorry for the confusion but "Use Salesforce Merge Check" is only for the legacy product.  DemandTools 5.x automatically uses it so you're all good there!

The batch sizes are ultimately up to how your environment is setup.  It really depends and something you'll have to test.  The higher the number the more issues you can run into with custom APEX code.  The lower the number, the slower the merges.  

If you're reducing the batch size the time should go faster.  Once you've done your Deduping for the first time, it'll be much easier during "maintenance mode."

Hope this helps!

Anthony Lardiere Jr
Senior Customer Success Manager

mcarino68
Observer

Hi,  I reduced my file to 300 records and it gets stuck at 8%.  I am using a file to do the merge.  

Hi @mcarino68,

For this situation, it is best to get a Support ticket open.  You can use this form to create a ticket and Support will follow up with you.

Regards,

Anthony Lardiere Jr
Senior Customer Success Manager