03-28-2023 10:02 AM - edited 03-28-2023 01:59 PM
Hi All,
I was wondering if I can get your input on using this tool. My team and I recently had Demandtools purchased. As it appears to be a prominent tool we need to dedupe millions of contact records. Although, it gathers duplicate records fast but the mass merging is quite slow. Occasionally it crashes and it is a bit upsetting. I may be unaware of other features in the tool but I would like to know:
- Is there a way to speed up the merging process? I saw something about 'Use salesforce merge checked" in this source.
- If my colleague and I would like run a mass merge at same time (with different selection criteria, so we dont have the same records) will that impact the tool in some negative way or anything in the background?
- From understanding there are no limits to the tool but it is based on the power of the machine. What is the recommend batch size for a mass merge?
I am open to suggestions and if you have any questions for me to further clarify the sitaution please let me know. Thank you!
P.S. we are using Salesforce.
Solved! Go to Solution.
03-28-2023 02:49 PM - edited 03-29-2023 08:54 AM
Howdy @emon_amin, great questions! I hope I can point you in the right direction with a few answers. The initial article you found is a perfect place to find many our best DeDuplication suggestions!
1) "Use Salesforce Merge" (Demand Tools 2.91) as well as "Bulk API Calls" are usually the best way to reduce the time it takes to run a scenario. Also controlling for Batch size is also important!
2) You CAN run simultaenous jobs, however if two people in an org are running the same job configuration, they can overwrite one another's work. Coordinate with other people running a job whenever possible.
3) We don't have an exact recommendation for an optimal batch size...Exactly what the upper threshold is for the total number of groups that can be merged is unknown, and will vary by organization.
You might also find some valuable insights in this article DeDupe Playbook
03-28-2023 02:49 PM - edited 03-29-2023 08:54 AM
Howdy @emon_amin, great questions! I hope I can point you in the right direction with a few answers. The initial article you found is a perfect place to find many our best DeDuplication suggestions!
1) "Use Salesforce Merge" (Demand Tools 2.91) as well as "Bulk API Calls" are usually the best way to reduce the time it takes to run a scenario. Also controlling for Batch size is also important!
2) You CAN run simultaenous jobs, however if two people in an org are running the same job configuration, they can overwrite one another's work. Coordinate with other people running a job whenever possible.
3) We don't have an exact recommendation for an optimal batch size...Exactly what the upper threshold is for the total number of groups that can be merged is unknown, and will vary by organization.
You might also find some valuable insights in this article DeDupe Playbook
03-29-2023 05:09 AM - edited 03-29-2023 05:10 AM
Hi @emon_amin ,
I'd like to add some additional information from what my colleague Forrest has provided.
1. Start your batches off by conditioning on Last Name "starts with" a,b,c, etc. By conditioning in this way you break the batches into smaller ones. At the very end of your deduping, you'd run a final batch without this condition. You can also do this by State if you prefer but I like last name better.
2. Use you Bulk API feature if you're trying to sort through 1.5 million or more records. This will increase the speed, but will not help with the merging piece.
3. Go to Cogwheel > General Settings. Check out the merge batch size - depending on your system you may wan to reduce this number, which would slow it down though. I'm thinking if you're getting errors it may just be the total batch size but you may want to play around with the merge size as well. If you're getting any timeout errors you can increase the CRM timeout setting to up to 10 minutes.
4. Make sure your local workstation has at least 8 GB of RAM, but I would recommend 16 GB.
5. If you're both working on different record sets, there should not be a problem at all. If you were to both be processing the same records, of course, you would most likely get a Salesforce error.
6. Standardizing your data in modify first will speed up the Dedupe scenario if you don't have to use as many comparison types. It is not necessary, but since you're looking for speed, I know that the comparison types and algorithims can add time to the process.
Come to Expert Office Hours and we'd be love to answer any additional questions!
03-29-2023 07:17 AM
@AnthonyValidity and @TarsaF_Validity Thank you for responding so quickly to my questions! I like how the tool gathers data so quickly but yeah the merging process is slow. In terms of batch sizes (I know you said mess around with it) are there any common numbers when it comes to batch size recommendation for merging? like 10, 20, or whatever it may be (default is 5)?
I am a bit confused where I would locate and enable "Use Salesforce Merge Check" Are you able to provide a step-by-step write out for me?
Thank you.
03-29-2023 07:20 AM
Hi @emon_amin,
Sorry for the confusion but "Use Salesforce Merge Check" is only for the legacy product. DemandTools 5.x automatically uses it so you're all good there!
The batch sizes are ultimately up to how your environment is setup. It really depends and something you'll have to test. The higher the number the more issues you can run into with custom APEX code. The lower the number, the slower the merges.
If you're reducing the batch size the time should go faster. Once you've done your Deduping for the first time, it'll be much easier during "maintenance mode."
Hope this helps!
06-27-2023 09:27 AM
Hi, I reduced my file to 300 records and it gets stuck at 8%. I am using a file to do the merge.
06-27-2023 10:23 AM
Hi @mcarino68,
For this situation, it is best to get a Support ticket open. You can use this form to create a ticket and Support will follow up with you.
Regards,
Get industry news, expert insights, strategies, and hot tips, served up fresh every week.
Visit the Validity blog