(Part #2) Product matching via Machine Learning – Important decisions to be made
Reading Time: 2 minutes
- Product matching in Price2Spy
- Previous topic: (Part #1) Product matching via Machine Learning – Introduction to the project
- Next topic: (Part #3) Product matching via Machine Learning – For ML experts – why is product matching so difficult?
Before kicking the project off, we had to make some really important decisions regarding the project scope.
- Language-specific or universal ML model?
- Of course, one would like his solution to be as broadly applicable as possible.
- Language-specific model would probably be more precise but would require training for each language individually. And, preparing a training set, as you will see is a very difficult one
- As Price2Spy has clients from literally all over the world, we would need to cover at least 15 different languages, and some of them written in non-Latin scripts

- Pretty often we face situations where competitor A uses English wording of the product, while competitor B goes for the local language. For example iPhone 11 Red vs iPhone 11 Rot. Our ML model would need to be ready for such cases
- Decision: try to go for a universal solution, by all means
2. Industry-specific or universal ML model?
- Price2Spy works with over 25 different industries. Preparing 25 training sets to build 25 different ML models seemed like a nightmare.

- On the other hand, we all know how little similarities there are between the wording of fashion and luxury products, compared to tires or fresh food
- Again, the industry-specific model would probably be more precise but would require training for each language individually. And, preparing a training set which is representative enough, as you will see is a very difficult one
- Decision: try to go for a universal solution, by all means
3. Matching accuracy
- One thing that we have learned in 9 years in this business is that a wrong match is something that we cannot afford to have in Price2Spy. Wrong match => Wrong pricing decision. Our customers cannot have that => we cannot have that!
- 99% matching accuracy is not sufficient. Even if it’s only 1% of wrong matches – how can the client know which 1% is wrong?
- ML is all about math and probability. Even when ML claims that we have a 99% probable match – that’s not good enough. Humans need to verify this
- Fortunately enough, verifying a match takes much less human time that establishing one. So, ML will not fully replace the need for human work – but it will significantly reduce it while keeping the match quality at 100%
- Decision: we’re striving for 100% matching accuracy
So, we have our 3 key ML matching decisions. On to the next task – preparing the training set!
Find more information here:
- Product matching in Price2Spy
- Previous topic: (Part #1) Product matching via Machine Learning – Introduction to the project
- Next topic: (Part #3) Product matching via Machine Learning – For ML experts – why is product matching so difficult?
About Price2Spy
Price2Spy is an online service that provides comprehensive and suitable solutions for eCommerce professionals including; retailers, brands/manufacturers and distributors in order to stay profitable in the current competitive market conditions. If you want to learn more about what Price2Spy can do for your business, please start your 30-day free trial.