{"id":7306,"date":"2020-06-19T12:12:46","date_gmt":"2020-06-19T12:12:46","guid":{"rendered":"https:\/\/www.price2spy.com\/blog\/?p=7306"},"modified":"2020-07-28T10:59:09","modified_gmt":"2020-07-28T10:59:09","slug":"part-8-product-matching-via-ml-testing-on-various-industries-languages","status":"publish","type":"post","link":"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/","title":{"rendered":"(Part #8) Product matching via ML: Testing on various industries\/languages"},"content":{"rendered":"\n<p> <a rel=\"noreferrer noopener\" aria-label=\"Product matching in Price2Spy (opens in a new tab)\" href=\"https:\/\/www.price2spy.com\/en\/pricing\/product-matching.html\" target=\"_blank\">Product matching in Price2Spy<\/a> <\/p>\n\n\n\n<p><strong> Previous topic: <\/strong> <a rel=\"noreferrer noopener\" aria-label=\"(Part #7) Product matching via ML: Post-processing  (opens in a new tab)\" href=\"https:\/\/www.price2spy.com\/blog\/part-7-product-matching-via-ml-post-processing\/\" target=\"_blank\">(Part #7) Product matching via ML: Post-processing <\/a><\/p>\n\n\n\n<p><strong>Next topic:<\/strong> <a href=\"https:\/\/www.price2spy.com\/blog\/part-9-ml-does-work-but-its-not-magic\/\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\"(Part #9) ML does work, but it\u2019s not magic  (opens in a new tab)\">(Part #9) ML does work, but it\u2019s not magic <\/a><\/p>\n\n\n\n<p>The <a href=\"https:\/\/www.price2spy.com\/en\/pricing\/product-matching.html\">Product Matching project<\/a> had been going on for a while, now was the time to remember our initial decisions, and put them to the test:<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li>Languages \u2013 we have too many languages, we need a universal solution <\/li><li>Industries \u2013 again, too many industries, we need a universal solution<\/li><\/ul>\n\n\n\n<p>Fortunately enough, <a href=\"https:\/\/www.price2spy.com\/\">Price2Spy<\/a> clients constantly keep us busy with new matching tasks, so we had good evaluation samples.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter size-full\"><img loading=\"lazy\" decoding=\"async\" width=\"600\" height=\"314\" src=\"https:\/\/www.price2spy.com\/blog\/wp-content\/uploads\/2020\/06\/Testing-on-various-industries.png\" alt=\"ML testing on various industries\" class=\"wp-image-7307\" srcset=\"https:\/\/www.price2spy.com\/blog\/wp-content\/uploads\/2020\/06\/Testing-on-various-industries.png 600w, https:\/\/www.price2spy.com\/blog\/wp-content\/uploads\/2020\/06\/Testing-on-various-industries-768x401.png 768w\" sizes=\"auto, (max-width: 600px) 100vw, 600px\" \/><\/figure><\/div>\n\n\n\n<p>Therefore, we did the following tests:<\/p>\n\n\n\n<p><strong>1<\/strong>.<strong>German musical products<\/strong>, but from smaller websites, outside of 12 major websites used for training ML model \u2013 as expected &#8211; it worked great. Same language, the same industry as in ML training set, so good results were expected.<\/p>\n\n\n\n<p><strong>2. Italian musical products<\/strong> \u2013 same industry, but very different languages (both from Latin script, though). Results were just slightly worse than 1). However, one should keep in mind that in both cases (both Italian and German stores) use English wording quite often (but not always!)<\/p>\n\n\n\n<p><strong>3<\/strong>. <strong>Australian consumer electronic products <\/strong>\u2013 different industries, different languages compared to the ML training set. At first, the results were rather poor. This is when we figured out that we needed extra features, namely<\/p>\n\n\n\n<ul class=\"wp-block-list\"><li><strong>Entity recognition<\/strong><\/li><li><strong>Additional alpha-numeric features<\/strong> (to cater for broader variations of MPNs)<\/li><li><strong>(after this the result got much better)<\/strong><\/li><\/ul>\n\n\n\n<p><strong>4. Pool cleaning equipment<\/strong> \u2013 Italy, Spain, France, Benelux \u2013 so many different languages, an industry which has nothing in common with the ML training set. Here the results were rather poor, after deep troubleshooting, it appeared that some websites did not use standardized product names, but rather introduced model names of their own. So, the ML model is not a piece of magic, it won\u2019t always work!<\/p>\n\n\n\n<p><strong>5.<\/strong> <strong>Books, perfumes, and toys from Romania<\/strong> \u2013 again, a language which differs from our ML training data, and a very different industry. The results were great for perfumes and toys, but not for books (sequels gave us a lot of trouble \u2013 their naming is almost identical, but they definitely are not the same books). So, again a good lesson when ML can be trusted more, and when extra-human work is needed.<\/p>\n\n\n\n<p><strong>6.<\/strong> <strong>Mixed products from Middle East (food, consumer electronics, office supplies, etc)<\/strong> \u2013 English language (so, differing from our ML training set), and a very different industry. The results were great!<\/p>\n\n\n\n<p>As we were performing these tests, we were learning, and improving our ML model.<\/p>\n\n\n\n<p>Most importantly, we have proved that the concept works \u2013 can be used for (almost) any language, and for most of the industries.<\/p>\n\n\n\n<p>Both accuracy and sensitivity figures were going up, which was good. But one thing was troubling us \u2013 remember the initial set of decisions &#8211; we\u2019re striving for 100% matching accuracy.<\/p>\n\n\n\n<p>There was more work to be done.<\/p>\n\n\n\n<p><strong>More about it on the following links:<\/strong><\/p>\n\n\n\n<ul class=\"wp-block-list\"><li> <a rel=\"noreferrer noopener\" href=\"https:\/\/www.price2spy.com\/en\/pricing\/product-matching.html\" target=\"_blank\">Product matching in Price2Spy<\/a>  <\/li><li> <strong>Previous topic: <\/strong> <a rel=\"noreferrer noopener\" href=\"https:\/\/www.price2spy.com\/blog\/part-7-product-matching-via-ml-post-processing\/\" target=\"_blank\">(Part #7) Product matching via ML: Post-processing <\/a> <\/li><li> <strong>Next topic:<\/strong> <a href=\"https:\/\/www.price2spy.com\/blog\/part-9-ml-does-work-but-its-not-magic\/\" target=\"_blank\" rel=\"noreferrer noopener\" aria-label=\"(Part #9) ML does work, but it\u2019s not magic  (opens in a new tab)\">(Part #9) ML does work, but it\u2019s not magic <\/a> <\/li><\/ul>\n","protected":false},"excerpt":{"rendered":"<p>Product matching in Price2Spy Previous topic: (Part #7) Product matching via ML: Post-processing Next topic: (Part #9) ML does work, but it\u2019s not magic The Product Matching project had been going on for a while, now was the time to remember our initial decisions, and&#8230;<\/p>\n","protected":false},"author":4,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[108,167],"tags":[190,645,646,15,81],"class_list":["post-7306","post","type-post","status-publish","format-standard","hentry","category-best-practices","category-new-price2spy-features","tag-ecommerce","tag-machine-learning","tag-ml","tag-price2spy","tag-product-matching"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.3 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>(Part #8) ML Testing on various industries\/languages<\/title>\n<meta name=\"description\" content=\"Find out more about Product matching via ML: Testing on various industries\/languages\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"(Part #8) ML Testing on various industries\/languages\" \/>\n<meta property=\"og:description\" content=\"Find out more about Product matching via ML: Testing on various industries\/languages\" \/>\n<meta property=\"og:url\" content=\"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/\" \/>\n<meta property=\"og:site_name\" content=\"Price2Spy\u00ae Blog\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/Price2Spy\/\" \/>\n<meta property=\"article:published_time\" content=\"2020-06-19T12:12:46+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2020-07-28T10:59:09+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/www.price2spy.com\/blog\/wp-content\/uploads\/2020\/06\/Testing-on-various-industries.png\" \/>\n<meta name=\"author\" content=\"Mi\u0161a Kruni\u0107\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@Price2Spy\" \/>\n<meta name=\"twitter:site\" content=\"@Price2Spy\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Mi\u0161a Kruni\u0107\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"2 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"(Part #8) ML Testing on various industries\/languages","description":"Find out more about Product matching via ML: Testing on various industries\/languages","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/","og_locale":"en_US","og_type":"article","og_title":"(Part #8) ML Testing on various industries\/languages","og_description":"Find out more about Product matching via ML: Testing on various industries\/languages","og_url":"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/","og_site_name":"Price2Spy\u00ae Blog","article_publisher":"https:\/\/www.facebook.com\/Price2Spy\/","article_published_time":"2020-06-19T12:12:46+00:00","article_modified_time":"2020-07-28T10:59:09+00:00","og_image":[{"url":"https:\/\/www.price2spy.com\/blog\/wp-content\/uploads\/2020\/06\/Testing-on-various-industries.png","type":"","width":"","height":""}],"author":"Mi\u0161a Kruni\u0107","twitter_card":"summary_large_image","twitter_creator":"@Price2Spy","twitter_site":"@Price2Spy","twitter_misc":{"Written by":"Mi\u0161a Kruni\u0107","Est. reading time":"2 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/#article","isPartOf":{"@id":"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/"},"author":{"name":"Mi\u0161a Kruni\u0107","@id":"https:\/\/www.price2spy.com\/blog\/#\/schema\/person\/382ac9db90cb7d6dd54b9425857fc96c"},"headline":"(Part #8) Product matching via ML: Testing on various industries\/languages","datePublished":"2020-06-19T12:12:46+00:00","dateModified":"2020-07-28T10:59:09+00:00","mainEntityOfPage":{"@id":"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/"},"wordCount":507,"commentCount":0,"image":{"@id":"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/#primaryimage"},"thumbnailUrl":"https:\/\/www.price2spy.com\/blog\/wp-content\/uploads\/2020\/06\/Testing-on-various-industries.png","keywords":["ecommerce","machine learning","ml","price2spy","product matching"],"articleSection":["Best practices in price monitoring","New Price2Spy features"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/","url":"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/","name":"(Part #8) ML Testing on various industries\/languages","isPartOf":{"@id":"https:\/\/www.price2spy.com\/blog\/#website"},"primaryImageOfPage":{"@id":"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/#primaryimage"},"image":{"@id":"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/#primaryimage"},"thumbnailUrl":"https:\/\/www.price2spy.com\/blog\/wp-content\/uploads\/2020\/06\/Testing-on-various-industries.png","datePublished":"2020-06-19T12:12:46+00:00","dateModified":"2020-07-28T10:59:09+00:00","author":{"@id":"https:\/\/www.price2spy.com\/blog\/#\/schema\/person\/382ac9db90cb7d6dd54b9425857fc96c"},"description":"Find out more about Product matching via ML: Testing on various industries\/languages","breadcrumb":{"@id":"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/#primaryimage","url":"https:\/\/www.price2spy.com\/blog\/wp-content\/uploads\/2020\/06\/Testing-on-various-industries.png","contentUrl":"https:\/\/www.price2spy.com\/blog\/wp-content\/uploads\/2020\/06\/Testing-on-various-industries.png","width":600,"height":314},{"@type":"BreadcrumbList","@id":"https:\/\/www.price2spy.com\/blog\/part-8-product-matching-via-ml-testing-on-various-industries-languages\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Home","item":"https:\/\/www.price2spy.com\/blog\/"},{"@type":"ListItem","position":2,"name":"(Part #8) Product matching via ML: Testing on various industries\/languages"}]},{"@type":"WebSite","@id":"https:\/\/www.price2spy.com\/blog\/#website","url":"https:\/\/www.price2spy.com\/blog\/","name":"Price2Spy\u00ae Blog","description":"Price2Spy\u00ae","potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/www.price2spy.com\/blog\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Person","@id":"https:\/\/www.price2spy.com\/blog\/#\/schema\/person\/382ac9db90cb7d6dd54b9425857fc96c","name":"Mi\u0161a Kruni\u0107","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/31aa4afb2464eca1f1ca0c7979628c87e54e7a6b53ebcb371749e9349d27c850?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/31aa4afb2464eca1f1ca0c7979628c87e54e7a6b53ebcb371749e9349d27c850?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/31aa4afb2464eca1f1ca0c7979628c87e54e7a6b53ebcb371749e9349d27c850?s=96&d=mm&r=g","caption":"Mi\u0161a Kruni\u0107"},"description":"Father of 2, Husband of 1, CEO of 3 :-)","sameAs":["http:\/\/www.price2spy.com"],"url":"https:\/\/www.price2spy.com\/blog\/author\/misha\/"}]}},"_links":{"self":[{"href":"https:\/\/www.price2spy.com\/blog\/wp-json\/wp\/v2\/posts\/7306","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.price2spy.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.price2spy.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.price2spy.com\/blog\/wp-json\/wp\/v2\/users\/4"}],"replies":[{"embeddable":true,"href":"https:\/\/www.price2spy.com\/blog\/wp-json\/wp\/v2\/comments?post=7306"}],"version-history":[{"count":5,"href":"https:\/\/www.price2spy.com\/blog\/wp-json\/wp\/v2\/posts\/7306\/revisions"}],"predecessor-version":[{"id":7417,"href":"https:\/\/www.price2spy.com\/blog\/wp-json\/wp\/v2\/posts\/7306\/revisions\/7417"}],"wp:attachment":[{"href":"https:\/\/www.price2spy.com\/blog\/wp-json\/wp\/v2\/media?parent=7306"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.price2spy.com\/blog\/wp-json\/wp\/v2\/categories?post=7306"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.price2spy.com\/blog\/wp-json\/wp\/v2\/tags?post=7306"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}