Site crawl and Competitor product extraction

Site crawl and Competitor product extraction

Competitor site crawling and product data extraction

Need to extract all product data from competitor sites in one go?
Need a detailed overview of your competitor’s assortment, along with full pricing/product description, category, and brand information?
Price2Spy team has mastered the site crawl & competitor product extraction process that could help your business gather such valuable data in bulk.
Just let us know the list of competitor sites where you want the product data to be extracted from – and we’ll be happy to give you a quote for this task.

Some of the key technical strengths that distinguish Price2Spy Crawler from other similar tools are the following:

  • Ability to crawl and extract data from very complex websites
    • Having a very complex page navigation structure
    • Having complex JavaScript menu and/or paging implementation
    • Having strong anti-bot protection (sites that do not want to be crawled) – for example, Amazon
    • Capturing multiple product variations shown on the same product page
    • Having huge amounts of products
    • Requiring browser interaction before scraping data
  • Crawl size / location
    • We can crawl websites in any language / any country
    • Crawling entire website / crawling only specific product categories/brands (for example – it doesn’t make much sense to crawl the whole of Amazon. However, crawling several specific product categories on Amazon can be done)
    • Crawling websites that are location-sensitive (showing different results depending on the visitor’s IP / ZIP code). For example, will show different results depending on whether you are an international or US visitor. This applies to other Amazon websites.
    • Big (more than 10 000 000 product pages) or small (less than 1000 product pages) – can be done
    • Crawling and scraping content in any language/script (Latin, Arabic, Chinese, Cyrillic, …)
  • Extraction results
    • We have the ability to capture data fields that are not shown on the product page itself (for example fields shown on the category page, shown before reaching the product page)
    • Extraction results can be delivered in a list (Excel, CSV, XML). If you need a custom format, do let us know
    • Extraction results can be run against automated translation services (so you get results in your preferred language)
    • Data can be stored either on your or on our server, or even on Cloud storage
    • The default format is CSV – easy for data processing on your side
  • One-off or repetitive
    • Price2Spy’s Product Extraction service can be used as a one-off data source, or as a repetitive process
    • If you go for a repetitive crawl/extraction, you will be able to determine the recrawl frequency
  • Product matching
    • Products that have been extracted can be matched (automatically or manually) to your own products
  • Data privacy
    • Data crawled for you will be shared with no other clients
  • Customizability
    • You are welcome to define specific rules on what needs to be scraped (for example: combining data from Category Page and data from the product page into a single result set)

Site crawl and Competitor product extraction

Use cases

We have performed crawl / extraction operations for a multitude of our clients, and we have noticed that they can be roughly grouped into following use cases

  • For online retailers
    • Extracting complete competitor’s assortment (to be used as data source for adding new products on own store)
    • Extracting delta’s in competitor’s assortment
      • Knowing which products have been added to competitor website
      • Knowing which products have been discontinued by your competitor
      • (of course, this requires a periodical recrawl)
  • For Brands / Distributors
    • Extracting product reviews from retail websites, in order to determine consumer sentiment towards the product (this service is sometimes combined with automated review translation)
    • Extracting newly released products from competitor brands
    • Data science – providing product (or user review) data for in-house data science projects. Data can be provided in original language, or machine-translated
  • For Marketing Agencies
    • Extracting products from Ecommerce sites (online stores) on behalf of it’s clients
    • Extracting products and their reviews from Review websites (for example:,
    • Data science – providing product (or user review) data for data science projects performed on behalf of agency clients. Data can be provided in original language, or machine-translated

Results of Price2Spy crawls/extractions can be later used as feed for regular Price2Spy account, so newly extracted products get continuously price-monitored. If needed, a continuous crawl process can be part of your Price2Spy Enterprise package