Why anti-scraping matters - BotDefender

Why anti-scraping matters?


Home » BotDefender » Here

Web scraping solutions are now available for no more than a few dollars per month. The practice of web scraping for competitive intelligence purposes has become a widespread in commerce. Because of those evolutions, it is more important than ever to protect your business against the automated retrieval of your prices by competitors.

Scraping: a widespread practice

Scraping commerce websites while looking for their prices is a widespread practice. We suggest to give a try to the search query magento web scraping on Google, and see for yourself that the query returns hundreds of jobs posted on various freelance website to hire a developer to specifically scrap the site of a competitor.

Obviously, public freelance job posts in the specific case of Magento are only the tip of the iceberg. Most companies would not post a job request publicly visible on the web. In-house approaches are typically favored, but there are also a growing number of packaged software solutions readily available. Then, Magento is not the only target either. The same experiment could be repeated for all popular shopping cart software.

Price extraction hurts the most

Pricing is one of the most basic yet powerful means of differentiation in commerce. Yet, if your competitors know all your price all the time, then it's relatively simple to outprice your business, by undercutting key prices by just a few cents, which will badly hurts your conversion rates when trying to acquire new customers, i.e. those who are still looking at multiple options on the web.

Price wars happen when merchants repeatedly cut their prices below those of competitors. When a competitor engages a price war with you, consequences are very expensive: either you give up on your margins, or you give up market shares. In theory (and indeed such was the case a few years ago), your competitors could have a small army of employees paid to check your prices, however in practice, it's too expensive. Thus, your competitor resorts to automated price extraction from your site. By preventing your competitors from automatically reacting to every price change you make, you prevent them effectively from engaging price wars with you. Let your competitors start their price wars with other competitors.

Commerce sites are increasingly vulnerable

Web standards have been steadily improving over the last decade. In particular, thanks to the widespread support of CSS, the HTML code of commerce sites is now more readable than ever for machine and web developers alike. Overall, it's a very good thing: websites look better, they are easier to develop and to maintain as well. However, as a downside, it has never been easier to extract prices from your modern site.

When looking at the HTML code of most commerce sites, usually prices can be identified with patterns not really more complex than:

<span class="price">$129.99<span>

where price is the CSS class assigned to the price. A decade ago, HTML pages were cluttered with tables (for layout) and images which were vastly complicating the scraping.

Those features of your modern commerce site explains why it's now possible for your competitors to buy or build web scraping solutions with minimal development efforts.

This trend did not stop with CSS. For example, the evolutions needed to make your site mobile-ready are again making things a lot simpler for web scrapers - because smaller pages are faster to download, and easier to analyze.