Web scraping is a technique that allows people to easily extract large amounts of information from around the web – for legal and illegal uses. Distil Networks’ 2016 Economics of Web Scraping Report notes that bots, which make up approximately 46% of web traffic, often conduct the web scraping at a faster rate than humans. The Report found about two percent of online revenue is lost to web scraping.
Key findings also noted in Distil Networks Report:
- 38% of companies engaged in web scraping to take content.
- Web scraping is inexpensive costing about $3.33 per hour.
- The average web scraping project costs approximately $135.
- Real estate sites are the #1 web scraping target.
- The average web scraper salary ranges from $58,000 to $128,000 per year.
There are six primary reasons for web scraping:
- Content scraping: stealing original content from one website and posting it on another website without the knowledge or permission from the publisher;
- Research: gaining marketplace intelligence;
- Contact scraping: acquiring customer emails and contact information for marketing and lead-generation;
- Price comparison: competitors in the real estate and travel industries see a lot of this activity;
- Weather data monitoring; and
- Website change detection: notifying users about changes made to specific websites.
It’s important for publishers to understand the extensive web scraping economy so they can take steps to protect their proprietary information.