It’s happened to anyone who’s ever used a search engine:
- First, you type a keyword into the search box and hit Enter.
- Next, you pick a promising link from the list and hit Enter again.
- Then, you find yourself looking at a page filled with gibberish – words are spliced together on the page in a way that has rendered them nonsensical.
What IS that about, anyway?
What Is Content Scraping?
Scraping is re-purposing content without permission. Sometimes it’s useful: Websites that compare product prices or weather forecasts in vacation locales employ forms of content scraping; the captured data is centrally analyzed before it’s republished. A 2010 report from the Fair Syndication Consortium found that, on average, over 75,000 unlicensed Internet sites scrape US newspaper content over a 30-day period.
Is Content Scraping Legal?
Is content scraping legal? Can I sue if someone steals my website content without my permission?
Unfortunately, the answer isn’t simple.
Copyrighted Content v. Duplicated Facts
While copyright protects original expression, protection doesn’t extend to duplicated facts. Nonetheless, in 2004, Southwest Airlines sued travel reservation websites FareChase and Outtask publishing Southwest fares and flight information as part of their service offerings.
Are there steps you can take to prevent copyrighted content from being scraped? Sure.
- Watermark graphics;
- Written content can be posted in non-copyable .pdf format (which is a horrible idea, because search engine spiders don’t “read” .pdfs.)
Grin or Sue?
Practically speaking, the easiest thing to do when it comes to content scraping is to grin and bear it. If you’re writing on the Internet in the 21st century, sooner or later you can expect to run across your misappropriated words on an unauthorized site.
That said, you can always formally copyright your blog or website — and sue for online copyright infringement if anybody steals.
The Legalities of Scraping Confidential Information
Scraping is also the term used when user-generated content is lifted and analyzed as part of data mining operations.
In May 2010, a password-protected website called PatientsLikeMe.com — a safe online space for over 70,000 people dealing with mental health challenges — caught BuzzMetrics, a subsidiary Nielsen Co., in the act of downloading messages posted to PatientsLikeMe’s online forum. Why? Nielsen intended to aggregate the information and sell it to drug companies anxious for patient feedback.
Is scraping information from password-protected sites illegal? It more than likely violates the Terms of Service, but courts have been inconsistent as to whether or not Terms of Service agreements constitute binding contracts.
Moreover, the company that operates PatientsLikeMe also sells information to drug companies; as such, user data might be considered legally protected intellectual property. Additionally, US courts have held that unauthorized use of servers by robo-scrapers constitutes a violation of personal property rights known as “trespass to chattels.”
Consult Attorney With Content Scraping Experience
Is your content scraping legal? For more information about content scraping, contact a qualified copyright attorney.