Monday, April 21, 2008

Uhg, Scrapers!

It finally happened! I had my blog posts scraped from my RSS feeds and posted in their entirety on some crap site that exists for ads. Not just one post, but every single post for the past two months.

Yeah, they attributed me by putting a link to my posts with the word “source” at the bottom, but they have ads, and I distinctly say with a Creative Common license that you can’t use my content for commercial purposes. You cannot scrape the content I work very hard to produce, reprint it word for word, picture for picture, and put ads all over it!

I don’t mind if some blogger uses a picture from my site (attributed to me, of course) to gush about how cute they think a cupcake is. I don’t mind if a food news aggregator takes a snippet of my work (attributed to me, of course) and works it into their own post (Foobooz does an exemplary job of playing nicely). These are all within the parameters of my Creative Commons license (I really don’t need one, everything I write and produce is copyrighted, but it’s a nice way of saying, “You can use my work, if…”).

I’ve caught others infringing on my rights (a Philly restaurant and a Philly gossip/news blog that uses/used a photo without attribution), but let it slide. Small potatoes. I couldn’t let it slide this time.

I’m sure most of you reading this are much more blog/computer savvy than myself, and already know what I'm about to say, but in case you’re wondering if you’re being stolen from, and what to do about it, here’s what I just did:

  • Every few months I check Copyscape to to see who is copying my work. Mostly I get plagiarized by crap recipe sites that exist for ads. They usually just take the recipes, and everyone knows that recipes are not really protected.
  • If you find that someone is violating your terms of content use (and can’t contact them to request they remove your content, because most of these sites don’t have a way to contact them), follow these nicely outlined steps to fight scrapers over on ProBlogger.

I had already done step 1 (license your content) from the ProBlogger article.

Step 2 (add a link to your original post in your RSS feed). I just installed a disclaimer at the bottom of my feed that links to my site (not the individual post, we'll see if I need to go there) that deters sites from automatically scraping my feed. Some people publish only a partial post to RSS feeds, and that helps deter scrapers, too. As a blog reader, I hate this. I’m too lazy/busy to click over to the full post when I’m scrolling the many blogs that are in my reader. Partial post and you’re dead to me.

I just completed step 3 (report scrapers to AdSense); step 4 (report scrapers to Google); and step 5 (report scrapers to their web hosting service).

As of right now, the web hosting service (they acted very quickly!) has disabled web access for the stolen content (and, for now, their entire site), and have informed the client to remove the stolen content. And there’s a letter in the mail to Google (gotta have it in writing! and I will write a letter in a heartbeat).

No comments:

Post a Comment