Ever read a “new” post someone has just shared and realize it sounds familiar? Often, once you get to the bottom, it’s noted that the post is “Edited from the Archives” or “A Re-Post.” Bloggers everywhere are re-posting their old posts. Often the re-posted version gets more attention and interaction, presumably since they now have more followers and have likely added a pinnable graphic, etc.
At a glance, it looks like a fabulous idea. And the “big bloggers” are doing it, so everyone else assumes that it’s fine. But when you take a closer look you run into a problem called “duplicate content“. The average hobby or family update blogger may not be concerned about Google rankings and SEO. But if you are serious about being intentional in your blogging, the last thing you want to do is make your blog hard to find because Google penalized you for duplicate content.
But re-posts from the archives aren’t the only form of duplicate content. Here are four things bloggers need to know about duplicate content—and four ways to avoid it.
(Please note: I am not an SEO expert. This post is based on my own research. I have attempted to over-simplify the information I have found into language that the average blogger can understand. Check out the links at the end of this post if you want to read the more technical posts on the subject.)
4 Things Bloggers Need to Know About Duplicate Content
1. Duplicate content can make Google think you are plagiarizing.
There’s more than one reason to “create before you consume”. If you write a post that sounds just like a post someone else wrote, with your own words sprinkled here and there for personalization, Google is going to notice—just like school teachers notice that a paper sounds just like the encyclopedia article on the topic.
2. Duplicate content can make Google think you are a content thief.
If you publish a post that is an exact duplicate of another, it appears to search engines that one of them is stolen (especially if they don’t link to each other). It might be that you are putting a copy of a guest post you wrote on your own blog, but Google doesn’t know that. It could just as easily be that you copied a post you liked from another blog and put it on your own blog. It’s called “content scraping” and it happens all the time. (You can try out a plugin like Plagiarism to automatically search for scraped copies of your blog’s content. If you’ve discovered an instance of your own content being stolen, click here to find out what to do.)
3. Duplicate content confuses search engines about where to send readers.
Search engines are constantly trying to improve their search results to include only the best and most relevant content for your search terms. That means that something has to go—and that will be the duplicate posts, those deemed irrelevant. And if Google looks at your site as a whole and sees a lot of content duplicated on your site and elsewhere? That will affect your site’s rankings.
4. Duplicate content confuses readers.
Even if you’re not worried about your search engine ranking, re-posting an old post is still splitting your blog traffic, likes, and shares between two posts. Not to mention that the related discussion (which is always fabulous–I love reading comments!) is spread out all throughout the archives (which might drive some of us detail people downright crazy). Readers don’t know which post to pin or like–let alone comment on. Duplicate content equals multiplied confusion.
4 Ways Bloggers Can Avoid Duplicate Content Penalties
1. Refresh an old post, don’t re-post from the archives.
It’s tempting—especially when writer’s block hits—to grab an old post, dress it up with a new picture and better grammar, and hit publish again. But that practice is littering your archives with duplicate posts. And Google is going to spot that. Best practice is to edit the old post, and re-share it again via social media and perhaps even your email newsletter. (Watch for the next post filled with ideas and methods for refreshing your archives.)
2. Guest post, don’t submit a previously published post.
Most guest post guidelines specify that you must submit a new, original post—one that has never been published elsewhere, including your own blog. That’s because a guest post that’s already been posted on your own blog isn’t a guest post: it’s duplicate content. Write a guest post that will give potential readers a taste of what they might find on your blog, without giving them your blog posts themselves.
(Looking for tips on writing a guest post? Rachelle Rea writes about how to write a guest post readers will want to read.)
3. Use a different title and intro when linking to your guest post.
When you guest post for someone else, it’s common courtesy (and often an understood agreement) that you’ll write a post on your own blog linking to your guest post. It’s a way to bring your community to theirs, and say thank you for the opportunity. However, you have the potential to hurt your SEO and the site hosting the guest post if you use the same title and beginning paragraphs of your guest post as an introduction on your own blog. It takes a bit more time, but it’s worth the effort to avoid duplicate content issues by creating a unique title and crafting a custom introduction to your guest post. Close your introduction with the words of the link itself in mind:
Visit So and So’s Blog to read my post: Very Brilliant Post title
4. Use excerpts only in your archives.
WordPress is a beautiful content management system in that it creates many ways to categorize and organize your content. However, those could also spell potential duplicate content issues. “Canonical” URLs help resolve this issue, but as an extra precaution, it’s a good idea not to show full posts on your archive pages. Choose excerpts only for your date, category, author, and tag archives. And consider using an SEO plugin to noindex at least some of your archives (on a single author blog you can disable the author archives altogether).
But how much duplicate text actually equals duplicate content?
Only the companies behind the search engines know the answer to that question. The truth is, my post about duplicate content is going to share some of the same keywords and links as other posts about duplicate content. But as long as I’m writing my own slant on the topic, in my own voice—not plagiarizing, or copying large portions of text—I’ll have done my best.
What about quotations?
The general rule for quotations is that you’re allowed to quote 50-300 words without permission from the author, depending on proportional length of the quotation to the original source. So if you’re quoting a few hundred words or less, duplicate content shouldn’t be an issue. But if you can sum up what they say without quoting them directly, all the better (as long as you give credit where credit is due!).
What about content aggregation?
There are many “aggregator” sites dedicated to gathering links to blog posts all in one place. From what I’ve read, aggregation does not equal duplication as long as only post excerpts are used and the excerpts link back to the original posts. If a site approaches you about aggregating your content, your larger concern should probably be the reputation of the site and whether—in light of possible SEO implications—it will actually bring you more readers.
What about syndication?
Syndication across the web—whether it’s a press release or a column—is going to result in some forms of duplicate content. The site with the highest ranking (a large newspaper, perhaps) will usually be the first to come up in the search results. Be careful where you allow your content to be syndicated. Ask the potential syndicating site what their practice is for dealing with duplicate content and SEO implications. If they haven’t thought about it, then they should be.
What if a site like the Huffington Post wants to republish my post?
If a site like the Huffington Post approaches you about one of your posts, you probably don’t want to say no. At the same time, the Huffington Post probably won’t want to noindex, nofollow your post on their site, nor add a canonical meta tag that lets search engines know your site was the original source. Best practices therefore may be to add noindex, nofollow your own site’s version of the republished post and/or use a canonical meta tag to point to the large site republishing the post as the one search engines should index.
What if I’ve already duplicated my content on my site or somewhere else?
Watch for the third post in this series where we’ll talk about simple ways to make SEO amends for duplicate content. Fixing the issue can be as complicated as merging and redirecting posts, or as simple as adding a canonical link to the meta of each duplicated post pointing to the original source.
More in this series:
- How to Refresh Old Posts (without creating duplicate content)
- What to Do If You’ve Already Re-Posted Duplicate Content On Your Blog
More on the subject of SEO and duplicate content:
- SEO Tips for Beginners: What Is Duplicate Content?
- When is Reposting Content On Your Blog a Bad Idea?
- Duplicate Content: How Unique Is Unique Enough?
- Moz: What is Duplicate Content?
- Moz: What is Canonicalization?
- Moz: Beginner’s Guide to SEO
- The Moz Blog: Dealing with Duplicate Content
- The Moz Blog: Cross-Domain Canonical
- The Moz Blog: Duplicate Content in a Post-Panda World