After many of your pages have been indexed you can now safely seek out what needs to be fixed. The first phase of increasing sites ranking has already finished (increase inbound links). The next phase is to reduce the amount of duplicate content for your site.
What is duplicate content?
Duplicate content is exactly what it sounds, content that is in more than one place and is exactly the same.
Why do we need to fix it?
We need to fix it so Google along with other search engines don’t see our pages as spam. Duplicate content is a great way to be penalized by search engines and reduce the amount of pages within their index. Too much duplicate content can lead to a complete ban from their SERPs.
How to spot duplicate content?
Spotting duplicate content has become much harder recently with the deletion of the supplemental index. In yester-year, a simple search [site:www.yoursite.com *** -sljktf] would spit out pages ‘penalized’ by the duplicate content filter and shoved into the supplemental index.
Now, duplicate content is indeed harder to locate. In order to keep your site free of duplicate content, you can take these steps:
- Make sure each page has only one URI that can be found by search engines. If your site must have a print version and a screen version, use your robots.txt file to exclude the pages or add the robots meta tag with noindex attached.
- Do not copy and paste long paragraphs directly onto your pages. Quotes and paraphrases are fine, but anything longer than a few lines should be taken out. If the page must have the same content as another page, surround it with unique text so Google, Yahoo, MSN and other search engines can easily distinguish the page.
- Try to get pages to at least 250 additional words / html tags that are not within the template.
- Use 301s to redirect old versions of pages to the new version.
- Keep internal linking consistent. Don’t link to /page/ /page and /page/index.html, choose one and only one. I have seen many websites link their ‘home’ button to /index.html. Don’t make this mistake. It will be seen as duplicate content, and even the 1% chance it isn’t, you will be splitting its Page Rank in two.
- Use country specific TLDs when the content is country specific. This will not only help deter search engines away from labeling external-internal duplicate content, but will also help with your sites rankings in local searches.
- Make sure your syndicated content has a link back to the original. This will help search engines decide which is the original.
- Use the Google webmaster tools and select your preferred www or non-www version.
- Don’t have empty pages. Many sites have stub pages for future articles that will be expanded. If you must, and I mean must, have these pages, at least block the pages using your robots.txt.
- Blog/CMS software – understand how your blog or CMS software places its pages. If your site includes an date archive, directory archive, and an index page that shows the latest posts, your site will have duplicate content problems. Make sure to block the pages that will least likely to rank high and leave only one.
- If a page is ranking better than your original, consider filing a DMCA request to claim ownership of your content within all search engines.
Don’t worry about your template as it has no bearing on whether or not your pages are marked as duplicate content, just as long as your template doesn’t change dramatically from page to page.
In short, make sure your site is original and has content in only one place. Don’t fret too much about duplicate content on your own site as Google and other search engines have stated they can (at least most likely) tell that the pages are archives for user purposes and will not penalize.
Written: May 22, 2008
Tags: search engine marketing, search engine optimization, sem, seo








Dennis Edell

May 23, 2008 @ 8:49 am
Another good reason to use the all-in-one SEO plugin. You can *noindex* your Categories, Archives & Tag archives with a simple tick of a check box (3 boxes), very simple.