We’re often asked about duplicate content. What is it? How do I remedy this blasted predicament? It can be overwhelming to discover how many pages on your site Google considers being duplicate versions of each other but hopefully, this post will assure you that the solution is well within your grasp.
What is it?
Duplicate content refers to identical (or nearly identical) pages of content. Many times it is not intentional. Homepages, for example, are often victims of duplicate content. Evolve Digital Labs’ homepage can be accessed by typing:
Of course, Google does not see these different pages as duplicate content because we conveyed which version is the one we want it to crawl and index. More on that later. You can do some fancy things with URLs, such as allowing them to skip to specific sections without leaving the page or print a plain version of the content. This is possible by adding dynamic parameters to the URL. But it also duplicates the pageas content.
Why is it bad?
Duplicate content can hurt your rankings because Google is forced to choose one version to index, and it may not be the one you wanted to show up in the SERPs. A printable version of a page, for example, might be more easily crawlable than its nearly identical original. In the worst-case scenario, it is possible that Google may neglect to index any of the pages. Furthermore, duplicate content can sabotage linking efforts. If multiple websites are linking to various versions of the same page, that link juice or the authority received from inbound links, will be diluted across the board a unless one page has been appointed as the sole version to rank.
How do I know if duplicate content is killing me softly?
Spend time in Google Webmaster Tools. Get real comfortable because this interface alerts you to these kinds of issues so you can remedy them as soon as you feel the digital tap on the shoulder. It also allows you to see to which pages other websites are linking. This is a phenomenal insight for a myriad of reasons, but in this case, you can see if outside sources are linking to different versions of the same page. For example, if you are seeing some sites linking to homepage.com and some linking to www.homepage.com, this can be a huge red flag. Note: if you have already designated one URL to be the canonical, or primary, version, this will not be an issue because the linking authority will be successfully transferred to the correct URL. Finally, we recommend performing a simple site search, which involves Googling “site:yourdomain.com,” but leave off the quotes and enter your literal domain. Then take it a step further; add the “inurl” command to search for URLs within your site that contain particular keywords. This is a very helpful way to quickly sift through the site to find duplicate files.
How can I fix it?
First, you can add a 301-redirect to the pages that aren’t receiving as much traffic. This will guide users, search engines, and any link juice to the new permanent URL.
Adding the rel=canonical attribute to a link of a page allows webmasters to communicate to Google that there is another, better page that should be crawled and indexed. The search bots then know that the rel=canonical link is the primary page that should reap the link juice and be included in Googleas index of pages.
Duplicate content is a big deal. But detecting and fixing it is easy, so there’s no reason to ignore this issue. To find out more about this issue and others like it, scope out our new Resources Page!