A Guide to using Canonical URLs
What is a canonical URL?
A canonical URL is the preferred URL of one or more pages and is commonly used to handle issues with duplicate content on a site. The canonical URL is often implemented using a rel=canonical tag in the HTML of a webpage to tell search engines which URL is the canonical for that page. You can also specify the canonical URL in your HTTP header or your sitemap.
It allows you to tell search engines that a set of pages contain the same content and that you would prefer only one of these pages to be indexed. If your website contains multiple versions of the same content, for example a page that also has the same content available in a print version on a different URL, you can assign a canonical URL to let search engines know which page you would prefer to be indexed. Without the canonical URL, search engines don’t know which of the pages you want them to index.
What are the SEO benefits?
When you have multiple pages that serve the same or similar content on a website and don’t have canonical URLs set up to correctly handle these, this can lead to keyword cannibalisation. Keyword cannibalisation occurs when a website has multiple pages that target the same keyword, and this can often do more harm than good for your SEO.
Having multiple pages that target the same keyword can lead to several issues when it comes to your SEO. Your pages end up competing against one another in the SERPs for the same search terms, and Google may index and rank the less relevant page, which can in turn negatively impact your conversion rate and provide a less valuable experience for your users. By assigning a canonical URL to these pages, you are telling Google and other search engines which page you would prefer to be included in their index.
We all know how important inbound links are for SEO but having multiple pages that contain the same content can lead to these links being split between pages. This means that you end up sharing link value between numerous pages, rather than focusing all the links on the most relevant page. Setting a canonical URL can allow search engines to count all the links pointing to various versions as links to the canonical version.
Duplicate content is a common issue on many websites and it can sometimes be necessary to serve multiple purposes. Search engines understand that duplicate content is often not manipulative and is usually free from any malicious intent, and internal duplicate content will not be penalised by search engines. However, it is not the most optimal way to setup pages on your website and your rankings can still suffer as a result.
Duplicate content causes similar problems to keyword cannibalisation. The duplicate pages end up competing against each other and search engines are unsure which page they should rank and they might not always rank the most relevant or valuable page. Setting a canonical URL allows search engines to identify which page you would prefer to be indexed.
Please note that there are many reasons why a website might have duplicate content issues and setting a canonical URL is only one of the ways to handle this issue, and often a 301 redirect might be a better solution. For example, if your website is suffering from duplicate content because both your www and non-www URLs or HTTP and HTTPS URLs are returning the same content, then you should implement a 301 redirect rather than a canonical URL. We go into this in further detail later on.
How to pick a canonical URL?
If you have multiple versions of the same or similar content, you will need to pick one to be your canonical URL. Often it might be obvious which is the best page to choose, but in other cases it may not be as clear. We would recommend reviewing various metrics to identify the best performing and most relevant page, including traffic, page views, conversions, conversion rates, and the keywords and search queries which result in organic traffic to these pages.
How to set up canonical tags?
If you have multiple versions of the same content, you can add the rel=canonical tag to the <head> section of your webpage. For example, these could be their URLs:
If you wanted /page-one to be the canonical URL and the page that is indexed in search engines, then you would add this snippet of code to the <head> section on /page-two:
<link rel= "canonical" href="https://example.com/page-one"/>
And that’s it!
We also recommend adding a self-referring canonical tag on the canonical URL.
Please note the rel=canonical <link> tag can only be used for HTML pages, and not for files such as PDFs. In cases such as these, you would use the rel=canonical HTTP header.
- Broken canonical link – Verify that your canonical URL target exists and does not return a 404 error or redirects to another URL.
- Noindex robots meta tag – Check that your canonical URL target doesn’t contain a noindex robots meta tag.
- Multiple canonical URLs - Ensure that your page does not contain more than one canonical URL. When more than one canonical is specified, they will all be ignored.
- Conflicting hreflang and canonical URL – When implementing hreflang tags on your website, ensure that the canonical URL matches the hreflang URL of the page.
- Placing rel=canonical in the <body> - The rel=canonical tag belongs in the <head> section of a webpage. Check that this isn’t appearing in the <body> section as this can cause the canonical URL to be ignored.
- Using a canonical URL instead of a redirect - While canonical URLs can be used to handle duplicate content, it might not always be the best solution. If your duplicate content does not exist for technical or user experience purposes, you should consider implementing a 301 redirect rather than a canonical.
301 redirect or Canonical URL
There are many reasons why a website might have duplicate pages with different URLs and these will require either a 301 redirect or a canonical URL. Here are some examples:
- The HTTP and HTTPS versions of a page – this should be a 301 redirect from HTTP to HTTPS
- The www and non-www version of a page – this should be a 301 redirect from www to non-www or non-www to www depending on how you have it set up.
- URLs with and without a trailing slash – this should be a 301 redirect from the URL with/without the trailing slash to the URL without/with the trailing slash depending on how you have it set up.
- URLs with parameters – URL parameters can allow users to filter the content of a page, for example sorting by price on an ecommerce website, and this type of duplicate content should be handled with a canonical URL.
- Alternate versions – Your website may offer an alternate version of a page on a different URL, for example a page that is designed for print. In this scenario you should assign a canonical URL.
- Duplicate content – If your website has multiple URLs that contain duplicate or extremely similar content, and you have a reason to keep both versions, then you should assign a canonical URL.
The canonical URL can improve your SEO performance on pages that have identical or highly similar content. You should set a canonical URL when you have an issue with duplicate content, but each version serves a purpose and it doesn’t make sense to remove the content or 301 redirect it. You should implement a 301 redirect on duplicate URLs that offer no additional purpose, such as duplicate content on both www and non-www URLs, HTTP and HTTPS URLs, and URLs with and without a trailing slash.