Tagging internal links with UTM and trace parameters can be confusing
Google’s John Mueller said yesterday during a hangout to webmasters on YouTube at 16:56 mark in the video that it can confuse Google if you mark your internal links with tracking parameters, such as UTM links. It’s actually a common practice to do this, but it can confuse Google for indexing if you use too many parameters. The problem is that Google has to crawl all the different URLs, which can be many. It then needs to determine if the canonical signal is more important than the signals you tell Google through your internal links. John Mueller of Google explains it below.
Glenn Gabe summed up a lot in two tweets:
More @johnmu about utm parameters: our systems are trying to figure out the different urls … so send clear signals to google. Rel canonical is a strong signal, but so are internal links. You could also cause more crawling using these settings: https://t.co/vCz9gjpFTQ pic.twitter.com/y3XydkGB09
– Glenn Gabe (@glenngabe) February 19, 2019
Here is the transcript:
Question: A question regarding parameter URLs with UTM links. So UTM links are basically tagged URLs that you can use for analytics tracking or just general tracking. And the question is whether these links will dilute the value of the link if they are strongly linked internally. We get indexed pages with parameters where the canonical points to the preferred version. How will it affect in the long run if we are linked to the website with 80% parameters and 20% clean urls?
Answer: II guess it’s always a bit of a sticky situation because you’re basically giving us mixed signals. On the one hand, you say these are the links that I want to index because this is how you build internal links within your website. On the other hand, these pages, when we open them, have an actual canonical link pointing to a different URL. So you say Index this one and from that one, you actually say Index another one.
Sp what our systems end up doing is they try to weigh the different types of URLs that we find for that content. We can probably recognize that this content these URLs lead to the same content. So we can sort of put them in the same group, and then it’s a matter of choosing which one to actually use for indexing. And on one side we have the internal links pointing to the UTM versions. On the other hand, we have the canonical rel pointing to some sort of cleaner version. The cleaner version is probably also a shorter URL and a nicer URL, that kind of game is also live with us. But it is still not guaranteed from our point of view that we would always use the shorter URL. So rel canonical is obviously a strong sign, internal binding is also kind of a stronger signal, in that it’s something that is under your control. So if you have explicitly linked to these urls and we think maybe you want them indexed like this.
So in practice what would probably happen here is that we would index a mixture of URLs. For some of them we would index the shorter version because maybe we will also find other signals pointing to the shorter version. Some of them are probably indexed with the UTM version and we will try to classify them normally as the UTM version.
In practice, in search you wouldn’t see any ranking difference, you would just see that these URLs could be shown in the search results. So they would rank in exactly the same way with UTM or without UTM and they would just be listed individually in the search results. And from a practical standpoint, that just means in Search Console you can see a mix of these URLs, in the performance report you can see some kind of mix, in the index report, you can see a mixture, in some of the other reports may be around AMP or structured data if you use something like that, you can also see this mixture, you may also see in some cases a situation where it switch between URLs. So we might index it with UTM parameters at some point and then a few weeks later if we upgrade to the cleaner version. And we do say though that this cleaner version is probably better and that at some point later some algorithm or algorithm to review it and say well in fact more signals point to the UTM version that we will come back to. This could theoretically also happen.
So what I would recommend doing there is if you have a preference when it comes to your URLs, make sure you are as clear as possible on your website about which version you want to index. With UTM parameters you also create the situation where we should explore these two versions. So it’s a bit heavier if it’s just an extra version which probably isn’t that bad. If you are using multiple UTM parameters on the website, we’ll try to explore all of these different variations, which would mean we might explore. I don’t know twice as many URLs as your website. to be able to follow the indexation. So this is probably something you would like to avoid.
So my recommendation would be to try to clean that up as much as possible, so that we can stick to clean urls. to the URLs you want indexed. Instead of ending up in this state where maybe we’ll pick them up like that, maybe we’ll pick them up like that, and in your report it could be like that, it could be like that, you have to be careful at this all the time. So keep it as simple as possible.
Here is the integration of the video:
Discussion forum at Twitter.
Google’s Daniel Waisberg update:
@briquerouille @JohnMu I just wanted to add that it is * very * bad practice to use UTMs on internal links from a @Google Analytics point of view, this will dramatically decrease the quality of your data.
– Daniel Waisberg (@danielwaisberg) February 20, 2019