Using Google Tag Manager to Dynamically Generate Schema.org/JSON-LD Tags

[Estimated read time: 7 minutes]

One of the biggest takeaways from SearchFest in Portland earlier this year was the rapidly rising importance of semantic search and structured data — in particular Schema.org. And while implementing Schema used to require a lot of changes to your site’s markup, the JSON-LD format has created a great alternative to adding microdata to a page with minimal code.

mike arnesen searchfest 2016

Check out Mike Arnesen’s deck from his SearchFest talk, “Understanding & Facilitating Semantic Search,” for a great overview on using structured data.

What was even more exciting was the idea that you could use Google Tag Manager to insert JSON-LD into a page, allowing you to add Schema markup to your site without having to touch the site’s code directly (in other words, no back and forth with the IT department).

Trouble is, while it seemed like Tag Manager would let you insert a JSON-LD snippet on the page no problem, it didn’t appear to be possible to use other Tag Manager features to dynamically generate that snippet. Tag Manager lets you create variables by extracting content from the page using either CSS selectors or some basic JavaScript. These variables can then be used dynamically in your tags (check out Mike’s post on semantic analysis for a good example).

So if we wanted to grab that page URL and pass it dynamically to the JSON-LD snippet, we might have tried something like this:

Using tag manager to insert JSON-LD with dynamic variables

But that doesn’t work. Bummer.

Meaning that if you wanted to use GTM to add the the BlogPosting Schema type to each of your blog posts, you would have to create a different tag and trigger (based on the URL) for each post. Not exactly scalable.

But, with a bit of experimentation, I’ve figured out a little bit of JavaScript magic that makes it possible to extract data from the existing content on the page and dynamically create a valid JSON-LD snippet.

Dynamically generating JSON-LD

The reason why our first example doesn’t work is because Tag Manager replaces each variable with a little piece of JavaScript that calls a function — returning the value of whatever variable is called.

We can see this error in the Google Structured Data Testing Tool:

JSON-LD Google Tag Manager variable error

The error is the result of Tag Manager inserting JavaScript into what should be a JSON tag — this is invalid, and so the tag fails.

However, we can use Tag Manager to insert a JavaScript tag, and have that JavaScript tag insert our JSON-LD tag.

Google Tag Manager JSON-LD insertion script

If you’re not super familiar with JavaScript, this might look pretty complicated, but it actually works the exact same way as many other tags you’re probably already using (like Google Analytics, or Tag Manager itself).

Here, our Schema data is contained within the JavaScript “data” object, which we can dynamically populate with variables from Tag Manager. The snippet then creates a script tag on the page with the right type (application/ld+json), and populates the tag with our data, which we convert to JSON using the JSON.stringify function.

The purpose of this example is simply to demonstrate how the script works (dynamically swapping out the URL for the Organization Schema type wouldn’t actually make much sense). So let’s see how it could be used in the real world.

Dynamically generating Schema.org tags for blog posts

Start with a valid Schema template

First, build out a complete JSON/LD Schema snippet for a single post based on the schema.org/BlogPosting specification.

example article schema template

Identify the necessary dynamic variables

There are a number of variables that will be the same between articles; for example, the publisher information. Likewise, the main image for each article has a specific size generated by WordPress that will always be the same between posts, so we can keep the height and width variables constant.

In our case, we’ve identified 7 variables that change between posts that we’ll want to populate dynamically:
identify schema properties for dynamic substitution by tag manager

Create the variables within Google Tag Manager

  • Main Entity ID: The page URL.
  • Headline: We’ll keep this simple and use the page title.
  • Date Published and Modified: Our blog is on WordPress, so we already have meta tags for “article:published_time” and “article:modified_time”. The modified_time isn’t always included (unless the post is modified after publishing), but the Schema specification recommends including it, so we should set dateModified to the published date if it there isn’t already a modified date. In some circumstances, we may need to re-format the date — fortunately, in this case, it’s already in the ISO 860 format, so we’re good.
  • Author Name: In some cases we’re going to need to extract content from the page. Our blog lists the author and published date in the byline. We’ll need to extract the name, but leave out the time stamp, trailing pipe, and spaces.tag manager extract author name from pagetag manager extract author name from page markup
  • Article Image: Our blog has Yoast installed, which has specified image tags for Twitter and Open Graph. Note: I’m using the meta twitter:image instead of the og:image tag value due to a small bug that existed with the open graph image on our blog when I wrote this.
  • Article Description: We’ll use the meta description.

Here is our insertion script, again, that we’ll use in our tag, this time with the properties swapped out for the variables we’ll need to create:

google tag manager json-ld insertion script with dynamic variables

I’m leaving out dateModified right now — we’ll cover than in a minute.

Extracting meta values

Fortunately, Tag Manager makes extracting values from DOM elements really easy — especially because, as is the case with meta properties, the exact value we need will be in one of the element’s attributes. To extract the page title, we can get the value of the tag. We don’t need to specify an attribute name for this one:

configuring a google tag manager tag to extract the title value

For meta properties, we can extract the value from the content attribute:

configuring a google tag manager tag to extract the title value

Tag Manager also has some useful built-in variables that we can leverage — in this case, the Page URL:

Tag Manager Page URL built in variable

Processing page elements

For extracting the author name, the markup of our site makes it so that just a straight selector won’t work, meaning we’ll need to use some custom JavaScript to grab just the text we want (the text of the span element, not the time element), and strip off the last 3 characters (” | “) to get just the author’s name.

In case there’s a problem with this selector, I’ve also put in a fallback (just our company name), to make sure that if our selector fails a value is returned.

custom JavaScript google tag manager variable to extract and process copy

Testing

Tag Manager has a great feature that allows you to stage and test tags before you deploy them.

google tag manager debug mode

Once we have our variables in place, we can enter the Preview mode and head to one of our blog posts:

testing tag manager schema variables

Here we can check the values of all of our variables to make sure that the correct values are coming through.

Finally, we set up our tag, and configure it to fire where we want. In this case, we’re just going to fire these tags on blog posts:

tag manager trigger configuration

And here’s the final version of our tag.

For our dateModified parameter, we added a few lines of code that check whether our modified variable is set, and if it’s not, sets the “dateModified” JSON-LD variable to the published date. You can find the raw code here.

dynamic schema json-ld tag

Now we can save the tag, deploy the current version, and then use the Google Structured Data Testing Tool to validate our work:

google structured data testing tool validates dynamically generated JSON-LD

Success!!


This is just a first version of this code, which is serving to test the idea that we can use Google Tag Manager to dynamically insert JSON-LD/Schema.org tags. However after just a few days we checked in with Google Search Console and it confirmed the BlogPosting Schema was successfully found on all of our blog posts with no errors, so I think this is a viable method for implementing structured data.

valid structured data found in Google Search Console

Structured data is becoming an increasingly important part of an SEO’s job, and with techniques like this we can dramatically improve our ability to implement structured data efficiently, and with minimal technical overhead.

I’m interested in hearing the community’s experience with using Tag Manager with JSON-LD, and I’d love to hear if people have success using this method!

Happy tagging!

Tags:.