Building search friendly Javascript applications with React.js

Update

Since building the History of Humanity piece, I went on to build Fly Me to the Moon in 2016 and based on that, wrote about react dynamic components and SEO friendly React-based content, which extend this further. I’d strongly recommend reading those too, as they offer a more complete and up to date take on this concept.

Preface: This post is aimed at technically inclined marketers. However, React and its cousins are an important concept to be aware of if you’re new to the subject, so don’t be put off. In this post, we’ll unpack terms you may not know. Just hover over anything underlined to get an explanation of what it is, or feel very welcome to ask questions in the comments.

A while ago I published a post on how we built History of Humanity, a client-side React.js application example, which uses react dynamic components in its application.

The History of Humanity project</

The History of Humanity project

Whilst that’s a lovely proof of concept, it suffers from two problems:

  1. It only works if the client has javascript enabled, otherwise nothing happens
  2. Because of that, it’s completely uncrawlable by search engines

It also has a few other issues, from a more technical perspective. Namely that scripts should be:

  • Concatenated into a single file, to save on HTTP requests
  • Minified, to reduce the download size

Pre-rendering React.js apps for SEO

Fortunately, we’re not leaving things as they currently are, and this post looks at how we can convert the application to run on the server. Thanks to Node.js and io.js, we’ve got servers that are designed to run javascript content. Since we’ve also got a nice little javascript application, all we have to do is fiddle with it a little to allow it to run on one of these platforms. If that works, we can serve a pre-rendered version of our app for search engines to crawl and index.

First steps

If we were building a large scale application, your developers will probably be using tools like Grunt or Gulp. However, as this is such a simple piece, the benefits of having them is negligible.

Instead, we’re just going to be using a tool called Browserify and NPM package scripts. Browserify also has the advantage of being able to handle minification for us, which is the only point in this case where we’d want a task running like Grunt. Since it can take care of that, it’s more than enough for this project.

I also have a secondary advantage, in that I don’t write React as JSX, so that doesn’t have to be turned into JS, which would be an argument for using Grunt.

Just be aware that as the project scope and scale increases, so your developers will start talking about other tools too. In essence, the end result stays the same as what we’re doing here, it just helps them manage the process of getting there a little more tightly.

The only real challenges we’re going to face from a technical perspective are:

The second of these is also an SEO concern, so let’s look at how we’d address these challenges.

Challenge one: Data flow

With the initial fetch of the list of items, and also with the application loading a specific item (which we’ll handle in a bit), we need to be able to pre-fetch data and add it to the application before we deliver it. This is where we face the first of our challenges. The data for all the items can be loaded easily, as it’s already in a file; we just have to host that file on the server somewhere accessible. Similarly, loading data for a single item simply requires that we go and fetch that data and then pass it in as a prop.

Challenge two: URL mapping

There’s a lot of routing options for React.JS. However, most are overkill for something this small. Instead, we’re going to be using History.js. It’s fairly small and pretty robust, given the number of browsers it supports. With a larger project though that required more configuration, it’s likely we’d look at a different solution (probably React Router).

Putting it together: The server

The first step in converting the application over is to set up a server. In this case we’re using Node.JS and Express. Our server file is fairly simple:


var express = require('express'),
    path = require('path'),
    app = express(),
    ip = '0.0.0.0',
    port = 3000,
    bodyParser = require('body-parser');

// Include static assets
app.use(express.static(path.join(__dirname, 'public')));

app.set('views', path.join(__dirname, 'views'));
app.set('view engine', 'ejs');

// Set up routing
require('./app/routes/core-routes.js')(app);

// Set 404
app.get('*', function(req, res) {
  res.json({
    "route": "Sorry this page does not exist!"
  });
});

app.listen(port, ip);

console.log('Server is Up and Running at Port : ' + port);

We simply require express and a few other useful things, set routes to the static files (client-side JS and CSS, which are all in the public folder), set a view engine as we’re going to have the world’s simplest template, and include the routing logic for the bulk of the application. A quick 404 handler and setting the server to listen on port 3000 completes things.

Routes

This is where things start to get a little more involved. Our core-routes file looks like this:


var React = require('react/addons'),
    timelineJsonData = require('../data/timeline.json'),
    axios = require('axios'),
    HoH = React.createFactory(require('../js/dispatcher.js'));

function isset (obj) { return typeof obj !== 'undefined'; }

module.exports = function(app) {
  app.get('/', function(req, res) {
    // React.renderToString takes your component and generates rendered markup. SEO friendliness all the way
    var staticHTML = React.renderToString(HoH ({ timeline: timelineJsonData, initParams: false }));
    res.render('index.ejs', { reactOutput: staticHTML });
  });

  app.get('/:year/:position/:name', function(req, res) {
    function getWikiData (wikiApiLink, renderer) {
      var initData = {
        itemDetail: timelineJsonData[req.params.year][req.params.position],
        wikiData: false,
        wikiImages: []
      };

      axios.get('https://apis.builtvisible.com/history_of_humanity/?url=' + encodeURIComponent(wikiApiLink.replace(/&/g, "&"))).then(function (output) {
        if (isset(output.data.query.pages)) {
          var pageId = Object.keys(output.data.query.pages);
          initData.wikiData = output.data.query.pages[pageId];

          // React.renderToString takes your component and generates rendered markup. SEO friendliness all the way
          var staticHTML = React.renderToString(HoH ({ timeline: timelineJsonData, initparams: req.params, initwikidata: initData }));
          renderer.render('index.ejs', { reactOutput: staticHTML });
        }
      })
      .catch(function (e) {
        console.log('error in xhr');
        console.log(e);
      });
    }
    var wikiApiLink = 'https://en.wikipedia.org/w/api.php?format=json&action=query&prop=extracts|images&exintro=&explaintext=&titles=' + req.params.name;
    getWikiData(wikiApiLink, res);
  });
};

Here we first include react, our timeline data, axios and the application.

Now, if we’re just dealing with the entry point to the application, that’s easy: we simply run the application in its default state. However, now that we’re server side, we can make the application point to a specific item. To do that, we specify a second route, which takes the form of /:year/:position/:name, where year is the year that item occurred, position is the pointer for where it sits in that year’s events, starting from 0, and name is the name of the item.

This is where our data pre-loading comes in. We use axios to fetch the main body data from the Wikipedia API. We’re not going to get the images too, as they’re less important for the purposes of this, and we want to keep the app as quick as possible. Having got that data, we continue as we would otherwise and render the application.

The real magic of this, is the renderToString method. This is something that React ships with which makes life so much easier than with something like Angular.JS. We can simply render our entire application output to a HTML string and inject it into the body of our output. That way, we end up with something search friendly, which also happens to have all the benefits of a javascript application. When the application is rendered, the instance of React that the browser renders is able to hook into the rendered HTML output and map it back onto the React application.

Putting it together: The application

There’s a few other things to know for how this is all assembled. We’re going to be using Browserify to make everything run smoothly on Node and the browser, so we have to go through the application and turn our react components into something that Browserify can understand. Mostly that means that rather than rendering anything, we instead export every component as a module. That’s why at the end of all the react files you’ll see:

module.exports = HoH;

With that, we’re able to call our react components as if they were modules. We pack all the application together using Browserify. You can see the scripts that run that if you look in package.json. For those without the git repo open though, they’re these:

"build-dev": "browserify -e app/main.js -o public/bundle.dev.js -d",
"build-min": "browserify -e app/main.js -g uglifyify -o public/bundle.min.js",

With those two scripts, we can pack our entire application up as a single js file. The first produces a normal version, with the second making a minified version.

But wait a minute, they’re running something called main.js. What’s that file?

Main.JS – Where the magic happens

Main.js is the file that pulls the application together for the browser. It allows us to specify what things the browser will need to make the application run, run tests, set up data and so on. It looks like this:


var React = require("react/addons"),
    ReactCSSTransitionGroup = React.addons.CSSTransitionGroup,
    axios = require('axios'),
    timelineJsonData = require('./data/timeline.json'),
    HoH = require('./js/dispatcher.js');

window.app = (function() {
  var requiredFeatures = {
    "JSON decoding": window.JSON,
    "the selectors API": document.querySelector,
    "DOM level 2 events": window.addEventListener,
    "the HTML5 history API": window.history.pushState
  };

  for (var i = requiredFeatures.length - 1; i >= 0; i--) {
    if (!requiredFeatures[i])
      return alert("Sorry, but your browser does not support " + feature + " so this app won't work properly.");
  };

  if (window.location.pathname !== '/') {
    var parts = window.location.pathname.split('/');
    var initData = { year: parts[1], position: parts[2], name: parts[3] }
  }
  else
    var initData = { year: false, position: false, name: false }

  return React.render(React.createElement(HoH, { timeline: timelineJsonData, initparams: initData }), document.getElementById('hoh'));
})();

As you can see, this is similar to the core-routes file. We’re saying what things we need (React, Axios etc), testing the browser for the features it needs to run the application, and then setting up some data based on the current URL. With all of that done, we boot up the application with any data required, and render it into the html element with the id “hoh”.

Wrapping up

In essence, that’s all that’s required to make a React based application run on the server. The other amends that we’ve put in here are more for the sake of UX. A few lines of history.js calls allow us to push changes to the URL bar, and to traverse history. Beyond that, I’ve just got the following few thoughts.

Things to remember

  1. React is for building views. Don’t turn it into controllers, despite that you can.
  2. Load data outside React components and pass it in as props. This applies to child components, and to the top of the application when rendering on the server.
  3. Make use of renderToString. It makes it trivial to make applications SEO friendly.
  4. Pre-render as much as you need, but as little as possible. Speed is good. Don’t get tempted to pre-render everything, just because you can.

Final note

We created this project to examine the marketing implications of modern Javascript development. Now we’ve made the app SEO friendly, we’re going to analyse the log data and look at how GoogleBot crawls the site.

Join the Inner Circle

Industry leading insights direct to your inbox every month.