= potential pitfall

How to Implement Lazy Loading with Webpack and Netlify

James Pulec
Level Up Coding
Published in
7 min readJun 12, 2020

--

Hand-holding Honeymoon

The first time I used Netlify to deploy a single page app, I was pleasantly surprised. Before then, I’d usually used Heroku to deploy applications, which meant serving my index.html file from Heroku, and configuring my Heroku build process to upload my compiled assets to a CDN. While not too terribly difficult to set up, this process had some downsides. Serving the index.html file from Heroku meant that it could take extra time before changes were deployed to users and the performance wasn’t ideal. Netlify, on the other hand, deploys your index.html file and your assets almost instantaneously and it does so atomically. Excellent. However, there’s something that can be problematic about this: Netlify will only serve the current version of your assets at your specified domain.

In a lot of cases this doesn’t matter. You make changes, your application gets rebundled, and the next time a user visits your app’s url, they get the newest version of your app. Where this falls apart is when your app is designed in such a way that a user doesn’t necessarily get all of the required assets when they make that initial index.html fetch. A common case where this happens is when you lazy load pieces of your application. Check out the webpack docs here for a more thorough explanation and example of lazy loading.

Cliff of Confusion

The reason this doesn’t work is because a user may require an older version of one of your assets at an indeterminate time. For an extreme example, assume a user visits your website, anatomypark.com. They get served the index.html file which contains a reference to shell-abcdef.js . That file has a dynamic import() statement which loads pirates-of-the-pancreas-abcdef.js when a button is clicked. Now suppose the user doesn’t click that button and leaves their browser open for a week. Over the course of that week you’ve probably released a number of different changes to the site. This means that the names for your bundles have changed and are now shell-012345.js and pirates-of-the-pancreas-012345.js. The fact that shell-012345.js has changed doesn’t really matter, since that file is loaded when index.html is served to the user. But suppose that user returns to their open browser window, and tries to click that aforementioned button. Their browser will try to make a request to anatomypark.com/pirates-of-the-pancreas-abcdef.js which will 404 because Netlify is no longer serving that file.

That was a fairly contrived example, but you can imagine if you have written your app to lazy load a number of different files for performance reasons or by user type, that this scenario occurs more often than you might expect. This is especially true if you deploy dozens of times a day, like we do at Resource. It basically means you can’t use lazy loading with Netlify without some extra steps.

Desert of Despair

There are a few ways to work around this. You could just bail on Netlify, but we’d grown to appreciate its instant cache invalidation and easy rollbacks, so we wanted to see if there was a way to make things work without switching to a different platform. Another option would be to set up Netlify to proxy these requests for old assets to some other place where you would need to store them, like an S3 bucket. Setting this up would require adding a few extra steps to our build process to always store our assets in that S3 bucket and we’d get worse performance for these requests since they’d have to be redirected from Netlify’s CDN to the S3 bucket. The final option we came up with, and chose to opt for, was to find a way to keep serving our old assets from Netlify. If we just kept copies of previously built assets around in Netlify, these old requests would work just fine.

Rather than keeping old assets available forever, we wanted to keep ~1 weeks worth of old bundle data around, so that old-ish bundle data is available for our users. We looked at two different ways to accomplish this: 1) Committing our bundles to our repository or 2) Using the persistence mechanism available in our build environment. We didn’t like the idea of committing build artifacts to our repo, so we started looking at the persistence available in CircleCI. One persistence option that it has is its caching, which is commonly used for caching dependencies. After some experimentation we figured out a way to abuse its caching mechanism in order to keep old bundles around. While these steps describe how to do this with CircleCI, you can probably adapt them to whatever build environment you’re in, assuming it has some persistence that you can use to keep your old bundle files around.

Upswing of Awesome

It’s easier to explain how this all works by showing the CircleCI config a bit out of order, since a single build job needs to access the bundle data from the previous build job. Don’t worry. There’s a full example a little ways down.

First, we have our build step. For now we’ll assume we’re doing something simple like running an npm script such as npm run build and we expect it to output our built index.html and js files to ~/build. We’ll also assume that our bundle files are created with unique hashes for filenames.

Second, once we’re done building we need to save that built data to our CircleCI cache. We do that with a step that looks like the following:

- save_cache:
key: bundles-{{ epoch }}
paths:
- ~/build

This step tells CircleCI to cache our entire ~/build folder, and store it with the key bundles-{{ epoch }} which means we should get a strictly increasing key using the number of seconds since epoch. We really don’t care about the key itself. All that is important is that each set of bundle data gets a unique key.

The last thing we need to do, is make sure that the next time we build our bundles, we include the bundle data that was built during the previous build. This step is actually going to occur before our build step in our configuration file. We’ll do this by using the cache restore functionality that CircleCI provides. Cache restoration is done by cache key prefix, so what we can do is lookup all of our bundle caches using the key bundles. A crucial reason this works, is that when CircleCI encounters multiple matches for a cache prefix (and in this case it will match ALL previous caches) it uses the most recently created cache that matches. In our case, this means we’ll get the most recently built bundle. By calling restore_cache we’ll end up copying our most recently created bundle data into the ~/build folder. The step will look something like the following.

- restore_cache:
keys:
- bundles

Tying all these steps together, we should have a working set of operations that results in us restoring our previous bundle files every time we run our build job. The file should look like the following when tied together:

job:
steps:
# Dependency installation and source code fetching omitted
# for brevity
...
# Restore the most recently cached set of previously built
# bundle files
- restore_cache:
keys:
- bundles
# Build our current bundle files
- run:
name: Build Bundle
command: >
yarn build
# After we've built our bundles we need to save them to
# the cache so they can be restored during the next build run
- save_cache:
key: bundles-{{ epoch }}
paths:
- ~/build
# Deploy to Netlify
- run:
name: Deploy to Netlify
command: >
netlify-cli deploy --prod --dir=./build

It’s important that we perform the build step after we restore previous bundle files, so that our build step overwrites any files that have changed, but don’t have unique filename, i.e. index.html.

IMPORTANT: We need to make sure that our build step does NOT clear the ~/build directory before creating new bundle files. Otherwise, we’ll just remove all the previously created bundle files that were restored from the cache. This is the default behavior for create-react-app, so we had to work around this behavior. Unfortunately, there’s no easy way to disable it. In our case, we had already ejected from create-react-app, so we could modify the build.js script directly. Otherwise you may need to fork react-scriptsas described in the docs here, in order to modify the build.js script.

Once all this is said and done, we should have lazy loading working properly on Netlify.

Except….

That’s a LOT of bundle files…

Despair Again?

After a couple days we noticed that the number of assets we were deploying to Netlify kept going up. Uh-oh. We’re always just adding the entirety of our previous bundle files to our new set of files, meaning our cache just keeps growing.

We decided to deal with this problem by just removing all files in ~/build that are older than 1 week. Again, since we had already ejected, making this change to our build script was fairly easy. We just added a small npm package called find-remove that provides that functionality via a function called findRemoveSync, but you could easily implement it yourself instead. The relevant part of our ejected build script looks as follows:

const { checkBrowsers } = require('react-dev-utils/browsersHelper');checkBrowsers(paths.appPath, isInteractive)
.then(()) => {
// Remove any files that haven't been modified in the last week
const weekInSeconds = 60 * 60 * 24 * 7;
const results = findRemoveSync(paths.appBuild, {
age: {
seconds: weekInSeconds
},
files: '*.*'
});
console.log(`Removed ${Object.keys(results).length} files/directories`);
// Merge with the public folder
copyPublicFolder();
// Start the webpack build
return build();
})

Job Ready

And that should do it. With just a little bit of configuration, you’ll now be able to take advantage of webpack’s lazy loading while deploying to Netlify. This should allow you to more easily break up your application, which will help reduce bundle sizes and provide better performance for your users. Go forth and lazy load!

--

--