How to design a high performance web site

Recently, I have been optimising login.pixelpin to try and make it perform as well as possible. I have learned a few things along the way, including the fact that it is not always obvious how to make things better!

There are basically 3 things that you are aiming for when making a high performance web site: 1) To reduce the overall content of your pages; 2) To reduce the network latency of your page impressions - related to but not the same as 1 and 3) To reduce the load on the web server so that it can perform better for the same number of page impressions.

Reduce Overall Content

This might seem obvious but if you look at a poor example, like amazon.co.uk, the page weight is a terrible 3MB! That's right, to see everything on the front page, you have to download 3MB of data, much of it contained in the 89 images it displays!

There are various ways to reduce the overall content and clearly, some of this happens early on in the design phase where designers should know about keeping their designs simple, not putting too much on each page and using images properly. Consider something like an e-commerce site with a section for "related products". If you make the related product content too heavy, every time someone views an item, they might have to wait another few seconds for the related products to load before they can see what they clicked on. On a fast connection this might not be too bad but on mobile networks or areas with poor connection speeds, you might quickly annoy your customers.

There are examples of content heavy sites in various places but on the Halifax banking site I reviewed in my last post, clicking on an account gives you transactions, direct debits and standing orders and statement searching functionality all on the same page. This is inefficient, even if the additional content is not massive.

Reduce Network Latency

Once you have reduced the content in the design of your pages, reducing network latency means 3 things: Removing redundant content, compression and caching.

Removing redundant content might be ensuring that commented out code is not kept in pages, unused scripts are removed and one that can be quite hard, removing unused CSS rules - which can account for quite a lot of CSS. If you use a framework, this is not usually very easy and something you might not use now, might be used on another page or might be needed in the future so be careful here!

Compression should be automatic on your web server. Text-heavy content, such as scripts and css files can often be gzip compressed by as much as 75%. The additional time to compress and decompress is usually small compared to the extra network speed obtained so this is always a good idea on resources that compress well. Formats that are already compressed, such as audio, video and images will not usually compress well so avoid these. NOTE that on CDNs, you do not usually get automatic compression and you might have to do something dynamically in your pages to check for the client supporting gzip compression before rewriting the link to point to a .gz version of the resource. Do not assume that all clients support gzip, they don't!

Another form of compression is to minify the css and javascript. Minifying is removing whitespace and replacing long variable names (that are easier to read when developing) with things like single character names e.g. var myLongVariableName; might be minified to var a; This can be done using a dynamic plugin that will do it on the fly (and should use web server cache to store the result) or it can be done manually and saved (usually as filename.min.js for example) and then set in the script or css links.

Caching is also really important and ideally, you should use long expiration times and some kind of cache-busting for when resources need to be updated. This is usually achieved with a querystring added to the link (even if it is not used) and then by changing the querystring, the client will assume it needs to re-download the resource. Caching time should usually be no longer than 365 days, since some clients get confused if it is longer! Cache tags for http responses are slightly confusing but usually the cache-control header is preferred and allows you to specify the type of caching (public or private) and the max-age in seconds of the resource. The older "expires" header had to use a date and time in the GMT timezone (anything that was wrong was ignored!). You can also set an ETag, which the client can resend to the server and ask "does this resource still have this ETag". If a server checks and the resource is still up-to-date, it should send a 304 (not modified) response which avoids sending the entire resource back across the network. This check still involves the server but much less network traffic so is still much faster than simply re-downloading everything. What is REALLY important with caching is to test it from a browser or a tool like Fiddler, which has a caching tab and which can report whether the caching responses match what you expect! You should also test what happens when caching expires and the client re-requests the resource - does it send an "If-Modified-Since" or "If-Not-Match" header and get a 304 response?

Reduce the load on the web server

Clearly, the more work the web server does, the slower it will be. Some of this is simply related to the number of customers/connections hitting the server but other slowness is due to poor design or designing a site that needs to make too many connections to the web server for each page.

As an example, login.pixelpin.co.uk references something like 10 scripts, 5 css files and even the front page references about 6 images. That means, including the page itself, the client has to make 22 connections to the web server to request all these items. There is usually a limit on the number of connections that can be made to the same domain at the same time so the client might even have to wait for some responses before it can continue requesting more resources - even if it already knows it needs them.

There are several ways we can make things faster for the web server and client.

Firstly, in our site, I was creating a web service connection as a local variable when processing requests from a page, which meant it was created and then disposed pretty much every time a user hit the site. I realised that since this was slow, it should be created in a static variable in code and accessed via a property so it was only created once when the code was first called.

Secondly, we can reduce the number of connections by combining (bundling) scripts and/or css into single files. For instance, rather than having 6 different bootstrap scripts, I simply copied and pasted them into a single script - the same amount of network traffic but 1 connection instead of 6.

Another common way of reducing server load is to use a content delivery network (CDN). Effectively, it just means serving content from another web server to keep connections away from the main one but generally these networks have a few features that help. Firstly, they are usually geo-cachable, which means the content is copied to various places around the world. This means that people who are in, say, Australia, get the content from a location in or near Australia even if your main web server is in London. Secondly, CDNs will be on another domain to your web server and won't set cookies. This is a minor thing but means that unnecessary cookie data is not sent with each request for these resources. CDNs can usually be attached to your web server content so that they will automatically pull new content as it is published. They can also be attached to a storage account and run completely separately from the web server. As mentioned before, you have to be careful with compression which is not usually applied automatically as it would be with most web servers, also, there is some concern that an additional SSL handshake to another domain might add some additional delay to the page load but I haven't seen this on my site.