Making Azure Virtual Machine Scale Sets work!

Introduction

So we are trying to create a test environment for load testing our application. Unfortunately, we use older resources (Cloud Services and VMs) and these were all originally deployed manually so I have been trying to create a close to exact replica of our production system bit-by-bit.

Resource Manager is the new thing in Azure and it is basically a concept and set of APIs that allow you to group resources together and to script the deployment, which will be very useful going forwards for our multiple deployments.

One of the things I had to replicate was a VM set, which we were using to run a PHP app. The reason we used a VM was that the cloud service tools for PHP had various shortcomings and were not suitable so I went old-school. Although there are now App Services Web Apps for PHP, we want to load test the original architecture so that we can then benchmark that against changes, including moving to App Services.

Virtual Machine Scale Sets

To use the new portal, you have to use Resource Manager and resources that are compatible and the closest to classic VMs are called Virtual machine Scale Sets which are basically the same thing but with more configuration.

I assumed this would be a relatively quick and easy setup like it was on classic but the problem with all configuration-heavy features is that if it doesn't work - which it didn't - it is hard to know which bit is wrong.

I got it to work eventually so I thought I would describe the process.

Create the Scale Set

This is like normal. Go to your subscription in the new portal, click Add and choose Virtual Machine Scale Set. It will list several, I am using the Microsoft one. Select it and press Create.

Fill in the details. The password is what RDP is set to use. It is useful to choose a new Resource Group since there will be about 10 resources created so it is easier to group them into 1.

The next page gives you options for the IP address (you can use the default), a domain name, which must be globally unique and options like whether you want to auto-scale. Auto-scale is only useful if you are using an image that will automatically install and run when the instances are increased. In my case, I am installing the code manually so auto-scale isn't much use!

I think that the only way to correctly template the installation is to use a Resource Manager template which is not supported in the portal - so the portal just gives you the vanilla setup.

Setup your instances

By default, the load balancer is setup with 2 inbound NAT rules for port forwarding RDP to your specific instances. You should probably change the port numbers because they are always 50000 series. With these port numbers you can simply RDP to your public IP address (shown in the overview for the load balancer and several other places) and a specific port number. Obviously for each instance, they should be setup the same since they are to balance the load.

This can obviously take time, especially with things like PHP to install, but as with all installations, test it as you go along to narrow down what might be wrong. Once you've finished, access the web apps locally to ensure they work and another cool trick is to point once instance at the other in the hosts file and attempt to access one instance from the other just to make sure firewall ports are open. This takes place on the virtual network so won't give any public access yet.

Setup a probe

In order to load balance instances, the load balancer will need a way to probe the health of your instance. Your choices are HTTP or TCP and this probe(s) is setup in the Probes menu for the load balancer.

When you are first testing it, it might be tempting to use HTTP simply to your web site but be warned that there are some cases where this won't work and you will not know anything except it doesn't work! I'm not sure that https is supported (although you can type it into the Path box) and it doesn't appear to be happy with host headers.

Unfortunately, there is no obvious to test the probes in real-time, you can only enable diagnostics and hope for some useful information but it is very user unfriendly.

You can always start by trying an HTTP probe, if it doesn't work try the following trick.

Download psping from sys internals onto each instance and run it as administrator from the command line with the arguments psping -4 -f -s 127.0.0.1:5000 and it will run a TCP server that simply pings a response to a request on that port. You can then setup a TCP probe to point at the port you specified which worked in my case showing me that the HTTP part was the problem.

If you must use HTTP which is the most useful in most cases, you might need to create a default web site with no host header and put your probe endpoint in there. That worked for me (I used iisstart.htm, which was already in wwwroot). Note that ideally the probe would do more than just see a web page load but would also carry out other basic tests to make sure the instance is healthy - not too heavy though it will be called every few seconds.

Setup a Load Balancing Rule

The Load Balancing Rule is for endpoints that are load balanced across the instances (e.g. HTTP and HTTPS). To create the rule, you must have a probe but otherwise it is straight-forward. A name, a front and back port (these are likely to be known ports for public sites like 80 and 443). By default, there will only be 1 pool and the probe you just created to choose from and unless you need sticky sessions (best to design your system NOT to need them) disable session persistence which will round-robin requests to each server.

Other Things to Check

As with all of these things, don't forget things like public DNS settings to point to the correct location, SSL certificates and consideration of how you will update/fix/maintain your instances. It is possible for these instances to die so ideally you should have an automated deployment although for now, I am not going to bother spending time on that!