Well, this week has been a journey. The Developers life is really about believing the "happy path" of new ideas or technologies and then finding out which bits don't quite work properly and which parts are not documented properly. This week was Docker week.

There is no doubt that Docker (and containers in general) are an awesome technology for deployments, scaling and migration so why wouldn't I want to get some building and working and plumbed up to production so that we can seriously considering using them?

The answer is that there is no happy path on .Net despite having "Enable Docker support" available in Visual Studio for a few years now.

This is my story...

Visual Studio

I know the basics of Docker but before I got complicated I wanted to liertally deploy the boilerplate Visual Studio Docker project (Web API) through my CI/CD pipeline into Kubernetes. If I could get that to work then the world would be my oyster and we could start deploying real services and tweaking the production settings.

Creating the app in VS is easy, you just tick a box and get a dockerfile. No worries there. Press ctrl-shift-B and it all builds (this is easy).

My next task was a CI build and I presumed that I would probably upload any successful build directly to Azure Container Registry (other registries are available). The deployment could then be the second stage which would take containers from the registry and deploy them to production. What I eventually learned was that builder docker is really easy (in theory) you just run "docker build". Note that you don't need to install things like Visual Studio because they are already present in the base-images that your Docker build depends on. How easy is that?

TeamCity

However, I started learning the pains of installing Docker on windows. Firstly, Docker is suprisingly kind to the RAM on the system but you need at least 2GB of clear memory for it to work. On our build server, with Java (damn-you build servers!) taking a large chunk and SQL server helping itself, we were running a bit tight. Even after reducing SQL server to use 1GB, there was not quite enough available.

I found a cool trick, however, by setting the RAM requirements for the Docker VM in Hyper-V to use dynamic RAM from 512MB up to 2GB, it would start up (although I was obviously pushing my luck). Now that this was OK, I tried to run a docker build. Yes, literally a Docker build step in TeamCity. I received the now infamous, "server doesn't support this image version" or whatever the message is. I had to learn about Windows containers.

Although Docker sounds nice and isolated, certainly in Windows (not sure about Linux) you can't build an image that is a newer version than the host OS. I guess this makes sense. These are not full VMs, they are containers and rely on features in the host. Although it is feasible, it is probably asking a lot for a host OS to know the future!

So the first problem was our build server running Windows Server 2016 - not old by any stretch but not officially supported on Docker Desktop any more and this would mean that we could only deploy 3 year old OS versions to production - certainly not good enough for things like nano server which were built to be lighter and faster than previous versions.

Azure Pipelines

Obviously upgrading Windows is no small job and would risk breaking the entire build system while it took place so I decided instead to try Azure Pipelines on our Azure Devops account. This should surely be as easy as pie.

It is not!

Firstly, it uses some new yaml format, which is good for source controlling deployments, I suppose, but not good when there are no tools to help you build your pipeline. I only wanted to do Docker build but it chose "ubuntu-latest" as the build agent (maybe to show it's open source credentials!?) which obviously didn't work so I immediately had to hit the pain that is MS docs to find out whether I could just choose windows-latest. I could but then my next error: "server doesn't support this image version" What? I have chosen windows latest and it is telling me my image is not supported.

Well guess what? windows-latest is server 2019, not nano server. If you want to build nano server images, you have to use on-prem agents. All of a sudden, Azure pipelines are immediately less appealing. I decided to compromise and update my dockerfile to use a windows 2019 base image and rebuilt. New error: "Unable to copy ....csproj..". What? This is a boilerplate docker file. As with most errors, they only make sense once you know what they mean! I didn't learn until later what the error meant which was after leaving Azure pipeline behind.

A word on container compatibility

As I said before, there are issues with compatibility between host and containers (not sure if this applies to Linux) and there are some documentation pages here.

In true MS fashion, they have created a document that is about 10 pages longer than needs be. I will sum it up more succinctly:

There are two ways to host or build windows containers on a windows host:
  1. Using process isolation on Windows Server is the fastest mode but the container windows version must match the host version (not just the major version but the build version). This is the default mode, hence the reason we see so many errors with default settings. This is better for production where we want performance and can attempt to group like windows versions together on different hosts.
  2. Using Hyper-V isolation (mini VM), which supports the host windows version and previous versions of windows containers but at the cost of slower builds and more resources taken up (not suprisingly). This works on Windows Server and Windows 10 (Fall creators update)

Turning to the cloud

I was struggling now because I was learning all of this as I was going along and the errors are often not helpful enough. Why don't we have a rule that all errors should include a hyperlink which could link to "known issues that cause this problem"?

I decided to create a new VM on the cloud with Windows Server 2016, just install Docker and set it up as a Team City build agent and that is what I did. The agent install is relatively easy and my first Docker build? A failure of course. Need to match the host version - oooh, I know how to do this now!

I had to enable hyper-v isolation but when I found the docs, I saw a comment about --isolation. I added that switch and got another error. I eventually worked out that I should have used --isolation=hyperv and the lack of = causes a strange parser error instead of a more useful "incorrect parameter" error.

Next? "The virtual machine could not be started because a required feature is not installed"? Hyper-V was running OK, I could see it but then some more searching revealed that not all Azure VMs support containerised virtualisation. In other words, the cloud VMs are already VMs so they need a feature to allow VMs to create their own child VMs! I had to resize the VM to one that was twice the price but which supported the feature (again, a more useful error would have been nice!)

I then got "unexpected EOF", which is about as unhelpful as possible. I didn't understand why this was happening after the VM was created since the build must be OK. Where was this end-of-file? Another 20 minutes of Googling and it turns out that this is a generic error that often means that the machine is out of memory. I remoted to the machine, looked at the RAM and it looked good: 2GB out of 8GB and then I ran the build again while watching...

Green build (bearing in mind this is a 1-step build process!).

Deployment

If you know anything about containers, they get versioned and pushed to a registry so that other systems can find and install them when needed. I decided to use Azure Container Registry because we already have an account on Azure and could change it later if required.

I then tried to setup TeamCity to push the image to Azure. This is quite convoluted for some reason and involves creating a "Connection" in the project admin which should then be used by the Docker "build feature" to login and then logout either side of the build.

What happened? What do you think? Error, "denied: requested access to the resource is denied". I currently have an entry on the support forum for TeamCity to work out what is going on.

So I finally discovered what I was doing wrong and it was alluded to in the Quick Start section of Azure Container Registry in my portal. You have to tag your image with the url of your ACR as well as the name e.g. example.azurecr.io/myimage:1.2 instead of just myimage:1.2. If you do not prepend the ACR URL, you get the permission error.

I have now got a working docker build and push job on Team City. The error messages could be more helpful but I guess that is always true!

Conclusion

I assume that once this is all setup, it will work wonderfully but the setup is a pain. Even a default simple dockerfile is not easily building and deploying and I haven't even done anything difficult yet. Microsoft should definitely be better at this, they own all the elements of the build apart from Docker and their pipelines thing is really unusable without serious training.

On the other hand, people like JetBrains and others need to keep up with the trend. Containers have been a thing for a long time and for such a simple deployment model, it should just work. When it doesn't, error messages need to be clear and log files verbose so that people can work out exactly what is wrong.