After having played around with Docker a bit more, I have learned some more things about optimising Dockerfiles to make builds more efficient.
As mentioned previously, Docker has a cacheable layer system built-in, which makes sense, since you are not just building e.g. a web app but in a sense building the entire OS, additional modules/plugins/runtimes and then your app as well. The build could easily take 10 minutes+ if you run the whole lot, so you clearly don't want to do that just because you changed one file.
So you need to understand how Docker builds and caches layers so that you can work out how to make the optimal balance between build times and flexibility.
Each time docker reaches a command like copy or run, it creates a new layer. This layer takes place in an "intermediate" container, which is deleted by default after it is used since the resulting later, which is really just a directory diff, is then cached locally. When you build the dockerfile the next time, if Docker detects that any source images or copied files have not changed, it simply takes the cached layer and applies that directly. This is the difference between running e.g. npm install, which might take 30 seconds or more and roughly the 1 second it takes to detect the lack of changes and take the layer from the cache.
What this means in crude terms is to put the least changing content as early in the dockerfile as possible. The Source image (which is often an sdk image for build purposes) is needed to start with so you can't really move that regardless of how often it changes but you can decide whether to pass --pull to docker build in order to get a later version if it exists, otherwise it will use an already downloaded image if that tag exists.
On top of that, it is best to add any global tools you might need for your build but which are not likely to change. For example, I install a specific version of NPM and then install gulp as a global package. Since npm install and npm run dev are both quite slow, I do these next because we don't change the source content as much as we change our code files (you might need this the other way round).
Even when you then get to the point where you might be using a different image for the runtime, you should then layer the least changing content earlier on for the same reason that the cached layering can be used.
Another slowness will come if you are manually copying a lot of files one after the other. To use the caching, each command (e.g. COPY) will create a new layer via a temporary container, which is quite slow and expensive. In most cases, especially if running a combination of related commands, you should combine them into a multiple line command so they are run as a single layer. You need to be careful with copying files because firstly, the copy globbing pattern takes some getting used to (make sure you test it is doing what you expect!) but secondly, if you combine too much, the entire step has to be repeated if any part of it changes. Otherwise, maybe just order the copy commands from least likely to change first to most likely last!
And lastly, make sure you keep testing and optimising. Just changing one cs file in my build and running Docker build is taking 10 minutes. I'm not sure why it is so slow, even copying the output from the build into the runtime seems to take 30 seconds, which doesn't sound right. I do have a Dockerfile that builds in debug for tests (which are not currently running) and then in release for deployment, so 2 builds is obviously quite slow but it is still too slow for my liking!