I have been wrestling with getting Envoy proxy to work with a dotnet core grpc service and really struggling. The dotnet core end is surprisngly easy and works when called from a test client. Configuring envoy to work is a pain, when you don't know how to enable various logging.
I had no end of problems one after the other understanding ports, hostnames, tls setup etc. and this wasn't helped by Envoy's rather information-oriented documentation which doesn't include the rich guides that some frameworks have to setup specific recipes. I am also wary of systems that have so much setup by default so that if you don't flick a switch, it doesn't go bang, it just does what it wants, leaving you unsure whether you have misconfigured it or whether the problem is elsewhere.
Anyway, the perpetual problem was this error:
upstream connect error or disconnect/reset before headers. reset reason: remote reset
Which means pretty much anything that does not permit the proxy to contact the web service. Wrong port or hostname (usually get connection refused but not always). I only worked it out by enabling another load of logging on the webservice as well. That shows this error:
The request :scheme header 'http' does not match the transport scheme 'https'
Which fortunately makes sense - but why is it happening? I then learned how to override the entrypoint for envoy proxy to enable debug logging. It's a shame you can't set this in config (unless it just isn't documented well enough)
ENTRYPOINT /docker-entrypoint.sh envoy -c /etc/envoy/envoy.yaml -l debug
This prints a load of stuff in the logs for the envoy container but then it is really easy to see what is happening with more details. Did it resolve the hostname? Did it manage to make a TLS connection? In my case, I could see the service was called and responded with a 200 message containing a grpc error which was the weird one about remote reset. More importantly, I could see the request headers and saw that it was sending :scheme http instead of https. Why? I would have thought the TLS setup would automatically set this and if not, I couldn't find any documentation about it so maybe not many people are using http2 + TLS + kestrel for their control planes? Maybe I need to disable the scheme check on the webservice if possible?
Anyway, I raised a bug on the envoy project so let's see what happens with it!
ABL - Always be logging!