Understanding CaddyServer and the ACME HTTP challenge
I had an issue where my JustNappies.io naked domain was timing out after 60 seconds with a HTTP 522 from Cloudflare.
A HTTP 522 from Cloudflare means Cloudflare is trying to pass requests to my server but the server is not responding.
My setup is a Rails app deployed on a Hetzner server. My app is served using Caddy as a webserver. Hetzner firewall is enabled. I have Cloudflare sitting in front of my server too. It looks kinda like this:
I have a preference to serve up all my requests on HTTPS and initially set all my Cloudflare/Hetzner Firewall config to allow HTTPS only. This was a mistake.
Browsers expect you to have valid certs when serving over HTTPS and by default, Caddy serves all sites over HTTPS. Caddy uses On-Demand TLS, which dynamically obtains a new certificate during the first TLS handshake that requires it
CaddyServer uses the ACME protocol to automatically get valid HTTPS certificates signed by LetsEncrypt so in the browser my site looks valid
My caddyfile is setup to use the ACME HTTP challenge. This challenge requires port 80 to be externally accessible. Remember this, port 80.
Caddy and the ACME HTTP Challenge
With Caddy’s on-demand TLS this is what happens when the first person hits my site:
Caddy (running on my server) kicks off the ACME HTTP Challenge and asks Lets Encrypt, 'you can verify I own the domain justnappies by invoking http://justnappies.io/.well-known/acme-challenge/xxx'
LetsEncrypt invokes the above http://justnappies.io endpoint over HTTP port 80
LetsEncrypt sees the expected resource, a certificate is issued and cached by Caddy
My app continues to serve up the response to the user
Now thats what SHOULD have happened, but what ACTUALLY happened was
Caddy (running on my server) tells LetsEncrypt, 'you can verify I own the domain justnappies by invoking http://justnappies.io/.well-known/acme-challenge/xxx'
LetsEncrypt attempts to invokes this http://justnappies.io endpoint but is blocked by cloudflare being https enabled only
LetsEncrypt attempts to invoke this http://justnappies.io endpoint but is blocked by hetzner firewall being https enabled only
Phew painful. My firewall was blocking the ACME challenge on HTTP port 80 and therefore blocking requests to my Rails app. Every request was timing out.
The Solution
As soon as I realised it was my firewalls blocking the ACME challenge it was super quick to remediate. I knew nothing about CaddyServer, caddyfiles and the ACME protocol before this kerfuffle.
Remediation Steps taken
enable http port 80 on cloudflare
enable http port 80 hetzner firewall
and (optional) also setting rails to only be accessible over HTTPS (force_ssl = true)