Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

acme-dns challenge: timeouts? #707

Closed
lf- opened this issue Nov 7, 2018 · 5 comments
Closed

acme-dns challenge: timeouts? #707

lf- opened this issue Nov 7, 2018 · 5 comments

Comments

@lf-
Copy link

lf- commented Nov 7, 2018

I'm running a version I compiled this evening from lego-git on the AUR with the command ACME_DNS_API_BASE=http://127.0.0.1:2500 ACME_DNS_STORAGE_PATH=/var/lib/lego/acme-dns-accounts.json lego --email <spambotsgoaway> -a --dns acme-dns --domains '*.lfcode.ca' --path /var/lib/lego run, and I'm observing a timeout.

Output:

 2018/11/07 06:21:52 [INFO] [*.lfcode.ca] acme: Obtaining bundled SAN certificate 
2018/11/07 06:21:53 [INFO] [*.lfcode.ca] AuthURL: https://acme-v02.api.letsencrypt.org/acme/authz/
2018/11/07 06:21:53 [INFO] [lfcode.ca] acme: Preparing to solve DNS-01 
2018/11/07 06:21:53 [INFO] [lfcode.ca] acme: Trying to solve DNS-01  
2018/11/07 06:21:53 [INFO] [lfcode.ca] Checking DNS record propagation using [1.1.1.1:53 1.0.0.1:53 [2606:4700:4700::1111]:53 [2606:4700:4700::1001]:53] 
2018/11/07 06:21:53 [INFO] Wait [timeout: 1m0s, interval: 2s]   
2018/11/07 06:22:53 Could not obtain certificates
         acme: Error -> One or more domains had a problem:
[lfcode.ca] time limit exceeded: last error: could not determine the zone: unexpected response code 'SERVFAIL' for 4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca.

acme-dns logs (don't worry about the timestamps, these are the same lines as were observed the first run):

Nov 07 06:41:13 abyss acme-dns[19514]: time="2018-11-07T06:41:13Z" level=debug msg="TXT updated" subdomain=4442248e-9706-4050-9910-b1f3bde0f362 txt=_GMMgk1r8WTHq7SDDpRltBLqy9Qr4yLn3QMCAWjrGAc
<...>
Nov 07 06:26:34 abyss acme-dns[19514]: time="2018-11-07T06:26:34Z" level=info msg="Answering TXT question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca.
Nov 07 06:26:34 abyss acme-dns[19514]: time="2018-11-07T06:26:34Z" level=debug msg="Answering question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca. qtype=SOA rcode=NXDOMAIN
<repeated 30 times>

Why is lego looking for SOA records? I've checked the domain listed there with dig for TXT records manually and found that there is indeed a record there. I've also tried running with --dns-resolvers 127.0.0.1 to eliminate any caching misbehaviour.

So, in summary:

  1. Lego appears to be updating the record to _GMM.....
  2. It is checking it, and that is indeed the value of the record.
  3. It's timing out checking the record is what it just set it to?
@lf-
Copy link
Author

lf- commented Nov 7, 2018

@cpu

@cpu
Copy link
Contributor

cpu commented Nov 7, 2018

Hi @lf-,

Here are some initial thoughts from reading your issue (thanks for all the detail!):

Why is lego looking for SOA records

Lego is trying to figure out the authoritative nameservers for your domain as described here: https://github.com/xenolf/lego/blob/4d21f8eec18d57cd029cc287ae75b1b20f49c885/acme/dns_challenge.go#L264-L265

It needs to know the authoritative nameservers in order to directly query them to check the TXT records are correct before expending your Let's Encrypt rate limits trying and failing.

The problem here is that one of the recursive nameservers Lego was using (in the text above, one of 1.1.1.1:53 1.0.0.1:53 [2606:4700:4700::1111]:53 [2606:4700:4700::1001]:53) returned a SERVFAIL in response to a SOA query for 4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca.

[lfcode.ca] time limit exceeded: last error: could not determine the zone: unexpected response code 'SERVFAIL' for 4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca.

The SOA lookup code in Lego considers any response other than NXDOMAIN or NOERROR to be an unexpected error, so your issuance fails:
https://github.com/xenolf/lego/blob/4d21f8eec18d57cd029cc287ae75b1b20f49c885/acme/dns_challenge.go#L284

I'm able to reproduce the SERVFAIL using Cloudflare's recursive resolver reliably with dig:

$> dig @1.1.1.1 SOA 4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca | grep "status:"
;; ->>HEADER<<- opcode: QUERY, status: SERVFAIL, id: 48657

Interestingly the query returns NXDOMAIN instead of SERVFAIL with Google DNS:

$> dig @8.8.8.8 SOA 4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca | grep "status:"
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 24733

I suspect there's a misconfiguration of some sort in your DNS zone that explains this but unfortunately I don't have any more time to dig into the details this morning and can't spot anything super obvious. You might find using --dns-resolvers 8.8.8.8 with Lego will work around this in the short term by avoiding Cloudflare's resolver.

Ultimately I don't think this is a bug with Lego or the acme-dns provider. If you need more help debugging your DNS zone I'd recommend opening an issue with Joohoi's acme-dns repo or contacting Cloudflare support.

Good luck! Apologies for not having a complete solution for you!

@lf-
Copy link
Author

lf- commented Nov 8, 2018

Indeed, that seems to fix it, but now I'm curious as to what the heck is happening with the DNS. Comparing Google DNS to Cloudflare, I find that the program while using Cloudflare DNS seems to be asking for a SOA on <guid>.acme.lfcode.ca and then not following up with a SOA question for acme.lfcode.ca.

Edit: reword: the client software getting an answer it can deal with when using the Google DNS resolver and continuing on its normal codepath.

The upstream DNS zone has only the following:

  • NS record delegating acme.lfcode.ca to abyss.lfcode.ca (which is this server)
  • CNAME _acme-challenge pointing to 4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca

I have some more logs:

#### CLOUDFLARE 1.1.1.1 FAILURE
Nov 08 05:15:41 abyss acme-dns[19514]: time="2018-11-08T05:15:41Z" level=debug msg="TXT updated" subdomain=4442248e-9706-4050-9910-b1f3bde0f362 txt=Rg5QBCjfyESezUmEw4k_2N8Kqzbc5BUgyhRZNvlCmik
Nov 08 05:15:41 abyss acme-dns[19514]: time="2018-11-08T05:15:41Z" level=info msg="Answering TXT question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca.
Nov 08 05:15:41 abyss acme-dns[19514]: time="2018-11-08T05:15:41Z" level=debug msg="Answering question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca. qtype=SOA rcode=NXDOMAIN
Nov 08 05:15:43 abyss acme-dns[19514]: time="2018-11-08T05:15:43Z" level=debug msg="Answering question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca. qtype=SOA rcode=NXDOMAIN
Nov 08 05:15:45 abyss acme-dns[19514]: time="2018-11-08T05:15:45Z" level=debug msg="Answering question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca. qtype=SOA rcode=NXDOMAIN
Nov 08 05:15:47 abyss acme-dns[19514]: time="2018-11-08T05:15:47Z" level=info msg="Answering TXT question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca.
Nov 08 05:15:47 abyss acme-dns[19514]: time="2018-11-08T05:15:47Z" level=debug msg="Answering question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca. qtype=SOA rcode=NXDOMAIN
Nov 08 05:15:49 abyss acme-dns[19514]: time="2018-11-08T05:15:49Z" level=debug msg="Answering question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca. qtype=SOA rcode=NXDOMAIN
Nov 08 05:15:51 abyss acme-dns[19514]: time="2018-11-08T05:15:51Z" level=debug msg="Answering question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca. qtype=SOA rcode=NXDOMAIN
Nov 08 05:15:53 abyss acme-dns[19514]: time="2018-11-08T05:15:53Z" level=info msg="Answering TXT question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca.
Nov 08 05:15:53 abyss acme-dns[19514]: time="2018-11-08T05:15:53Z" level=debug msg="Answering question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca. qtype=SOA rcode=NXDOMAIN
Nov 08 05:15:55 abyss acme-dns[19514]: time="2018-11-08T05:15:55Z" level=debug msg="Answering question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca. qtype=SOA rcode=NXDOMAIN
Nov 08 05:15:57 abyss acme-dns[19514]: time="2018-11-08T05:15:57Z" level=debug msg="Answering question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca. qtype=SOA rcode=NXDOMAIN

### SUCCESSFUL 8.8.8.8 ATTEMPT
Nov 08 05:16:18 abyss acme-dns[19514]: time="2018-11-08T05:16:18Z" level=debug msg="TXT updated" subdomain=4442248e-9706-4050-9910-b1f3bde0f362 txt=Rg5QBCjfyESezUmEw4k_2N8Kqzbc5BUgyhRZNvlCmik
Nov 08 05:16:18 abyss acme-dns[19514]: time="2018-11-08T05:16:18Z" level=info msg="Answering TXT question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca.
Nov 08 05:16:18 abyss acme-dns[19514]: time="2018-11-08T05:16:18Z" level=debug msg="Answering question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca. qtype=SOA rcode=NXDOMAIN
Nov 08 05:16:18 abyss acme-dns[19514]: time="2018-11-08T05:16:18Z" level=debug msg="Answering question for domain" domain=acme.lfcode.ca. qtype=SOA rcode=NOERROR
Nov 08 05:16:18 abyss acme-dns[19514]: time="2018-11-08T05:16:18Z" level=debug msg="Answering question for domain" domain=acme.lfcode.ca. qtype=NS rcode=NOERROR
Nov 08 05:16:18 abyss acme-dns[19514]: time="2018-11-08T05:16:18Z" level=info msg="Answering TXT question for domain" domain=4442248e-9706-4050-9910-b1f3bde0f362.acme.lfcode.ca.
Nov 08 05:16:18 abyss acme-dns[19514]: time="2018-11-08T05:16:18Z" level=info msg="Answering TXT question for domain" domain=4442248E-9706-4050-9910-B1f3bDE0F362.acME.LfCODe.CA.

@cpu
Copy link
Contributor

cpu commented Nov 8, 2018

Indeed, that seems to fix it

🎉 Great! Glad to hear it :-)

I'm curious as to what the heck is happening with the DNS

I see that you opened a thread on the Cloudflare forum for this issue (great idea) and @mnordhoff (he's a true Internet support hero, have I mentioned how much I appreciate him?) flagged a handful of issues with your DNS zone. I suspect one or more of these is the culprit here.

I find that the program while using Cloudflare DNS seems to be asking for a SOA on .acme.lfcode.ca and then not following up with a SOA question for acme.lfcode.ca.

Indeed, Lego doesn't follow-up with another query because the first query to Cloudflare's recursive resolver returns a SERVFAIL response.

I think we can conclusively say that this isn't a bug with Lego or Lego's acme-dns provider. @lf- would you be willing to close the issue? I'm not a maintainer and can't :-)

@lf- lf- closed this as completed Nov 8, 2018
@lf-
Copy link
Author

lf- commented Nov 8, 2018

joohoi/acme-dns#127 filed. acme-dns is kind of broken. I've fixed the NS issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

3 participants