musl dns search domain stop when current search got some error

Submitted by 王志强 on June 12, 2018, 12:54 a.m.

Details

Message ID 2b350e19.14e4.163f17c6d76.Coremail.00107082@163.com
State New
Series "musl dns search domain stop when current search got some error"
Headers show

Commit Message

王志强 June 12, 2018, 12:54 a.m.
Guys,

I have a alpine container running with following resolv conf:
# cat /etc/resolv.conf
nameserver 10.254.0.100
search default.svc.enn.cn svc.enn.cn default.pod.enn.cn pod.enn.cn enn.cn lan.davidkarlsen.com
options ndots:5

When I try to resolve some domain in the alpine container, say baidu.com, it would fail if some search search domain return code 0 without answers.
I think the cause  is that  name_from_dns would return error code,  but name_from_dns_search would return whenever a error received from name_from_dns
I tried following code change, and it seems fix it
DNS server would sometimes send back a response with return code 0 as following tcpdump shows, sometimes it would return SERVFAIL or REFUSED. (Not sure why though...)

00:36:13.165567 IP 10.254.0.100.domain > slave-2.55437: 2953 0/1/0 (106)    0x0000:  4500 0086 ab87 4000 3e11 bb6b 0afe 0064  E.....@.>..k...d
    0x0010:  ac10 1e02 0035 d88d 0072 a95c 0b89 8180  .....5...r.\....
    0x0020:  0001 0000 0001 0000 0562 6169 6475 0363  .........baidu.c
    0x0030:  6f6d 036c 616e 0c64 6176 6964 6b61 726c  om.lan.davidkarl
    0x0040:  7365 6e03 636f 6d00 001c 0001 c01a 0006  sen.com.........
    0x0050:  0001 0000 0384 002e 036b 656e 026e 730a  .........ken.ns.
    0x0060:  636c 6f75 6466 6c61 7265 c027 0364 6e73  cloudflare.'.dns
    0x0070:  c043 78e1 741a 0000 2710 0000 0960 0009  .Cx.t...'....`..
    0x0080:  3a80 0000 0e10                           :.....
00:36:13.170095 IP 10.254.0.100.domain > slave-2.55437: 2637 0/1/0 (106)
    0x0000:  4500 0086 ab88 4000 3e11 bb6a 0afe 0064  E.....@.>..j...d
    0x0010:  ac10 1e02 0035 d88d 0072 aab3 0a4d 8180  .....5...r...M..
    0x0020:  0001 0000 0001 0000 0562 6169 6475 0363  .........baidu.c
    0x0030:  6f6d 036c 616e 0c64 6176 6964 6b61 726c  om.lan.davidkarl
    0x0040:  7365 6e03 636f 6d00 0001 0001 c01a 0006  sen.com.........
    0x0050:  0001 0000 0384 002e 036b 656e 026e 730a  .........ken.ns.
    0x0060:  636c 6f75 6466 6c61 7265 c027 0364 6e73  cloudflare.'.dns
    0x0070:  c043 78e1 741a 0000 2710 0000 0960 0009  .Cx.t...'....`..



Thanks
David

Patch hide | download patch | download mbox

diff --git a/src/network/lookup_name.c b/src/network/lookup_name.c
index 209c20f..abb7da5 100644
--- a/src/network/lookup_name.c
+++ b/src/network/lookup_name.c
@@ -202,7 +202,7 @@  static int name_from_dns_search(struct address buf[static MAXADDRS], char canon[
                        memcpy(canon+l+1, p, z-p);
                        canon[z-p+1+l] = 0;
                        int cnt = name_from_dns(buf, canon, canon, family, &conf);
-                       if (cnt) return cnt;
+                       if (cnt > 0 || cnt == EAI_AGAIN) return cnt;
                }
        }


Comments

Markus Wichmann June 12, 2018, 4:12 a.m.
On Tue, Jun 12, 2018 at 08:54:13AM +0800, 王志强 wrote:
> Guys,
> 
> I have a alpine container running with following resolv conf:
> # cat /etc/resolv.conf
> nameserver 10.254.0.100
> search default.svc.enn.cn svc.enn.cn default.pod.enn.cn pod.enn.cn enn.cn lan.davidkarlsen.com
> options ndots:5
> 
> When I try to resolve some domain in the alpine container, say baidu.com, it would fail if some search search domain return code 0 without answers.

Let me stop you there. I think we already had this discussion once, but
here goes: code 0 means "Name exists". No answers mean "No record of
this type exists". Therefore, if one of your local resolvers does that,
it means to tell you that the name exists, just no records of type A or
AAAA or CNAME. If the name actually does not exist, then this is a bug
in the DNS server and should be fixed there.  And in the meantime you
can drop the offending server from your search list.

> Thanks
> David

Ciao,
Markus
William Pitcock June 12, 2018, 4:53 a.m.
Hello,

On Mon, Jun 11, 2018 at 11:12 PM, Markus Wichmann <nullplan@gmx.net> wrote:
> On Tue, Jun 12, 2018 at 08:54:13AM +0800, 王志强 wrote:
>> Guys,
>>
>> I have a alpine container running with following resolv conf:
>> # cat /etc/resolv.conf
>> nameserver 10.254.0.100
>> search default.svc.enn.cn svc.enn.cn default.pod.enn.cn pod.enn.cn enn.cn lan.davidkarlsen.com
>> options ndots:5
>>
>> When I try to resolve some domain in the alpine container, say baidu.com, it would fail if some search search domain return code 0 without answers.

davidkarlsen.com is hosted on Cloudflare which appears to have broken
their DNS again.

William
Florian Weimer June 13, 2018, 5:26 a.m.
* Markus Wichmann:

> Let me stop you there. I think we already had this discussion once, but
> here goes: code 0 means "Name exists". No answers mean "No record of
> this type exists". Therefore, if one of your local resolvers does that,
> it means to tell you that the name exists, just no records of type A or
> AAAA or CNAME. If the name actually does not exist, then this is a bug
> in the DNS server and should be fixed there.

NODATA (RCODE 0 without any data) for non-existing names is part of
the DNS protocol as it is deployed, for various reasons (empty
non-terminals, enumeration protection, online signing).