Bug #3294
closeddrill(1) with IPv6 NS fails with UDP but works with TCP
Description
YONETANI Tomokazu reported this issue on users@ mailing list: https://lists.dragonflybsd.org/pipermail/users/2021-August/404805.html
$ drill @2001:4860:4860::8888 aaaa leaf.dragonflybsd.org | egrep -v '^(\;|$)' Error: error sending query: Could not send or receive, because of network error
unless using TCP query:
$ drill -t @2001:4860:4860::8888 aaaa leaf.dragonflybsd.org | egrep -v '^(\;|$)' leaf.dragonflybsd.org. 3599 IN AAAA 2001:470:1:43b:1::68
Similar DNS queries on other boxes running different OSes don't have the same problem, and tcpdump output shows the response from the DNS server, so I doubt it's an network issue.
$ uname -a DragonFly c60 6.0-RELEASE DragonFly v6.0.0.33.gc7b638-RELEASE #0: Wed Aug 4 20:25:25 JST 2021 root at c60:/usr/obj/build/usr/src/sys/X86_64_GENERIC x86_64
I also confirmed this issue on leaf, which running master as of Aug 4.
Updated by y0n3t4n1 almost 3 years ago
I spent some time playing with LDNS example code and comparing the net.c with resolv/res_send.c in libc, and found that LDNS sends the query with sendto, while the libc resolver uses connect+send, unless RES_INSECURE1 is specified.
So with the code at the bottom of this comment, this succeeds./r aaaa leaf.dragonflybsd.org.
while this fails./r -1 aaaa leaf.dragonflybsd.org.
#include <sys/types.h>
#include <netinet/in.h>
#include <arpa/nameser.h>
#include <err.h>
#include <errno.h>
#include <resolv.h>
#include <stdio.h>
#include <unistd.h>
int main(int ac, char **av)
{
u_char answer[1024];
int ch;
const char *me = av[0];
if (res_init() != 0)
err(-1, "res_init");
while ((ch = getopt(ac, av, "12t")) != -1) {
switch (ch) {
case '1':
_res.options |= RES_INSECURE1;
break;
case '2':
_res.options |= RES_INSECURE2;
break;
case 't':
_res.options |= RES_USEVC;
break;
default:
errx(-1, "unknown switch %c", ch);
}
}
ac -= optind, av += optind;
if (ac < 2)
errx(-1, "usage: %s resource type domain...", me);
int ok = 0;
int query_type = res_nametotype(*av, &ok);
if (!ok)
errx(-1, "unknown query type: %s", *av);
fp_resstat(&_res, stdout);
while (++av, --ac > 0) {
int l = res_query(*av, C_IN, query_type, answer, sizeof answer);
if (l == -1) {
warnx("res_query: %s", *av);
continue;
}
res_pquery(&_res, answer, l, stdout);
}
}
Updated by y0n3t4n1 almost 3 years ago
On DragonFlyBSD, the UDP packet sent from sendto has no flowlabel (0x00000) even if net.inet6.ip6.auto_flowlabel
is left set (the default).
On DragonFlyBSD, tcpdump shows the empty flowlabel (the 3 octets following the first 0x60)
08:58:15.240037 IP6 (hlim 64, next-header UDP (17) payload length: 16) ::1.2082 > ::1.3456: [udp sum ok] udp/vt 8 69 / 21 [|vat]
0x0000: 6000 0000 0010 1140 0000 0000 0000 0000 `......@........
0x0010: 0000 0000 0000 0001 0000 0000 0000 0000 ................
0x0020: 0000 0000 0000 0001 0822 0d80 0010 c00c ........."......
0x0030: 5445 5354 4041 4243 TEST@ABC
while on another system (namely, WSL)
23:51:23.201149 IP6 (flowlabel 0x33ba2, hlim 64, next-header UDP (17) payload length: 16) ::1.40700 > ::1.3456: [bad udp cksum 0x0023 -> 0x2932!] udp/vt 8 69 / 21 [|vat]
0x0000: 6003 3ba2 0010 1140 0000 0000 0000 0000 `.;....@........
0x0010: 0000 0000 0000 0001 0000 0000 0000 0000 ................
0x0020: 0000 0000 0000 0001 9efc 0d80 0010 0023 ...............#
0x0030: 5445 5354 4041 4243 TEST@ABC
sendto.c:
#include <sys/socket.h>
#include <sys/types.h>
#include <err.h>
#include <errno.h>
#include <netdb.h>
#include <stdio.h>
#include <string.h>
#include <unistd.h>
int
main(int ac, char **av)
{
struct addrinfo hint, *ai = NULL;
int s = -1;
if (ac < 1 + 3)
errx(1, "usage: %s host port msg", av[0]);
const char *msg = av[3];
size_t msglen = strlen(msg);
memset(&hint, 0, sizeof hint);
hint.ai_family = AF_UNSPEC;
hint.ai_socktype = SOCK_DGRAM;
hint.ai_flags = AI_PASSIVE;
hint.ai_protocol = 0;
hint.ai_canonname = NULL;
hint.ai_addr = NULL;
hint.ai_next = NULL;
if (getaddrinfo(av[1], av[2], &hint, &ai) != 0)
err(errno, "getaddrinfo");
if ((s = socket(ai->ai_family, ai->ai_socktype, ai->ai_protocol)) == -1)
err(errno, "socket");
ssize_t sent = sendto(s, msg, msglen, 0, ai->ai_addr, ai->ai_addrlen);
if (sent == -1)
err(errno, "sendto");
close(s);
freeaddrinfo(ai);
return 0;
}
To build:
make sendto CFLAGS='-W -Wall' && ./sendto ::1 3456 TEST@ABC
To observe (on Linux system, change lo0 to lo):
sudo tcpdump -Xnvs 1500 -i lo0 udp 3456
Updated by y0n3t4n1 almost 3 years ago
Hmm, forget about the flowlabel, the C code in the first comment works with an empty flowlabel by setting net.inet6.ip6.auto_flowlabel=0
, so it's simply a matter of connect+send vs sendto (as I mentioned earlier). Not sure how it used to be OK, as apparently there hasn't been a dport-specific patch to convert sendto to use connect+send in dns/ldns or dns/dnsmasq.
Updated by y0n3t4n1 almost 3 years ago
sendto(which ends up calling udp_send or udp6_send) behaves differently in the point of view of the hash tables involved.
When it auto-binds the local port
- udp_send() calls in_pcbbind(nam = NULL), which calls in_pcbsetlport(), then udp_inswildcardhash(), which populates both localgroup hash and wildcard hash.
- udp6_send() only calls in6_pcbsetlport (in udp6_output)
so sendto() populates the localgroup hash, the port hash, and the wildcard hash for IPv4, whereas only the port hash for IPv6.
For receiving replies(which probably is not a multicast or a broadcast) on the socket used by sendto()
- udp_input() uses in_pcblookup_pkthash(wildcard = TRUE), which looks up on hashbase, calls inp_localgroup_lookup(), then looks up on wildcardhashbase
- udp6_input() uses in6_pcblookup_hash(wildcard = 1), which looks up on hashbase, then wildcardhashbase
so the possible fixes are:
- call in_pcbinswildcardhash() in udp6_output() for local port auto-bind, or
- create in_pcbinslocalgrphash() and call it in udp_output for auto-bind; call inp_localgroup_lookup() in in6_pcblookup_hash()
The change for the first fix seems to work for me.
diff --git a/sys/netinet6/udp6_output.c b/sys/netinet6/udp6_output.c
index 355735e7f9..d2627a7c96 100644
--- a/sys/netinet6/udp6_output.c
+++ b/sys/netinet6/udp6_output.c
@@ -186,9 +186,11 @@ udp6_output(struct in6pcb *in6p, struct mbuf *m, struct sockaddr *addr6,
error = EADDRNOTAVAIL;
goto release;
}
- if (in6p->in6p_lport == 0 &&
- (error = in6_pcbsetlport(laddr, in6p, td)) != 0)
- goto release;
+ if (in6p->in6p_lport == 0) {
+ if ((error = in6_pcbsetlport(laddr, in6p, td)) != 0)
+ goto release;
+ in_pcbinswildcardhash(in6p);
+ }
} else {
if (IN6_IS_ADDR_UNSPECIFIED(&in6p->in6p_faddr)) {
error = ENOTCONN;
Updated by dillon over 2 years ago
Ok, reproduced the bug over here (I had to configure an IPV6 dns server in my /etc/resolv.conf to test it), and the patch fixes it. I think you can go ahead and commit it.
-Matt
Updated by tuxillo over 2 years ago
- Status changed from New to Closed
- Assignee set to y0n3t4n1
Applied in f6d528e84967a859764d5c145f24e98ffafbb8e9