Despite having DNSSEC, DoH or DoT to secure DNS lookups, many systems still rely on plain old DNS from 1983. Earlier this year, we’ve been part of a larger team that audited c-ares v1.19.0. c-ares is an asynchronous DNS client library with support for a wide range of platforms. It is around for quite some time now and a few of its more prominent users include libcurl, node.js and Wireshark.
In this post, we delve into one specific outcome of our work, namely the weak DNS query ID generation in c-ares identified as CVE-2023-31147.
Even though plain DNS does not include any cryptographic measures for authenticity, DNS queries use two properties for being more resilient against forged answers1:
In the good old days, this was not the default. For instance, DNS resolvers used fixed source ports, allowing attackers to focus solely on correctly predicting the 16-bit query ID. The practical exploitation of this vulnerability was most prominently shown by Dan Kaminsky in 2008 through the publication of CVE-2008-1447. Thus, selecting the source port and query ID using a cryptographically secure random number generator (CSPRNG) is crucial: It makes it quite hard for attackers to construct and inject a malicious response before the real response arrives at the client.
When auditing a DNS protocol implementation like c-ares, we always check if these mitigations are properly implemented. In the context of this blog post, we primarily focus on random DNS query identifiers, as source port randomization is nowadays the default and is managed by the OS2.
The high-level design of DNS-Query ID generation in c-ares appears fairly simple to users: Upon initialization with
c-ares collects random bytes from the OSes CSPRNG. These bytes serve as the seed for an internal CSPRNG.
This internal CSPRNG is then used to generate the 16-bit DNS query ID for each individual query.
The CSPRNG state is stored in the opaque type
ares_channel and updated every time a DNS query
ID is generated through the use of
Once we start looking at the code more closely, we’ll realize that not everything is well designed: First, we notice that DNS query IDs are generated using a pseudo random number generator (PRNG) based on RC4:
In case you’re too young or it’s been a while: RC4 is a stream cipher designed in 1987 and has gained widespread usage over the years. However, since its inception, it has been shown to be flawed and insecure multiple times. It is also the reason why Wifi protocols WPA-TKIP and WEP were both famously broken. So it is not a cipher you’d use in 2023.
Using RC4 as cryptographically secure PRNG (CSPRNG) was also quite common some time ago and
it was the core of
arc4random(3) which originated
in OpenBSD and is nowadays part of all BSD-descendants including Apple’s macOS and iOS.
However, with the discovery of more and
more ways the RC4 key stream is biased and not properly random,
it became clear that RC4 was unsuitable for use as a CSPRNG. As a result, today’s
use CSPRNGs based on ChaCha20 or AES.3
Shifting our focus back to c-ares, there is more to be considered: While functions like
ensure that new entropy is added after a certain amount of random bytes has
been generated, c-ares takes a different approach. It simply seeds the PRNG once and uses it throughout
the entire lifespan of the
ares_channel. This may not be a concern for tools like
dig version of c-ares), since the process will never live long enough. On the other hand, if we consider services which
run for a much longer time and perform a lot of DNS queries, this might be a different story.
Especially, if said service initializes a single
ares_channel upon startup and uses it until it is stopped.
While this is probably not really an issue with c-ares, it is common practice to reseed
the PRNG after generating a certain amount of bytes.
With our interest peaked, we can start looking more closely into how RC4 is seeded. This is
done in the function
init_id_key(rc4_key* key,int key_data_len) which gets handed
in a pointer to the
rc4_key type stored in the
ares_channel and a number dubbed
We notice that the buffer
key_data_ptr is actually
useless, since it is never populated with something other than all zero bytes.
So we can actually ignore it in the calculation of
index2 in line 25.
Furthermore, the 256-byte buffer
key->state is filled with numbers from
then handed to a function
randomize_key(...) which one would assume shuffles
the contents of
key->state. Also notice that we hand
key_data_len to this function. Digging through the header files, we can find
that its value is always
ARES_ID_KEY_LEN, which is
31. So this is not the
key->state which is 256 bytes.
Therefore, it begs the question where the true randomness is fetched from the OS.
Looking at the code above,
randomize_key(...) is the only sensible candidate.
Prior to delving into that, let’s briefly compare how RC4 is implemented in OpenSSL:
We’ll notice that
init_id_key(...) in c-ares is slightly different and actually broken:
OpenSSL’s key schedule implementation receives the raw key via the
initializes the state buffer
d and then shuffles
key_data_ptr is OpenSSL’s
data buffer and
state is the
equivalent of pointer
d. Knowing this, we can see that c-ares confused
key_data_ptr when calling
randomize_key(...) which we
assume retrieves a random key.
Consequently, the 256
ARES_SWAP_BYTE(...) operations in c-ares’ key schedule incorrectly
key->state only, but not
Without going into all the math details here, this definitely looks worse than what the original RC4 key
schedule does as it is likely resulting in fewer possible permutations of
We can assume that this has a downside on the quality of random numbers it generates.
Finally, let’s take a look at the
randomize_key(...) more closely:
At first glance, this appears okay: On
used to query
key_data_len bytes of randomness from the OS and place them
key which is
key->state in the caller
WIN32 targets, we either open the file path of
key_data_len bytes from there or fall back to using
rand() to get
the same amount of random numbers. We assume that
/dev/random for now.
An initial observation here is that the fallback relies on
which is not designed for generating cryptographically secure random numbers.
Since it is only a fallback if nothing else works it is better than
but it would be a better choice to first try to use
arc4random(3) on *BSD or
on Linux, just in case.
More concerning though, is the absence of any
srand(3) in the whole
source, which would seed the PRNG used by
rand(3). Without it,
will output the same sequence of numbers every single time!
This means that all our DNS query IDs will be
fully predictable every time we end up in this fallback case.
Looking at the
CARES_RANDOM_FILE case, it becomes evident that any error with
fread(3) will silently fail and result again in
rand(3) being used.
While c-ares does the best it can in this case, it probably shouldn’t fail silently.
At least a few sysadmins would want to know that their DNS queries are
all predictable due to some configuration issue.
Finally, there is one more thing in the above code. Whenever
is not set, it automatically falls back to using
rand(3) for seeding the RC4
PRNG. That this can be a problem becomes apparent when we look at the Autotools
dnl Check for user-specified random device AC_ARG_WITH(random, AS_HELP_STRING([--with-random=FILE], [read randomness from FILE (default=/dev/urandom)]), [ CARES_RANDOM_FILE="$withval" ], [ dnl Check for random device. If we're cross compiling, we can't dnl check, and it's better to assume it doesn't exist than it is dnl to fail on AC_CHECK_FILE or later. if test "$cross_compiling" = "no"; then AC_CHECK_FILE("/dev/urandom", [ CARES_RANDOM_FILE="/dev/urandom"] ) else AC_MSG_WARN([cannot check for /dev/urandom while cross compiling; assuming none]) fi ] )
As we can see, the existence of
/dev/urandom is determined at compile time.
This will likely break in cross-compile situations where this file does not
exist on your build host. We’ll then always use the fallback case with the RC4 PRNG seeded by
Luckily enough, c-ares also brings CMake as build system and this is used
for example by the Yocto meta-oe recipe for c-ares, so not all is lost.
Nevertheless, the check for
/dev/urandom’s existence in c-ares should probably
be done during runtime instead of determining it at compile time.
Combining all these issues, we now know that:
/dev/urandomon the host will result in the fallback case being used, even if
/dev/urandomwould be available on the target.
This means that DNS query IDs generated by c-ares are not fully random, raising the likelihood of query IDs becoming completely predictable. Consequently, an attacker’s search space for the tuple of source port and DNS query ID is smaller and makes it more likely to succeed. It basically brings us closer to the good old days of CVE-2008-1447 where source port randomization was not used by default. :-)
After reporting this issue, the c-ares maintainers published v1.19.1, which rectifies this problem (fix commit) and other recent vulnerabilities. So, better be sure to update c-ares to the latest version!
Of course we also checked if the c-ares code base does anything special with respect to selecting the source port. ↩︎