How DNS kills the Internet

There is a huge amount of Internet traffic just for Domain Name System DNS, as the amount of domain names grows it also becomes more complex. Due to the architecture of DNS, more complex means also increased time upon name resolution. This article tries to show why it happens and to give suggestions to improve it. Most of the points have been previously exposed (e.g. Notes on the Domain Name System, RFCs, …). We try to give a clear view of the whole state of DNS and show some statistics of our measurements.

How DNS works

To simplify, this article focuses mainly on domain name to IPv4 address resolution (A records). DNS provides resolution in a hierarchical distributed way. The following image illustrates the domain name resolve mechanism (credits to wikimedia.org).

An_example_of_theoretical_DNS_recursion

Basically there are 13 root name servers (there are 13 IP addresses but there are hundreds of root servers distributed all over the world using Anycast addressing). Root servers serve the “.” domain; all domain names end with “.”, and if it is omitted a suffix is added. If it is a public Internet domain name, the suffix will usually be “.”. Root servers don’t have the A record for the requested domain name, except for the TLDs name servers themselves, but they delegate to the Top Level Domain (TLD) name server. Basically they respond I don’t know, but these servers may know (illustrated above by Try 204.74.112.1). The mechanism is the same for the TLD name servers until it reaches the name server that has the answer (completed in 3 queries in the picture).

Delegation

The delegation explained above (Try 204.74.112.1) is not exactly what happens. The name server responds with a name server’s domain name (NS record, e.g. Try tld1.ultradns.net.), so to reach it, this name needs first to be resolved. Only having the domain name would make a domain name unresolvable, thus NS records are accompanied by additional A records oh! by the way its IP address is … (tld1.ultradns.net. IP is 204.74.112.1). These additional records are called glue records:

  • A glue record is mandatory if the name server’s domain name is under the domain it serves, the term used to describe such NS record is in-bailiwick (e.g. ns.example.com. serves example.com.)
    example.com.    IN NS ns.example.com.
    ns.example.com. IN A  1.2.3.4 (mandatory)
  • A glue records is optional when it is not, the term used to describe that case is out-of-bailiwick (e.g. ns.example.net. serves example.com.), optional because it is still possible to resolve the name server’s domain name by restarting with root server.
    example.com.    IN NS ns.example.net.
    ns.example.net. IN A  1.2.3.4 (optional)

Recursive resolver and caching

Looping through the name servers to get a response is usually never done by the application making the query; the applications ask a recursive resolver to do the work for them and just get back the answer. A caching resolver store the result of a query such as if it received again the same query, it can respond instantly without redoing the whole query loop. As DNS is changing over time, every record has a time to live to inform caching servers on how long they should stay in the cache.

Canonical NAME record (CNAME)

DNS allows domain name to be a CNAME, which means the result of the resolution of a domain name may be another domain name. To obtain the IP address it has to be resolved starting over at the root servers. CNAMEs are often used for different domain names sharing the same IP address, but DNS administrators do not realize that it increases the resolution time. Such case is due to a syntax issue in the definition of DNS records (in the zone file), and it could be avoided if the zone file had a way to give the same IP address to different names without duplicating it (as everyone knows, duplicates are bad and lead to mistakes). CNAMEs are sometimes used for implementing multi-addressing (for example content delivery networks CDN like Akamai are doing that). This usage is questionable because anycast could be used instead (indeed, anycast works for TCP too, not only UDP. If you are not convinced see TCP Anycast – Don’t believe the FUD). If the CNAME is in the same domain everything up to the last request will be cached, but otherwise a full sequence of queries has to be resent (which is the case of Akamai). Misused CNAMEs may cause problems and in general it is discouraged to use them (see Oversimplified DNS, it says These should only be used when absolutely necessary, but is there a case where it is absolutely necessary?).

Mail eXchange record (MX)

MX records are used to specify mail servers serving a domain. The MX record contains domain names of mail servers, which have to be resolved to an IP address. Generally, a mail server also validates a domain using the Sender Policy Framework system (spf) which consists of DNS text (TXT) query to that domain. Thus for sending an email, three different DNS queries are needed: the MX query, the A query (or AAAA for IPv6) and the TXT query. CNAMEs should never be used for MX record domain names. MX records are not covered in this article but we expect the results to be very similar or, as it needs minimum 3 queries, even worse than what we analyzed for A queries.

Caching issues

Recursive resolvers simplify the process for applications and if caching is in place it also reduces the network load and response times. ISPs provide such resolvers, companies normally provide one too and most of end-user systems also have one. There are public DNS such as Google DNS (8.8.8.8 and 8.8.4.4). As they are not located at the same place as the end-user, using public resolver like Google may break anycast DNS server that may provide different results based on geographic location. In this case, the response would not be optimal as the public DNS would give the best geographic answer for him which may differ from yours.

DNS resolvers are not authoritative while name servers are authoritative for the domains they serve (e.g. root servers are authoritative for “.”). Resolvers are authoritative for nothing because they only make the requests in your place. This means they can, intentionally or not, give false responses. You have to trust the resolver you use, your ISP should be trustful (otherwise change your ISP), Google may be trustful for valid responses, but you have to be ok that they track all your queries.

Beside bugs that can result in non-intentional false responses, cache poisoning (see Dan Kaminsky’s 2008 DNS vulnerability and The Hitchhiker’s Guide to DNS Cache Poisoning) has to be taken into consideration. Many DNS cache implementations cache the glue records as normal records, thus a DNS server may provide a malicious response and in consequence having a wrong entry in the cache.

For example:

evil.com.       IN NS ns.company.com.
ns.company.com. IN A 6.6.6.6

While the legitimate server says

company.com.    IN NS ns.company.com.
ns.company.com. IN A 1.2.3.4

So the wrong IP address will end up in the cache for the lifetime of the record. To avoid this problem most of the caching resolver implementation do not trust and ignore the optional glue records for out-of-bailiwick NS records, and thus ignore the ns.company.com. IN A 6.6.6.6 in the example above. A note for DNS administrator: it is good practice to have only in-bailiwick records.

These example records are served by TLD com. name servers; it is also important to check that the domain (left side of NS record, evil.com. or company.com.) is a subdomain of the name server’s domain (com.), otherwise the cache could be poisoned by overwriting the TLD NS entry.

Ideal DNS gets … not so ideal

Ideally, you should get responses from decent, close DNS servers in less than 10ms from your ISP and the round-trip time to our ISP is generally about 5ms (we measured response times for DNS queries under 6ms, thus <1ms from our ISP). To be quite conservative let’s say a query takes max 50ms. The normally common shortest way to get a response is 3 queries, so we may expect it to take less than 150ms. But the reality is quite different:

  1. For root servers and TLD servers (at least for most common TLDs) response time is often under 10ms, but sub-TLD servers may be much slower (hundreds of milliseconds).
  2. As demonstrated above, out-of-bailiwick NS records and CNAMEs increase the number of queries needed to resolve a domain name.

Examples

The following examples are done with a non-caching recursive resolver with ignoring glue records for out-of-bailiwick NS records. The name servers to use for the queries are chosen randomly in the set given as response. You may not get the optimal name server and, together with the non-caching, it will increase the response times more than what we could measure with real caching resolvers. However using djbdns dnscache we obtained a bit shorter but similar times (dnscache does not optimize queries in questionable ways like explained in section What cache implementations do?). All the measured times are the sum of time needed for each queries, ignoring the processing time of the resolver (however a good implementation should not add significant delays).

  • w​ww.swisscom.ch. resolved in 3 queries in 35.2ms
  • w​ww.google.com. resolved in 6 queries in 327.8ms (1 out-of-bailiwick record)
  • w​ww.nagra.com. resolved in 12 queries in 1007.0ms (3 out-of-bailiwick records)
  • w​ww.kudelskisecurity.com. resolved in 18 queries in 1982.7ms (4 out-of-bailiwick records, 1 CNAME)
  • w​ww.digikey.ch. resolved in 22 queries in 2004.2ms (5 out-of-bailiwick records, 1 CNAME)
  • pic.dhe.ibm.com. resolved in 29 queries in 2536.3ms (6 out-of-bailiwick records, 2 CNAMEs)
  • yuriplus.com. resolved in 46 queries in 4168.2ms (11 out-of-bailiwick records, 3 CNAMEs)

Let’s show some statistics. We took 200’000 first top Internet domains from Alexa.com at http://www.registered-domains-list.com/top.html. About 160’000 domain names are left after excluding full requests that took more than 3 seconds, that encountered timeouts on non-recursive queries (1s timeout in our case) and whose domain names have more than one sub-TLD domain (keep only xxx.tld. for more accurate comparisons). The following chart shows the repartition of TLDs in the chosen sample.

TLD repartition pie chartThe next chart shows the number of queries needed to resolve the domain name.

  • In blue there is the average number of queries needed by ignoring glue records for out-of-bailiwick NS records.
  • In green is the average number of queries when trusting them.
  • Both blue and green are averages for randomly chosen NS records.
  • In red, the box shows the average range you can expect by choosing one or another NS record. In other words, the resolution may differ depending on the chosen path; the box shows the average on shortest and longest path you may expect for resolving a domain name. For example the box for nl. in the graph is large which means the path varies a lot depending on the chosen name server.
  • The 12 most frequent TLDs are detailed for comparison. Note that all of them have more than 1000 domain names.

Absent from the chart are the highest number of queries:

  • By randomly selecting name servers, 3753 domain names in the sample (2.3%) were resolved in more than 20 queries (with 5-18 levels)
  • By explicitly choosing worst case name servers, only following NS records that have missing glue, more than 5000 domain names (3.2%) needed more than 20 queries.

The djbdns dnscache fails to resolve domain names if it encounters more than 100 NS entries during a recursive resolution, which makes it usually fail after around 20 queries.

Number of queries

This graph is directly correlated to the number of levels (missing/ignored glue records and CNAMEs) needed, the average number of queries per level is very close to 3 for all TLDs.

com. and net. are both served by the same name servers, their domain names are [a-m].gtld-servers.net., which is in-bailiwick for net. but not for com.. It takes 3 queries to resolve their address, this explains why net. domain names have an average of about 9 and com. about 12.

Some domain names in de. resolve in only 2 queries, which is not really surprising, but quite unusual for country TLDs. 2 queries means the TLD gives directly a response for an A record, root servers and TLD servers usually act as delegating servers only, such as they have only NS records with corresponding glue A records. Having a TLD providing an answer for domain names other than name server’s domain names means it manage these domains on the same server as the delegating part. For small TLDs it is common to see such entries but usually not for largely used TLDs. As we can see de. is the only case in the top 12 from our sample.

We can see that by ignoring the glue records for out-of-bailiwick NS entries, the queries needed are about 3 times higher than when trusting them.

The following chart shows the average times needed to get the responses (again here, the resolver used is non-caching which may increase the times compared to real caching resolvers).

Recursive query times

The times are not directly proportional to the number of queries, especially de. domain names resolve faster that jp. domain names even if the number of queries averages respectively to about 12 and 6. As our ISP is in Zurich Switzerland, this is of no surprise.

Root servers traffic

CAIDA, ISC, DNS-OARC, many partnering root name server operators and other organizations are coordinating to conduct a large scale traffic data collection known as A Day in the Life of the Internet (DITL). They do that once per year since 2006 for 24-72 hours. The last public DNS traffic analysis based on these capture dates from DITL 2009. Some raw unprocessed statistics (and here) are available for every years.

Some numbers from DITL 2009 analysis, note that it was performed on only a subset of root servers:

  • 8.09 billion of queries in 24 hours, about 100’000 queries per second
  • 281GB of data, about 3.3 MB/s
  • estimated response traffic: 2TB, about 24MB/s

DITL 2011-2014 have about 17-22 billion of queries on root servers in 24 hours (200’000 – 250’000 queries per second), which seems to be two to three times more than in 2009. However, the subset of covered root servers varies year after year. Considering that our statistics showed that in average 12 queries are needed for resolving a domain name, you can imagine how much higher the number of queries in the whole Internet can be. We measured in a network of the order of thousands of users: DNS is top 1 in number of network sessions (3-4 times more than top 2 which is http/https). In term of bandwidth it stays quite low at about 1% of the whole traffic.

Some root servers have a website showing some live statistics:

Can we make it better?

We have an average of 1.1 seconds for query responses, it still takes too much time. com. is the most used TLD, but it gives an out-of-bailiwick glue from the root servers which needs 3 more queries.

What cache implementations do?

Some cache implementations have a white-list of DNS servers from which they accept out-of-bailiwick glue records. The root servers are often trusted for example, which improves the case of com. (as well as other TLDs) explained above.

Some cache implementations seem to ignore out-of-bailiwick name server if the domain has both in- and out-of-bailiwick name servers and thus avoids a new level. This may sound a good idea but it violates the purpose of having several name server entries (esp. load balancing).

Some cache implementations keep track of the best name servers to query them quite exclusively. Again this may break load balancing but it may be recommended in some cases for geographical reasons. So it depends on how aggressive the resolver is at this. A resolver should never completely ignore a server or use only one: each server should be periodically used. Some domains have more name servers than the available space in the query response; a caching resolver should keep them all and use them all periodically.

Some cache implementations send simultaneous queries to all name servers of a domain and take the fastest response to be able to reply as quickly as possible to its client.

Here is a list of recursive caching resolver implementations. They all ignore glue records for out-of-bailiwick NS records. They may include optimizations presented above. The first 5 are open source and run on Unix systems (may work on Windows too) while the last 2 are closed-source Windows exclusive implementations.

MaraDNS (http://maradns.samiam.org/) has a caching resolver name Deadwood not listed here because its current version is non-recursive. According to The Hitchhiker’s Guide to DNS Cache Poisoning it trusts glue records for out-of-bailiwick NS records without having the cache poisoning issue by using the way explained in the following section Are there other options?. Older versions of Deadwood should also work this way, however we only tested the latest versions of resolvers.

Basically, many cache implementation try to improve their performance by not (fully) following DNS specification and recommendations.

What web browser do?

As DNS queries can be really slow, some web browsers do pre-fetching of DNS queries. For example:

  • While you type in the address bar, queries are done so that when you finish typing and validate the URL (hit enter), the query is already done and the page can load faster.
  • All links in a web page may be queried for so that if you click on one of them, it will load faster.
  • A web page may include special dns-prefetch tags containing domain names that the browser will pre-fetch.
  • At browser startup it may issue queries for domain names in history in case you want to visit the same websites.
  • And when IPv6 is enabled, it is done for both IPv4 A records and IPv6 AAAA records.

In short, web browsers try to improve their performance by literally flooding their resolver with queries.

Are there other options?

Out-of-bailiwick glue record is not a name server issue, but a caching issue. To reduce the number of queries needed, it is even recommended for name servers to have glue records for out-of-bailiwick NS records. Secure Glue A Cache and Update proposes an interesting caching mechanism that eliminates the risk of cache poisoning with out-of-bailiwick glue records.

The idea is that when caching NS entries, never cache the name server’s names but only their IP addresses. Let’s take the poisoning example used above.

evil.com.        IN NS ns1.company.com.
ns1.company.com. IN  A 9.9.9.9

Instead of caching it as two entries (1 NS and 1 A), cache the domain and the IP address.

evil.com.        IN NS _____ IN A 9.9.9.9

So, 9.9.9.9 will never be associated to company.com. and thus no poisoning. By trusting all the glue records, the examples listed above look much better.

  • w​ww.swisscom.ch. resolved in 3 queries in 36.4ms
  • w​ww.google.com. resolved in 3 queries in 71.9ms instead of 6 queries in 327.8ms
  • w​ww.nagra.com. resolved in 3 queries in 37.4ms instead of 12 queries in 1007.0ms
  • w​ww.kudelskisecurity.com. resolved in 6 queries in 268.4ms instead of 18 queries in 1982.7ms (1 CNAME)
  • pic.dhe.ibm.com. resolved in 7 queries in 363.3ms instead of22 queries in 2004.2ms (1 CNAME)
  • w​ww.digikey.ch. resolved in 13 queries in 956.7ms instead of 29 queries in 2536.3ms (2 CNAMEs, 1 missing optional glue record)
  • yuriplus.com. resolved in 19 queries in 2097.4ms instead of 46 queries in 4168.2ms (3 CNAMEs, 2 missing optional glue records)

As told before, based on your tests, the number of needed queries could be divided by 3.

Network Information Centers (NIC) for TLDs do not accept optional glue records, so that to have an optimal configuration only in-bailiwick NS records should be used.

Another point that may be considered is caching the whole zone file for the root servers. RFC 2870 states that Root servers SHOULD NOT answer AXFR, or other zone transfer. SHOULD NOT is a recommendation not to but some of them accept zone transfer, the zone file is also available for download on some websites (or ftp) like internic.net (link). A Start Of Authority (SOA) record request can give you the state of the last update to check if your zone file copy is up-to-date.

. IN SOA a.root-servers.net. nstld.verisign-grs.com. 2014082900 1800 900 604800 86400

2014082900 (query done on 20140829) is the date of the last zone file update. As of the time of the request, the zone file is about 400kB with more than 8000 records.There are several points against doing zone transfer:

  • According to the top Alexa.com domain names we took for our statistics, more than 60 percent of them are com..
  • The root zone file is expected to grow due to the liberalization of the TLDs.
  • The root zone file seems to be updated quite often; according to iana.org, it was updated about 20 times during July 2014.
  • Root server TLDs NS records have a time to live of 2 days, thus you would end up doing zone transfer more often than TLDs NS records cache updates.

But with the fact that about half of the root server queries ends up in there is no such domain (NXDOMAIN), half of the DNS traffic could be dropped by performing zone transfers (with the as you type DNS query in your web browser address bar feature talked about above. The number of NXDOMAIN may increase considerably, based on statistics over last year provided by RIPE: before January 2014 there was last than 50% NXDOMAIN result and since then more than 50% with increasing tendency).

In conclusion, it is difficult to say if transferring the whole zone file on every update is better than caching the TLDs on demand (but risking having a lot of queries resulting in NXDOMAIN).

In conclusion

DNS is a huge system demanding bandwidth, the impact of non-optimal configurations like missing glue records, (mis)use of CNAMEs:

  • is tripling the requirements of bandwidth (doubling when considering root servers as cached).
  • is resulting in slow response delays making applications use pre-fetching optimization techniques which results in more useless DNS queries.
  • caching resolver implementations use questionable shortcuts or tricks to make resolution faster.
  • some configurations can result in loops in the resolution that needs to be detected by the resolver or even rendering the resolution impossible.
  • recommendations say that resolvers need to have a recursion limit in their implementation. This makes some domain names non resolvable in at least one caching resolver implementation (djbdns dnscache fails to resolve domain names with more than ~20 queries, which is the case for about 2.5% of domain names in our testing sample).

DNS is widely used; domain administrators are often not knowing or not realizing the effect of what they are doing. Content Delivery Networks doing Anycast-like functionality are somehow breaking DNS performance instead of doing real Anycast. Public DNS caching servers (like Google’s DNS) may be tempting to use, they are widely used, probably caching a big portion of all DNS servers on the planet. Althoug being fast, as they are not at the same location as you, it breaks the purpose of Anycast DNS servers and it gives valuable information to Google on what domain names you are querying.

References

  1. Notes on the Domain Name System http://cr.yp.to/djbdns/notes.html
  2. RFC 4786 – Operation of Anycast Services http://tools.ietf.org/html/rfc4786
  3. Dan Kaminsky’s 2008 DNS vulnerability http://www.ietf.org/mail-archive/web/dnsop/current/pdf2jgx6rzxN4.pdf
  4. The Hitchhiker’s Guide to DNS Cache Poisoning http://www.cs.utexas.edu/~shmat/shmat_securecomm10.pdf
  5. TCP Anycast – Don’t believe the FUD https://www.nanog.org/meetings/nanog37/presentations/matt.levine.pdf
  6. Oversimplified DNS http://rscott.org/dns/cname.html
  7. A Day in the Life of the Internet (DITL) http://www.caida.org/projects/ditl/
  8. DITL 2009 presentation https://www.dns-oarc.net/files/workshop-200911/Sebastian_Castro.pdf
  9. DITL Traces and Analysis https://www.dns-oarc.net/oarc/data/ditl
  10. Mozilla Firefox DNS pre-fetching https://developer.mozilla.org/en-US/docs/Web/HTTP/Controlling_DNS_prefetching
  11. Google Chrome DNS pre-fetching http://dev.chromium.org/developers/design-documents/dns-prefetching
  12. Internet Explorer DNS pre-fetching http://blogs.msdn.com/b/ie/archive/2011/03/17/internet-explorer-9-network-performance-improvements.aspx
  13. Secure Glue A Cache and Update http://conference.apnic.net/__data/assets/pdf_file/0004/58846/yongjin_apricot2013_20130225_1361832625.pdf
  14. Root Name Server Operational Requirements http://tools.ietf.org/html/rfc2870

Some RFCs

  1. Mail Routing and the Domain System https://www.ietf.org/rfc/rfc974.txt
  2. DOMAIN NAMES – CONCEPTS AND FACILITIES https://www.ietf.org/rfc/rfc1034.txt
  3. DOMAIN NAMES – IMPLEMENTATION AND SPECIFICATION https://www.ietf.org/rfc/rfc1035.txt
  4. Requirements for Internet Hosts – application and support https://www.ietf.org/rfc/rfc1123.txt
  5. A Security Problem and Proposed Correction With Widely Deployed DNS Software https://www.ietf.org/rfc/rfc1535.txt
  6. Common DNS Implementation Errors and Suggested Fixes https://www.ietf.org/rfc/rfc1536.txt
  7. Common DNS Data File Configuration Errors https://www.ietf.org/rfc/rfc1537.txt
  8. Clarifications to the DNS Specification https://www.ietf.org/rfc/rfc2181.txt
  9. Observed DNS Resolution Misbehavior https://www.ietf.org/rfc/rfc4697.txt

Leave a Reply