Anycast performance monitoring with Anteater

New open-source tool helps DNS operators to monitor and troubleshoot their anycast DNS

We have developed and open-sourced a DNS anycast monitoring tool called Anteater for DNS operators. Anteater’s main value is that it informs a DNS operator about the latencies that clients experience across the operator’s entire authoritative DNS infrastructure, purely on the basis of passive DNS measurements. Even the anycast service itself is run in house or outsourced. In this blog we discuss how we measure the real latencies experienced by clients querying our authoritative server, in near-real time, and how you can monitor such latencies in near-real time using ENTRADA and Anteater.

DNS operators and the challenge of delivering fast responses

DNS operators strive to deliver fast DNS responses to clients – say, under 50ms, and hopefully well under. However, they typically don’t really know what latencies their clients experience. Why?

Well, because DNS operators don’t have access to clients’ networks, so they cannot send queries from the clients’ vantage points or measure the round-trip times (RTTs). There are alternative ways of obtaining the data, such as using RIPE Atlas (~10k measurement points in client networks). However, such tools only cover some of the client networks that a large DNS operator serves.

So, what are the alternatives? You could also run Verfploeter, but that requires active measurements from the anycast network. Making such measurements is complicated, because many operations teams don’t have full access to their name servers, especially if they hire third-party services. Also, Verfploeter does not support IPv6, which means it won’t capture the RTTs of DNS queries/responses sent over IPv6.

We therefore wondered whether we could find an alternative approach that filled the following criteria:

  • No requirement to perform extra measurements
  • Support for IPv6
  • Measurements based on real clients, not arbitrary vantage point

Our approach: using DNS over TCP to measure RTTs

It turned out that there is another, somewhat overlooked source of the required information: your own traffic, specifically the DNS queries sent over TCP. DNS supports both TCP and UDP, and TCP is used for large responses after truncation (or zone transfers). Given that TCP requires a session to be established, we can measure the RTT from a client to a server during the TCP handshake (time difference between the first and third packets).

Together with colleagues at the University of Southern California/Information Sciences Institute (USC/ISI), we have released a technical report detailing that measurement methodology. The report proves that the approach can be used to measure real clients’ RTTs and explains how it can be used. (You can also watch the video presentation at DNS-OARC and view the slides as well.)

If you’re a DNS operator, the biggest advantage of using DNS over TCP RTT measurements is that the data is available for free: it’s right there on your own DNS infrastructure. You just need to analyse it. And that is what our open-source tool, Anteater, does. You can download it from our GitHub repository.

This message is taken from SIDN Labs, read the entire blog here