A Quick Investigation of EdgeCast CDN Blocking in China

arbyreed-blacksmith_forge_heating_iron

This morning, GreatFire.org published a story stating that EdgeCast CDN, one of the more popular content distribution networks that handles content for a number of large websites, has been blocked by the Chinese national filter. As a result, a friend emailed me asking what I thought, and pointed out that all we have are a few reports and a link to a status update from EdgeCast themselves.

As usual, my attempt to write a short email failed, and I ended up carrying out an impromptu investigation into this. With minor edits, I’ve reproduced my email detailing how I looked into this below. For reference, this was carried out from an internet connection based in Oxford, UK.

Based on prior knowledge we have evidence that China will man-in-the-middle (UDP) DNS requests for blocked sites, but ignore genuine ones. So first, let’s pick a Chinese IP address almost at random:

$ ping baidu.cn
PING baidu.cn (220.181.111.86) 56(84) bytes of data.

Check that it’s actually in China, using the MaxMind GeoIP database:

$ geoiplookup 220.181.111.86
GeoIP Country Edition: CN, China
GeoIP City Edition, Rev 1: CN, 22, Beijing, Beijing, N/A, 39.928902,
116.388298, 0, 0

Check that it’s not a DNS server:

$ dig @220.181.111.86 baidu.cn

; <<>> DiG 9.9.2-P2 <<>> @220.181.111.86 baidu.cn
; (1 server found)
;; global options: +cmd
;; connection timed out; no servers could be reached

Excellent. No response. Now, perform a DNS lookup from our (presumably uncensored) connection in the UK to an edgecastcdn.net host:

$ dig edgecastcdn.net

; <<>> DiG 9.9.2-P2 <<>> edgecastcdn.net
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 635 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 2, ADDITIONAL: 5 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;edgecastcdn.net. IN A ;; ANSWER SECTION: edgecastcdn.net. 3370 IN A 93.184.221.133

Now let’s see what happens if we look up edgecastcdn.net at the baidu.cn IP, recalling that it is not actually a DNS server:

$ dig @220.181.111.86 edgecastcdn.net

; <<>> DiG 9.9.2-P2 <<>> @220.181.111.86 edgecastcdn.net
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 63357 ;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 0 ;; QUESTION SECTION: ;edgecastcdn.net. IN A ;; ANSWER SECTION: edgecastcdn.net. 30640 IN A 203.98.7.65 ;; Query time: 397 msec ;; SERVER: 220.181.111.86#53(220.181.111.86) ;; WHEN: Tue Nov 18 12:02:53 2014 ;; MSG SIZE rcvd: 64

Interesting. We get a response, which took 397 milliseconds. Let's look up the returned IP:

$ dig -x 203.98.7.65

; <<>> DiG 9.9.2-P2 <<>> -x 203.98.7.65
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NXDOMAIN, id: 7947 ;; flags: qr rd ra; QUERY: 1, ANSWER: 0, AUTHORITY: 1, ADDITIONAL: 1 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;65.7.98.203.in-addr.arpa. IN PTR ;; AUTHORITY SECTION: 7.98.203.in-addr.arpa. 3600 IN SOA ns1.netlink.co.nz. soa.netlink.co.nz. 2008110600 7200 1200 1728000 172800 ;; Query time: 767 msec ;; SERVER: 129.67.1.180#53(129.67.1.180) ;; WHEN: Tue Nov 18 12:11:43 2014 ;; MSG SIZE rcvd: 110

That doesn't look like a genuine response! A quick WHOIS:

$ whois netlink.co.nz
% New Zealand Domain Name Registry Limited
% Users confirm on submission their agreement to all published Terms
%
version: 5.00
query_datetime: 2014-11-19T01:12:03+13:00
domain_name: netlink.co.nz
query_status: 200 Active
domain_dateregistered: 1997-03-24T00:00:00+12:00
domain_datebilleduntil: 2014-12-01T00:00:00+13:00
domain_datelastmodified: 2014-11-01T23:37:30+13:00
domain_delegaterequested: yes
domain_signed: no
%
registrar_name: Vodafone New Zealand Limited (Clear)
registrar_address1: Private Bag 92161
registrar_city: Auckland
registrar_country: NZ (NEW ZEALAND)
registrar_phone: +64 508 888 800
registrar_email: registry@clear.net.nz
%
registrant_contact_name: NetLink
registrant_contact_address1: PO Box 5358
registrant_contact_city: Wellington
registrant_contact_country: NZ (NEW ZEALAND)
registrant_contact_phone: +64 4 9228499
registrant_contact_fax: +64 4 9228401
registrant_contact_email: dns@netlink.co.nz
%
admin_contact_name: Netlink Operations Centre
admin_contact_address1: PO Box 5358
admin_contact_city: Wellington
admin_contact_country: NZ (NEW ZEALAND)
admin_contact_phone: +64 4 922 8499
admin_contact_fax: +64 4 922 8401
admin_contact_email: dns@netlink.co.nz
%
technical_contact_name: Netlink Operations Centre
technical_contact_address1: PO Box 1762
technical_contact_address2: Wellington, New Zealand
technical_contact_phone: +64 4 495 5021
technical_contact_fax: +64 4 495 5197
technical_contact_email: dns@netlink.co.nz
%
ns_ip4_01: 202.20.93.10
ns_name_01: ns1.netlink.co.nz
ns_ip4_02: 203.96.152.12
ns_name_02: ns2.netlink.co.nz

There does appear to be evidence of interference at the network level -- the request to edgecastcdn.net appears to be redirected to a Vodafone-operated host in New Zealand. I'm interested by the DNS response time of 397 msecs for the lookup, and I have a feeling that that could reveal interesting things about the possible location of the man-in-the-middle attack. A quick whois on baidu.cn gives their genuine DNS server as being located at 202.108.22.220, so we can use that as our test IP instead:

$ dig @202.108.22.220 baidu.cn

;; Query time: 313 msec

$ dig @202.108.22.220 edgecastcdn.net

;; Query time: 281 msec

With one data point it seems that a reply for a censored domain is a few tens of milliseconds quicker than an uncensored one, but one data point does not science make. Here's 100 requests to each:

$ for i in $(seq 1 100); do
dig @202.108.22.220 edgecastcdn.net | grep "Query time" | sed -e "s/.*: (.*) msec/1/" >> edgecastresults.txt;
done

$ for i in $(seq 1 100); do
dig @202.108.22.220 baidu.cn | grep "Query time" | sed -e "s/.*: (.*) msec/1/" >> baiduresults.txt;
done

That gives us two set of statistics that we can summarise quickly in R:

$ R
> baidu <- read.csv('baiduresults.txt') > edgecast <- read.csv('edgecastresults.txt') > summary( baidu )

X323
Min. :286.0
1st Qu.:303.0
Median :306.0
Mean :323.2
3rd Qu.:309.0
Max. :652.0

> summary( edgecast )

X324
Min. :249.0
1st Qu.:293.5
Median :306.0
Mean :314.7
3rd Qu.:309.0
Max. :622.0

The timing doesn't seem to be that different overall. I suspect that, with the Baidu DNS being in Beijing, there is a good chance of the man-in-the-middle attacks being sufficiently geographically close that it makes little difference to the output. Maybe we could pick a reasonable DNS server that isn't in Beijing and test against that.

A quick Google for 'china dns server' gives this page: https://sites.google.com/site/kiwi78/public-dns-servers, and we randomly pick one that claims to be in Chengdu.

$ dig @61.139.54.66 baidu.cn

; <<>> DiG 9.9.2-P2 <<>> @61.139.54.66 baidu.cn
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 7210 ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 4, ADDITIONAL: 5 ;; OPT PSEUDOSECTION: ; EDNS: version: 0, flags:; udp: 4096 ;; QUESTION SECTION: ;baidu.cn. IN A ;; ANSWER SECTION: baidu.cn. 120 IN A 220.181.111.85 baidu.cn. 120 IN A 123.125.114.144 baidu.cn. 120 IN A 220.181.111.86 ;; AUTHORITY SECTION: baidu.cn. 120 IN NS ns2.baidu.com. baidu.cn. 120 IN NS ns3.baidu.com. baidu.cn. 120 IN NS ns4.baidu.com. baidu.cn. 120 IN NS ns1.baidu.com. ;; ADDITIONAL SECTION: ns1.baidu.com. 75287 IN A 202.108.22.220 ns2.baidu.com. 74049 IN A 61.135.165.235 ns3.baidu.com. 74049 IN A 220.181.37.10 ns4.baidu.com. 74049 IN A 220.181.38.10 ;; Query time: 426 msec ;; SERVER: 61.139.54.66#53(61.139.54.66) ;; WHEN: Tue Nov 18 12:29:19 2014 ;; MSG SIZE rcvd: 230 joss@kafka:~/tmp/exp$ geoiplookup 61.139.54.66 GeoIP Country Edition: CN, China GeoIP City Edition, Rev 1: CN, 32, Sichuan, Chengdu, N/A, 30.666700, 104.066704, 0, 0

That responds to our innocuous query, and seems to be in Chengdu according to our geoip database. Let's try the 100 requests trick on it. (I won't repeat the code as it's illustrated above.):

$ summary(baidu)
X376
Min. :275.0
1st Qu.:305.0
Median :321.0
Mean :359.8
3rd Qu.:408.8
Max. :868.0

$ summary(edgecast)
X338
Min. :247
1st Qu.:307
Median :317
Mean :335
3rd Qu.:335
Max. :628

There still isn't anything particularly damning based on the timing. Of course, there are lots of issues with DNS caching to think about, and my simple shell scripts didn't check for things like timeouts or no replies, so this might not be as simple as it looks, but we've still got some interesting data with which to play. At the very least, I can support GreatFire's claim that China are doing things to the edgecastcdn domain.

The first improvement that I'll make, which will probably run tonight, is to add a `sleep 90` call in the for loop that runs the repeated requests. That should hopefully avoid the most obvious form of rate limiting.

This wasn't intended to be a full and detailed investigation, but certainly threw up some interesting findings and a confirmation of the key points of GreatFire.org's story. In the wider sense, the idea of blocking a content distribution network rather than a website itself is a bold step and has significant collateral implications, as GreatFire point out at length.

For me, looking forward, the most interesting aspects of these stories are to derive the logic and intentions behind the behaviour itself, which will require much more significant analytical and theoretical tools than simply probing what was blocked where and how. Why are these particular networks or sites chosen for filtering, and what can that tell us? How quickly are sites blocked, and unblocked, in relation to political or social events? Filtering provides a fascinating lens into the motivations and through processes of those carrying it out. As blocking policies develop, with governments and populations become more comfortable with using the internet in all aspects of life, the ability to draw inferences from such overt interference in the network will be an incredibly rich seam to mine.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.