How I discovered caching CDNs were throttling my everyday browsing

It’s not DNS for once

A Cisco ISR 1117-4P router and a Nexus 3K switch, with a patch panel and a Catalyst 3750 switch with lots of wires plugged into it

Update: 28 Nov 2023 Zen Internet and I have amicably parted ways over this matter. Their technical response has been added to this post.

My ISP has never been amazing…

Well, most aren’t. Browsing the Internet was getting worse and worse, but too unpredictable to see a clear pattern. Speed tests were fine, many websites loaded perfectly, but enough didn’t work, so it was worth investigating. Apple TV+ would crap out, but only if casting via AirPlay. BBC News wouldn’t load at all, but iPlayer would slowly.

The network

I have a VDSL2 line, Ubiquiti Wi-Fi and standard Catalyst switches. Nothing too weird. I pay for a /29 block, so I have multiple usable IP addresses. Yes I host my own mail, and yes, I do DKIM, DMARC and SPF, and MXToolbox shows I’m not on any blocklists.

The symptoms

WorkingNot working
most websites load fine
(amazon.co.uk, halifax.co.uk)
but many not at all
(bbc.co.uk, farnell.com)
800Mbps iperf across the LAN
Stable 39 Mbps down / 7 Mbps up
Stable 6ms ping to 1.1.1.1
PPP is stable and not dropping
AirPlay to LG TV works (via a Unifi AP)but crashes once per show
AirPlay to LG TV works (TV wired in)still crashes
Spotify can stream to a Bose Soundbarbut changing song takes 30sec+
DNS is fine
iOS app downloads take 3min to start
Swapping Cisco VDSL2 router for the ISP’s router makes some sites now workbut not all sites

The cause

The freebie ISP router has IPv6 enabled by default, so would speak to the CDNs from the clean v6 range. When using my equipment without IPv6 configured, I can only speak on the IPv4 range that’s either “shadow banned” by the CDNs, or more likely, not peered well with them.

Proving my theory

Step 1: VPN egress traversing bad ISP’s network

I set up a WireGuard tunnel to another site I have in the Isle of Man, supplied by another ISP. I forwarded all traffic from my workstation via this tunnel:

I now pop out in the Isle of Man via ASN 42455’s network (www.infobyip.com to test):

and can surf and stream as normal:

Step 2: Wireshark deep-dive

I took two captures of my workstation visiting bbc.co.uk in Chrome. The first without the WireGuard tunnel on, and the second with it on. Let’s look at tcp traffic:

Without WireGuard:

No meaningful TCP traffic makes it out. A lot of retransmissions and duplicate ACKs are happening

With WireGuard:

A nice healthy TCP session

(I disabled checksum verification in the screenshots above, as I’m capturing on the same machine having problems, which also has checksum offloading on the NIC.)

Zen Internet, my (now ex) ISP

I’ll save my views on Zen for when this case is resolved. I’ve got so far through the support process I’m now dealing with their Network Ops team to investigate this. They can still redeem themselves by simply giving me a different /29 block as far away from 82.71.78.0/29 as possible 🙂

Update: Zen’s NOC provided the below response which I did not agree with.

The customer is using the broadcast address 82.71.78.7. That address technically will work but will have a negative impact on the speeds.

It can introduce performance problems since packets to the broadcast IP address are broadcast to all hosts (IP’s) on the segment(subnet). This additional traffic which is intended for only one host but send to every host can negatively impact the performance of those hosts needlessly receiving that traffic.

The usable IP’s are 82.71.78.1-82.71.78.5 and the router being .6.

If the customer needs more than 5 usable addresses they will need to get a /28.

Why I disagree:

  • I can use the “network” address 82.71.78.0 which wouldn’t be possible in a network larger than a /31
  • The “CDNs not working” problem happens on 82.71.78.4 which is one of the IPs they say is “usable”
  • the /29 block is not a real “subnet” with a host and broadcast address anyways – traffic for my /29 is just slammed down the link for my router to deal with
  • there is no real concept of “broadcast” on a PPPoE link

Too much effort is now going into this problem for both of us. Zen and I agreed to close my account down rather than continue further investigation. I need to “stop loss” in trading speak by this point and so does the ISP. I don’t suspect customers send in pcaps to get out of contracts every day…

Andrews & Arnold, my new ISP

I have had a new line installed by A&A from the same 8-core Openreach cable the Zen line is from. I have noticed:

  • same latency
  • 2.38x faster upload speed
  • 1.15x faster download
  • The CDN problem is gone

Be more specific. What do you mean by “throttled”?

The blocking is nondeterministic. Whilst Akamai (who offers a reputation check) returns one of my IPs as clean, websites of their customers like eBay have shown the throttling behaviour. Apple TV+ appears to be served by Akamai too (tv.apple.com), which has been the worst offender for “throttling” my streaming:

PS C:\WINDOWS\system32> nslookup tv.apple.com
Server:  dc1-lon.core.*********.net
Address:  10.*.*.*

Non-authoritative answer:
Name:    e673.dsce9.akamaiedge.net
Addresses:  2a02:26f0:fd00:1088::2a1
          2a02:26f0:fd00:109a::2a1
          2a02:26f0:fd00:1081::2a1
          2a02:26f0:fd00:1098::2a1
          2a02:26f0:fd00:1093::2a1
          88.221.41.37
Aliases:  tv.apple.com
          itunes-cdn.itunes-apple.com.akadns.net
          itunes.apple.com.edgekey.net

It’s also available with IPv6, explaining the freebie ISP router theory.

Fastly doesn’t offer a rep check on their website but bbc.co.uk right now won’t load at all. Cloudflare is opaque too about disclosing what IPs they’re throttling.

Why would a range get throttled or “shadow banned”?

This is extremely difficult to answer and I have no good theory. The same IP range handles my email and I generally have no trouble emailing major providers like Gmail, Outlook, Yahoo!, and many other counterparties. It’s in the CDNs’ interests to serve eyeballs and ban bots, but I think bad routing or congestion within the ISP’s core is a more likely explanation.