Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

What is the point of owning public address space?

Anything in your private network (even if it goes over public internet) should be encrypted and locked up anyway. Something like Wireguard or Nebula only needs a few (maybe just one) publicly accessible address. Inside the overlay network, it's easy to keep IP addresses stable.

Anything public-facing likely needs a DNS record, updatable quickly when the IP of a publicly accessible interface changes (infrequently).

What am I missing?





The realistic point is to have your own abuse email contact, to evade the banhappy policies that most server hosts have even when you did nothing wrong. Usually they suspend your account if you don't reply within 24 hours, even if the complaint is obvious nonsense.

It's the only real way of running reliable IPv6 networks with multiple uplinks. Unless you want NATv6.

DNS updates are slow. BGP can react to a downed link in <1 sec.

Even fast LACP needs three seconds and that's on the same collision domain.

How does BGP actually detect a link is down? Keep alive default is 30s but that can be changed. If you set it to say one second, is that wise? Once a link is down, that fact will propagate at the speed of BGP and other routing protocols. Recovery will need a similar propagation.

Depending on where the link is, a second can be a "life time" these days or not. It really depends on the environment what an appropriate heart beat interval might be.

Also, given that BGP is TCP based, it might have to interact with other lower level link detection protocols.


BFD or Ethernet-OAM is the standard here.

It can get a bit hardware dependant but getting <50ms failovers from software based BFD in BIRD or FRR is fairly easy, and I've tested down to < 1ms before with hardware based BFD echo. ~50ms is the point at which a user making a traditional VOIP call won't notice the path switch.

You can get NIC's for computers (like most Nvidia/Meallanox or higher end Broadcom/Intel NIC's that do hardware BFD, and its obviously included in higher end networking kit.

You then link the BGP routes to the health of the BFD session for which that path is the next hop, and you get super quick withdrawls.


I.e. bird detects interface failure but this affects only your side of decision making. For bidirectional failure detection you do BFD with BGB. BFD default timers are 3 times 30 ms, iirc.

I have both my own multihomed ASN and operate my own nameservers. The latter has usually been about as fast for failover overall in practice. BGP may look to converge near instantly from your 2-3 peer outbound perspective but the inbound convergence from the 100k networks on the rest of the internet is much slower and has a long tail very akin to trying to set your DNS TTL to 0 and having the rest of the internet decide to do it slower for cache/churn reasons anyways.

The bigger problem, and where BGP multihoming is most handy, is it's just so much easier to get a holistic in+out failover where nothing really changes vs in DNS where it's more about getting the future inbound stuff to change where it goes. E.g. it's a pain to break an active session because the address had to change, even if DNS can update where the new service is quickly.


The long tail of routers receiving your update doesn’t matter. Once the common transit networks get it, that’s where the rest would dump the traffic to reach you anyway. The only time slow propagation to the edges matters is the first time announcing a prefix after it has been fully withdrawn.

Using the wrong route to get the packet in your general direction still gets you the packet as long as it hits an ISP along the way that got the update.

We could fully drain traffic from a transit provider in <60s with a withdrawal with all of the major providers you get at the internet exchanges. If you weren’t seeing that your upstream ISPs may have penalized you for flapping too much and put in explicit delays.


<60s sounds about right as a general safe estimate. I just mean people should expect 1-2ish orders of magnitude more than <1s from a downed link with internet BGP upstreams in a multihomed situation.

I’m saying that’s not a correctly configured link for fast failure.

<1 second was normal for hard link down events or explicit withdrawals. Anything above that was waiting for some BGP peer timeout or some IGP event.

If your ISP is taking longer than 1 second to propagate your change, you’ve been put in some dunce protection box.


If it were flap suppression/slow peer detection/"the dunce bucket" there wouldn't be a long tail of convergence - it'd just be nothing until all at once. This also isn't something I've only seen on my personal AS alone, it's what I've come to expect in many enterprise cutovers while previously working at a network VAR. The personal AS is however much more carefree to move around to different random providers on a whimthough of course :).

I found some data from an oldish post by benjojo https://blog.benjojo.co.uk/post/speed-of-bgp-network-propaga... which confirm various tirr 1s do propagate updates across their networks very fast (<2ish seconds) while others certainly do not. Notably, Level 3 (now Lumen) is the largest BGP presence by prefix count and was the worst tested in the list - starting to apply at ~20s after to finishing at ~50s after. This was for announce specifically, which should be the clearer case.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: