Blog

Making your Debian server networking redundant

You will need at least the following…

  1. A pair of stacked switches that support creating an LACP bonded port across the stack on 2 different nodes. This gives you the best of all worlds being able to provide redundancy and increase your bandwidth.
  2. Or alternatively, 2 ports on the same or on different unstacked switches. This is the bare minimum you can do to mitigate link failure. Note this setup has no polling mechanism so if the physical ethernet link stays up but is not operational because of device switching failure, or a failure on the another port on the device that provides the uplink, then this wont help you.

On your server you will need 2 (or more) network cards and some “simple” setup

Install the packages that you will need in case you don’t have them already.

  • apt-get install ifenslave vlan bridge-utils

The example sets up the following

  • eth0 and eth1 bonded together into bond0
  • create 2 bridges br8 and br9
  • create 2 vlans bond0.8 and bond0.9
  • place them in each bridge respecitvely
  • add IP details on br9
  • br8 has no L3 config on it and in this specific case is used by KVM to bridge virtual machines into as they come online

For option 1 edit your /etc/network/interfaces to look something like this


# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

source /etc/network/interfaces.d/*

# The loopback network interface
auto lo bond0 bond0.8 bond0.9 br8 br9
iface lo inet loopback

iface bond0 inet manual
 bond-slaves eth0 eth1
 bond-mode 802.3ad
 mond-miimon 100
 bond-use-carrier 1
 bond-lacp-rate 1
 bond-min-links 1
 # send traffic over the available links based on src/dst MAC address
 bond-xmit-hash-policy layer2
 mtu 1600

iface bond0.8 inet manual
iface bond0.9 inet manual

iface br8 inet manual
 bridge_stp off
 bridge_ports bond0.8

iface br9 inet static
 address 192.168.0.2
 netmask 255.255.255.0
 gateway 192.168.0.1
 bridge_ports bond0.9
 bridge_stp off

For option 2 edit your /etc/network/interfaces to look something like this (only the bond0) config changes


# This file describes the network interfaces available on your system
# and how to activate them. For more information, see interfaces(5).

source /etc/network/interfaces.d/*

# The loopback network interface
auto lo bond0 bond0.8 bond0.9 br8 br9
iface lo inet loopback

iface bond0 inet manual
 slaves eth0 eth1
 bond_mode active-backup
 bond_miimon 100
 bond_downdelay 200
 bond_updelay 200

iface bond0.8 inet manual
iface bond0.9 inet manual

iface br8 inet manual
 bridge_stp off
 bridge_ports bond0.8

iface br9 inet static
 address 192.168.0.2
 netmask 255.255.255.0
 gateway 192.168.0.1
 bridge_ports bond0.9
 bridge_stp off

Most use cases probably will not require bridging or VLAN but I thought it best to provide examples of the entire feature set, you can always reduce to what you need.

Broken web pages/downloads – AAAaaarrrrgh!

Broken pages nowadays account for half of my nightmare support scenarios, you know those ones where you have half loaded, broken or endlessly loading pages. This is usually accompanied with a comment about how its fine on the users 3G/DSL/whatever network.  In the past this was usually more likely to be a MTU related issue, but with the proliferation of CDN hosting nowadays, your more likely suspect is now a broken path to a CDN.

The situation in South Africa lends itself to this as some of the larger ISP’s have private CDN deployments alongside open peered deployments. Often a user is trying to get content from a CDN they should not have access to, or a CDN that is not optimal. Sometimes this is because of bad CDN configuration but more often than not its because your users are not using the correct DNS servers for resolution.

CDN relies heavily on DNS for being able to determine the origin AS and location of a request, and then replies based on this information accordingly.  Google and OpenDNS are often culprits here as well intentioned users love to use these (I blame google for making it so easy with 8.8.8.8). While extensions to DNS have helped with being able to identify the source of the request the issues are not completely gone and will continue to rear their head for some time still. I have also seen scenarios where domain controllers are set up to use one network provider (DNS settings included) while LAN users use a different provider/gateway (aka you), meaning the domain controller gives DNS responses to your clients from a server on a different network altogether

The tools I usually for troubleshooting these kinds of issues are

  • dig/nslookup to check resolution discrepancies between you and the client network.
  • The browsers developer view. Just head on over to the sources tab and see what resources the page is actually loading, gone are the days of simple sites, all sites now include content from all over the show for advertising, tracking and to load balance.
  • http://www.cdnplanet.com/tools/cdnfinder/ . It lets you point it to a web site and it reports what external resources the site uses and what CDN’s those resources are on.

If this doesn’t resolve your issues then possibly your issue is with your upstream providers. They either have a broken transparent application cache/accelerator (good luck finding someone there that knows something about them) or you are running on a seriously messed up bonded link. I have dealt with both of these before and maybe I will share more on this another day.