This entry explains how I have configured a linux bridge, dnsmasq and iptables to be able to run and communicate different virtualization systems and containers on laptops running Debian GNU/Linux.

I’ve used different variations of this setup for a long time with VirtualBox and KVM for the Virtual Machines and Linux-VServer, OpenVZ, LXC and lately Docker or Podman for the Containers.

Required packages

I’m running Debian Sid with systemd and network-manager to configure the WiFi and Ethernet interfaces, but for the bridge I use bridge-utils with ifupdown (as I said this setup is old, I guess ifupdow2 and ifupdown-ng will work too).

To start and stop the DNS and DHCP services and add NAT rules when the bridge is brought up or down I execute a script that uses:

  • ip from iproute2 to get the network information,
  • dnsmasq to provide the DNS and DHCP services (currently only the dnsmasq-base package is needed and it is recommended by network-manager, so it is probably installed),
  • iptables to configure NAT (for now docker kind of forces me to keep using iptables, but at some point I’d like to move to nftables).

To make sure you have everything installed you can run the following command:

sudo apt install bridge-utils dnsmasq-base ifupdown iproute2 iptables

Bridge configuration

The bridge configuration for ifupdow is available on the file /etc/network/interfaces.d/vmbr0:

# Virtual servers NAT Bridge
auto vmbr0
iface vmbr0 inet static
    address         10.0.4.1
    network         10.0.4.0
    netmask         255.255.255.0
    broadcast       10.0.4.255
    bridge_ports    none
    bridge_maxwait  0
    up              /usr/local/sbin/vmbridge ${IFACE} start nat
    pre-down        /usr/local/sbin/vmbridge ${IFACE} stop nat
Warning:

To use a separate file with ifupdown make sure that /etc/network/interfaces contains the line:

source /etc/network/interfaces.d/*

or add its contents to /etc/network/interfaces directly, if you prefer.

This configuration creates a bridge with the address 10.0.4.1 and assumes that the machines connected to it will use the 10.0.4.0/24 network; you can change the network address if you want, as long as you use a private range and it does not collide with networks used in your Virtual Machines all should be OK.

The vmbridge script is used to start the dnsmasq server and setup the NAT rules when the interface is brought up and remove the firewall rules and stop the dnsmasq server when it is brought down.

The vmbridge script

The vmbridge script launches an instance of dnsmasq that binds to the bridge interface (vmbr0 in our case) that is used as DNS and DHCP server.

The DNS server reads the /etc/hosts file to publish local DNS names and forwards all the other requests to the the dnsmasq server launched by NetworkManager that is listening on the loopback interface.

As this server already does catching we disable it for our server, with the added advantage that, if we change networks, new requests go to the new resolvers because the DNS server handled by NetworkManager gets restarted and flushes its cache (this is useful if we connect to a new network that has internal DNS servers that are configured to do split DNS for internal services; if we use this model all requests get the internal address as soon as the DNS server is queried again).

The DHCP server is configured to provide IPs to unknown hosts for a sub range of the addresses on the bridge network and use fixed IPs if the /etc/ethers file has a MAC with a matching hostname on the /etc/hosts file.

To make things work with old DHCP clients the script also adds checksums to the DHCP packets using iptables (when the interface is not linked to a physical device the kernel does not add checksums, but we can fix it adding a rule on the mangle table).

If we want external connectivity we can pass the nat argument and then the script creates a MASQUERADE rule for the bridge network and enables IP forwarding.

The script source code is the following:

/usr/local/sbin/vmbridge
#!/bin/sh
set -e
# ---------
# VARIABLES
# ---------
LOCAL_DOMAIN="vmnet"
MIN_IP_LEASE="192"
MAX_IP_LEASE="223"
# ---------
# FUNCTIONS
# ---------
get_net() {
  NET="$(
    ip a ls "${BRIDGE}" 2>/dev/null | sed -ne 's/^.*inet \(.*\) brd.*$/\1/p'
  )"
  [ "$NET" ] || return 1
}
checksum_fix_start() {
  iptables -t mangle -A POSTROUTING -o "${BRIDGE}" -p udp --dport 68 \
    -j CHECKSUM --checksum-fill 2>/dev/null || true
}
checksum_fix_stop() {
  iptables -t mangle -D POSTROUTING -o "${BRIDGE}" -p udp --dport 68 \
    -j CHECKSUM --checksum-fill 2>/dev/null || true
}
nat_start() {
  [ "$NAT" = "yes" ] || return 0
  # Configure NAT
  iptables -t nat -A POSTROUTING -s "${NET}" ! -d "${NET}" -j MASQUERADE
  # Enable forwarding (just in case)
  echo 1 >/proc/sys/net/ipv4/ip_forward
}
nat_stop() {
  [ "$NAT" = "yes" ] || return 0
  iptables -t nat -D POSTROUTING -s "${NET}" ! -d "${NET}" \
    -j MASQUERADE 2>/dev/null || true
}
do_start() {
  # Bridge address
  _addr="${NET%%/*}"
  # DNS leases (between .MIN_IP_LEASE and .MAX_IP_LEASE)
  _dhcp_range="${_addr%.*}.${MIN_IP_LEASE},${_addr%.*}.${MAX_IP_LEASE}"
  # Bridge mtu
  _mtu="$(
    ip link show dev "${BRIDGE}" |
      sed -n -e '/mtu/ { s/^.*mtu \([0-9]\+\).*$/\1/p }'
  )"
  # Compute extra dnsmasq options
  dnsmasq_extra_opts=""
  # Disable gateway when not using NAT
  if [ "$NAT" != "yes" ]; then
    dnsmasq_extra_opts="$dnsmasq_extra_opts --dhcp-option=3"
  fi
  # Adjust MTU size if needed
  if [ -n "$_mtu" ] && [ "$_mtu" -ne "1500" ]; then
    dnsmasq_extra_opts="$dnsmasq_extra_opts --dhcp-option=26,$_mtu"
  fi
  # shellcheck disable=SC2086
  dnsmasq --bind-interfaces \
    --cache-size="0" \
    --conf-file="/dev/null" \
    --dhcp-authoritative \
    --dhcp-leasefile="/var/lib/misc/dnsmasq.${BRIDGE}.leases" \
    --dhcp-no-override \
    --dhcp-range "${_dhcp_range}" \
    --domain="${LOCAL_DOMAIN}" \
    --except-interface="lo" \
    --expand-hosts \
    --interface="${BRIDGE}" \
    --listen-address "${_addr}" \
    --no-resolv \
    --pid-file="${PIDF}" \
    --read-ethers \
    --server="127.0.0.1" \
    $dnsmasq_extra_opts
  checksum_fix_start
  nat_start
}
do_stop() {
  nat_stop
  checksum_fix_stop
  if [ -f "${PIDF}" ]; then
    kill "$(cat "${PIDF}")" || true
    rm -f "${PIDF}"
  fi
}
do_status() {
  if [ -f "${PIDF}" ] && kill -HUP "$(cat "${PIDF}")"; then
    echo "dnsmasq RUNNING"
  else
    echo "dnsmasq NOT running"
  fi
}
do_reload() {
  [ -f "${PIDF}" ] && kill -HUP "$(cat "${PIDF}")"
}
usage() {
  echo "Uso: $0 BRIDGE (start|stop [nat])|status|reload"
  exit 1
}
# ----
# MAIN
# ----
[ "$#" -ge "2" ] || usage
BRIDGE="$1"
OPTION="$2"
shift 2
NAT="no"
for arg in "$@"; do
  case "$arg" in
  nat) NAT="yes" ;;
  *) echo "Unknown arg '$arg'" && exit 1 ;;
  esac
done
PIDF="/var/run/vmbridge-${BRIDGE}-dnsmasq.pid"
case "$OPTION" in
start) get_net && do_start ;;
stop) get_net && do_stop ;;
status) do_status ;;
reload) get_net && do_reload ;;
*) echo "Unknown command '$OPTION'" && exit 1 ;;
esac
# vim: ts=2:sw=2:et:ai:sts=2

NetworkManager Configuration

The default /etc/NetworkManager/NetworkManager.conf file has the following contents:

[main]
plugins=ifupdown,keyfile

[ifupdown]
managed=false

Which means that it will leave interfaces managed by ifupdown alone and, by default, will send the connection DNS configuration to systemd-resolved if it is installed.

As we want to use dnsmasq for DNS resolution, but we don’t want NetworkManager to modify our /etc/resolv.conf we are going to add the following file (/etc/NetworkManager/conf.d/dnsmasq.conf) to our system:

/etc/NetworkManager/conf.d/dnsmasq.conf
[main]
dns=dnsmasq
rc-manager=unmanaged

and restart the NetworkManager service:

$ sudo systemctl restart NetworkManager.service

From now on the NetworkManager will start a dnsmasq service that queries the servers provided by the DHCP servers we connect to on 127.0.0.1:53 but will not touch our /etc/resolv.conf file.

Configuring systemd-resolved

If we start using our own name server but our system has systemd-resolved installed we will no longer need or use the DNS stub; programs using it will use our dnsmasq server directly now, but we keep running systemd-resolved for the host programs that use its native api or access it through /etc/nsswitch.conf (when libnss-resolve is installed).

To disable the stub we add a /etc/systemd/resolved.conf.d/disable-stub.conf file to our machine with the following content:

# Disable the DNS Stub Listener, we use our own dnsmasq
[Resolve]
DNSStubListener=no

and restart the systemd-resolved to make sure that the stub is stopped:

$ sudo systemctl restart systemd-resolved.service

Adjusting /etc/resolv.conf

First we remove the existing /etc/resolv.conf file (it does not matter if it is a link or a regular file) and then create a new one that contains at least the following line (we can add a search line if is useful for us):

nameserver 10.0.4.1

From now on we will be using the dnsmasq server launched when we bring up the vmbr0 for multiple systems:

  • as our main DNS server from the host (if we use the standard /etc/nsswitch.conf and libnss-resolve is installed it is queried first, but the systemd-resolved uses it as forwarder by default if needed),
  • as the DNS server of the Virtual Machines or containers that use DHCP for network configuration and attach their virtual interfaces to our bridge,
  • as the DNS server of docker containers that get the DNS information from /etc/resolv.conf (if we have entries that use loopback addresses the containers that don’t use the host network tend to fail, as those addresses inside the running containers are not linked to the loopback device of the host).

Testing

After all the configuration files and scripts are in place we just need to bring up the bridge interface and check that everything works:

$ # Bring interface up
$ sudo ifup vmbr0
$ # Check that it is available
$ ip a ls dev vmbr0
4: vmbr0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN
          group default qlen 1000
    link/ether 0a:b8:ef:b8:07:6c brd ff:ff:ff:ff:ff:ff
    inet 10.0.4.1/24 brd 10.0.4.255 scope global vmbr0
       valid_lft forever preferred_lft forever
$ # View the listening ports used by our dnsmasq servers
$ sudo ss -tulpan | grep dnsmasq
udp UNCONN 0 0  127.0.0.1:53     0.0.0.0:* users:(("dnsmasq",pid=1733930,fd=4))
udp UNCONN 0 0  10.0.4.1:53      0.0.0.0:* users:(("dnsmasq",pid=1705267,fd=6))
udp UNCONN 0 0  0.0.0.0%vmbr0:67 0.0.0.0:* users:(("dnsmasq",pid=1705267,fd=4))
tcp LISTEN 0 32 10.0.4.1:53      0.0.0.0:* users:(("dnsmasq",pid=1705267,fd=7))
tcp LISTEN 0 32 127.0.0.1:53     0.0.0.0:* users:(("dnsmasq",pid=1733930,fd=5))
$ # Verify that the DNS server works on the vmbr0 address
$ host www.debian.org 10.0.4.1
Name: 10.0.4.1
Address: 10.0.4.1#53
Aliases:

www.debian.org has address 130.89.148.77
www.debian.org has IPv6 address 2001:67c:2564:a119::77

Managing running systems

If we want to update DNS entries and/or MAC addresses we can edit the /etc/hosts and /etc/ethers files and reload the dnsmasq configuration using the vmbridge script:

$ sudo /usr/local/sbin/vmbridge vmbr0 reload

That call sends a signal to the running dnsmasq server and it reloads the files; after that we can refresh the DHCP addresses from the client machines or start using the new DNS names immediately.