Creating a completely isolated guest network using Mikrotik hardware
Introduction
By way of a preface, my home network is not simple - this is not your average "home" network.
This is another project finished a while ago, in early 2020 that I've had the time to write up.
Recently I switched my (crappy) Netgear "managed" switches for an array of second-hand MikroTik "routers" from eBay (at a similar cost to the new Netgear kit since the MikroTik kit quite niche, which makes the resale value attractive).
They are powerful, complex, cost effective and from what I can tell a bit of a marmite product - some love them and some hate them. I like them, at least until things stop working.
All of the above is a digression from the title and the purpose of the article.
Some of my client work requires me to work with their devices on a local network, and since the broader uptake of working from home, that means these devices exist on my home network. I generally would not want client devices interacting with my home network, and vice versa, and especially if a client device created a public wireless network.
A conventional 'guest network' is often created at the wireless AP level and employs client isolation to ensure wireless clients cannot talk to each other. This is fine until the devices to be connected are wired.
This post looks at creating a separate subnet within the main network that cannot see the wider parent network but can access the internet to be able to do it's job - in this sense it is complete in so far as it works at the network level, rather than using a concept like client isolation. Clients on the guest network can see each other in this case, but not the core network and without very intimate knowledge and adding some custom routes the same is true of the core network to the guest network.
The solution present here is complex and not ideal in the way it works so it is likely to evolve over time.
My Network
The diagram above shows my current network setup, each layer being a floor, and I am fortunate to have an internal (to the walls) CAT6A wired network throughout the house connecting all the key switches.
On top of the 8 switches, server, Fire TVs, TVs, Xboxes, smart devices and other bits, I have more than 50 daily devices that connect directly or wirelessly. As a result I run a /23
subnet to allow for a decent address space for blocks of DHCP reservations, various other transient devices (lots of Pis, ESPs, etc) and the odd visitor.
The internet comes in via the Phone-line (ADSL2) to the Draytek modem in bridge mode and is connected to via the RB2011L at the top which establishes the PPPoE connection, as well as providing NAT services to the local network.
The RB1100AHx2 is the 'Core Switch' linking most of the others.
The Micro Server runs DNS and DHCP through dnsmasq
.
Otherwise, it's quite simple, while all the routers are used as switches and are smart, they're not configured to do much.
That is, until, the guest network was created.
The Guest Network
The guest network is consumed via the switch at x.x.0.12
which is on my desk.
The guest network lives on VLAN10
whereas the default is VLAN1
.
Unless otherwise configured, all the switches default each port to VLAN1
untagged, which means any traffic on those ports is assumed to be on VLAN1
and traffic spanning from those ports, on dumb switches or wireless APs is the same.
There are two ports on my desk switch which are set to be on VLAN10
untagged - meaning any traffic on those ports is assumed to be and attributed to VLAN10
.
At this point the virtual LAN network is isolated, but largely useless as there is nothing providing a bridge between VLAN10
and VLAN1
, where the wider connectivity is available.
Untagged VLANS are associated by the switch.
The 802.11q
protocol allows for packets to be sent on a given tagged aka explicitly declared VLAN. For this to work the switch has to allow that VLAN on that port too, so you can't just snoop on any VLAN at will. In addition, to the best of my knowledge, you need smart/managed switches anywhere that multiple VLANs will be transported (or trunked) over a single link.
Here's how the gap is bridged:
- The Micro Server is on a port where
VLAN1
is untagged andVLAN10
is available tagged. - The uplink between the Core Switch (where the Micro Server lives) and my desk trunks
VLAN1
andVLAN10
. - My desk switch has the uplink to the Core Switch trunking
VLAN1
andVLAN10
too, as well as defaulting most ports toVLAN1
untagged and the guest ports toVLAN10
untagged. - The NAS runs
dnsmasq
on the mainVLAN1
(physical) interface and the virtualVLAN10
interface dnsmasq
provides DHCP and DNS on bothVLAN1
isx.x.0.x/23
VLAN10
isx.x.254.x/24
- The Micro Server provides forwarding (NAT) from
VLAN1
toVLAN10
, and the RB2011L (at the top of the diagram) providers upstream forwarding to the internet fromVLAN1
- The Micro Server blocks upstream traffic from the
VLAN10
subnet toVLAN1
using IP tables, except when forwarded via the gateway - The setup is effectively double NAT'd:
Internet <=> RB2011L@x.x.0.0/23 (NAT) <=> Micro Server (NAT) <=> Guest@x.x.254.0/24
In the diagram above the dotted green lines represent VLAN1
and VLAN10
coexisting, all other links are normal untagged VLAN1
.
On the core switch setup we can see the VLANs:
Ports 10 and 13 are the Micro Server and uplink respectively that are both untagged and tagged (Port 2 is for a RaspberryPi to connect to client devices via a WiFi dongle for development and testing - outside the scope of this post).
On my desk:
Ports 9 and 10 are untagged to 10, the rest are untagged to 1 (10 is disconnected right now).
From this point the traffic is well isolated but the Micro Server needs setting up to route traffic.
Networking and Routing
VLAN Interface
The netplan configuration is updated to add a VLAN interface on the existing interface with the address x.x.254.50
.
Packet Routing
Enabling masquerade in iptables allows the machine to perform the classic NAT we experience on a home network but in this case between from x.x.254.x/24
up to x.x.0.x/23
(and then up to the Internet).
Running the following command will add the rule:
iptables -t nat -A POSTROUTING -o enp2s0 -j MASQUERADE
This is what the nat
table will look like if this is the only rule in place.
We need to enable forwarding on the VLAN interface, but without allowing packets to be forwarded to the parent / home network. The first command drops (blocks) those undesirable packets, and the second allows any remaining packets to be forwarded.
iptables -I FORWARD -d x.x.0.0/23 -j DROP -i vlan.10@enp2s0
iptables -A FORWARD -i vlan.10@enp2s0 -j ACCEPT
The filter
table will be as follows if these are the only rules:
Finally, enable IP forwarding on the Kernel by finding and updating the line below (the line will be net.ipv4.ip_forward=0
by default or commented out, if you don't find it then add it):
Using iptables-save
and iptables-restore
from the package netscript-ipfilter
makes it easier to manage rule sets, the full saved rules are as follows:
To save, run: iptables-save > /etc/iptables/rules.v4
To restore, run: iptables-restore < /etc/iptables/rules.v4
As usual a greater level of detail can be found elsewhere on how to implement the restore part of the life cycle on boot, along with other solutions.
Guest devices can ping 8.8.8.8
(as an example of Internet access) and x.x.254.50
(the NAT router/Micro Server) but not anything on x.x.0.0/23
including x.x.0.1
which is the upstream router.
Dnsmasq
dnsmasq
is used to perform both DNS and DHCP service. I moved away from bind9
and dhcpd-server
to achieve better cohesion between DHCP leases and their corresponding local DNS entries, which it has certainly improved.
The Easy Option
A quick and easy option is to serve up all leases from a single dnsmasq
instance. Between the localise-queries
option, and the multiple dhcp-range
entries the server will provide the right range to the queries coming into and served from their respective interfaces.
The tagging feature is used to segregate DNS options and reservations, lan
is applied to all leases served up on the range and interface and guest
is applied to those on the VLAN and accompanying range.
Option 3 is the gateway - which is the main router for the home range and the Micro Server for the guest range.
Option 6 is the DNS servers to be used.
Clearly there is a lot going on in the configuration above - the manpages for dnsmasq
are comprehensive so its best to look at that to understand each directive.
Configuration changes can be checked by running dnsmasq --test
before restarting the service.
As a bonus, its easy to view and monitor leases in /var/lib/misc/dnsmasq.leases
, e.g. watch -n 1 -d cat /var/lib/misc/dnsmasq.leases
.
The main downside to this approach is that many configuration options are global, between both interfaces, and in part, as a side effect, all the hostnames and DNS entries are associated with the same domain, and the dnsmasq
instance will serve results on either interface - while the likelihood is arguably slim, devices on their respective subnets could discover the addresses of those on other subnets. Only the addresses could be exposed but it still, in theory, increases the potential attack surface.
The Fully Isolated Option
Most dnsmasq
options are global or can be called more than once, but without tagging there is no concept of scoping (compared to nginx/apache configs, for example, which are hierarchical).
There is a little misinformation floating around that suggests if you using the conf-dir
option to include files in a folder it can serve completely different configurations on different interfaces. This is incorrect as the various config files are effectively interpreted as a single file and in the case of options that cannot be repeated, the last is taken (e.g domain
), but nothing is scoped or acts like it is scoped, except the options that support tagging - implied and limited scoping (1 level deep).
Using multiple files is likely better used for breaking up range configurations and lease reservations.
The prevailing opinion is that it is better to run a dnsmasq
instance per interface in cases where there are materially different options, such as the domain name to name one global directive.
The default systemd setup for running dnsmasq
is quite involved as it is entangled between a number of files including:
/etc/init.d/dnsmasq
/etc/default/dnsmasq
/lib/systemd/system/dnsmasq.service
While the system service setup is complex and arguably comprehensive, he actual command that runs as a result will look similar to this:
/usr/sbin/dnsmasq -x /run/dnsmasq/dnsmasq.pid -u dnsmasq -7 /etc/dnsmasq.d,.dpkg-dist,.dpkg-old,.dpkg-new --local-service --trust-anchor=[snip]
-x
provides the pidfile location, -u
is the user to run as, -7
is the config folder, --local-service
tells it to only accept queries on the local subnet(s) and --trust-anchor
relates to DNSSEC.
To keep it simple, the systemd dnsmasq
service is disabled as two new dnsmasq
services are run using supervisor
.
The config lives in /srv/dnsmasq/home.conf
and /srv/dnsmasq/guest.conf
respectively, again to keep it separate. I've opted for single file configs here but in future may break out lease reservation groups into other files.
To stop and disable the systemd dnsmasq
process run:
systemctl stop dnsmasq.service
systemctl disable dnsmasq.service
It would be possible to have one run in systemd and the other under supervisor
.
Supervisor Config
After installing supervisor
(via apt install supervisor
) create two config files:
These two files declare the home and guest dnsmasq
processes. The main differences to the systemd version is that the --conf-file
is specified, to avoid defaulting to the /etc
path, and we swap -x ...
for --no-daemon
which runs it in the foreground (which supervisor
prefers). The --trust-anchor
argument was copied from the systemd command (captured using ps waux | grep dnmasq
).
The two dnsmasq
config files are similar to before, with a few key changes:
The key differences are as follows:
- In both we bind to interfaces and IPs - the bind interfaces makes the process more selective about binding
- Guest does not bind to
lo
which is a default, to avoid a clash - Each can and does have its own domains
- Each has its own lease file
- No need to localise queries
- Tags are not used
Conclusion
The net result is exactly as desired using two dnsmasq
processes, however it took a while to sift through the various Stack Overflow posts to conclude the best approach. It also took a while to get the right configuration options to ensure DHCP bound to the interface declared when using the 2 process setup.
The main downside for me is that this is a very complex setup and easy to break - removing the DROP rule from iptables removes the only line of defence between the two subnets.
On the plus side, you can connect anything to the slave ports on the switch, APs or dedicated devices and they act in isolation to your main network.
Of course there are other options, such as having both networks as sibling NATs under a higher NAT - but The Internet seems less keen on lots of NATs, and I can empathise with that.
Another potential issue is that any traffic and activity from the guest network is still linked to your network and public IP address.
I plan to try out a few other ideas in future posts, including moving the setup largely as is to the MikroTik routers, and exploring the concept of a plug-and-play private and isolated network.
Hints and Tips
Renewing a DHCP lease
On Ubuntu the DHCP lease can be renewed with sudo dhclient -r; sudo dhclient -v
which first releases the lease then re-runs the DHCP client, in verbose mode so the results are easily viewed.
DNS Settings
On the latest Ubuntu the DNS settings can be viewed with resolvectl status
.
Monitoring listener bindings
While trying to ascertain why the configuration for dnsmasq
and the 2 process setup was fighting for DHCP port UDP/67 I was using lsof -i -P -n
to view listening connections, the problem is that it isn't possible to see which interface the connection is listening on.
The following result, while bound to 0.0.0.0
can be scoped to just one interface.
dnsmasq 58895 root 4u IPv4 277252 0t0 UDP *:67
I found the following solution on Stack Exchange:
ss -nlput | grep dnsmasq
Will show the TCP/UDP sockets for the process dnsmasq
.
-n
no port to name resolution-l
only listening sockets-p
show processes listening-u
show udp sockets-t
show tcp sockets
An example output is below: