VLANs on MikroTik: The good, the bad and the ugly

/images/network_switch.jpg

RouterOS remains for me the absolute best platform for enterprise networking today. From its intuitive commandline and built-in syntax highlighting, to the incredible Winbox UI which makes visualizing configurations so much easier, to the little things like MAC Winbox and RoMON there’s much to love.

That said, the platform itself is in a constant state of evolution and development. I’m pleased that MikroTik develops so actively and is always releasing new products and updates for its existing ones. Changelogs for RouterOS updates include bugfixes for individual models of routers, some of which are quite old, meaning that if you buy a MikroTik product chances are you’ll be able to use it in production for a good long time.

Update
It seems that hardware offload of VLAN tagging depends on the particular switch platform you’re using. I was using a hAP ac and a CRS125 to test configurations when I wrote this post. I’ve since tested with a CRS317 and a set of CRS326’s and can confirm that bridgeful VLAN tagging is indeed hardware offloaded and results in very little CPU usage. I’ve not been able to find any official documentation that explains this behavior so I have to assume that it’s a limitation of how RouterOS works with certain switch chips. Without knowing beforehand and testing after however these sorts of limitations can and will screw people over.

Bridgeful VLANs and Hardware Offloading

One of the pleasing new directions taken in RouterOS 6.41 is the introduction of hardware offloading and bridgeful VLANs.

Hardware offloading refers to the dynamic offloading of bridge packet handling to the built-in switch chip. This significantly increases performance and decreases CPU usage. 6.41 also did away with the esoteric ‘master interface’ configurations that defined so-called switch groups which was a welcome change.

However bridgeful VLANs, for me anyway, were the really exciting news. This new interface allows you to implement your VLANs and ingress tagging directly in the familiar bridge menu on Winbox. This (somewhat) intuitive new interface is a welcome change to the myriad of different switch menus one would normally have to go through to do the same job. And of course this new interface would be the same across different models of RouterBOARD devices instead of expecting different options for different switch chips as was the case previously.

I was disappointed however when I found that contrary to implication VLAN tagging via the bridge was not hardware offloaded. That is to say, the actual tagging and untagging of frames moving through the bridge appears to be handled by the CPU. This meant that a single gigabit data flow had the potential to max out the CPU on your RouterBOARD device, severely impacting overall performance. For me, this is a significant step back.

Naturally, not all deployments will require VLAN tagging and multiple gigabit data flows through your routing or switching infrastructure. Bridgeful VLAN tagging has its place for SOHO deployments and places where fileservers aren’t present. However for larger deployments, say on the backbone of a medium-size business, they’re entirely unsuitable.

CAPsMAN

One of the killer features on the RouterOS platform is the automatic configuration of wireless access points (or CAPs) via CAPsMAN. This lovely system allows one to dynamically broadcast SSIDs on devices based on MAC address or even the system identity, to upgrade CAPs quickly and easily and to centralize otherwise tedious configuration.

It’s also incompatible with bridgeful VLANs. That is to say, when CAPsMAN associates with a CAP, it creates a cap interface and may also associate it with a bridge. If VLAN tagging is enabled in the dataflow configuration then frames are tagged accordingly. This is done in the same way as with wireless interfaces. You can change the VLAN tag associated with an interface and very quickly and easily end up with tagged frames coming into your bridge, ready to be routed accordingly.

However, when VLAN filtering is enabled on the bridge, these cap and wireless interface VLAN tagging configurations are ignored. This means that regardless of whether local or CAPsMAN forwarding is enabled on the dataflow and in spite of whether or not a cap interface is created dynamically or otherwise VLAN tagging will not occur for any dataflows associated with a configuration if bridgeful VLAN tagging is set.

To me, this seemed like a horrendous oversight and I took to the forums to explain what I’d found. The thread languished without a response from the community or from any MikroTik officials.

To be fair, it is quite possible to work around this. Simply not using bridgeful VLAN tagging on the device acting as CAPsMAN sidesteps the problem. If you have a switch and a router then it’s natural to assume that most VLAN tagging takes place on the switch anyway. All one has to do then is set up the correct VLAN interfaces and you’re off to the races. And indeed you could simply use switchful VLANs on your router and things work as expected.

Final Thoughts

As mentioned previously, I’ve been pleased overall with the constant development of RouterOS and the RouterBOARD products that run it. It is an excellent platform and continues to grow and evolve with customer needs. What I hope to have explained with this post is that these requirements and observations aren’t edge cases. Neglecting consideration of a fairly common CAPsMAN configuration in MikroTik’s approach to bridgeful VLANs is particularly egregious but certainly not a dealbreaker given the availability of workarounds.

MikroTik is a company pursuing ideals that I’m sympathetic with: standardization and excellence in their products and a cheaper alternative to more well-known names like Cisco and Ubiquity. Perhaps patience is called for as we continue to use and learn about this wonderful platform.