A VMWare Enterprise licensed solution consisting of a number of VMWare hosts, shared storage, plus the VSphere management application is a complex product. Obtaining maximum performance means being able to tune a number of different sub-systems, and to get the various sub-systems working together efficiently.
One of the key sub-systems many people over-look is the network. It appears as though many people think that simply
plugging gear into a series of gigibit ethernet ports is all that is necessary for inter-connecting the various VMWare
solution components. That may be true for a basic level of functionality, but not for optimal performance.
VMWare has a capability called vMotion. This allows guest operating systems to be migrated live from one host
to another. This requires synchronization of sessions between hosting physical servers, which relies on an efficient network connection.
vStorage is a function, very similar in capability to vMotion. In this case it is used to migrate file sets between VMWare DataStores
in a live scenario. This requires coordination between shared storage devices, between hosts, and between the host and shared storage devices. If iSCSI
is used for accessing shared storage, the network beocmes a doubly critical component of this migration and synchronization.
Here are some ideas for improving the performance of a VMWare solution at the network level.
Speed and Duplex: Sometimes it is easy to overlook the fact that the host server may not always negotiate proper speed and duplex settings
with the switch. Both the server and the switch should be checked to ensure that they have both negotiated to 1gbps at full duplex. GigE ports
may also perform handshaking. You will want to ensure that the host and the switch are consistent in their settings. Switches with
management interfaces will commonly show if there are any duplex mismatches, and will also show if there are errors encountered.
TCP Offload Engine (TOE): Do your Network Interface Cards (NICs) have TCP Offload Engine capability? Has it been enabled? Are the cards compatible with VMWare?
Fault Tolerance: Most modern enterprise servers come with two NICs. This provides for load-balancing and for fault-tolerance abilities. In one scenario,
the two NICs can be bundled and connected to one switch for higher overall throughput. The other scenario involves connecting one NIC to one switch and the other NIC to
another switch. In this mode, bundling has to be turned off. If one switch becomes unavailable, all traffic will run through the one switch still operating. With this
two switch configuration, there are a number of additional optimizations available, which will be described in subsequent points.
Separation of Data and ISCSI Traffic: When hosts use iSCSI for connecting to SAN or NAS devices, the network becomes an integral part of a host/datastore communications.
It is commonly recommended that iSCSI traffic
should not traverse the same network links as regular host data traffic. Therefore, in a general view, in the two NIC/two switch configuration defined above, iSCSI traffic
should be on one NIC and regular data traffic on the other. If you do regular switch access ports for the two types of traffic, the fault tolerance is no longer available, with
a solution for this outlined below. Also, all iSCSI preferred ports should be connected to one switch, and all data preferred ports should be connected to another switch.
Use of VLANs: In order to mix traffic types on the NICs, VLANs should be configured on the switches, and the switch ports connecting to the servers
should be configured as trunk ports. At this point, at least two VLANs are required: a data VLAN,
and an iSCSI VLAN. Typically a third native vlan is supplied, which can be the default vLAN of 1, or some other neutral VLAN. The native VLAN should not be used
for any sort of traffic. It is only on 802.1q type VLANs on which QOS can be set. The VLAN configurations should be identical on the two switches, and on each
of the two trunk ports connecting to the servers.
Server Separation of Traffic: Once the VLANs have been configured and matched on switch and server sides, the server should be set so that the iSCSI traffic favours
one VLAN and the data traffic favours the other VLAN. In the event of a switch failure, both traffic types will use the one link in a slightly degraded state.
Switch Ports: On many switches, each switch port shares bandwidth with other switch ports. This can cause traffic contention, and possibly packet loss. For example,
in a Cisco 4500E switch with a Supervisor V, each set of 8 ports on a 48 port blade shares 1gbps of bandwidth to the Supervisor. This is called
an over-subscription ratio, and in this case, the ports are over-subscribed in an 8:1 ratio. When working with high instantaneous traffic loads that VMWare hosts can
place on their associated iSCSI DataStores, use of over-subscribed ports is not recommended. It is best to use low port count server blades, or higher capacity
switches in order to eliminate these issues of bandwidth contention.
Switch Cross-Connects: In a similar vein, when cross connecting two switches, it is best to use non-blocking, non over-subscribed switch ports. Bundling multple ports
together to improve inter-switch traffic capacity is also recommended. Just remember that bundling two or more adjacent switch ports on an over-subscribed blade will not
yield the desired benefit. Only non-blocking, non-over-subscribed switch ports should be in a bundle.
Switch Spanning Tree: When multiple switches are inter-connected, they should be configured with spanning tree in order to prevent loops in the network. For
optimizing traffic patterns in a mixed iSCSI/Data network configured on redundant switches, a common rule of thumb is to keep iSCSI traffic on one switch, and all other
data traffic on the other switch. If the host server port connections for iSCSI and data, as explained above, are mixed between switches, then in some cases, one extra switch hop
is required, which even at the GigE level, can slow things down. Per-VLAN spanning tree should be implmented. The root for the iSCSI VLAN should be on the iSCSI preferred switch,
and the root for the Data VLAN should be on the Data preferred switch. This minimized the amount of cross switch data transfer, therefore optimizing traffic flow.
Switch Port Settings: When devices are turned on while connected to a switch port, or are first connected to a switch port, the switch will
typically not allow traffic to flow for a number of seconds while it recalculatese spanning tree. This delay period can be reduced on Cisco switches through the use of three
settings having to do with: portfast, bpdu-filter, and bpdu-guard.
Fault Tolerant Routing: This two switch configuration should be supported through redundant layer 3 routing, commonly implemented via HSRP, or VRRP.
When setting up with fault tolerant routing scenario, default gateway weighting for the iSCSI VLAN should favour the iSCSI switch, and default gateway weighting for
the data VLAN should favour the data preferred switch.
Fault Tolerant VLAN: Some VMWare Fault Tolerant operations require an inter-host heartbeat. It is best to create an additional VLAN for this data and make
it available over the trunk ports to the servers. I'd suggest setting preferences of this VLAN to the same NIC as the iSCSI VLAN. This can cause some contention, but
can be minimized through the use of QOS, as suggested in a following point.
Quality of Service: When multiple data types from multiple sources attempt to use common links, there is always the opportunity for contention, packet jitter, and
subsequent packet loss. I would rank heart beat traffic to be of highest priority (low volume), then iSCI traffic (high volume), then regular traffic (high volume). Suitable
QOS settings should be set on the host server side and on the various switch ports to ensure high priority traffic is prioritized and apportioned appropriately.
Management Overhead: Switches and ports may also carry other traffic such as routing protocols, network management data, voice traffic, etc. These
other traffic types have to be appropriately analyzed and integrated into the overall VLAN, QOS, and routing architecture.
Jumbo Frames: Common IPv4 traffic relies on packets containing payloads of a maximum of 1500 bytes (1500 byte MTU). Large data streams can be further optimized by adjusting
switches and host server NICs to allow larger MTU sizes. Values in the 9000 range are commonly used.
Summary: As you can see, there are many network related optimizations available for obtaining even better performance and reliablity for VMWare based clusters.