Troubleshooting XenServer VM network performance

Recently I had to troubleshoot network performance issues for VMs running on Citrix XenServer 6.0.2. All VMs were running W2k8R2 and it seems that CIFS copy jobs where incredible slow. The network itself couldn’t be the problem as all VMs were connected to the same virtual network on only one XenServer.

My favorite troubleshooting tool for these scenarios is netio123. It gives you a basic idea how much throughput you could get between two endpoints. It implements a server and a client in the same binary and is available for multiple OSs. Small, easy and straightforward…

In the first run I tested 2 windows VMs. The command “netio -t -p 5000 -s” will start the netio server on one VM on TCP/5000. To start the test from the client run “netio.exe -t -p 5000 ”.

C:\temp\netio123\bin>netio.exe -t -p 5000 192.168.15.73

NETIO – Network Throughput Benchmark, Version 1.26
(C) 1997-2005 Kai Uwe Rommel
TCP connection established.
Packet size  1k bytes:  8419 KByte/s Tx,  9423 KByte/s Rx.
Packet size  2k bytes:  8236 KByte/s Tx,  8851 KByte/s Rx.
Packet size  4k bytes:  16263 KByte/s Tx,  18310 KByte/s Rx.
Packet size  8k bytes:  32720 KByte/s Tx,  33743 KByte/s Rx.
Packet size 16k bytes:  63435 KByte/s Tx,  65853 KByte/s Rx.
Packet size 32k bytes:  116217 KByte/s Tx,  121351 KByte/s Rx.

As you see by the results i got less than 10MByte/s for small packages. Seems quite slow for a virtual network on one host. I fired up 2 LinuxVMs and rerun the test:

root@squeeze2:~/netio123/bin# ./linux-i386 -t -p 5000 192.168.15.78

NETIO – Network Throughput Benchmark, Version 1.26
(C) 1997-2005 Kai Uwe Rommel

TCP connection established.
Packet size  1k bytes:  331380 KByte/s Tx,  337781 KByte/s Rx.
Packet size  2k bytes:  352727 KByte/s Tx,  344394 KByte/s Rx.
Packet size  4k bytes:  324983 KByte/s Tx,  325345 KByte/s Rx.
Packet size  8k bytes:  332496 KByte/s Tx,  328502 KByte/s Rx.
Packet size 16k bytes:  348690 KByte/s Tx,  357080 KByte/s Rx.
Packet size 32k bytes:  369076 KByte/s Tx,  356453 KByte/s Rx.

I constantly got 300MByte/s even for small packages. This seems to be fine. So what happens when i run the test with a mixed Windows/Linux environment?

C:\temp\netio123\bin>netio.exe -t -p 5000 192.168.15.78

NETIO – Network Throughput Benchmark, Version 1.26
(C) 1997-2005 Kai Uwe Rommel

TCP connection established.
Packet size  1k bytes:  108210 KByte/s Tx,  170729 KByte/s Rx.
Packet size  2k bytes:  129609 KByte/s Tx,  173783 KByte/s Rx.
Packet size  4k bytes:  219375 KByte/s Tx,  202754 KByte/s Rx.
Packet size  8k bytes:  280427 KByte/s Tx,  205357 KByte/s Rx.
Packet size 16k bytes:  283376 KByte/s Tx,  206990 KByte/s Rx.
Packet size 32k bytes:  239445 KByte/s Tx,  206456 KByte/s Rx.

Wow, impressive 100MByte/s even for small packages. That’s 10x compared to WinVM-WinVM. Now it’s quite sure that the network performance issues are just related to the Windows VMs itself – and only when connecting to other Windows VMs. So Windows implements enhanced TCP features which you can display with the netsh command (e.g.: “netsh int tcp show global”). One basic troubleshooting step is to disable all these features and rerun the test. For disabling certain features run the following commands: “netsh int tcp set global chimney=disabled” for disable chimney, “netsh int tcp set global rss=disabled” for disable receive seite scaling… The breakthrough in my case was to disable autotuning by

netsh interface tcp set global autotuninglevel=disabled

C:\temp\netio123\bin>netio.exe -t -p 5000 192.168.15.73

NETIO – Network Throughput Benchmark, Version 1.26
(C) 1997-2005 Kai Uwe Rommel

TCP connection established.
Packet size  1k bytes:  94664 KByte/s Tx,  98326 KByte/s Rx.
Packet size  2k bytes:  99216 KByte/s Tx,  102267 KByte/s Rx.
Packet size  4k bytes:  99947 KByte/s Tx,  105185 KByte/s Rx.
Packet size  8k bytes:  98054 KByte/s Tx,  99594 KByte/s Rx.
Packet size 16k bytes:  99172 KByte/s Tx,  103267 KByte/s Rx.
Packet size 32k bytes:  98967 KByte/s Tx,  102743 KByte/s Rx.

This shows a 10x performance boost for small packages immediately. The sluggish copy jobs are now performing well. So, always make sure your VM network is running exactly. The small tool netio123 (you’ll easily find this on the web) can help you with some basic tests.

So, always make sure your VM network is running fine. The small tools netio123 (you’ll easiely find this on the web) could help you with some basic tests.