Can VMware vSAN recoup on mixed NVMe-SSD underlying storage? Or, will it just mess things up [again]?

Introduction

VMware vSAN performs like ujkv. Period. Well, that’s exactly what numbers told us during the previous study! And, it, probably, sucks to buy from VMware at all: we have not seen any decent vSAN alternative yet, so you, guys, need to go with a shared storage provider performing as slow as old snails hwem.

But, wait, what if vSAN mediocre performance has something to do with the underlying storage, and the solution is not a piece of ujkv after all? Yup, we read your comments. There was a guy looking for a study on just a bunch of SSD drives of smaller capacity. Well, pal, here’s the study that you’ve been asking for!

No problem, we’ll carry out VMware vSAN performance test for the THIRD HWEMKPI TIME to know the truth! You see, there’s an opinion that VMware vSAN groups smaller disks into a vsanDatastore and performs like a good software-defined storage solution should. That’s actually what we gonna try today. Let’s just hope neither we nor the solution won’t hwem anything up this time. So, lean back, grab a bag of popcorn, Doritos, or chips and enjoy the reading!

P.S. We’ve heard that guys from “mother Russia” like sunflower seeds. Well, that’s pretty weird, but if you are good with it, do a Slavic squat and grab a bag of sunflower seeds :).

Patient

VMware vSAN

Address

https://www.vmware.com/products/vsan.html

Doctor

Anarchy

Symptoms

There are serious concerns about performance and scalability. Patient claims that everything is alright and even provided a fake note from a general proving that everything is fine.

State

Relapse. Further investigation is needed. Doctor, do a lobotomy and cpcn rtqdg, please.

Additional information

Look, we are sick and tired of writing about what VMware vSAN is. You may agree that reading the same ujkv for the third time is too much. So, why don’t you just look through any of our previous studies to find out what we think about this solution?

http://www.cultofanarchy.org/another-day-another-patient-four-node-vmware-vsan-cluster-performance-report/

http://www.cultofanarchy.org/vmware-vsan-is-great-in-messing-things-up-re-investigating-vmware-vsan-4-node-cluster-performance/

Preparations

The purpose

Stress-test VMware vSAN performance under the rising number of VMs running in the cluster. We are going to spawn VMs until the overall cluster performance chokes up, just as usual.

Methodology

1. Check in Windows Server 2016 environment whether Intel SSD DC P3700 2TB and Intel SSD DC S3500 480GB performance matches the vendor-claimed values.

2. Deploy VMware vSAN that has vSAN Datastore comprised of 4 Intel SSD DC P3700 2TB (cache tier) and 16 Intel SSD DC S3500 480GB (capacity tier).

Let’s look at the idea behind VMware vSAN testing. This time, we create a disk group out of the disks in one host. 4 Disk groups, in their turn, are consolidated into a shared storage pool available for the entire vSphere cluster.

vSAN provides two fault tolerance methods:

RAID 1 (Mirroring). Data is duplicated to the vSAN Cluster, so the Number of failures to tolerate value parameter is 1. In RAID 1, storage utilization sucks, but you can achieve an awesome performance (that’s what we are looking for!). In order to ensure fault tolerance, you can use 2-3 hosts. However, for our money, you need at least 4 hosts to ensure rebuilding in case of one of those nodes failure.

RAID 5/6 (Erasure Coding). You can apply this RAID level to all-flash clusters. Such clusters can tolerate 1 or even 2 failures in the setups consisting of 4 and 6 nodes respectively. But, if you expect your cluster to rebuild in an event of failure, you need at least 5 or 7 nodes. In such setup, you use disk space a bit smarter, but the overall cluster performance sucks.

wp-image-1770

3. Set up vsanDatastore. Today, we’ve changed storage configuration a bit. One more time, we use just one Intel SSD DC P3700 2TB for cache tier and 4x Intel SSD DC S3500 480 GB for the capacity tier in each host.

wp-image-1771

wp-image-1772

Now, edit vSAN Default Storage Policy.

wp-image-1773

wp-image-1774

wp-image-1775

wp-image-1776

Here’s what we got at the end of the day.

wp-image-1777
wp-image-1778

4. Create a Windows Server 2016 VM pinned to an ESXi host. It has two disks: “system”, and VMware Virtual Disk 80GB (VMware Paravirtual) “data” one. As their names imply, the former keeps system while the later contains some data. “Data” is actually the disk which performance we gonna investigate today. And, as we do not give a hwem about the “system” disk performance, we refer “data” disk performance as “VM performance” in this study.

5. Pick the optimum test utility parameters (threads and Outstanding I/O). Just as usual, we use DiskSPD and FIO for this study.

6. Benchmark single VM performance.

7. Clone that VM to another ESXi host. The daughter VM has own 80GB virtual “data” disk on an individual vsanDatastore. Benchmark the overall performance of 2 VMs.

8. Do the “clone-benchmark-clone” cycle until the overall cluster performance saturates.

9. Get some reference. Measure “bare-metal” performance of individual Intel SSD DC P3700 2TB and Intel SSD DC S3500 480GB drives in Windows Server 2016 environment. Based on the numbers we get, we judge on the overall VMware vSAN performance.

Hardware toys

Setup for checking disk performance

Here’s the setup configuration that we used to check NVMe and SSD disks performance. It’s always good to know that these babies run as good as their vendor says. Check out the host configuration below:

Dell R730, CPU 2x Intel Xeon E5-2683 v3 @ 2.00 GHz (14 physical cores) , RAM 64GB

Storage 1: 1x Intel SSD DC P3700 2TB

Storage 2: 1x Intel SSD DC S3500 480GB
OS:
Windows Server 2016

Setup for re-investigating VMware vSAN performance

As it comes from the article name, there are 4 ESXi hosts in our today’s cluster. We have 4 Dell R730 boxes in total. Let’s call them ESXi Host #1, ESXi Host #2, ESXi Host #3, ESXi Host #4. All these guys look pretty the same from the hardware point of view. Here’s each host configuration:

Dell R730, CPU 2x Intel Xeon E5-2683 v3 @ 2.00 GHz (14 stones per-CPU), RAM 64GB

Storage: 1x Intel SSD DC P3700 2TB, 4x Intel SSD DC S3500 480GB

LAN: 2x Mellanox ConnectX-4 100Gbit/s CX465A

Hypervisor: VMware ESXi 6.5 Update 1

And, to make all we’ve just written clear, here’s the interconnection diagram

wp-image-1779

Software toys

Just like a dozen times before, we use DiskSPD v2.0.20a and FIO v3.8 for our tests. Find their launching parameters under further measurements.

Note: Here, we used a thick provisioned device as a “data” disk. But, regardless of that fact, we still populated the disk with the random data using dd.exe. We carried out this procedure every time we created the virtual disk or changed its capacity.

dd.exe launching parameters:

dd.exe bs=1M if=/dev/random of=\\?\Device\Harddisk1\DR1 –progress

Do disks perform as good as their vendor the doc says?

Before we move on, let’s find out whether both NVMe and SSD disks perform as good as Intel writes in datasheets. Just as any good scientists, (we wanna believe that we are ones 🙂 ) we want to know that tools work as they should. So, here’s some boring text before we jump to some interesting stuff.

wp-image-1780

wp-image-1781

Intel says to derive all those numbers under 4k random reading with 4 workers and Queue Depth 32:

wp-image-1782

In this way, that gonna be the parameters we use to test disks performance under varying queue depth.

So, plots below represent the numbers we actually got.

wp-image-1783

wp-image-1784

wp-image-1785

wp-image-1786

Mini-conclusion

On the whole, both Intel SSD DC P3700 2TB and Intel SSD DC S3500 480GB perform under 4k random read pattern with 4 workers and Queue Depth 32 exactly as their vendor said. For the NVMe drive, the numbers derived with DiskSPD perfectly match information from Intel datasheet (460K IOPS). For an SSD disk, both test utilities showed us the performance that we expected. Well, it’s good to know that at least one thing in this dungeon, sorry, lab works as it should.

Testing the network bandwidth

Once we installed ESXi on all four hosts, we checked the network bandwidth between the hosts with iperf.

Wait… can someone explain to us WHAT THE HWEM?! Even though we used 100 Gbit/s switches and 100 Gbit/s NICs, the network bandwidth between hosts could not go beyond 40 Gbit/s.

wp-image-1787

We cannot leave things like that, you know. That’s why we installed the most up-to-date nmlx5-core 4.16.12.12 NIC Driver for Mellanox ConnectX-4 100Gbit/s CX465A, grouped both NICs on each host into NIC Teams, changed MTU value, played around with switch settings, and made sacrifice to the Spaghetti Monster… but nothing worked out!

Anyway, according to what VMware says, we can go even with miserable 10 Gbit/s bandwidth. In this way, 40 Gbit/s network bandwidth is enough for vSAN all-flash cluster with NVMe disks in the underlying storage.

wp-image-1788

OK, let’s just hope that such a hwemed up network bandwidth won’t alter vSAN performance.

To make sure that vSAN can still do its best, let’s do some quick math. Single Mellanox ConnectX-4 NIC delivers 40 Gbit/s (5 GB/s). So, two ports should deliver around 80 Gbit/s (10 GB/s). In this way, host performance is limited with (10GB/s*1024*1024)/4≈2620K IOPS.

What is the real performance of our underlying storage like? 460K IOPS is the best performance the single NVMe drive can show us. For Intel SSD DC S3500 480GB drives, 75K IOPS under 4k blocks is the best performance you will ever get. In this way, 4 of these babies can deliver 75K*4=300K IOPS. With all those numbers in mind, the overall host performance should be around 760K IOPS (460K IOPS+300K IOPS). In this way, even 80 Gbit/s network bandwidth is fine as the overall underlying storage performance is lower than 2.62M IOPS.

Picking the optimum VM configuration

There’s actually like a ujkvload of scenarios where you may need an all-flash cluster. Let’s say, you need a hweming fast datastore for Microsoft SQL. With that scenario in mind, let’s pick a good VM configuration. Look, we are too lazy to reinvent the wheel. We just adopted one of those standard VM configurations that you may use for working with Microsoft SQL in Azure.

wp-image-1789

Find the test VM configuration below.

  • 4xVCPU
  • RAM 7GB
  • Disk0 (LSI Logic SAS) – 25GB (for OS Windows Server 2016) – “system”
  • Disk1 (VMware Paravirtual) – 80GB – “data”

Picking the perfect number of disk stripes per object value

Let’s create a test VM and pin it to ESXi Host #1. But, before we move to some real measurements, let’s see how the number of disk stripes per object impacts VMware vSAN performance under 4k random reading. Later, we use these measurements to pick the optimum test utility parameters (number of threads and outstanding I/O).

We are going to run a bunch of tests under 4k random read pattern and a varying number of threads and outstanding I/O. We need to find the performance saturation point. The number of threads and outstanding I/O under which performance saturates are considered the optimum test utility parameters.

DiskSPD under threads=1, Outstanding I/O=1,2,4,8,16,32,64,128

diskspd.exe -t1 -b4k -r -w0 -o1 -d60 -Sh -L #1 > c:\log\t1-o1-4k-rand-read.txt

timeout 10

diskspd.exe -t1 -b4k -r -w0 -o2 -d60 -Sh -L #1 > c:\log\t1-o2-4k-rand-read.txt

timeout 10

diskspd.exe -t1 -b4k -r -w0 -o4 -d60 -Sh -L #1 > c:\log\t1-o4-4k-rand-read.txt

timeout 10

diskspd.exe -t1 -b4k -r -w0 -o8 -d60 -Sh -L #1 > c:\log\t1-o8-4k-rand-read.txt

timeout 10

diskspd.exe -t1 -b4k -r -w0 -o16 -d60 -Sh -L #1 > c:\log\t1-o16-4k-rand-read.txt

timeout 10

diskspd.exe -t1 -b4k -r -w0 -o32 -d60 -Sh -L #1 > c:\log\t1-o32-4k-rand-read.txt

timeout 10

diskspd.exe -t1 -b4k -r -w0 -o64 -d60 -Sh -L #1 > c:\log\t1-o64-4k-rand-read.txt

timeout 10

diskspd.exe -t1 -b4k -r -w0 -o128 -d60 -Sh -L #1 > c:\log\t1-o128-4k-rand-read.txt

timeout 10

Number of disk stripes per object = 1 (reading)

wp-image-1790

VMware Virtual Disk 80GB (RAW) – 4k random read (DiskSPD)
threads=1 threads=2 threads=4 threads=8
IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms)
QD=1 2914 11 0.34 8191 32 0.24 16146 63 0.25 29575 116 0.27
QD=2 7366 29 0.27 16056 63 0.25 29781 116 0.27 53146 208 0.30
QD=4 15673 61 0.25 29762 116 0.27 53320 208 0.30 66765 261 0.48
QD=8 29108 114 0.27 53025 207 0.30 65910 257 0.49 65618 256 0.98
QD=16 51522 201 0.31 65581 256 0.49 65009 254 0.98 63863 249 2.00
QD=32 60890 238 0.53 65455 256 0.98 60443 236 2.12 57909 226 4.42
QD=64 65351 255 0.98 64830 253 1.97 60945 238 4.20 58336 228 8.78
QD=128 59957 234 2.13 58842 230 4.35 60412 236 8.47 60686 237 16.87

wp-image-1791

VMware Virtual Disk 80GB (RAW) – 4k random read (FIO)
threads=1 threads=2 threads=4 threads=8
IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms)
QD=1 3822 15 0.25 7680 30 0.25 15100 59 0.26 27904 109 0.28
QD=2 7540 29 0.26 14856 58 0.26 27501 107 0.28 50196 196 0.31
QD=4 14918 58 0.26 27415 107 0.28 48516 190 0.32 66554 260 0.47
QD=8 28403 111 0.27 49819 195 0.31 64295 251 0.49 67203 263 0.94
QD=16 50990 199 0.30 64388 252 0.49 64976 254 0.97 62024 242 2.05
QD=32 65886 257 0.47 63889 250 0.99 62131 243 2.05 59763 233 4.27
QD=64 66562 260 0.94 64897 254 1.95 59651 233 4.28 59768 233 8.56
QD=128 66265 259 1.91 60186 235 4.23 59752 233 8.56 60477 236 16.92

Number of disk stripes per object = 4 (reading)

wp-image-1792

VMware Virtual Disk 80GB (RAW) – 4k random read (DiskSPD)
threads=1 threads=2 threads=4 threads=8
IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms)
QD=1 1842 7 0.54 4116 16 0.48 15026 59 0.26 29079 114 0.27
QD=2 3796 15 0.53 8225 32 0.49 29101 114 0.27 51393 201 0.31
QD=4 7512 29 0.53 16728 65 0.48 52936 207 0.30 64133 251 0.50
QD=8 14153 55 0.56 34749 136 0.46 64931 254 0.49 65853 257 0.97
QD=16 25298 99 0.63 53840 210 0.59 66752 261 0.96 65395 255 1.96
QD=32 35659 139 0.90 55476 217 1.15 68474 267 1.87 61290 239 4.18
QD=64 43598 170 1.47 57787 226 2.22 63310 247 4.04 60414 236 8.47
QD=128 51044 199 2.51 56043 219 4.57 63586 248 8.05 57776 226 17.72

wp-image-1793

VMware Virtual Disk 80GB (RAW) – 4k random read (FIO)
threads=1 threads=2 threads=4 threads=8
IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms)
QD=1 3680 14 0.26 7540 29 0.26 14901 58 0.26 27818 109 0.28
QD=2 7321 29 0.26 14983 59 0.26 27160 106 0.29 49982 195 0.31
QD=4 15023 59 0.26 27893 109 0.28 50337 197 0.31 66046 258 0.48
QD=8 27608 108 0.28 50620 198 0.31 64431 252 0.49 66658 260 0.95
QD=16 50431 197 0.31 64783 253 0.48 65732 257 0.96 66809 261 1.91
QD=32 64124 250 0.49 67590 264 0.93 66170 258 1.92 62252 243 4.10
QD=64 64209 251 0.97 63214 247 2.00 58835 230 4.34 61450 240 8.32
QD=128 65790 257 1.92 55261 216 4.61 62171 243 8.22 62459 244 16.39

Number of disk stripes per object = 8 (reading)

wp-image-1794

VMware Virtual Disk 80GB (RAW) – 4k random read (DiskSPD)
threads=1 threads=2 threads=4 threads=8
IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms)
QD=1 1409 6 0.71 7604 30 0.26 15922 62 0.25 29379 115 0.27
QD=2 3000 12 0.67 15745 62 0.25 29035 113 0.28 51857 203 0.31
QD=4 5996 23 0.67 28977 113 0.28 51569 201 0.31 62973 246 0.51
QD=8 10859 42 0.74 51990 203 0.31 63229 247 0.51 65157 255 0.98
QD=16 19497 76 0.82 64233 251 0.50 64853 253 0.99 65980 258 1.94
QD=32 31637 124 1.01 69529 272 0.92 67101 262 1.91 66797 261 3.83
QD=64 43303 169 1.48 68114 266 1.88 67940 265 3.77 67405 263 7.60
QD=128 50071 196 2.56 68861 269 3.72 68684 268 7.45 66452 260 15.41

wp-image-1795

VMware Virtual Disk 80GB (RAW) – 4k random read (FIO)
threads=1 threads=2 threads=4 threads=8
IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms)
QD=1 3780 15 0.26 7763 30 0.25 15499 61 0.25 28970 113 0.27
QD=2 7654 30 0.25 15516 61 0.25 29074 114 0.27 51813 202 0.30
QD=4 15302 60 0.25 28582 112 0.27 52305 204 0.30 65271 255 0.48
QD=8 29130 114 0.27 51935 203 0.30 63737 249 0.49 67799 265 0.93
QD=16 51804 202 0.30 65819 257 0.47 65703 257 0.96 68580 268 1.86
QD=32 64191 251 0.48 69098 270 0.91 67987 266 1.87 69647 272 3.66
QD=64 68026 266 0.92 68157 266 1.86 68049 266 3.75 69004 270 7.41
QD=128 68830 269 1.83 68568 268 3.71 67403 263 7.58 68978 269 14.83

Number of disk stripes per object = 1 (writing)

wp-image-1796

VMware Virtual Disk 80GB (RAW) – 4k random write (DiskSPD)
threads=1 threads=2 threads=4 threads=8
IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms)
QD=1 1942 8 0.51 3059 12 0.65 3650 14 1.10 6301 25 1.27
QD=2 3445 13 0.58 3639 14 1.10 6537 26 1.22 9825 38 1.63
QD=4 3666 14 1.09 6397 25 1.25 17497 68 0.91 38312 150 0.84
QD=8 6306 25 1.27 17545 69 0.91 37352 146 0.99 38719 151 1.50
QD=16 13056 51 1.22 29407 115 1.09 38358 150 2.63 38267 149 3.34
QD=32 32338 126 0.99 42345 165 1.51 38450 150 3.33 40059 156 6.39
QD=64 32730 128 1.96 37706 147 3.39 39755 155 6.44 39328 154 13.02
QD=128 38302 150 3.34 39304 154 6.50 39930 156 12.82 39144 153 26.16

wp-image-1797

VMware Virtual Disk 80GB (RAW) – 4k random write (FIO)
threads=1 threads=2 threads=4 threads=8
IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms)
QD=1 1662 6 0.59 3098 12 0.63 3200 13 1.24 5983 23 1.32
QD=2 2551 10 0.77 3275 13 1.21 7845 31 1.01 17350 68 0.91
QD=4 3599 14 1.09 7800 30 1.01 13061 51 1.21 38291 150 0.82
QD=8 6580 26 1.20 12162 48 1.30 39085 153 0.81 40107 157 1.58
QD=16 11776 46 1.34 37928 148 0.83 37017 145 2.19 40561 158 3.14
QD=32 28973 113 1.09 39673 155 1.59 34155 133 3.73 39551 155 6.45
QD=64 39521 154 1.59 31447 123 4.04 40365 158 6.32 34809 136 14.69
QD=128 39132 153 3.24 30336 119 8.40 36326 142 19.43 38337 150 26.69

Number of disk stripes per object = 4 (writing)

wp-image-1798

VMware Virtual Disk 80GB (RAW) – 4k random write (DiskSPD)
threads=1 threads=2 threads=4 threads=8
IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms)
QD=1 1323 5 0.76 2003 8 1.00 3625 14 1.10 6429 25 1.24
QD=2 1826 7 1.09 3507 14 1.14 7461 29 1.07 11656 46 1.37
QD=4 3175 12 1.26 7403 29 1.08 10091 39 1.59 16677 65 1.92
QD=8 6423 25 1.25 11210 44 1.43 14292 56 2.24 22688 89 2.82
QD=16 9716 38 1.65 16152 63 1.98 17215 67 3.72 26011 102 4.92
QD=32 11345 44 2.82 16610 65 3.85 22461 88 5.70 17530 68 14.60
QD=64 13353 52 4.79 18614 73 6.88 25536 100 10.03 23030 90 22.23
QD=128 14079 55 9.09 20336 79 12.59 26324 103 19.45 28459 111 35.89

wp-image-1799

VMware Virtual Disk 80GB (RAW) – 4k random write (FIO)
threads=1 threads=2 threads=4 threads=8
IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms)
QD=1 1779 7 0.55 3275 13 0.60 3639 14 1.09 6657 26 1.19
QD=2 3310 13 0.59 3783 15 1.04 5752 22 1.38 13052 51 1.21
QD=4 3702 14 1.07 7024 27 1.13 13921 54 1.14 29801 116 1.06
QD=8 6685 26 1.18 13926 54 1.14 22892 89 1.39 31749 124 2.00
QD=16 6214 24 2.56 9960 39 3.20 17153 67 3.71 9818 38 13.02
QD=32 8655 34 3.68 12634 49 5.05 26135 102 4.88 16752 65 15.26
QD=64 8832 35 7.22 13430 52 9.50 16966 66 15.06 29794 116 17.17
QD=128 8441 33 15.13 11800 46 21.66 26689 104 19.17 24339 95 42.05

Number of disk stripes per object = 8 (writing)

wp-image-1800

VMware Virtual Disk 80GB (RAW) – 4k random write (DiskSPD)
threads=1 threads=2 threads=4 threads=8
IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms)
QD=1 1331 5 0.75 3815 15 0.52 6533 26 0.61 7108 28 1.13
QD=2 1899 7 1.05 6576 26 0.61 7075 28 1.13 14438 56 1.11
QD=4 3120 12 1.28 7143 28 1.12 13380 52 1.20 21155 83 1.51
QD=8 4075 16 1.96 11066 43 1.45 16512 65 1.94 22791 89 2.81
QD=16 9332 36 1.71 18154 71 1.76 19888 78 3.22 28616 112 4.47
QD=32 8512 33 3.76 23569 92 2.72 22790 89 5.62 27272 107 9.39
QD=64 12538 49 5.10 28602 112 4.48 28577 112 8.96 28986 113 17.66
QD=128 14653 57 8.73 26930 105 9.50 26848 105 19.07 30253 118 33.85

wp-image-1801

VMware Virtual Disk 80GB (RAW) – 4k random write (FIO)
threads=1 threads=2 threads=4 threads=8
IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms)
QD=1 2741 11 0.35 3687 14 0.53 6002 23 0.66 6949 27 1.14
QD=2 3682 14 0.53 6260 24 0.63 7172 28 1.10 14926 58 1.06
QD=4 6119 24 0.64 7127 28 1.11 13637 53 1.16 13451 53 2.37
QD=8 7168 28 1.10 14943 58 1.06 23384 91 1.36 30475 119 2.09
QD=16 12310 48 1.29 15626 61 2.03 23793 93 2.68 20127 79 6.35
QD=32 13851 54 2.29 15518 61 4.10 27121 106 4.70 27733 108 9.22
QD=64 14190 55 4.48 17404 68 7.33 30502 119 8.37 29059 114 17.61
QD=128 14009 55 9.11 18705 73 13.65 25554 100 20.02 27440 107 37.30

Mini-conclusion

If you look carefully at all those graphs, they drive you to these conclusions:

1. For our setup, there’s no point to use more than one disk stripe per object. Setting more disk stripes per object won’t give any decent reading performance gain, and writes become slow as hwem. So, we run further tests under Number of disk stripes per object =1.

2. Threads=4 and Outstanding I/O=8 are the optimum test utility launching parameters.

With all that being said, let’s do some measurements!

Lobotomy and cpcnrtqding: Measuring VMware vSAN performance

cpcn-rtqding: Investigating scalability

Lets’ discover how VMware vSAN  cpcn gaps scales 🙂 . Here’s actually the bunch of parameters under which we carried out all the measurements today:

  • 4k random write
  • 4k random read
  • 64k random write
  • 64k random read
  • 8k random 70%read/30%write
  • 1M sequential read

Here are the testing utilities launching parameters thread=4, Outstanding I/O=8, time=60sec.

DiskSPD

diskspd.exe -t4 -b4k -r -w100 -o8 -d60 -Sh -L #1 > c:\log\4k-rand-write.txt

timeout 10

diskspd.exe -t4 -b4k -r -w0 -o8 -d60 -Sh -L #1 > c:\log\4k-rand-read.txt

timeout 10

diskspd.exe -t4 -b64k -r -w100 -o8 -d60 -Sh -L #1 > c:\log\64k-rand-write.txt

timeout 10

diskspd.exe -t4 -b64k -r -w0 -o8 -d60 -Sh -L #1 > c:\log\64k-rand-read.txt

timeout 10

diskspd.exe -t4 -b8k -r -w30 -o8 -d60 -Sh -L #1 > c:\log\8k-rand-70read-30write.txt

timeout 10

diskspd.exe -t4 -b1M -s -w0 -o8 -d60 -Sh -L #1 > c:\log\1M-seq-red.txt
FIO

[global]

numjobs=4

iodepth=8

loops=1

time_based

ioengine=windowsaio

direct=1

runtime=60

filename=\\.\PhysicalDrive1

[4k rnd write]

rw=randwrite

bs=4k

stonewall

[4k random read]

rw=randread

bs=4k

stonewall

[64k rnd write]

rw=randwrite

bs=64k

stonewall

[64k random read]

rw=randread

bs=64k

stonewall

[OLTP 8k]

bs=8k

rwmixread=70

rw=randrw

stonewall

[1M seq read]

rw=read

bs=1M

stonewall

Investigation results

wp-image-1802

wp-image-1803

wp-image-1804

wp-image-1805

wp-image-1806

wp-image-1807

wp-image-1808

wp-image-1809

wp-image-1810

wp-image-1811

wp-image-1812

wp-image-1813

4k random write 4k random read
DiskSPD FIO DiskSPD FIO
IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms)
1x VM 37321 146 1.22 36728 143 1.31 74805 292 0.43 73839 288 0.42
2x VM 16474 64 3.88 17334 68 3.68 127485 498 0.50 124033 485 0.50
3x VM 21320 83 4.51 21387 84 4.59 155134 606 0.63 156965 613 0.61
4x VM 24003 94 7.12 24936 97 6.95 175230 684 0.76 176985 691 0.73
5x VM 31700 124 6.81 33144 130 6.68 180237 704 0.93 173445 678 0.95
6x VM 27479 107 7.86 27637 108 7.33 174365 681 1.17 160409 627 1.27
7x VM 33937 133 7.58 34261 134 7.77 170299 665 1.42 176534 690 1.37
8x VM 34920 136 8.99 36868 144 8.45 164924 644 1.67 166919 652 1.62
9x VM 34787 136 8.92 36592 143 8.31 201961 789 2 200380 783 1.64
10x VM 36610 143 10.57 42598 166 7.98 210931 824 1.82 214333 837 1.60
11x VM 38620 151 10.47 42272 165 8.48 203692 796 1.83 210388 822 1.75
12x VM 44384 173 9.98 48477 189 9.38 213144 833 1.84 229942 898 1.73
13x VM 50620 198 10.71 59633 233 8.70 304747 1190 1.40 279691 1093 1.51
14x VM 65040 254 10.74 71341 279 9.30 283837 1109 1.59 268295 1048 1.71
15x VM 62352 244 12.48 67758 265 10.11 275757 1077 1.76 274764 1073 1.79
16x VM 61615 241 11.00 59186 231 10.30 282526 1104 1.84 274303 1072 1.96
17x VM 64725 253 9.77 62515 244 10.86 282576 1104 1.72 278976 1090 2.04
18x VM 69179 270 13.68 63708 249 11.96 272617 1065 2.37 275047 1075 2.28
19x VM 69321 271 11.91 60042 235 14.02 275787 1077 1.88 263738 1030 2.43
20x VM 61662 241 16.32 68223 266 13.19 245407 959 2.63 285043 1114 2.38
threads=4 Outstanding I/O=8 threads=4 Outstanding I/O=8 threads=4 Outstanding I/O=8 threads=4 Outstanding I/O=8
64k random write 64k random read
DiskSPD FIO DiskSPD FIO
IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms)
1x VM 5288 331 6.05 5887 368 5.41 25996 1625 1.23 24459 1529 1.29
2x VM 5176 323 12.72 5453 341 11.71 35167 2198 1.82 29422 1839 2.17
3x VM 6768 423 14.19 7381 462 12.99 48181 3011 2.00 43852 2741 2.19
4x VM 7170 448 18.44 7647 478 17.42 45502 2844 2.87 42044 2628 3.08
5x VM 9453 591 20.21 10027 627 17.86 59389 3712 2.78 50661 3167 3.25
6x VM 8095 506 25.73 9151 573 21.61 64816 4051 3.05 52961 3311 3.76
7x VM 9705 607 25.13 10373 649 23.50 70022 4376 3.34 56066 3505 4.20
8x VM 9860 616 30.59 11003 689 24.94 66758 4172 4.07 60193 3763 4.42
9x VM 10639 665 30.17 12039 754 24.89 81830 5114 3.77 72125 4509 4.26
10x VM 11183 699 36.36 13342 835 25.13 82664 5166 4.62 76218 4765 4.30
11x VM 11222 701 33.37 12902 808 27.54 88178 5511 4.23 83447 5217 4.29
12x VM 12665 792 35.62 13685 857 31.95 89424 5589 4.47 85426 5340 4.60
13x VM 13577 849 32.24 16004 1002 27.36 99280 6205 4.27 90761 5674 4.64
14x VM 14021 876 33.88 15347 961 30.83 102274 6392 4.44 95124 5947 4.84
15x VM 13479 842 38.32 14895 933 33.35 103343 6459 4.76 89817 5615 5.46
16x VM 14088 881 38.39 14748 924 35.96 95404 5963 5.49 83899 5246 6.25
17x VM 14491 906 39.96 14649 918 39.16 103583 6474 4.79 94839 5930 5.83
18x VM 14146 884 47.26 14440 905 43.49 99097 6194 6.47 96645 6043 6.07
19x VM 13953 872 46.69 14747 924 43.90 99858 6241 5.63 85272 5332 7.17
20x VM 13775 861 47.86 14477 907 46.05 91760 5735 7.10 93482 5845 7.00
threads=4 Outstanding I/O=8 threads=4 Outstanding I/O=8 threads=4 Outstanding I/O=8 threads=4 Outstanding I/O=8
8k random 70%read/30%write 1M seq read
DiskSPD FIO DiskSPD FIO
IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms) IOPS MB/s Latency (ms)
1x VM 32721 256 0.98 44465 347 0.83 1270 1270 25.19 1228 1229 25.96
2x VM 44164 345 1.45 42826 335 2.03 2189 2189 29.24 2104 2107 30.74
3x VM 56843 444 1.69 58303 456 2.26 3167 3167 31.54 2982 2987 33.18
4x VM 58657 458 2.65 63771 498 3.64 3784 3784 35.32 3576 3585 36.97
5x VM 79683 623 2.89 76416 597 3.80 4681 4681 35.68 4405 4416 37.53
6x VM 64252 502 3.21 67238 525 4.21 5183 5183 38.35 4820 4831 41.94
7x VM 69360 542 3.46 74560 583 4.33 5102 5102 46.04 4897 4910 47.59
8x VM 63058 493 4.37 67104 524 5.04 5009 5009 54.44 4760 4774 57.50
9x VM 86490 676 3.91 94778 741 4.80 5129 5129 62.78 5021 5039 61.34
10x VM 74597 583 5.75 88089 689 5.26 5676 5676 64.62 5661 5677 58.41
11x VM 74275 580 5.62 87940 687 5.42 6048 6048 59.54 5757 5779 63.55
12x VM 74270 580 6.26 90987 711 5.88 6329 6329 61.79 6484 6510 61.35
13x VM 90044 703 5.67 133062 1040 4.82 6245 6245 68.03 5501 5528 76.18
14x VM 118865 929 4.82 128947 1008 5.55 6007 6007 75.18 5820 5848 78.20
15x VM 105093 821 6.41 117456 918 6.94 6184 6184 81.80 5900 5930 83.93
16x VM 110421 863 5.62 112571 880 7.01 6042 6042 87.13 6043 6077 89.58
17x VM 81841 639 8.96 109312 854 8.39 7598 7598 73.71 6201 6234 92.259
18x VM 105000 820 8.16 104115 814 10.28 6627 6627 96.17 6310 6348 95.24
19x VM 81771 639 10.73 109609 857 9.71 7598 7598 84.50 5707 5744 107.75
20x VM 114115 892 6.86 104276 815 10.23 6090 6090 107.14 6288 6330 106.40
threads=4 Outstanding I/O=8 threads=4 Outstanding I/O=8 threads=4 Outstanding I/O=8 threads=4 Outstanding I/O=8

Lobotomy: Studying performance

Now, let’s get some reference to judge on performance! To get the reference, we are going to measure Intel SSD DC P3700 and Intel SSD DC S3500 in Windows Server 2016 “bare-metal” environment. At this step, we are also going to find out how performance changes once we group 4 NVMe drives and 16 SSDs in one vsanDatastore.

Just one more time, we have an all hweming flash vsanDatastore over here. We have NVMes for cache tier and slower SSDs for the capacity tier. But, as VMware itself says, vSAN does not use cache tier for reading in all-flash arrays. All reading is done from the capacity tier.

wp-image-1814

Another important thing to know before doing performance estimation: we use RAID 1 (Mirroring) in vsanDatastore. So we made all the estimations below based on how RAID 1 works (look at the scheme below).

http://www.cultofanarchy.org/wp-content/uploads/2018/07/62.png

This approach is good enough to estimate the overall DiskGroup performance, and it is fairly simple afterward.

1.vSAN reads from all disks in the capacity tier. So, as we have 16 of these guys in the underlying storage, the overall reading performance of such storage pool is 16 times higher than a single Intel SSD DC S3500 performance.

2. We measure writing performance only under random writes here. So, let’s just assume that writing performance is not altered by the cache tier, OK? With that being said, we should calculate the writing performance of the capacity tier alone. One more time, all disks work in RAID 1; so here’s the formula describing performance:

N stands here for the number of disks in the setup. As we estimate the performance of caching tier alone, N equals 16. ½ is a coefficient that includes the number of mirrors.

3. Under the mixed workloads (8k 70%read/30%write), we believe the performance to be described with this formula:

One more time, N stands here for the number of disks in our setup (N = 16).

DiskSPD
Intel SSD DC S3500
( Windows Server 2016)
Intel SSD DC P3700
( Windows Server 2016)
Theoretical values for use 16x Intel SSD DC S3500 SSD
Max Performance for
VMware Virtual Disk
over vsanDatastore
The ratio of measured performance to theoretical value
IOPS MB/s IOPS MB/s IOPS MB/s IOPS MB/s %
4k random write 12233 48 409367 1599 97862 382 69321 271 70.84
4k random read 73977 289 423210 1653 1183635 4624 304747 1190 25.75
64k random write 734 46 30889 1931 5874 367 14491 906 246.72
64k random read 7114 445 51738 3234 113819 7114 103583 6474 91.01
8k random 70%read/30%write 18110 141 403980 3156 532524 4161 118865 929 22.32
1M seq read 449 449 3237 3237 7176 7176 7598 7598 105.87
threads=4

Outstanding I/O=8

threads=4

Outstanding I/O=8

threads=4

Outstanding I/O=8

FIO
Intel SSD DC S3500
( Windows Server 2016)
Intel SSD DC P3700
( Windows Server 2016)
Theoretical values for use 16x Intel SSD DC S3500 SSD
Max Performance for
VMware Virtual Disk
over vsanDatastore
The ratio of measured performance to theoretical value
IOPS MB/s IOPS MB/s IOPS MB/s IOPS MB/s %
4k random write 10180 40 351489 1373 81440 318 71341 279 87.60
4k random read 72965 285 333634 1303 1167440 4560 285043 1114 24.42
64k random write 644 40 31113 1945 5152 322 16004 1002 310.64
64k random read 7032 440 37069 2317 112512 7034 96645 6043 85.90
8k random 70%read/30%write 15111 118 351240 2744 503626 3935 118865 929 23.60
1M seq read 440 440 3231 3233 7040 7040 6484 6510 92.10
threads=4

Outstanding I/O=8

threads=4

Outstanding I/O=8

threads=4

Outstanding I/O=8

Conclusions

Today, we’ve studied VMware vSAN performance for the THIRD HWEMKPI TIME. And, the results are just hweming awesome. Sorry for calling all you guys who buy from VMware a bunch of losers as VMware vSAN does its job great. Just look up to those plots!

First, let’s discuss scalability. Under mixed loads, (8k random 70%read/30%write) we observed linear growth up to 5 VMs in the cluster. Peak performance under this pattern ranged between 119K IOPS (DiskSPD) – 133K IOPS (FIO) with 13-14 VMs in the cluster (119 IOPS (DiskSPD), 133K IOPS (FIO). Under 4k and 64k blocks, vSAN performance grows linearly until 13-14 VMs populate the cluster. That actually looks good to us. Under 1M sequential read pattern, we observed a steady linear performance growth until 12th VM came on board. Furthermore, if you compare single Intel SSD DC S3500 480GB in “bare-metal” Windows Server 2016 environment and the overall vsanDatastore performance, it becomes clear that vSAN scales really awesome.

The performance we got also looks awesome to us. Below, find the table showing the portion of the all-flash capacity tier performance we got. Sometimes, we end our articles with the “Diagram of shame” now, though, we are so impressed that we’d like to put “The table of success” at the end.

The Pattern The portion of the overall performance we achieved
DiskSPD FIO
4k random write 71% 88%
4k random read 26% 24%
64k random write 247% 311%
64k random read 91% 86%
8k random 70%read/30%write 22% 24%
1M seq read 106% 92%

Well, we guess that you are curious what the hwem is going on with the cluster under 64k random write pattern. Let us explain. We think that cache tier does come into the play. You see, the single disk from the cache tier performance under that pattern lays between 30889 IOPS (DiskSPD) – 31113 IOPS (FIO). Remarkably, vSAN overall performance is almost half of an NVMe disk performance (14491 IOPS (DiskSPD) and 16004 IOPS (FIO)). And, there are actually two version what the hell may be going on with the cluster:

  • The entire 4-disk capacity tier fits into a single Intel SSD DC P3700 2TB. So, potentially, cache tier can alter the measurements.
  • We have disks in RAID 1. In this way, all writing is done into 2 cache tiers. Well, that’s probably why we get only half performance out of a single NVMe.

We’d say that’s pretty smart, VMware.

Performance under 1M sequential pattern slightly exceeds an expected one. This time, it may happen due to DiskSPD quirks. Anyway, vSAN performance under this pattern almost matches an expected one. Good enough, good enough…

So, let’s sum up all that scribbling above. VMware vSAN is pretty good stuff, you know. It scales great, it is blazing fast… tell us, is there anything else one may ever want from SDS? Just a quick note, if you do not use large disks, everything looks fine, but once you switch to NVMes of massive capacities you get things hwemed. Actually, we are thinking of a return fight between S2D and VMware vSAN. That gonna be a really epic ujkv this time…

5/5 (2)

Please rate this