Nutanix CE (Community Edition) four-node all-flash hyperconverged cluster.
Install Nutanix CE hyperconverged cluster from scratch, tinker, fine-tune it, and kick shit out of it squeezing all possible performance it can give away. For free! 🙂
Nutanix CE (Community Edition) is a clumsy combination of Nutanix-ripped fork-out from a free open-source Linux-based hypervisor called KVM (Kernel Virtual Machine), and some Nutanix-written proprietary SDS (Software-Defined Storage) stack called NDFS (Nutanix Distributed File System) having deep roots in Cassandra NoSQL database. Plus, some ugly (good looking?) web-based management UI (User Interface) called for some unknown reason “Prism” put as a cherry on top of the cake to provide a single … Pain in the ass?! Crossed out! Pane of glass! 🙂 … management experience. All this bloody mix allows end-user to build a hyperconverged environment basically free of charge. “Basically” because he or she still needs to find the actual server hardware on the street to make the whole thing 100% free. The solution can be deployed either bare metal for hyperconverged pseudo-production, “pseudo” because Nutanix enforces strict non-commercial use with their EULA (End-User Licenase Agreement), or alternatively, it can be spawn as a set of a guest VMs (Virtual Machines) to play with the same hyperconvergence absolutely on the cheap. It’s worth mentioning that any trial with commercial use in mind isn’t supported or allowed by Nutanix either and enforced (again!) with strict “NO” by their EULA. So is the performance testing and (especially!) test numbers publishing and sharing, something we plan to do right now 🙂
1. 2.1 Limitations on Use.You must not use the Software or Documentation except as permitted by this Agreement. You must not:
[ … ]
disclose the results of testing, benchmarking or other performance or evaluation information related to the Software or the product to any third party without the prior written consent of Nutanix;
Well 🙂 Sharing is caring, so the fuck we give about these Nutanix farts … Crossed out! Facts! 🙂 … magically disappears, and we encourage you to push the pedal to the metal. Either way, whatever Nutanix sees as their CE use case, it ensures virtual machine HA (High Availability), and claims to deliver some reasonably acceptable performance as well. Truly wonderful thing about Nutanix CE is that it enables an end-user to play with all of the functionality Nutanix paid platform has for free, including features like data de-duplication, distributed erasure coding, compression, snapshots, shadow cloning, and all that jazz. These features available with CE are exactly the same as ones provided for paying customers, being different only in support conditions (Community supported), scalability (Maximum four nodes within one hyperconverged cluster), and… something more we’ll talk about just a little bit later 🙂 Regarding these benefits, and due to being “free as beer”, Nutanix CE is considered kinda “starter-pack” to use with any other Nutanix solution.
Methodology, Considerations & Milestones:
During this particular investigation, we intend to test the four-node Nutanix cluster performance and scalability. You should know that already 🙂 What you don’t probably know is… we absolutely expect all of the numbers, configuration sequences, and other test results being 100% reproducible on the same or very similar hardware platforms. We want to play scientists, don’t we? 🙂 We’ve decided to stick with just one very basic and very critical I/O pattern which is 4KB 100% random reads. With our further tests we give you our honest Indian word we’ll add bigger blocks, writes, and some reads and writes in the mix, but we need to start with something, right? Just to sweeten the pill a little bit, we varied the I/O queue depth from 1 to 128 during all of our performance tests. To make things worse and to sacrifice the Goddess of the overall simplicity, we also decided we want Windows Server 2016 as a guest, only because it’s the most popular production environment so far. Just because! We could do Linux or FreeBSD or some other real operating system instead of Windows, but we decided to use Microsoft Xbox OS only because that’s what Nutanix clusters run most of the time if they aren’t travelling inside UPS track back home after customer applied for RMA. We promise to add Linux if we’ll get a chance to re-run our tests again. So far so good! Here’s what we’re intended to do, step by step…
1) Measure Samsung 960 EVO M.2 500GB NVMe “raw” performance within Windows Server 2016 bare metal environment. That’s our ground zero! OK, it turns out we believe we worth an explanation here. Do not do that in real life, we mean production! These Samsung SSDs are consumer-grade, so while they got plenty of IOPS to deliver, they got no power loss protection as well, which effectively means it’s simply not safe to use them in a combination with pretty much any of the up to date Software-Defined Storage stacks. OK, at least Microsoft Storage Spaces Direct (S2D), and VMware Virtual SAN (vSAN) are both out! Microsoft has a good simplified reading on topic published on their technical blog, telling young Padawans why they should drop consumer grade flash like hot.
VMware has no complete story wrapped up in any single place, but they maintain a nice hardware compatibility list (HCL), the one f.e. Microsoft lacks, which is one of the primary reasons why Storage Spaces Direct fail (Fails? Is it “it” or “they”? Fuck their super-complicated naming!) miserably so far, regardless of what Microsoft tells both the Marines and ordinary people. BTW, second reason is obvious: S2D itself is so broken Microsoft had to rip it off from the current Windows Server builds not to piss of their customers completely, but that’s another story to tell, and we definitely will at some point 🙂 VMware lists all compatible hardware including SSDs you should be using, and you can assume safely anything NOT on HCL should NOT be used in production. Pretty smart, eh?
Just in case given link above isn’t working for whatever weird reason, it should be trivial to figure everything out using generic search inside VMware HCL catalog.
Real hardcore guys with big brass balls, who might be interested in a truly unbiased opinion and want to start right from the rock may wish to spend some more time on scientific reading, where HP Labs sponsored University of Ohio to investigate: What the fuck happens when SSDs lose external power source?
TL;DR: We did it, you shouldn’t!
2) Test just a single lonely Windows Server 2016 VM ran on a four-node Nutanix CE cluster. For the research purpose, we’ll play with the vCPU/vCORE ratio to ensure some optimal virtual machine properties are achieved.
3) Study how the number of running VMs assigned to a single node impacts the overall Nutanix CE cluster I/O performance. We’ll increase the number of running VMs “one by one” on one node of the cluster to see how this impacts system efficiency and overall combined I/O performance. The optimal number of VMs estimated at this stage will be used in further tests.
4) Clone the previously determined optimal number of VMs to the other nodes. We’ll keep the Node Affinity rule for each VM on each node on the safe side to ensure combined now cluster performance is topped.
5) Test performance of a single VM (Node Affinity rule enabled) ran in the one-node Nutanix CE Cluster. This is done just to compare the results with the ones obtained for our four-node cluster test.
OK, let’s move on with some real work 🙂
Hardware and corresponding software configurations deployed within this study.
Below, we describe the first very basic setup used for the purpose of testing Samsung 960 EVO M.2 500GB NVMe “raw” performance in Windows Server 2016 environment.
Node: Dell R730 chassis, 2 * Intel Xeon E5-2683 V3 CPUs @ 2.00 GHz, 128GB of RAM. Yeah, we know we could do much better than that, but it’s still rather challenging to throw in a couple of TBs of RAM without breaking the bank, so we have what we got, or we got what we have J
Storage: Samsung 960 EVO M.2 500GB NVMe drive
OS: Windows Server 2016 x64 Datacenter
For testing our four-node cluster Nutanix CE v.2017.07.20 I/O performance we’ll use these guys…
4 * Node: Dell R730 chassis, 2 * Intel Xeon E5-2683 V3 CPUs @ 2.00 GHz, 128GB of RAM
Storage: 2 * Samsung 960 EVO M.2 500GB NVMe drives
Networking: 1 * Mellanox ConnectX-4 100 Gb NIC, flashed into Converged Ethernet mode, no InfiniBand. We’ll be using just a fraction of those 100, please see some clarification comments below… + Mellanox 40 GbE switch for the whole cluster. Yes, we’re perfectly aware of the fact we’re not in Kansas anymore, and 40 isn’t really 100! We’ll be slowing down our 100 GbE NICs to “miserable” 40 GbE performance, but we shift the gears up still because… we aren’t going to saturate network with just a couple of NVMe drives we got per server, and… we ain’t gotta 100 GbE switches right away L TL;DR: One day we’ll grab a 100 Gb switch or two, and a couple of Enterprise grade NVMe drives to re-run all these tests once again. For now, we’ll just stick with what we have for you so far.
OS/Hypervisor: Nutanix CE v.2017.07.20. Nutanix Community Edition (CE) image is available for download after signing up on the official company’s site. Just a friendly advice: Don’t use your corporate e-mail or anything of a value, generate some throwaway junk one or Nutanix dumbass sales engineers will become your worst nightmare!
This is how network interconnect diagram should look like…
OK, you see NO redudnant network paths configured, and our single 40 GbE switch is being a very nice and shiny single point of failure. SPOF. WTF? We aren’t retard, we just don’t do production, we only want to see how fast we can go! Production? Get everything as fucking redundant as you can!
Hold on for a second! Let’s get into some theory of the strings here. It’s obvious what AHV is, it’s basically a server OS where we install Necropolis … Crossed out! We mean (A)cropolis (H)yper(V)isor which is a KVM fork-out done by Nutanix “engineers”. But what the fuck is “CVM”?!? Well, CVM stands for “Controller Virtual Machine”. See, when Nutanix was running their Software-Defined Storage stack on top of VMware vSphere or Microsoft Hyper-V, they couldn’t port their software to ESXi or Hyper-V, basically – Windows kernel easily, that’s why they decided to run everything inside a guest virtual machine. Something that was working for HPE now and Lefthand Networks nearly a decade back in 2005 started working for Nutanix just 10 years later! Wow! Running SDS stack in kernel like VMware & StarWind vSAN, and Storage Spaces Direct (S2D) do? Versus running it inside guest VM like HPE & now EOL VMware VSA does/did, and Nutanix does now? That’s a big question we’ll try to answer over the next couple of months, but for now, we do Nutanix CE which is like we already told nothing but Red Hat KVM with some pretty minor Nutanix add-ons inside. Why the hell did they do Linux VM inside Linux hypervisor?!? They could skip VM and install everything natively! We don’t have an answer. Over some drinking sessions we decided Community Edition (CE) uses controller VM instead of native implementation for installation simplicity, which opens another big can of worms: Why the fuck Nutanix uses absolutely the same approach for their “ready nodes” or preconfigured hardware they sell to poor little souls becoming their customers?!? No. Fucking. Idea.
Here are some links for after the class reading about what CVM is and how it operates.
Some good writing on subject belongs to Josh-“the-Shill”-Odgers a.k.a. “Nutanix is an answer!!! What was the question?!?” guy. You really should develop some radical reading between the line skills because maybe 10% of quite valuable Nutanix-related information is deeply buried in his rest 90% of the marketing dead badger shit.
You might want to give a try and find an answer inside what’s called a “Nutanix Bible”. We don’t fuck into our eyes, but we found nothing.
Utilities used for testing, DiskSPD & FIO.
For the test purposes, we used DiskSPD v.2.0.17 and FIO v.3.0 (If you’re on the Dark Side of the Force you should know FIO from your Linux/UNIX background, DiskSPD is just another Microsoft’s perpendicular attempt to make this world a better place: instead of contributing to open-source Intel I/O Meter and fixing some issues known with it, Microsoft wrote their own I/O performance testing tool called DiskSPD… Well, I’m not surprised 🙂
We measured I/O performance with the 4KB 100% random read load pattern with queue depth variation (QD). We used the following QD values for the study purpose: QD=1,2,4,6,8,10,12,14,16,32,64, and 128. Test duration: 360 seconds. Warmup time: 60 seconds.
DiskSPD launching parameters. Exact DiskSPD launching parameters deployed within this study:
diskspd.exe -t8 -b4K -r -w0 -o1 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o2 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o4 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o6 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o8 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o10 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o12 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o14 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o16 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o32 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o64 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o128 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1
FIO launching parameters. Below, find FIO launching parameters used for this research:
[global] numjobs=8 iodepth=1 loops=1 ioengine=windowsaio cpus_allowed=0,2,4,6,8,10,12,14 direct=1 ramp_time=60 runtime=360 filename=\\.\PhysicalDrive1 [4k rnd read QD1] iodepth=1 rw=randread bs=4k stonewall [4k rnd read QD2] iodepth=2 rw=randread bs=4k stonewall [4k rnd read QD4] iodepth=4 rw=randread bs=4k stonewall [4k rnd read QD6] iodepth=6 rw=randread bs=4k stonewall [4k rnd read QD8] iodepth=8 rw=randread bs=4k stonewall [4k rnd read QD10] iodepth=10 rw=randread bs=4k stonewall [4k rnd read QD12] iodepth=12 rw=randread bs=4k stonewall [4k rnd read QD14] iodepth=14 rw=randread bs=4k stonewall [4k rnd read QD16] iodepth=16 rw=randread bs=4k stonewall [4k rnd read QD32] iodepth=32 rw=randread bs=4k stonewall [4k rnd read QD64] iodepth=64 rw=randread bs=4k stonewall [4k rnd read QD128] iodepth=128 rw=randread bs=4k stonewall
Testing Samsung 960 EVO M.2 500GB NVMe drive performance in bare metal Windows Server 2016 environment.
During our main test we use the storage pool comprised of the multiple Samsung 960 EVO M.2 500GB NVMe drives as an underlying storage for Nutanix-hosted and running test VMs. For now, we’ll measure the performance of a single NVMe in Windows Server 2016 environment. Bare metal. No virtualization to avoid any possible virtualization overhead interfering with our tests.
Tests were held on the unformatted disk with our “normal” 4KB 100% random reads I/O pattern. As mentioned, we used DiskSPD and FIO to check where performance really is.
Testing Samsung 960 EVO M.2 500GB NVMe drive performance with four I/O threads.
The table and plot below highlight values obtained while testing Samsung 960 EVO M.2 500GB NVMe with four I/O threads.
Testing Samsung SSD 960 EVO M.2 NVMe (500GB) with eight I/O threads.
The table and the plot below address testing disk performance with eight I/O threads.
According to the tests and their results provided above, our SSD drive put under test reached its nominal claimed performance (330,000 IOPS if you care) during measurements with the 4 KB 100% random reads using eight I/O threads. The highest performance level was achieved under QD=10. These values are considered as reference ones. Therefore, we’ll keep using eight I/O threads and QD=10 in our further experiments.
Imprisonment & First Term:
Testing a single VM performance in a four-node Nutanix CE cluster.
Below, we provide the steps for the four-node Nutanix CE cluster performance assessment:
- Install Nutanix CE on the servers.
- Create Nutanix CE cluster. Please note that the number of nodes incorporated in the cluster depends on the clusters’ purposes. Here, we used a four-node setup.
- Consolidate all available disks in a single Storage Pool.
- Create a Storage Container that occupies the entire disk volume.
- Create a two-disk VM. Its first disk, 25GB Virtual Disk0, is intended for the operating system (boot), while the second one, maximum-size Virtual Disk1, is used for DiskSPD and FIO measurements. Note: The overall VM underlying storage volume should not exceed 90% of the Storage Container size. Otherwise, Nutanix decreases disk subsystem performance.
- To indicate the VM properties that grant its highest performance, we varied vCPU/vCORE At this point, bear in mind Nutanix recommendations.
- Clone the optimally set VM keeping it pinned to its current Afterwards, carry out tests on all VMs simultaneously. Disk volume, in this case, is distributed among all cluster nodes. The maximum number of VMs on a single node depends on their overall performance.
- Next, estimate Nutanix cluster scalability. For this purpose, clone the estimated above number of VMs on other cluster nodes. Eventually, test the entire four-node Nutanix CE cluster performance.
Note: Virtual disk in Nutanix is thin provisioned by default. So, to have reasonably accurate test, the disk should be filled with some random junk data to simulate some “normal” disk volume. This operation would be performed before every single test during new virtual disks creation.
To fill virtual disks with junk dd utility was used. Find dd.exe parameters below:
dd.exe bs=1M if=/dev/random of=\\?\Device\Harddisk1\DR1 –progress
While building a four-node Nutanix CE cluster, we deployed a VM (Virtual Disk0 – 25GB, Virtual Disk1 – 500GB, Node Affinity – Node1) and studied its performance using the following vCPU/vCORE combinations:
- 2vCPU/28vCORE this template mimics testing Samsung SSD 960 EVO M.2 NVMe (500GB) on the physical server with two Intel Xeon E5-2683 processors (2 NUMA nodes and 28 CPU cores overall).
Let’s go on and configure our Nutanix cluster.
OK, in order to install and boot Nutanix CE, we’ll need four flash drives with more than 16GB volume.
We’ll use Rufus for image recording on a flash drive. You can use any other tool to format and create bootable USB flash drives, but we’ve decided on this one cause it’s free and it simply does the job.
Here it is:
WARNING: The host with the set Nutanix CE continues using the flash drive for booting. Under any circumstances, I repeat, under any circumstances do not pull the flash drive out of the server after Nutanix CE installation or the whole thing will blow! Seriously, the drive contains the root partition of AHV host file system so don’t remove it during all the way of using Nutanix CE.
Let’s boot Nutanix CE from the flash drive and enter “install”.
Nutanix CE from the flash drive and enter install
Specify the network settings and click “Start”. Use the static IP addresses and keep AHV and CVM hosts in one network.
The further installation takes approximately 5 minutes.
After Nutanix CE has been installed, we can see the following message in the command line:
OK, log in to any CVM.
AHV default account credentials are:
CVM default account credentials are:
Let’s check the network availability of all hosts and CVM (ping each IP address).
If everything is OK with the network, we can go on and create the cluster. Run the following command in CVM:
cluster –dns_servers=220.127.116.11 –ntp_servers=time.google.com –redundancy_factor=2 -s 10.0.0.30,10.0.0.40,10.0.0.50,10.0.0.60 –cluster_name=mynewcluster –cluster_external_ip=10.0.0.111 create
This will initiate the creation of the cluster and all servers’ launching. Thank all the heaven forces, our cluster is created and what we see is:
This means that the Web interface is available by the cluster’s address. In our case, it is: https://10.0.0.111:9440/console/
The default Prism account credentials:
The system will ask you to change the password after the first authorization…you better do that if you don’t want suddenly finding out your Nutanix cluster cheating on you with your best IT buddy.
Now that we’re safe, let’s log in.
Straight after this, you get redirected to the main menu. That’s how Paint, sorry, Prism looks like:
Check the storage pool by following Storage=>Table=>Storage pool.
The system adds all the available storage space to one pool. That’s just what we need.
Now, go to the Storage Container tab and create the storage container.
Once the storage container is created, press Update and uncheck the “Compression” box.
Then, create the Volume Group in the Volume Group tab. We’ll need it to mount disk groups to the virtual machine.
We further need to create two virtual disks in this group. One of them will serve as a guest OS host , while the second disk will be used for running our tests.
Now, upload the image for guest OS installation.
Click Image Configuration:
So, let’s upload the images to the recently created storage container.
First, upload the Windows Server 2016 installation image:
Next, upload the image with Nutanix VirtIO drivers. We’ll needed it further to load the drivers during VM’s installation on Windows.
Now, go to VM=>Network config
Create the new network for virtual machines.
Next, create a VM in the Create VM tab.
Add the booting Windows image to CD-ROM in the Update Disk window:
And add the Nutanix VirtIO image to load the drivers later.
In the VM tab, press Launch Console. The browser window with the Console pops up.
As you can see, the virtual disks are not visible during Windows installation. Now, load drivers from the mounted VirtIO disk. Told ya we’ll need them later!
Click Load Driver, specify the path, and select them all.
Once the Windows installation is over, we can clone virtual machines. We’ll describe the process right here not to break our pretty-formatted testing part.
OK, we’ll have two stages here:
Cloning Volume Group:
And cloning the virtual machine. When the cloning is finished, press Update on the VM tab and assign the appropriate cloned Volume Group to each virtual machine.
And what we get is…
…4 identical VMs (just for example, that’s what we’re gonna do later) up and running so everything is ready for further testing.
Testing one VM (2vCPU/28vCORE configuration) in a four-node Nutanix CE cluster
The following tables and plots highlight performance of one VM with two vCPUs and 28 vCOREs.
Testing one VM (2vCPU/1vCORE configuration) in a four-node Nutanix CE cluster.
The tables below highlight performance of one VM with two vCPUs and one vCORE.
Testing one VM (4vCPU/1vCORE configuration) in a four-node Nutanix CE cluster.
Measurements held on one VM (4vCPU/1vCORE configuration) are described below:
Testing one VM (6vCPU/1vCORE configuration) in a four-node Nutanix CE cluster.
Tables and graphs below highlight one VM (6vCPU/1vCORE configuration) performance:
Testing one VM (8vCPU/1vCORE configuration) in a four-node Nutanix CE cluster.
Below, we provide measurements held on one VM with eight vCPUs and one vCORE:
Mini-Conclusion: What do experiments say? According to the data derived from the tests, 4vCPU/1vCORE is the optimal vCPU/vCORE ratio for one VM run in a four-node Nutanix CE cluster. Optimal IOPS and Latency values support this claim. Regarding this fact, we used 4vCPU/1vCORE VM configuration for all the further tests described below
Imprisonment & Second Term:
Testing several VMs.
In order to check several VMs combined performance, we varied the number of VMs assigned to the particular node (Node1, in our case). The measurement was held until performance growth stopped. For the investigation purpose, we used the following cluster configurations:
- 1xVM (4vCPU/1vCORE/Virtual Disk1=860GB).
- 2xVM (4vCPU/1vCORE/Virtual Disk1=415GB).
- 3xVM (4vCPU/1vCORE/Virtual Disk1=270GB).
- 4xVM (4vCPU/1vCORE/Virtual Disk1=195GB).
- 5xVM (4vCPU/1vCORE/Virtual Disk1=152GB).
- 6xVM (4vCPU/1vCORE/Virtual Disk1=122GB).
- 7xVM (4vCPU/1vCORE/Virtual Disk1=100GB).
- 8xVM (4vCPU/1vCORE/Virtual Disk1=85GB).
Testing one VM (4vCPU/1vCORE/Virtual Disk1=860GB configuration) in a four-node Nutanix CE cluster. We set Affinity Node properly to make sure the VM is stuck to Node1.
Tables and plots below address data obtained while measuring one VM performance run in the cluster:
Testing two VMs (4vCPU/1vCORE/Virtual Disk1=415GB configuration) in a four-node Nutanix CE cluster. All VMs are assigned to Node1.
The following tables and plots depict two VMs performance run in the studied setup:
Testing three VMs (4vCPU/1vCORE/Virtual Disk1=270GB configuration) in a four-node Nutanix CE cluster. All VMs are assigned to Node1.
In this section, we provide three VMs’ performance measurement:
Testing four VMs (4vCPU/1vCORE/Virtual Disk1=195GB configuration) in a four-node Nutanix CE cluster. All VMs are assigned to Node1.
The following tables and plots shed light on testing four VMs (two tables to keep formatting happy):
…and VM3 and VM4 numbers are continued in the next table.
…and VM3 and VM4 continued from the previous table to keep the formatting tidy 🙂
Mini-Conclusion: Regarding the remarkable performance drop during four VMs testing, there was no point to continue enlarging their number. In the following tests, we cloned three 4vCPU/1vCORE VMs on other nodes with the Node Affinity rule set. In this way, we ultimately get a four-node Nutanix CE cluster with three VMs on each node.
Imprisonment & Second Term:
This part of our research addresses the four-node Nutanix CE (12xVM/4vCPU/1vCORE/Virtual Disk1=50GB/Node Affinity 3xVM/Node) cluster scalability. It should be noted that the tested virtual disk volume had been reduced to 50 GB since the overall Storage Container Volume had the constant capacity and was distributed among 12 VMs. We expected the described configuration to exhibit four times higher performance than benchmarks obtained for a single node.
We measured the overall performance of 12 VMs running in a four-node cluster (three VMs per node). Node Affinity rule was enabled for all nodes. Each VM had the following configuration: 4vCPU/1vCORE/Virtual Disk1=50GB. Tests’ results are provided below:
Mini-Conclusion: How does the environment scale? According to the graphs above, a four-node Nutanix CE setup with 12 VMs (Affinity Node 3xVM/Node) does not provide the expected performance scalability. The real overall cluster performance is lower than the sum of all four nodes values.
Not Proven & Third Verdict:
Testing one VM running in a one-node Nutanix CE cluster performance.
In this section, we examined one VM (Node Affinity enabled) running in one-node Nutanix CE cluster to gain insight into the overall system performance. The benchmarks derived from the one-node setup were compared with the performance of the four-node one.
Testing one VM (4vCPU/1vCORE/VirtualDisk1=500GB configuration) run in a one-node Nutanix CE cluster
Below, we provide tests obtained while running a one VM (4vCPU/1vCORE/VirtualDisk1=500GB configuration) in the Nutanix CE cluster.
Mini-Conclusion: According to the data obtained while testing a VM in one-node Nutanix CE cluster, the single-node setup performance (~30,000 IOPS) is strikingly higher than the similar configuration (4vCPU/1vCORE/Virtual Disk1=860GB) deployed in the four-node setup (~8,300 IOPS) when all of the VMs are sending I/O to the back-end storage.
The reality is that Nutanix uses what’s called “data locality”, in the other words, all data associated with one particular VM resides on the single node, while multiple scattered replicated copies are stored among all others cluster nodes. This approach is again nothing new, back in mid-2000s Lefthand Networks, acquired by Hewlett-Packard later, were doing what’s called “wide striping” or “Network RAID”. What’s interesting is however, pretty few companies maintain full local copies, most of the competitors prefer to chop the data into chunks and never keep single object as a whole.
Some more both interesting and frustrating reading from Josh-“the-Shiller”-Odgers’ blog about how now “data locality” really works with Nutanix.
Here’re some thoughts on why VMware isn’t doing “data locality” or what they really mean when they say VMware VSAN “takes advantage of data locality”.
That’s what VMware engineering staff thinks on the subject.
Last but not least, probably the only other company except Nutanix who keeps local copies intact.
Well, both approaches have their own drawbacks and benefits and while keeping all I/O locally sounds very exciting on the paper, it becomes a real slow down factor with anything using many nodes with a combination with super-fast low latency RDMA networking. At this point, we’d pull over as we prefer to compare numbers and not somebody’s opinions: Numbers never lie, people always do!
Sentence & Death Row:
This research provided us with some sufficient information about Nutanix CE cluster scalability and performance. For this study purpose, we built the four-node cluster and played with the number of VMs on each node and various other settings. We were unable to assign as many VMs as we intended to do first due to Nutanix severe performance drop while running four and up VMs on a single cluster node. Regarding this fact, we cloned only three VMs to four cluster nodes each. Furthermore, environment’s performance did not scale as we expected, again: the combined overall cluster performance is lower than the sum of all four nodes run separately! To make things worse we could get only around 25,000 IOPS from the cluster capable of a massive 2,640,000 IOPS. That’s below 10% of the theoretical performance back-end NVMe flash can deliver! Based on this, we can confirm that the solution scales and performs not as good as we expected. Calling things with their real names it’s just epic fail. Epic. Fail. Period.