Damn it! While we were busy configuring the cluster and running all those tests, Dell EMC decided to quit selling ScaleIO and giving it away for free as a software. Well, apparently, no one needs such piece of shit even for free!
4-Node all-flash VMware ScaleIO Cluster
Deploy EMC ScaleIO cluster, tune it and squeeze all possible IOPS it can give away to see how fast it can go. For free!
So, EMC ScaleIO is a software-defined storage solution that transforms block storage infrastructure into an “efficient and easier to manage” one. Basically, the software creates a Storage Area Network or SAN to put it simply using the local servers’ storage. It unites the storage resources of the system connected directly to the server, creating a virtual storage pool with various perfromance levels. This all shoud serve to give us flexible scaling options. OK, these guys also claim they can “build an enterprise grade software-defined storage architecture utilizing standard x86 servers and Ethernet network.” Enterprise-grade SDS architecture on those shitty servers you already have? Sounds good, if it’s true. They don’t seem to stop fucking with our imagination and we’re getting really wet: “ScaleIO is designed to massively scale from three, to hundreds or even thousands of nodes. The scalability of performance is linear with regard to the growth of the deployment. Such massive I/O parallelism eliminates bottlenecks. Throughput and IOPS scale in direct proportion to the number of servers and local storage devices added to the system, improving cost/performance rates with growth.” No shit? That sounds serious. Wonder how there are still any other SDS vendors on the market…
Damn this thing looks good so we wanna try it right away!
But here is what EMC ScaleIO Limited Software License Agreement says on that:
Customer shall not, without EMC’s prior written consent: (iii) perform or disclose to any third party the results of any comparative or competitive analyses, benchmark testing or analyses of Software performed by or on behalf of Customer;
So, can we do that? NO…
Will we do that? YES! Fuck the rules and all the attempts to stop us from testing and revealing all the fuckups around the software. Moreover, ScaleIO has its free edition allowed for non-production purposes (who could doubt that?) and has no restrictions, so no problem here.
- Measure the raw Samsung SSD 960 EVO M.2 NVMe (500GB) performance in Windows Server 2016 environment.
- Test a single VM performance in the four-node ScaleIO cluster. At this point, we played around vCPU/vCORE ratio in order to derive the optimal VM’s properties.
- Study the correlation between the four-node ScaleIO cluster performance and the number of VMs assigned per node. For this purpose, we’ll vary the number of VMs on one node and measure the performance change. After that, we’ll stop on the optimal number of VMs assigned per node.
- Clone the optimal number of VMs estimated at the previous stage to all cluster nodes and measure their total performance.
Below, we describe the basic setup for testing Samsung SSD 960 EVO M.2 NVMe “raw” performance on Windows Server 2016.
Node: Dell R730, CPU 2x Intel Xeon E5-2683 v3 @ 2.00 GHz, RAM 128GB
SSD: Samsung SSD NVMe 960 EVO M.2 (500GB)
OS: Windows Server 2016 x64 Datacenter
To estimate the EMC ScaleIO four-node cluster performance we’ll use these fellows:
4x Node: Dell R730, CPU 2x Intel Xeon E5-2683 v3 @ 2.00 GHz, RAM 128GB, 2x Samsung SSD 960 EVO M.2 NVMe (500GB), 1x Mellanox ConnectX-4 (100Gib), Mellanox SX1012 40GbE switch.
Hypervisor: ESXi 6.5 Update 1, EMC ScaleIO v2.0-13000.211
OK, the network interconnection diagram looks like this:
We’ve held the performance testing with DiskSPD v 2.0.17 and FIO v 3.0 tools.
The I/O performance was measured with the 4K random read load pattern under varying queue depth (QD). We’ve used the following QD values: QD=1,2,4,6,8,10,12,14,16,32,64,128. Test duration: 360 seconds, warmup time: 60 seconds.
DiskSPD launching parameters
diskspd.exe -t8 -b4K -r -w0 -o1 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o2 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o4 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o6 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o8 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o10 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o12 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o14 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o16 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o32 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o64 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1 diskspd.exe -t8 -b4K -r -w0 -o128 -d360 -W60 -Su -L -a0,2,4,6,8,10,12,14 #1
FIO launching parameters
[global] numjobs=8 iodepth=1 loops=1 ioengine=windowsaio cpus_allowed=0,2,4,6,8,10,12,14 direct=1 ramp_time=60 runtime=360 filename=\\.\PhysicalDrive1 [4k rnd read QD1] iodepth=1 rw=randread bs=4k stonewall [4k rnd read QD2] iodepth=2 rw=randread bs=4k stonewall [4k rnd read QD4] iodepth=4 rw=randread bs=4k stonewall [4k rnd read QD6] iodepth=6 rw=randread bs=4k stonewall [4k rnd read QD8] iodepth=8 rw=randread bs=4k stonewall [4k rnd read QD10] iodepth=10 rw=randread bs=4k stonewall [4k rnd read QD12] iodepth=12 rw=randread bs=4k stonewall [4k rnd read QD14] iodepth=14 rw=randread bs=4k stonewall [4k rnd read QD16] iodepth=16 rw=randread bs=4k stonewall [4k rnd read QD32] iodepth=32 rw=randread bs=4k stonewall [4k rnd read QD64] iodepth=64 rw=randread bs=4k stonewall [4k rnd read QD128] iodepth=128 rw=randread bs=4k stonewall
Testing Samsung SSD 960 EVO M.2 NVMe (500GB) performance in a bare metal Windows Server 2016 environment.
Since in our research, we used the storage pool comprised of Samsung SSD 960 EVO M.2 NVMe drives (500GB) as underlying storage for ScaleIO Cluster, we’ve performed the preliminary performance testing of a single NVMe in Windows Server 2016 environment. No virtualization to avoid any possible virtualization overhead interfering with our tests.
The measurements were held on an unformatted disk with our common 4k random read pattern using DiskSPD and FIO utilities.
Testing Samsung SSD 960 EVO M.2 NVMe (500GB) with four threads
Testing Samsung SSD 960 EVO M.2 NVMe (500GB) with four threads
What we’ve learned from the tests above
According to the obtained data, Samsung SSD 960 EVO M.2 NVMe (500GB) reached its maximum claimed performance (330 000 IOPS) under 4k random reads with eight threads and QD=10. Based on this, we’ll refer to this “gold standard” in our further experiments.
Test EMC ScaleIO four-node cluster
Below, you can see the steps we’ll follow to test the EMC ScaleIO cluster performance:
- Deploy VMware ESXi 6.5 Update1 on the servers and connect them to a single vCenter.
- Install EMC ScaleIO Plugin in vCenter.
- Install EMC ScaleIO System on the set nodes.
- Once the ScaleIO system is configured, (One NVMe per storage and one NVMe-RF-cache per ESXi host) we get a 1.8 TB storage pool. Then, build a 664GB volume from the storage pool. Create a Volume-based ScaleIO Datastore on one of our ESXi hosts.
- Create a two-disk VM. The first disk, 25GB Virtual Disk0 located on ESXi Datastore, will be used for the operating The second one, the maximum-size Virtual Disk1 located on ScaleIO Datastore, is intended for performance testing with DiskSPD and FIO.
WATCH OUT: It is necessary to create a Thick provision eager zeroed disk for testing and connect it to the additional virtual SCSI controller.
- Vary vCPU number assigned to one VM to estimate its optimal properties that grant maximum performance.
- Clone the optimally-set VM to the same node and run tests for multiple VMs simultaneously. Disk volume, in that case, is distributed among all VMs. We’ll keep on increasing the number of VMs till we see the stop of their overall performance growth,
- In order to estimate solution’s scalability, clone the previously estimated number of VMs on all cluster nodes and do measurements on all VMs.
NOTE: Prior to performance measurements, we need to fill the disk with files. This should be done before each test while creating a Virtual Disk for a VM. We’ll use dd.exe for that purpose.
dd launching parameters
dd.exe bs=1M if=/dev/random of=\\?\Device\Harddisk1\DR1 –progress
Enough theory, let’s jump into some real work and configure our 4-node ScaleIO cluster.
Stages of the setup preparation and testing:
- Install ESXi on four bare-metal servers
- Install VMware PowerCLI
- Install vCenter
- Add ESXi hosts to VMware vCenter
- Install ScaleIO plugin in vCenter
- Create a support vector machine (SVM) template in vCenter
- Install ScaleIO on the hosts
- Create VMs using the datastore resources under ScaleIO management and run tests with IOmeter on them
- Clone VMs and do tests on all four hosts.
OK, the first step. Download ScaleIO for VMware from the official website. Then, unpack the archive, go to the vSphere Plugin folder, and install ScaleIO plugin for vSphere:
To install the ScaleIO plugin for VMware, run the ScaleIOPluginSetup-2.0-13000.211 script using PowerCLI:
After entering login and password for connecting to vCenter, select the 1-Register ScaleIO option. Chose Standard (S) installation type.
Now that we have successfully installed the ScaleIO plugin, log in to vCenter. Also, do not close Power CLI. We’ll need it later.
The EMC ScaleIO plugin icon appears in the Home tab.
In order to continue the ScaleIO installation, copy the SVM template to four hosts. What the fuck is SVM? Well, it’s translated as support vector machine and it’s a ScaleIO VMs template used for their deployment. For this purpose, it is necessary to choose 3-Create SVM template in the PowerCLI window. It is also necessary to specify the path to the SVM image in *.ova format. It’s located in the ScaleIO_188.8.131.52_ESX_Download folder with the unpacked ScaleIO archive.
Next, set the names for datastores created on ESXi hosts for copying the SVM template.
Once the script has been successfully executed on all hosts, navigate to the EMC ScaleIO tab in vCenter.
Now, we can proceed with installing ScaleIO Data Client (SDC) on ESX. SDC is a device driver installed on each host with applications or file systems requiring access to the ScaleIO Virtual SAN block devices.
OK, for this purpose, choose hosts where you intend to install SDC and enter their logins and passwords as it is shown in the image below:
Once SDC is installed, we can switch to ScaleIO deployment.
On the vCenter home tab, choose Deploy ScaleIO environment. You’ll see the first step of ScaleIO VMware Installation Wizard. Select the Create new ScaleIO system option.
Carefully read the license agreement. It may take several days to understand that crap. In case you wanna prevent your eyes from bleeding, mark “I agree to whatever bullshit you’re saying here just to finally deploy your awesome software!”, or something like that.
Enter the name for a new ScaleIO system.
Select the cluster hosts where the ScaleIO system will be installed.
Assign roles for ScaleIO hosts and configure ScaleIO. Here, we use a three-node cluster.
Three control nodes are needed for the proper functioning of the system. They contain all the information regarding the health of the array, its components, and processes running. At least one control node must stay alive for ScaleIO proper functioning. In our configuration, we use the additional fourth node.
On the Configure Performance, Sizing, Syslog tab specify only the DNS server.
Afterwards, add a Protection Domain after specifying its name.
Why use Protection Domain? Well, we’ll get several small advantages for our cluster:
– it mitigates the impact of multiple failures in large clusters
– isolates performance if required
In the Configure Storage Pool tab, enter the Storage Pool name and add it to the Protection Domain.
Skip step 9. No need to waste our time here.
OK. Now, specify the hosts for ScaleIO Data Servers (SDS) installation. SDS starts on every server and contributes local storage space to and aggregated pool of storage within the ScaleIO virtual SAN. Disk partitions, drives, and, even, files can serve as local storage. SDS, in its turn, performs the Back-End I/O operations based on SDC request.
On the next step, chose all free devices. Here, we select NVMe drives.
Further, select the ESXi hosts for SDCs installation.
Let’s take a few lines to explain what SDC really is. As EMC puts it, it is a lightweight block device driver that provides ScaleIO shared block volumes to applications. SDC runs on the same server as the application, enabling the latter to issue an IO request and the SDC fulfills it regardless of where the particular blocks physically reside. SDC communicates with other nodes (beyond its own local server) over TCP/IP-based protocol, so it is fully routable. We can also modify the ScaleIO configuration parameters to allow two SDCs access the same data.
Afterwards, select the host to deploy the ScaleIO Gateway VM. That will be used to collect logs and upgrade ScaleIO components.
In the next tab, select the templates to create virtual machines for controlling the ScaleIO system.
Then, configure the network for our ScaleIO VMs.
On the “Configure SVM” step, specify IP addresses for ScaleIO virtual machines and the IP for accessing the virtual ScaleIO cluster (Cluster Virtual IP).
Wait until the end of installation and configuration of the virtual ScaleIO cluster. Think there could be some fuckups? Let’s check that. Just click “View log”. Well, everything seems OK and we’re good to go.
Create Volume from the Storage Pool. For this purpose, follow the path: vCenter=>Home=>EMC ScaleIO=>Storage Pools.
Press Create Volume as it is shown in the screenshot below:
Nothing special here… set the name, the number of volumes, the size, and provisioning. Yeah, also select the hosts.
After the successful Volume creation, create a Datastore.
First things first! Select the host where the Datastore will be created.
Now, select the disk.
Use the recently created datastore for virtual machines creation.
Prior to launching the VM, check whether all settings are appropriate (CPU, RAM, virtual disk). Our configuration will be: 20 CPU, 30GB RAM, 2x Virtual Disks (40GB and 100GB) “thick provision eager zeroed”, 1xNetwork adapter with the virtual access to Mellanox SX1012.
OK. We can launch the VMs after that.
Once the cluster had been created, we tested a VM (Virtual Disk0 – 25GB, Virtual Disk1 – 860GB) performance using the following vCPU/vCORE combinations:
Testing one VM (2CPU/1CorePerSocket) in a four-node VMware ScaleIO cluster:
While running our tests, we also studied how Virtual Disk1 performance changes over time.
So what can we see? Well, first of all, a performance hit after the test start. Secondly, a jump in the initial performance when switching from QD16 to QD32.
The normal question would be: what the fuck? Our well-trained, highly-educated, and extremely-experienced agents say this is caused by the RAM Read Cache use. To verify this information, we repeated the test having RAM Read Cache disabled and with additional QD values in 16-32 interval.
Testing one VM (2CPU/1CorePerSocket) in a four-node VMware ScaleIO cluster RAM Read Cache Disabled:
The diagram below displays the Virtual Disk performance change with ScaleIO Volume RAM Read Cache disabled.
Ruling: with RAM Read Cache disabled, the disk performance gets higher by 10% approximately and no performance drop occurs after the test start. Therefore, we’ll have ScaleIO Volume RAM Read Cache disabled during our further measurements.
Testing one VM (4CPU/1CorePerSocket) in a four-node VMware ScaleIO cluster:
Testing one VM (6CPU/1CorePerSocket) in a four-node VMware ScaleIO cluster:
Testing one VM (8CPU/1CorePerSocket) in a four-node VMware ScaleIO cluster:
Suspect status update
What do experiments say? Well, we can see that the disk’s maximum performance merely depends on the CPU number assigned to the VM and is around 93000-94000 IOPS. For our further testing, we will use 4 CPU/1CorePerSocket configuration since it provides max IOPS numbers.
OK, we can now go on and check several VMs combined performance. We’ll increase the number of VMs keeping them assigned to the single node until their overall performance stops growing. Below, find the configurations we test:
- 2xVM (4CPU/1CorePerSocket/Virtual Disk1=310GB)
- 3xVM (4CPU/1CorePerSocket/Virtual Disk1=210GB)
- 4xVM (4CPU/1CorePerSocket/Virtual Disk1=152GB)
Testing two VMs (4CPU/1CorePerSocket/Virtual Disk1=310GB configuration) in a four-node VMware ScaleIO cluster:
Testing three VMs (4CPU/1CorePerSocket/Virtual Disk1=210GB configuration) in a four-node VMware ScaleIO cluster:
Tests’ results interpretation
The further increase of VMs’ number assigned per one node does not result in the overall ScaleIO Volume performance growth. We haven’t tested scalability, but EMC ScaleIO managed to fuck things up already: the total ScaleIO Volume performance while running two and three VMs on one node turned out to be even lower than running one VM. Based on this “outstanding” result, we do not see any point in rising their number per host anymore.
Let the trial begin
Testing EMC ScaleIO Cluster scalability
Further, we cloned that single VM (4CPU/1CorePerSocket configuration) on other cluster nodes. On the whole, we get four VMs running in the cluster. In this way, the tested cluster has the following configuration: (4xVM/4CPU/1CorePerSocket /Virtual Disk1=152GB). We’ve also reduced the tested virtual disk size to 152GB since the overall ScaleIO Datastore volume is constant and has to be distributed equally among four VMs.
Testing 4xVM/4CPU/1CorePerSocket /Virtual Disk1=152GB in a four-node VMware ScaleIO cluster:
Our experiment provided us with sufficient data on EMC ScaleIO cluster performance. So, we’ve tested a four-node ScaleIO cluster with several single VM (beg your pardon, we still can’t believe in what we’re typing here) on each node due to performance after trying to assign more VMs on a node. The tests show that the use of Samsung SSD 960 EVO M.2 NVMe (500GB) as an underlying storage pool for VMware ScaleIO Storage Pool (configuration: 1xNVMe – Storage/1xNVMe-RF-cache on each ESX host) does not lead to a considerable growth of the overall volume performance when scaling VM on all four hosts. To be more specific, the overall performance growth was slightly higher than 22%. What’s the verdict? Well, we highly recommend EMC ScaleIO as a solution for creating a virtual shared storage pool in your enterprise company…if you live in a parallel universe where enterprise means cold basement with several dust-covered nodes with one VM running (better with some small workload not to shock ScaleIO). To sum up, EMC ScaleIO scaling capabilities ain’t worth a single fuck accept the node are in your imagination.