Serving sizing in the cloud can be tricky. Unless you are about to do some massive high-performance computing project, super-sizing your cloud virtual machines/instances is probably not what you are thinking about when you log in to you favorite cloud service provider. But from looking at customer data within our system, it certainly does look like a lot of folks are walking up to to their neighborhood cloud provider and saying exactly that: Super Size Me!
Like at a fast-food place, buying in the super size means paying extra costs…and when you are looking for ways to save money on cloud costs, whether for production or non-production resources, the first place to look is at idle and underutilized resources.
Within the ParkMyCloud SaaS platform, we have collected bazillions (scientific term) of samples of performance data for tens of thousands of virtual machines, across hundreds of customers, and the average of all “Average CPU” readings is an amazing (even to us) 4.9%. When you consider that many of our customer are already addressing underutilization by stopping or “parking” their instances when they are not being used, one can easily conclude that the server sizing is out of control and instances are tremendously overbuilt. In other words, they are much more powerful than they need to be…and thus much more expensive than they need to be. As cool as “super sizing” sounds, the real solution is in rightsizing, and ensuring the instance size and type are better tailored to the actual load.
Before we start talking about what is involved in rightsizing, let’s look at a few more statistics, just because the numbers are pretty cool. Looking at utilization data from about 88.9 million instance-hours on AWS – that’s 10,148 years – we find the following:
So, what is this telling us about server sizing? The percentiles alone tell us that more than 95% of our samples are operating at less than 50% Average CPU – which means if we cut the number of CPUs in half for most of our instances, we would probably still be able to carry our workload. The 95th percentile for Peak CPU is 58%, so if we cut all of those CPUs in half we would either have to be OK with a small degradation in performance, or maybe we select an instance to avoid exceeding 99% peak CPU (which happens around the 93rd percentile – still a pretty massive number).
Looking down at the 75th and 50th percentiles we see instances that could possibly benefit from multiple steps down! As shown in the next section, one step down can save you 50% of the cost for an instance. Two steps down can save you 75%!
Before making an actual server sizing change, this data would need to be further analyzed on an instance by instance basis – it may be that many of these instances have bursty behavior, where their CPUs are more highly utilized for short periods of time, and idle all the rest of the time. Such an instance would be better of being parked or stopped for most of the time, and only started up when needed. Or…depending on the duration and magnitude of the burst, might be better off moving to the AWS T instance family, which accumulates credits for bursts of CPU, and is less expensive than the M family, built for a more continuous performance duty cycle. Also – as discussed below – complete rightsizing would entail looking at some other utilization stats as well, like memory, network, etc.
On every cloud provider there is a clear progression of server sizing and prices within any given instance family. The next size up from where you are is usually twice the CPUs and twice the memory, and as might be expected, twice the price.
Here is a small sample of AWS prices in us-east-1 (N. Virginia) to show you what I mean:
Double the memory and/or double the CPU…and double the price.
It is important to note that there is more to instance utilization than just the CPU stats. There are a number of applications with low-CPU but high network, memory, disk utilization, or database IOPs, and so a complete set of stats are needed before making a rightsizing decision.
This can be where rightsizing across instance families makes sense.
On AWS, some of the most commonly used instance types are the T and M general purpose families. Many production applications start out on the M family, as it has a good balance of CPU and memory. Let’s look at the m5.4xlarge as a specific example, shown in the middle row below.
- If you find that such an instance was showing good utilization of its CPU, maybe with an Average CPU of 75% and Peak CPU of 95%, but the memory was extremely underutilized, maybe only consuming 20%, we may want to move to more of a compute-optimized instance family. From the table below, we can see we could move over to a c5.4xlarge, keeping the same number of CPUs, but cutting the RAM in half, saving about 11% of our costs.
- On the other hand, if you find the CPU was significantly underutilized, for example showing an Average CPU of 30% and Peak of 45%, but memory was 85% utilized, we may be better off on a memory-optimized instance family. From the table below, we can move to an r5.2xlarge instance cutting the vCPUs in half, and keeping the same amount of RAM, and saving about 34% of the costs.
Within AWS there are additional considerations on the network side. As shown here, available network performance follows the instance size and type. You may find yourself in a situation where memory and CPU are non-issues, but high network bandwidth is critical, and deliberately super-size an instance. Even in this case, though, you should think about if there is a way to split your workload into multiple smaller instances (and thus multiple network streams) that are less expensive than a beastly machine selected solely on the basis of network performance.
You may also need to consider availability when determining your server sizing. For example, if you need to run in a high-availability mode using an autoscaling group you may be running two instances, either one of which can handle your full load, but both are only 50% active at any given time. As long as they are only 50% active that is fine – but you may want to consider if maybe two instances at half the size would be OK, and then address a surge in load by scaling-up the autoscaling group.
For full cost optimization for your virtual machines, you need to consider appropriate resource scheduling, server sizing, and sustained usage.
- Rightsize instances wherever possible. You can easily save 50% just by going down one size tier – and this applies to production resources as well as development and test systems!
- Modernize your instance types. This is similar to rightsizing, in that you are changing to the same instance type in a newer generation of the same family, where cloud provider efficiency improvements mean lower costs. For example, moving an application from an m3.xlarge to an m5.xlarge can save 28%!
- Park/stop instances when they are not in use. You can save 65% of the cost of a development or test virtual machine by just having it on 12 hours per day on weekdays!
- For systems that must be up continually, (and once you have settled on the correct size instance) consider purchasing reserved instances, which can save 54-75% off the regular cost. If you would like a review of your resource usage to see where you can best utilized reserved instances, please let us know.
Last week, ParkMyCloud released the ability to rightsize and modernize instances. This release helps you identify the virtual machine and database instances that are not fully utilized or on an older family, making smart recommendations for better server sizing and/or family selection for the level of utilization, and then letting you execute the rightsize action. We will also be adding a feature for scheduled rightsizing, allowing you to maintain instance continuity, but reducing its size during periods of lower utilization.
Get started today!
Alibaba Cloud is growing at an amazing rate, recently claiming to have overtaken both Google and IBM as the #3 public cloud provider globally, and certainly the #1 provider in China. Many sites and services hosted outside China are accessible from within China, but can suffer high latency and potentially lost functionality if their web interface requires interaction with blocked social media systems. As such, it is no surprise that a number of our (non-Chinese) customers have expressed interest in actually running virtual machine Alibaba instances in China. In this blog we are going to outline the process…and give an alternate plan.
General Process to Run Alibaba Instances in China
The steps to roll-out a deployment on Alibaba in mainland China are relatively clear:
- Establish a “legal commercial entity” in Mainland China.
- Select what services you want to run on Alibaba Cloud
- Apply for Internet Content Provider (ICP) certification
The first three steps are described in more detail below.
Establish a Legal Commercial Entity
Or putting it another way – you need to have an office in China. This can range from an actual office with your own employees, to a Joint Venture, which is a legal LLC between your organization and an established Chinese company. If your service is more informational in nature and is not actually selling anything via the service, then this can be relatively easy, taking only a couple weeks (at least for the legal side), though you will still need to find a Joint Venture partner and make the deal worth their while financially. For commerce or trade-related services, the complexity, time requirements, and costs start going up significantly.
What to run on Alibaba Cloud
There is a decision-point here, as there is one set of rules for Alibaba-hosted web/app servers, and additional rules for everything else. Base virtual machines, databases and other such core IT building blocks require the ICP registration described below, plus “real-name registration”, where a passport is needed to actually confirm the identity of whomever is purchasing the resource. If all you need is a web server, then you can skip this step. In either case, some of the filing requirements involve having a server and/or DNS record prepared in order to complete the later steps. A web site does not need to be completely finished until launch, but a placeholder may be needed.
Internet Content Provider (ICP) certification
There are two flavors of ICP certification:
- A “simple” ICP Filing – which is the bare minimum needed for informational websites that are not directly generating revenue.
- ICP Commercial Filing – This starts with getting an approved ICP Filing, and then also includes a Commercial License that must be obtained a province/municipality in China. In some cases, this appears to be related to which Alibaba region you are using, and even the physical location of your public IP address.
Many references recommend finding an experienced consultant to guide you through these processes, and it is easy to see why!
OK…WAY too much work. What is Plan B?
The other way to run Alibaba instances in China is to host your site or services in Hong Kong. All of the rules described above apply to “Mainland China”, which does not include Hong Kong. Taiwan is also not included in Mainland China, but Hong Kong has the advantage of being better connected to the rest of China. If the main problem you are trying to solve is to reduce latency to your site for China-based customers, Hong Kong is the closest you can get without actually being there, and Alibaba appears to do a pretty good job optimizing the Hong Kong experience. No local office or legal filings required!
Once you are all set up: Optimize your Costs!
After your instances are set up, make sure you’re optimizing Alibaba costs. Our Mainland China-based customers using Alibaba have confirmed that ParkMyCloud is able to access the Alibaba APIs from our US-based servers – so you can go ahead and try it out.
In this blog we will look at the Google Cloud Committed Use discount program for customers that are willing to “commit” to a certain level of usage of the GCP Compute Engine.
The Committed Use purchasing option is particularly useful if you are certain that you will be continually operating instances in a region and project over the course of a year or more. If your instance usage does not add up to a full month, you may instead want to look at the Google Cloud Sustained Use discounts, which we discussed in a previous blog.
The Google Cloud Committed Use discount program has many similarities to the AWS and Azure Reserved Instances (RI) programs, and a couple unique aspects as well. A comparison of the various cloud providers’ reservation programs is probably worth a blog in itself, so for now, let’s focus on the Google Cloud Committed Use discounts, and the best times and places to use them.
Critical Facts about Google Cloud Committed Use
- The Committed Use discount is best for a stable and predictable workload (you are committed to pay – regardless of whether you use the resources or not!)
- Commitment periods are for either 1 or 3 years.
- Commitments are for a specific region and a specific project. Zone assignment within a region is not a factor.
- Discounts apply to the total number of vCPUs and amount of memory– not to a specific machine or machine type.
- No pre-payment – the commitment cost is billed monthly.
- GCP Committed Use discounts are available for all of the GCP instance families except the shared-core machine types, such as f1-micro and g1-small.
- Committed Use discounts do not apply to the premium charges for sole-tenants, nor can they be used for Preemptible instances.
- The commitments for General Purpose instances are distinct from those for Memory Optimized instances. If you have some of both types, you must buy two different types of Commitment. These types are:
- Standard – n1-standard
- High Memory – n1-highmem
- High CPU – n1-highcpu
- General purpose sole-tenant
How much does it cost?
Each Committed Use purchase must include a specific number of vCPUs and the amount of memory per vCPU. This combination of needing to commit to both a number of vCPUs and amount of Memory can make the purchase of a commitment a bit more complicated if you use a variety of machine types in your environment. The following table illustrates some GCP machine types and the amount of memory automatically provided per vCPU:
||Memory per vCPU
||0.9 – 6.5 GB
While the vCPU aspect is fairly straightforward, the memory commitment to purchase requires a bit of thought. Since it is not based on a specific machine type (like AWS and Azure), you must decide just how much memory to sign-up for. If your set of machine types is homogeneous, this is easy – just match the vCPU/memory ratio to what you run. The good news here is that you are just buying a big blob of memory – you are not restricted to rigidly holding to some vCPU/memory ratio. The billing system will “use up” a chunk of memory for one instance and then move on to the next.
Looking at a specific example, the n1-standard-8 in the Oregon region that we discussed in the Sustained Usage Blog, we can see that the Committed Use discount does amount to some savings, but one must maintain a usage level throughout the month to fully consume the commitment.
Recall from the earlier blog that the base price of this instance type in the GCP Price list already assumes a Sustained Usage discount over a full month, and that the actual “list price” of the instance type is $277.40, and Sustained Usage provides up to a maximum of a 30% discount. With that as a basis, we can see that the net savings for the Committed Use discount over 1 year is 37%, and over 3 years, rises to 55%. This is close to the advertised discount of 57% in the GCP pricing information, which varies by region.
The break-even points in this graph are about 365 hours/month for a 3 year commitment, and 603 hours/month for a 1 year commitment. In other words, if you are sure you will be using a resource less than 365 hours/month over the course of a year, then you probably want to avoid purchasing a 3 year Commitment.
Allocation of Commitments
Because Commitments are assigned on a vCPU/RAM basis, you cannot simply point at a specific instance, and say THAT instance is assigned to my Committed Use discount. Allocation of commitments is handled when your bill is generated, and your discount is applied in a very specific order:
- To custom machine types
- Sole-tenant node groups
- Predefined machine types
This sequence is generally good for the customer, in that it applies the Commitment to the more expensive instances first. For example, an n1-standard-4 instance in Northern Virginia normally costs $109.35. If an equivalent server was constructed as a Custom instance, it would cost $114.76.
For sole-tenant node groups, you are typically paying for an entire physical machine, and the Committed Use discount serves to offset the normal cost for that node. For a sole-tenant node group that is expected to be operating 7x24x365, it makes the most sense to buy Committed Use for the entire system, as you will be paying for the entire machine, regardless of how many instances are running on it.
Commitments are allocated over the course of each hour in a month, distributing the vCPUs and RAM to all of the instances that are operating in that hour. This means you cannot buy a Commitment for 1 vCPU and 3.75 GB of RAM, and run two n1-standard-1 instances for the first half of the month, and then nothing for the second half of the month, expecting it all to be covered by the Commitment. In this scenario, you would be charged for one month at the committed rate, and two weeks at the regular rate (subject to whatever Sustained Usage discount you might accumulate for the second instance).
Thank you for….not…sharing?
Unlike AWS, where Reserved Instances are automatically shared across multiple linked accounts within an organization, GCP Commitments cannot be shared across projects within a billing account. For some companies, this can be a major decision point as to whether or not they commit to Commitments. Within the ParkMyCloud platform, we see customers with as many as 60 linked AWS accounts, all of which share in a pool of Reserved Instances. GCP customers do not have this flexibility with Commitments, being locked-in to the Project in which they were purchased. A number of our customers use AWS Accounts as a mechanism to track resources for teams and projects; GCP has Projects and Quotas for this purpose, and they are not quite as flexible for committed resource sharing. For a larger organization, this lack of sharing means each project needs to be much more careful about how they purchase Commitments.
Google Cloud Committed Use discounts definitely offer great savings for organizations that expect to maintain a certain level of usage of GCP and that expect to keep those resources within a stable set of regions and projects. Since GCP Commitments are assigned at the vCPU/Memory level, they provide excellent flexibility over machine-type-based assignments. With the right GCP usage profile over a year or more, purchase of Google Cloud Committed Use discounts is a no-brainer, especially since there are no up-front costs!
More cloud users are starting to investigate Alibaba Cloud, so the time is ripe for a comparison of AWS vs Alibaba Cloud pricing. Commonly recognized as the #4 cloud provider (from a revenue perspective anyway), Alibaba is one of the fastest growing companies in the space today.
Alibaba has been getting a lot of attention lately, given its rapid growth, and its opportunities for cloud deployments within mainland China. ParkMyCloud is preparing to release support for Alibaba, and that of course has let us focus on pricing and cost savings – our forte. In this article I am going to dive a bit into the pricing of the Alibaba Elastic Compute Service (ECS), and compare it with that of the AWS EC2 service.
Alibaba vs Aliyun
Finding actual pricing for comparison purposes can be a bit complicated, as the prices are listed in a couple different places and do not quite exactly match up. If one searches for Alibaba pricing, one ends up here, which I am going to call the “Alibaba Cloud” site. However, when you actually get an account and want to purchase an instance, you can up here or here, both of which I will call the “Aliyun” site. [Note that you may not be able to see the Aliyun sites without signing up for an account and actually logging-in.]
Aliyun (literally translated “Ali Cloud”) was the original name of the company, and the name was changed to Alibaba Cloud in July 2017. Unsurprisingly, the Aliyun name has stuck around on the actual operational guts of the company, reflecting that it is probably hard-coded all over the place, both internally and externally with customers. (Supernor’s 3rd Conjecture: Engineering can never keep up with Marketing.)
Both sites show that like the other major cloud providers, Alibaba’s pricing model includes a Pay-As-You-Go (PAYG) offering, with per-second billing. Note, however, that in order to save money on stopped instances, one must specifically enable a “No fees for stopped instances” feature. Luckily, this is a global one-time setting for instances operating under a VPC, and you can set it and forget it. Unlike AWS, this feature is not available for any instances with local disks (this and other aspects of the description lead me to believe that Alibaba instances tend to be “sticky” to the underlying hardware instance). On AWS, local disks are described as ephemeral, and are simply deallocated when they are not in use. Like AWS, system/data disks continue to accrue costs even when an instance is stopped.
Both sites also show that Alibaba also has a one-month prepaid Subscription model. Based on a review of the pricing listed for the us-east-1 region on the Alibaba Cloud site, the monthly subscription discount reflects a substantial 30-60% discount compared to the cost of a PAYG instance that is left up for a full month. For a non-production environment that may only need to be up during normal business hours (say, 9 hours per day, weekdays only), one can easily see that it may be more cost-effective to go with the PAYG pricing, and use the ParkMyCloud service to shut the instances down during off-hours, saving 73%.
But this is where the similarities between the sites end. For actual pricing, instance availability, and even the actual instance types, one really needs to dive into a live Alibaba account. In particular, if PAYG is your preference, note that the Alibaba public site appears to have PAYG pricing listed for all of their available instance types, which is not consistent with what I found in the actual purchasing console.
Low-End Instance Types – “Entry Level” and “Basic”
The Alibaba Cloud site breaks down the instance types into “Entry Level” and “Enterprise”, listing numerous instance types under both categories. All of the Entry Level instance types are described as “Shared Performance”, which appears to mean the underlying hardware resources are shared amongst multiple instances in a potentially unpredictable way, or as described by Alibaba: “Their computing performance may be unstable, but the cost is relatively low” – an entertaining description to say the least. I did find these instance types on the internal purchasing site, but did not delve any further with them, as they do not offer a point of reference for our AWS vs. Alibaba Cloud pricing comparison. They may be an interesting path for additional investigation for non-production instance types where unstable computing performance may be OK in exchange for a lower price.
That said…after logging in to the Alibaba management console, reaching the Aliyun side of the website, there is no mention of Entry Level vs Enterprise. Instead we see the top-level options of “Basic Purchase” vs “Advanced Purchase”. Under Basic Purchase, there are four “t5” instance types. The t5 types appear to directly correspond to the first four AWS t2 instance types, in terms of building up CPU credits.
These four instance types do not appear to support the PAYG pricing model. Pricing is only offered on a monthly subscription basis. A 1-year purchase plan is also offered, but the math shows this is just the monthly price x12. It is important to note that the Aliyun site itself has issues, as it lists the t5 instance types in all of the Alibaba regions, but I was unable to purchase any of them in the us-east-1 region – “The configuration for the instance you are creating is currently not supported in this zone.” (A purchase in us-west-1, slightly more expensive, was fine).
The following shows a price comparison for Alibaba vs AWS for “t” instance prices in a number of regions. The AWS prices reflect the hourly PAYG pricing, multiplied by an average 730 hour month. I was not able to get pricing for any AWS China region, so the Alibaba pricing is provided for reference.
While the AWS prices are higher, the AWS instances are PAYG, and thus could be stopped when not being used, common for t2 instances used in a dev-test environment, and potentially saving over 73%. One can easily see that this kind of savings is needed to compete with the comparatively low Alibaba prices. I do have to wonder what is up with that Windows pricing in China….does Microsoft know about this??
Aliyun “Advanced Purchase”
Looking at the “Advanced” side of the Aliyun purchasing site, we get a lot more options, including Pay-As-You-Go instances. To keep the comparison simple, I am going to limit the scope here to a couple of instance types, trying to compare a couple m5 and i3 instances with their Alibaba equivalents. I will list PAYG pricing where offered.
In this table, the listed monthly AWS prices reflect the hourly pay-as-you-go price, multiplied by an average 730 hour month.
The italicized/grey numbers under Alibaba indicate PAYG numbers that had to be pulled from the public-facing website, as the instance type was not available for PAYG purchase on the internal site. From a review of the various options on the internal Aliyun site, it appears the PAYG option is not actually offered for very many standalone instance types on Alibaba…
The main reason I pulled in the PAYG prices from the second source was for auto scaling, which is normally charged at PAYG prices. In Alibaba, “all ECS instances that Auto Scaling automatically creates, or manually adds, to a scaling group will be charged according to their instance types. Note that you will still be charged for Pay-As-You-Go instances even after you stop them.” It is possible, however, to manually add subscription-based instances to an auto scaling group, and configure them to be not removed when the group scales-down.
In general, the full price of the AWS Linux instances over a month is 22-35% higher that of an Alibaba 1-month subscription. A full price AWS Windows instance over a month is 9-25% higher than that of an Alibaba subscription. (And once again, it appears Windows licensing fees are not a factor in China.)
AWS vs Alibaba Cloud Pricing: Alibaba is cheaper, but…
Alibaba definitely comes out as less expensive in this AWS vs Alibaba cloud pricing comparison – the one-month subscription has a definite impact. However, for longer-lived instances, AWS Reserved Instances will certainly be less expensive, running about 40-75% less expensive than AWS PAYG, and thus less than some if not all of the Alibaba monthly subscriptions. AWS RI’s are also more easily applicable to auto scaling groups than a monthly subscription instance.
For non-production instances that can be shut down when not in use, PAYG is less expensive for both cloud providers, where ParkMyCloud can help you schedule the downtime. The difficulty with Alibaba will actually be finding instances types that can actually be purchased with the PAYG option.
When looking to keep Google Cloud Platform (GCP) costs in control, the first place users turn are the discount options offered by the cloud service provider itself, such as Google’s Sustained Use discounts. The question is: do Google Sustained Use discounts actually save you money, when you could just turn the instance off?
How Google Sustained Use discounts work
The idea of the Sustained Use discount is that the longer you run a VM instance in any given month, the bigger discount you will get from the list price. The following shows the incremental discount, and its cumulative impact on a hypothetical $100/month VM instance, where the percentages are against the baseline 730-hour month.
I have to say here that the GCP prices listed can be somewhat misleading unless you read the fine print where it says “Note: Listed monthly pricing includes applicable, automatic sustained use discounts, assuming the instance runs for a 730 hour month.” What this means to us is that the list prices of the instances are actually much higher, but their progressive discount means that no one ever actually pays list price. That said – the list price is what you need to know in order to estimate the actual cost you will pay if you do not plan to leave the instance up for 730 hours/month.
For example, the price shown on the GCP pricing link for an n1-standard-8 instance in the Iowa region is (as of this writing) $194.1800. The list price for this instance would be $194.1800/0.7 = $277.40. This is the figure that must be used as the entry point for the table above to calculate the actual cost, given a certain level of utilization.
What if you parked the VM instance instead?
Here at ParkMyCloud, we’re all about scheduling resources to turn off when you’re not using them, i.e., “parking” them. With this mindset, I wondered about the impact of the sustained use discounts on the schedule-based savings. The following chart plots the cost of that n1-standard-8 VM instance, showing Google sustained use discounts combined with a parking schedule.
We can definitely see progressively more sustained use savings added to progressively less schedule-based savings. I am sure this would end up getting described as the typical hype of “the more you spend, the more you save!” But, the reality of the matter must intrude here and show the more you spend…the more you spend!
Looking at what this means for ParkMyCloud users, here is the monthly uptime for a few common parking schedules, and the associated cost:
These are a far cry from the $277.40 list price, and even the $194.18 max discounted price. From this, it can be seen that even with the most wide-open “work day” schedule of 12 hours per weekday, the schedule is barely nudging over the 182.5 hours needed to hit the first price break of 20%. And even then, the 20% discount is only applied to those hours above 182.5 hours. A welcome discount to be sure, but not very enormously impactful to the bottom line.
Another way our users keep these utilization hours low is by keeping their VM instances “always parked” and temporarily overriding the schedule for a set number of hours (such as for an 8-hour workday) when their non-production resources are needed. When the duration of the override expires, the instance is automatically shut down. Giving the best possible savings, and usually never even hitting the first GCP discount tier.
Do Google Sustained Use discounts save you money?
In short: definitely! At least, they do save you money over the price listed by Google. Do they save you the maximum amount of money possible? No, not if it’s a non-production VM instance that is only needed during a regular workday (although it’s close).
To get the optimal savings on your resources, keep them running only when you’re actually using them, and park them when you’re not. If you meet the threshold of 25% usage for the month, Google’s Sustained Use discounts will kick in, and further lower your cost from the list price. These two savings options combined will optimize your costs and provide the maximum savings.