While going through our recent Cloud Cost Optimization Competency review with AWS, one of the things they asked us to do was remove the ability for customers to sign up for our service using AWS IAM User credentials. They loved the fact that we already supported AWS IAM Role credentials, but their concern was that AWS IAM User credentials could conceivably be stolen and used from outside AWS by anyone. (I say inconceivable, but hey, it is AWS.) This was a bit of a bitter pill to swallow, as some customers find IAM Users easier to understand and manage than IAM Roles. The #1 challenge of any SaaS cloud management platform like ours is customer onboarding, where every step in the process is one more hurdle to overcome.
While we could debate how difficult it would be to steal a customer cloud credential from our system, the key (pun intended) thing here is why is an IAM Role preferred over an IAM User?
Before answering that question, I think it is important to understand that an IAM Role is not a “role” in perhaps the traditional sense of Active Directory or LDAP. An AWS IAM Role is not something that is assigned to a “User” as a set of permissions – it is a set of capabilities that can be assumed by some other entity. Like putting on a hat, you only need it at certain times, and it is not like it is part of who you are. As AWS defines the difference in their FAQ:
An IAM user has permanent long-term credentials and is used to directly interact with AWS services. An IAM role does not have any credentials and cannot make direct requests to AWS services. IAM roles are meant to be assumed by authorized entities, such as IAM users, applications, or an AWS service such as EC2.
(The first line of that explanation alone has its own issues, but we will come back to that…)
The short answer for SaaS is that a customer IAM Role credential can only be used by servers running from within the SaaS provider’s AWS Account…and IAM User credentials can be used by anyone from anywhere. By constraining the potential origin of AWS API calls, a HUGE amount of risk is removed, and the ability to isolate and mitigate any issues is improved.
What is SaaS?
Software as a Service (SaaS) means different things to different vendors. Some vendors claim to be “SaaS” for their pre-built virtual machine images that you can run in your cloud. Maybe an intrusion detection system or a piece of a cloud management system. In my (truly humble) opinion this is not a SaaS – this is just another flavor of “on prem” (on-premise), where you are running someone’s software in your environment. Call it “in-cloud” if you do not want to call it “on-prem”, but it is not really SaaS, and it does not have the challenges you will experience with a “true” SaaS product – coming in from the outside. A core component of SaaS is that it is centrally hosted – outside your cloud. For an internal service, you might relax permissions and access mechanisms somewhat, as you have total control over data ingress/egress. A service running IN your network…where you have total control over data ingress/egress…is not the same as external access – the epitome of SaaS. Anyway: </soapbox>. (Or maybe </rant> depending on the tone you picked up along the way…)
The kind of SaaS I am focussing on for this blog is SaaS for cloud management, which can include cloud diagramming tools, configuration management tools, storage management+backup tools, or cost optimization tools like ParkMyCloud.
AWS has enabled SaaS for secure cloud management more than any other cloud provider. A bold statement, but let’s break that down a bit. We at ParkMyCloud help our customers optimize their expenses at all of the major cloud providers and so obviously all the providers allow for access from “outside”. Whether it is an Azure subscription, a GCP project, or an Alibaba account, these CSP’s are chiefly focussed on customer internal cross-domain access. I.e., the ability of the “parent” account to see and manage the “child” accounts. Management within an organization. But AWS truly acknowledges and embraces SaaS.
You could attribute my bold statement to an aficionado/fanboi notion of AWS having a bigger ecosystem vision, or more specifically that they simply have a better notion of how the Real World works, and how that has evolved in The Cloud. The fact is that companies buy IT products from other companies…and in the cloud that enables this thing called Software as a Service, or SaaS. All of the cloud providers have enabled SaaS for cloud access, but AWS has enabled SaaS for more secure cloud access.
AWS IAM Cross-account Roles
So…where was I? Oh…right…Secure SaaS access.
OK, so AWS enables cross-account access. You can see this in the IAM Create Role screen in the AWS Console:
If your organization owns multiple AWS accounts (inside or outside of an AWS “organization”), cross-account access allows you to use a parent account to manage multiple child accounts. For SaaS, cross-account access allows a 3rd-party SaaS provider to see/manage/do stuff with/for your accounts.
Looking a little deeper into this screen, we see that cross-account access requires you to specify the target account for the access:
The cross-account role allows you to explicitly state which other AWS account can use this role. More specifically: which other AWS account can assume this role.
But there is an additional option here talking about requiring an “external ID”…what is that about?
Within multiple accounts in a single organization, this may allow you to differentiate between multiple roles between accounts….maybe granting certain permissions to your DevOps folks…other permissions to Accounting…and still other permissions to IT/network management.
If you are a security person, AWS has some very interesting discussions about the “confused deputy” problem mentioned on this screen. It discusses how a hostile 3rd party might guess the ARN used to leverage this IAM Role, and states that “AWS does not treat the external ID as a secret” – which is all totally true from the AWS side. But summing it up: cross-account IAM Roles’ external IDs do not protect you from insider attacks. For an outsider, the External ID is as secret as the SaaS provider makes it.
Looking at it from the external SaaS side, we get a bit of a different perspective. For SaaS, the External ID allows for multiple entry points…and/or a pre-shared secret. At ParkMyCloud (and probably most other SaaS providers) we only need one entry point, so we lean toward the pre-shared secret side of things. When we, and other security-conscious SaaS providers, ask for access, we request an account credential, explicitly giving our AWS account ID and an External ID that is unique for the customer. For example, in our UI, you will see our account ID and a customer-unique External ID:
Assume Role…and hacking SaaS
If we look back at the definition of the AWS IAM Role, we see that IAM roles are meant to be assumed by authorized entities. For an entity to assume a role, that party has to be an AWS entity that has the AWS sts:AssumeRole permission for the account in which it lives. Breaking that down a bit, the sts component of this permission tells us this comes from the AWS Secure Token Services, which can handle whole chains of delegation of permissions. For ParkMyCloud, we grant our servers in AWS an IAM Role that has the sts:AssumeRole permission for our account. In turn, this allows our servers to use the customer account ID and external ID to request permission to “Assume” our limited-access role to manage a customer’s virtual machines.
From the security perspective, this means if a hostile party wanted to leverage SaaS to get access to a SaaS customer cloud account via an IAM Role, they would need to:
Learn an account ID for a target organization
Find a SaaS provider leveraged by that target organization
Hack the SaaS enough to learn the External ID component of the target customer account credentials
Completely compromise one of the SaaS servers within AWS, allowing for execution of commands/APIs to the customer account (also within AWS), using the account ID, External ID, and Assume Role privileges of that server to gain access to the customer account.
Have fun with the customer SaaS customer cloud, but ONLY from that SaaS server.
So….kind-of a short recipe of what is needed to hack a SaaS customer. (Yikes!) But this is where your access privileges come in. The access privileges granted via your IAM role determine the size of the “window” through which the SaaS provider (or the bad guys) can access your cloud account. A reputable SaaS provider (ahem) will keep this window as small as possible, commensurate with the Least Privilege needed to accomplish their mission.
Also – SaaS services are updated often enough that the service might have to be penetrated multiple times to maintain access to a customer environment.
So why are AWS IAM Users bad?
Going back to the beginning, our quote from AWS stated “An IAM user has permanent long-term credentials and is used to directly interact with AWS services”. There are a couple frightening things here.
“Permanent long-term credentials” means that unless you have done something pretty cool with your AWS environment, that IAM User credential does not expire. An IAM User credential consists of a Key ID and Secret Access Key (an AWS-generated pre-shared secret) that are good until you delete them.
“…directly interact with AWS services” means that they do not have to be used from within your AWS account. Or from any other AWS account. Or from your continent, planet, galaxy, dimension, etc. That Key ID and Secret can be used by anyone and anywhere.
From the security perspective, this means if a hostile party wanted to leverage SaaS to get access to a SaaS customer cloud account via an IAM Role, they would need to:
Learn an account ID for a target organization
Find a SaaS provider leveraged by that target organization
Hack the SaaS enough to get the IAM User credentials.
Have fun…from anywhere.
So this list may seem only a little bit shorter, but the barriers to compromise are higher, and the opportunity for long-term compromise is MUCH longer. Any new protections or updates for the SaaS servers has no impact on an existing compromise. The horse has bolted, so shutting the barn door will not help at all.
What if the SaaS provider is not in AWS? Or…what if *I* am not in AWS?
The other cloud providers provide some variation of an access identifier and a pre-shared secret. Unlike AWS, both Azure and Google Cloud credentials can be created with expiration dates, somewhat limiting the window of exposure. Google does a great job of describing their process for Service Accounts here. In the Azure console, service accounts are found under Azure AD>App registrations>All apps>App details>Settings>Keys, and passwords can be set to expire in 1 year, 2 years, or never. I strongly recommend you set reminders someplace for these expiration dates, as it can be tricky to debug an expired service account password for SaaS.
For all providers you can also limit your exposure by setting a very limited access role for your SaaS accounts, as we describe in our other blog here.
Azure does give SaaS providers the ability to create secure “multi-tenant” apps that can be shared across multiple customers. However, the API’s for SaaS cloud management typically flow in the other direction, reaching into the customer environment, rather than the other way around.
IAM Role – the Clear Winner
Fortunately, when AWS “strongly recommended” that we should discontinue support for AWS IAM User-based permissions, we already supported an upgrade path, allowing our customer to migrate from IAM User to IAM Role without losing any account configuration (phew!). We have found some scenarios where IAM Role cannot be used – like between the AWS partitions of AWS global, AWS China, and the AWS US GovCloud. For GovCloud, we support ParkMyCloud SaaS by running another “instance” of ParkMyCloud from within GovCloud, where cross-account IAM Role is supported.
With the additional security protections provided for cross-account access, AWS IAM Role access is the clear winner for SaaS access, both within AWS and across all the various cloud providers.
The principle of least privilege is important to understand and follow as you adopt SaaS technologies. The market for SaaS-based tools is growing rapidly, and can typically be activated much more quickly and cheaply than creating a special-purpose virtual machine within your cloud environment. In this blog, I am focusing specifically on the SaaS cloud management tool area, which can include services like cloud diagramming tools, configuration management tools, storage management and backup tools, or cost optimization tools like ParkMyCloud.
Why the Principle of Least Privilege is Important
Before you start using such tools and services, you should carefully consider how much access you are granting into your cloud. The principle of least privilege is a fundamental tenet of any identity and access control policy, and basically means a service or user should have no more permissions than absolutely required in order to do a job.
Cloud account privileges and permissions are typically granted via roles and permissions. All of the cloud providers provide numerous predefined roles, which consist of pre-packaged sets of permissions. Before granting any requested predefined role to a 3rd-party, you should really investigate the permissions or security policy embedded in that role. In many (most?) cases, you are likely to find that the predefined roles give a lot more information or capabilities away than you are really likely to want.
SaaS Onboarding – Where Least Privilege Can Get Lost
For on-boarding of new SaaS customers, the initial permissions setup is often the most complicated step, and some SaaS cloud management platforms try to simplify the process by asking for one of these predefined roles. For example, the Amazon ReadOnlyAccess role or the Azure Reader role or the GCP roles/viewer role. While this certainly makes onboarding of SaaS easier, it also exposes you to a massive data leakage problem. For example, with the Amazon ReadOnlyAccess role a cloud diagramming tool can certainly get a good enough view of your cloud to create a map…but you are also granting read access for all of your IAM Users, CloudTrail events and history, any S3 objects you have not locked-down with a distinct bucket policy, and….lots of other stuff you probably do not even know you have. It is like kinda like saying – “Here, please come on in and look at all of our confidential file cabinets – and it is OK for you to make copies of anything interesting, just please do not change any of our secrets to something else…” No problem, right?
Obviously, least privilege becomes especially critical when giving permissions to a SaaS provider, given the risk of trusting your cloud environment to some unknown party.
Custom Policies for SaaS
Because of the broad nature of many of their predefined roles, all of the major cloud providers give you the ability to assign specific permissions to both internal and external users through Policies. For example, the following policy snippets show the minimum permissions ParkMyCloud requests to list, start, and stop virtual machines on AWS, Google, and Azure.
Creating and assigning these permissions makes SaaS onboarding a bit more complicated, but it is worth the effort in terms of reducing your exposure.
Other Policy Restrictions
What if you want to give a SaaS provider permissions, but lock it down to only certain resources or certain regions? AWS and Azure allow you to specify in the policy which resources the policy can be applied to. Google Cloud….not so much. AWS takes this the farthest, allowing for very robust policies down to specific services, and the addition of tag-based caveats for the policy permissions, for example:
This policy locks down the Start and Stop permissions to only those instances that have the tag name/value parkmycloud: yes,and are located in the us-east-1region. Similar Conditions can be used to lock this down by region, instance type, and many other situations. (This recent announcement shows another way to handle the region restriction.)
Azure has somewhat similar features, though with a slightly different JSON layout, as described here. It does not appear you can use resource tags to for Azure, nor does Azure provide easy ways to limit the geographic scope of permissions. You can get around the location and grouping of resources by using Azure Management Groups, but that is not quite as flexible as an arbitrary tag-based system, and is actually more intended to aggregate resources across subscriptions, rather than be more specific within a subscription. That said, the Azure permissions defined here are a bit more granular than AWS. This does allow for a bit more specificity in permissions if it is needed, but can no doubt grow tedious to list and manage.
Google Cloud provides a long list of predefined roles here, with an excellent listing the contained permissions. There is also an interesting page describing the taxonomy of the permissions here, but Google Cloud appears to make it a bit difficult to enumerate and understand the permissions individually, outside of the predefined roles. Google does not provide any tag or resource-based restrictions, apart from assignment at the Project level. More on user management and roles by cloud provider in this blog.
You may note that the ec2:Describe permission in our last example does not have the tag-based restriction. This is because the tag-based restriction can only be used for certain permissions, as shown in the AWS documentation. Note also that some APIs can do several different operations, some of which you may be OK with sharing, and others not. For example, the AWS ModifyInstance permission allows the API user to change the instance type. But…this one API (and associated permission) also allows the API user to modify security group assignments, shutdown behaviors, and other features – things you may not want to share with an untrusted 3rd party.
Key takeaway here? Look out for permissions that may have unexpected consequences.
Beware of SaaS cloud management providers who are asking for simple predefined roles from your cloud provider. They are either giving a LOT more functionality than you are likely to want from a single provider, or they are asking for a lot more permissions than they need. Ask for a “limited access policy” that gives the SaaS provider ONLY what they need, and look for a document that defines these permissions and how they tie back to what the SaaS provider is doing for you.
These limited access policies serve to limit your exposure to accidents or compromises at the SaaS provider.
Serving sizing in the cloud can be tricky. Unless you are about to do some massive high-performance computing project, super-sizing your cloud virtual machines/instances is probably not what you are thinking about when you log in to your favorite cloud service provider. But from looking at customer data within our system, it certainly does look like a lot of folks are walking up to their neighborhood cloud provider and saying exactly that: Super Size Me!
Like at a fast-food place, buying in the super size means paying extra costs…and when you are looking for ways to save money on cloud costs, whether for production or non-production resources, the first place to look is at idle and underutilized resources.
Within the ParkMyCloud SaaS platform, we have collected bazillions (scientific term) of samples of performance data for tens of thousands of virtual machines, across hundreds of customers, and the average of all “Average CPU” readings is an amazing (even to us) 4.9%. When you consider that many of our customer are already addressing underutilization by stopping or “parking” their instances when they are not being used, one can easily conclude that the server sizing is out of control and instances are tremendously overbuilt. In other words, they are much more powerful than they need to be…and thus much more expensive than they need to be. As cool as “super sizing” sounds, the real solution is in rightsizing, and ensuring the instance size and type are better tailored to the actual load.
Before we start talking about what is involved in rightsizing, let’s look at a few more statistics, just because the numbers are pretty cool. Looking at utilization data from about 88.9 million instance-hours on AWS – that’s 10,148 years – we find the following:
So, what is this telling us about server sizing? The percentiles alone tell us that more than 95% of our samples are operating at less than 50% Average CPU – which means if we cut the number of CPUs in half for most of our instances, we would probably still be able to carry our workload. The 95th percentile for Peak CPU is 58%, so if we cut all of those CPUs in half we would either have to be OK with a small degradation in performance, or maybe we select an instance to avoid exceeding 99% peak CPU (which happens around the 93rd percentile – still a pretty massive number).
Looking down at the 75th and 50th percentiles we see instances that could possibly benefit from multiple steps down! As shown in the next section, one step down can save you 50% of the cost for an instance. Two steps down can save you 75%!
Before making an actual server sizing change, this data would need to be further analyzed on an instance by instance basis – it may be that many of these instances have bursty behavior, where their CPUs are more highly utilized for short periods of time, and idle all the rest of the time. Such an instance would be better off being parked or stopped for most of the time, and only started up when needed. Or…depending on the duration and magnitude of the burst, might be better off moving to the AWS T instance family, which accumulates credits for bursts of CPU, and is less expensive than the M family, built for a more continuous performance duty cycle. Also – as discussed below – complete rightsizing would entail looking at some other utilization stats as well, like memory, network, etc.
On every cloud provider there is a clear progression of server sizing and prices within any given instance family. The next size up from where you are is usually twice the CPUs and twice the memory, and as might be expected, twice the price.
Here is a small sample of AWS prices in us-east-1 (N. Virginia) to show you what I mean:
Double the memory and/or double the CPU…and double the price.
It is important to note that there is more to instance utilization than just the CPU stats. There are a number of applications with low-CPU but high network, memory, disk utilization, or database IOPs, and so a complete set of stats are needed before making a rightsizing decision.
This can be where rightsizing across instance families makes sense.
On AWS, some of the most commonly used instance types are the T and M general purpose families. Many production applications start out on the M family, as it has a good balance of CPU and memory. Let’s look at the m5.4xlarge as a specific example, shown in the middle row below.
If you find that such an instance was showing good utilization of its CPU, maybe with an Average CPU of 75% and Peak CPU of 95%, but the memory was extremely underutilized, maybe only consuming 20%, we may want to move to more of a compute-optimized instance family. From the table below, we can see we could move over to a c5.4xlarge, keeping the same number of CPUs, but cutting the RAM in half, saving about 11% of our costs.
On the other hand, if you find the CPU was significantly underutilized, for example showing an Average CPU of 30% and Peak of 45%, but memory was 85% utilized, we may be better off on a memory-optimized instance family. From the table below, we can move to an r5.2xlarge instance cutting the vCPUs in half, and keeping the same amount of RAM, and saving about 34% of the costs.
Within AWS there are additional considerations on the network side. As shown here, available network performance follows the instance size and type. You may find yourself in a situation where memory and CPU are non-issues, but high network bandwidth is critical, and deliberately super-size an instance. Even in this case, though, you should think about whether there is a way to split your workload into multiple smaller instances (and thus multiple network streams) that are less expensive than a beastly machine selected solely on the basis of network performance.
You may also need to consider availability when determining your server sizing. For example, if you need to run in a high-availability mode using an autoscaling group you may be running two instances, either one of which can handle your full load, but both are only 50% active at any given time. As long as they are only 50% active that is fine – but you may want to consider if maybe two instances at half the size would be OK, and then address a surge in load by scaling-up the autoscaling group.
For full cost optimization for your virtual machines, you need to consider appropriate resource scheduling, server sizing, and sustained usage.
Rightsize instances wherever possible. You can easily save 50% just by going down one size tier – and this applies to production resources as well as development and test systems!
Modernize your instance types. This is similar to rightsizing, in that you are changing to the same instance type in a newer generation of the same family, where cloud provider efficiency improvements mean lower costs. For example, moving an application from an m3.xlarge to an m5.xlarge can save 28%!
Park/stop instances when they are not in use. You can save 65% of the cost of a development or test virtual machine by just having it on 12 hours per day on weekdays!
For systems that must be up continually, (and once you have settled on the correct size instance) consider purchasing reserved instances, which can save 54-75% off the regular cost. If you would like a review of your resource usage to see where you can best utilized reserved instances, please let us know.
Last week, ParkMyCloud released the ability to rightsize and modernize instances. This release helps you identify the virtual machine and database instances that are not fully utilized or on an older family, making smart recommendations for better server sizing and/or family selection for the level of utilization, and then letting you execute the rightsize action. We will also be adding a feature for scheduled rightsizing, allowing you to maintain instance continuity, but reducing its size during periods of lower utilization.
Alibaba Cloud is growing at an amazing rate, recently claiming to have overtaken both Google and IBM as the #3 public cloud provider globally, and certainly the #1 provider in China. Many sites and services hosted outside China are accessible from within China, but can suffer high latency and potentially lost functionality if their web interface requires interaction with blocked social media systems. As such, it is no surprise that a number of our (non-Chinese) customers have expressed interest in actually running virtual machine Alibaba instances in China. In this blog we are going to outline the process…and give an alternate plan.
General Process to Run Alibaba Instances in China
The steps to roll-out a deployment on Alibaba in mainland China are relatively clear:
Establish a “legal commercial entity” in Mainland China.
Select what services you want to run on Alibaba Cloud
Apply for Internet Content Provider (ICP) certification
The first three steps are described in more detail below.
Establish a Legal Commercial Entity
Or putting it another way – you need to have an office in China. This can range from an actual office with your own employees, to a Joint Venture, which is a legal LLC between your organization and an established Chinese company. If your service is more informational in nature and is not actually selling anything via the service, then this can be relatively easy, taking only a couple weeks (at least for the legal side), though you will still need to find a Joint Venture partner and make the deal worth their while financially. For commerce or trade-related services, the complexity, time requirements, and costs start going up significantly.
What to run on Alibaba Cloud
There is a decision-point here, as there is one set of rules for Alibaba-hosted web/app servers, and additional rules for everything else. Base virtual machines, databases and other such core IT building blocks require the ICP registration described below, plus “real-name registration”, where a passport is needed to actually confirm the identity of whomever is purchasing the resource. If all you need is a web server, then you can skip this step. In either case, some of the filing requirements involve having a server and/or DNS record prepared in order to complete the later steps. A web site does not need to be completely finished until launch, but a placeholder may be needed.
Internet Content Provider (ICP) certification
There are two flavors of ICP certification:
A “simple” ICP Filing – which is the bare minimum needed for informational websites that are not directly generating revenue.
ICP Commercial Filing – This starts with getting an approved ICP Filing, and then also includes a Commercial License that must be obtained a province/municipality in China. In some cases, this appears to be related to which Alibaba region you are using, and even the physical location of your public IP address.
Many references recommend finding an experienced consultant to guide you through these processes, and it is easy to see why!
OK…WAY too much work. What is Plan B?
The other way to run Alibaba instances in China is to host your site or services in Hong Kong. All of the rules described above apply to “Mainland China”, which does not include Hong Kong. Taiwan is also not included in Mainland China, but Hong Kong has the advantage of being better connected to the rest of China. If the main problem you are trying to solve is to reduce latency to your site for China-based customers, Hong Kong is the closest you can get without actually being there, and Alibaba appears to do a pretty good job optimizing the Hong Kong experience. No local office or legal filings required!
Once you are all set up: Optimize your Costs!
After your instances are set up, make sure you’re optimizing Alibaba costs. Our Mainland China-based customers using Alibaba have confirmed that ParkMyCloud is able to access the Alibaba APIs from our US-based servers – so you can go ahead and try it out.
In this blog we will look at the Google Cloud Committed Use discount program for customers that are willing to “commit” to a certain level of usage of the GCP Compute Engine.
The Committed Use purchasing option is particularly useful if you are certain that you will be continually operating instances in a region and project over the course of a year or more. If your instance usage does not add up to a full month, you may instead want to look at the Google Cloud Sustained Use discounts, which we discussed in a previous blog.
The Google Cloud Committed Use discount program has many similarities to the AWS and Azure Reserved Instances (RI) programs, and a couple unique aspects as well. A comparison of the various cloud providers’ reservation programs is probably worth a blog in itself, so for now, let’s focus on the Google Cloud Committed Use discounts, and the best times and places to use them.
Critical Facts about Google Cloud Committed Use
The Committed Use discount is best for a stable and predictable workload (you are committed to pay – regardless of whether you use the resources or not!)
Commitment periods are for either 1 or 3 years.
Commitments are for a specific region and a specific project. Zone assignment within a region is not a factor.
Discounts apply to the total number of vCPUs and amount of memory– not to a specific machine or machine type.
No pre-payment – the commitment cost is billed monthly.
GCP Committed Use discounts are available for all of the GCP instance families except the shared-core machine types, such as f1-micro and g1-small.
Committed Use discounts do not apply to the premium charges for sole-tenants, nor can they be used for Preemptible instances.
The commitments for General Purpose instances are distinct from those for Memory Optimized instances. If you have some of both types, you must buy two different types of Commitment. These types are:
Standard – n1-standard
High Memory – n1-highmem
High CPU – n1-highcpu
General purpose sole-tenant
How much does it cost?
Each Committed Use purchase must include a specific number of vCPUs and the amount of memory per vCPU. This combination of needing to commit to both a number of vCPUs and amount of Memory can make the purchase of a commitment a bit more complicated if you use a variety of machine types in your environment. The following table illustrates some GCP machine types and the amount of memory automatically provided per vCPU:
Memory per vCPU
0.9 – 6.5 GB
While the vCPU aspect is fairly straightforward, the memory commitment to purchase requires a bit of thought. Since it is not based on a specific machine type (like AWS and Azure), you must decide just how much memory to sign-up for. If your set of machine types is homogeneous, this is easy – just match the vCPU/memory ratio to what you run. The good news here is that you are just buying a big blob of memory – you are not restricted to rigidly holding to some vCPU/memory ratio. The billing system will “use up” a chunk of memory for one instance and then move on to the next.
Looking at a specific example, the n1-standard-8 in the Oregon region that we discussed in the Sustained Usage Blog, we can see that the Committed Use discount does amount to some savings, but one must maintain a usage level throughout the month to fully consume the commitment.
Recall from the earlier blog that the base price of this instance type in the GCP Price list already assumes a Sustained Usage discount over a full month, and that the actual “list price” of the instance type is $277.40, and Sustained Usage provides up to a maximum of a 30% discount. With that as a basis, we can see that the net savings for the Committed Use discount over 1 year is 37%, and over 3 years, rises to 55%. This is close to the advertised discount of 57% in the GCP pricing information, which varies by region.
The break-even points in this graph are about 365 hours/month for a 3 year commitment, and 603 hours/month for a 1 year commitment. In other words, if you are sure you will be using a resource less than 365 hours/month over the course of a year, then you probably want to avoid purchasing a 3 year Commitment.
Allocation of Commitments
Because Commitments are assigned on a vCPU/RAM basis, you cannot simply point at a specific instance, and say THAT instance is assigned to my Committed Use discount. Allocation of commitments is handled when your bill is generated, and your discount is applied in a very specific order:
To custom machine types
Sole-tenant node groups
Predefined machine types
This sequence is generally good for the customer, in that it applies the Commitment to the more expensive instances first. For example, an n1-standard-4 instance in Northern Virginia normally costs $109.35. If an equivalent server was constructed as a Custom instance, it would cost $114.76.
For sole-tenant node groups, you are typically paying for an entire physical machine, and the Committed Use discount serves to offset the normal cost for that node. For a sole-tenant node group that is expected to be operating 7x24x365, it makes the most sense to buy Committed Use for the entire system, as you will be paying for the entire machine, regardless of how many instances are running on it.
Commitments are allocated over the course of each hour in a month, distributing the vCPUs and RAM to all of the instances that are operating in that hour. This means you cannot buy a Commitment for 1 vCPU and 3.75 GB of RAM, and run two n1-standard-1 instances for the first half of the month, and then nothing for the second half of the month, expecting it all to be covered by the Commitment. In this scenario, you would be charged for one month at the committed rate, and two weeks at the regular rate (subject to whatever Sustained Usage discount you might accumulate for the second instance).
Thank you for….not…sharing?
Unlike AWS, where Reserved Instances are automatically shared across multiple linked accounts within an organization, GCP Commitments cannot be shared across projects within a billing account. For some companies, this can be a major decision point as to whether or not they commit to Commitments. Within the ParkMyCloud platform, we see customers with as many as 60 linked AWS accounts, all of which share in a pool of Reserved Instances. GCP customers do not have this flexibility with Commitments, being locked-in to the Project in which they were purchased. A number of our customers use AWS Accounts as a mechanism to track resources for teams and projects; GCP has Projects and Quotas for this purpose, and they are not quite as flexible for committed resource sharing. For a larger organization, this lack of sharing means each project needs to be much more careful about how they purchase Commitments.
Google Cloud Committed Use discounts definitely offer great savings for organizations that expect to maintain a certain level of usage of GCP and that expect to keep those resources within a stable set of regions and projects. Since GCP Commitments are assigned at the vCPU/Memory level, they provide excellent flexibility over machine-type-based assignments. With the right GCP usage profile over a year or more, purchase of Google Cloud Committed Use discounts is a no-brainer, especially since there are no up-front costs!