Azure Low Priority VMs for Cost Savings

Azure Low Priority VMs for Cost Savings

Among the many ways to purchase and consume Azure resources are Azure low priority VMs. These virtual machines are compute instances allocated from spare capacity, offered at a highly discounted rate compared to “on demand” VMs. This means they can be a great option for cost savings – for the right workloads. And we love cost savings! Here’s what you need to know about this purchasing option.

How Azure Low Priority VMs Work

The great part about these virtual machines is the price: it’s quite attractive with a fixed discount of 60-80% compared to on-demand. The “low priority” part means that these VMs can be “evicted” for higher priority jobs, which makes them suitable for fault-tolerant applications such as batch processing, rendering, testing, some dev/test workloads, containerized applications, etc.

Low priority VMs are available through Azure Batch and VM scale sets. Through Azure Batch, you can run jobs and tasks across compute pools called “batch pools”. Since batch jobs consist of discrete tasks run using multiple VMs, they are a good fit to take advantage of low priority VMs.

On the other hand, VM scale sets scale up to meet demand, and when used with low priority VMs, will only allocate when capacity is available. To deploy low priority VMs on scale sets, you can use the Azure portal, Azure CLI, Azure PowerShell, or Azure Resource Manager templates.

When it comes to eviction, you have two policy options to choose between:

  • Stop/Deallocate (default) – when evicted, the VM is deallocated, but you keep (and pay for) underlying disks. This is ideal for cases where the state is stored on disks.
  • Delete – when evicted, the VM and underlying disks are deleted. This is the recommended option for auto scaling because deallocated instances are counted against your capacity count on the scale set.

Azure Low Priority VMs vs. AWS Spot Instances

So are low priority VMs the same as AWS Spot Instances? In some ways, yes: both options allow you to purchase excess capacity at a discounted rate.

However, there are a few key differences between these options:

  • Fixed vs. variable pricing – AWS spot instances have variable pricing while Azure low priority VMs have a fixed price as listed on the website
  • Integration & flexibility – AWS’s offering is better integrated into their general environment, while Azure offers limited options for low priority VMs (for example, you can’t launch a single instance) with limited integration to other Azure services.
  • Visibility – AWS has broad availability of spot instances as well as a Spot Instance Advisor to help users predict availability and interruptibility. On the other hand, Azure has lower visibility into the available capacity, so it’s hard to predict if/when your workloads will run.

Should You Use Azure Low Priority VMs?

If you have fault-tolerant batch processing jobs, then yes, low priority VMs are worth a try to see if they work well for you. If you’ve used these VMs, we’re curious to hear your feedback. Have you had issues with availability? Does the lack of integrations cause any problems for you? Are you happy with the cost savings you’re getting? Let us know in the comments below.

Related reading:

AWS IAM User vs IAM Role for Secure SaaS Cloud Management

AWS IAM User vs IAM Role for Secure SaaS Cloud Management

While going through our recent Cloud Cost Optimization Competency review with AWS, one of the things they asked us to do was remove the ability for customers to sign up for our service using AWS IAM User credentials. They loved the fact that we already supported AWS IAM Role credentials, but their concern was that AWS IAM User credentials could conceivably be stolen and used from outside AWS by anyone. (I say inconceivable, but hey, it is AWS.)  This was a bit of a bitter pill to swallow, as some customers find IAM Users easier to understand and manage than IAM Roles. The #1 challenge of any SaaS cloud management platform like ours is customer onboarding, where every step in the process is one more hurdle to overcome.

While we could debate how difficult it would be to steal a customer cloud credential from our system, the key (pun intended) thing here is why is an IAM Role preferred over an IAM User?

Before answering that question, I think it is important to understand that an IAM Role is not a “role” in perhaps the traditional sense of Active Directory or LDAP. An AWS IAM Role is not something that is assigned to a “User” as a set of permissions – it is a set of capabilities that can be assumed by some other entity. Like putting on a hat, you only need it at certain times, and it is not like it is part of who you are. As AWS defines the difference in their FAQ:

An IAM user has permanent long-term credentials and is used to directly interact with AWS services. An IAM role does not have any credentials and cannot make direct requests to AWS services. IAM roles are meant to be assumed by authorized entities, such as IAM users, applications, or an AWS service such as EC2.

(The first line of that explanation alone has its own issues, but we will come back to that…)

The short answer for SaaS is that a customer IAM Role credential can only be used by servers running from within the SaaS provider’s AWS Account…and IAM User credentials can be used by anyone from anywhere. By constraining the potential origin of AWS API calls, a HUGE amount of risk is removed, and the ability to isolate and mitigate any issues is improved.

What is SaaS?

Software as a Service (SaaS) means different things to different vendors. Some vendors claim to be “SaaS” for their pre-built virtual machine images that you can run in your cloud. Maybe an intrusion detection system or a piece of a cloud management system. In my (truly humble) opinion this is not a SaaS – this is just another flavor of “on prem” (on-premise), where you are running someone’s software in your environment. Call it “in-cloud” if you do not want to call it “on-prem”, but it is not really SaaS, and it does not have the challenges you will experience with a “true” SaaS product – coming in from the outside. A core component of SaaS is that it is centrally hostedoutside your cloud. For an internal service, you might relax permissions and access mechanisms somewhat, as you have total control over data ingress/egress. A service running IN your network…where you have total control over data ingress/egress…is not the same as external access – the epitome of SaaS. Anyway: </soapbox>. (Or maybe </rant> depending on the tone you picked up along the way…)  

The kind of SaaS I am focussing on for this blog is SaaS for cloud management, which can include cloud diagramming tools, configuration management tools, storage management+backup tools, or cost optimization tools like ParkMyCloud.

AWS has enabled SaaS for secure cloud management more than any other cloud provider. A bold statement, but let’s break that down a bit. We at ParkMyCloud help our customers optimize their expenses at all of the major cloud providers and so obviously all the providers allow for access from “outside”. Whether it is an Azure subscription, a GCP project, or an Alibaba account, these CSP’s are chiefly focussed on customer internal cross-domain access. I.e., the ability of the “parent” account to see and manage the “child” accounts. Management within an organization. But AWS truly acknowledges and embraces SaaS.

You could attribute my bold statement to an aficionado/fanboi notion of AWS having a bigger ecosystem vision, or more specifically that they simply have a better notion of how the Real World works, and how that has evolved in The Cloud. The fact is that companies buy IT products from other companies…and in the cloud that enables this thing called Software as a Service, or SaaS. All of the cloud providers have enabled SaaS for cloud access, but AWS has enabled SaaS for more secure cloud access.

AWS IAM Cross-account Roles

So…where was I?  Oh…right…Secure SaaS access.

OK, so AWS enables cross-account access. You can see this in the IAM Create Role screen in the AWS Console:

If your organization owns multiple AWS accounts (inside or outside of an AWS “organization”), cross-account access allows you to use a parent account to manage multiple child accounts. For SaaS, cross-account access allows a 3rd-party SaaS provider to see/manage/do stuff with/for your accounts.

Looking a little deeper into this screen, we see that cross-account access requires you to specify the target account for the access:

The cross-account role allows you to explicitly state which other AWS account can use this role. More specifically: which other AWS account can assume this role.

But there is an additional option here talking about requiring an “external ID”…what is that about?  

Within multiple accounts in a single organization, this may allow you to differentiate between multiple roles between accounts….maybe granting certain permissions to your DevOps folks…other permissions to Accounting…and still other permissions to IT/network management.

If you are a security person, AWS has some very interesting discussions about the “confused deputy” problem mentioned on this screen. It discusses how a hostile 3rd party might guess the ARN used to leverage this IAM Role, and states that “AWS does not treat the external ID as a secret” – which is all totally true from the AWS side. But summing it up: cross-account IAM Roles’ external IDs do not protect you from insider attacks. For an outsider, the External ID is as secret as the SaaS provider makes it.

Looking at it from the external SaaS side, we get a bit of a different perspective. For SaaS, the External ID allows for multiple entry points…and/or a pre-shared secret. At ParkMyCloud (and probably most other SaaS providers) we only need one entry point, so we lean toward the pre-shared secret side of things. When we, and other security-conscious SaaS providers, ask for access, we request an account credential, explicitly giving our AWS account ID and an External ID that is unique for the customer. For example, in our UI, you will see our account ID and a customer-unique External ID:

Assume Role…and hacking SaaS

If we look back at the definition of the AWS IAM Role, we see that IAM roles are meant to be assumed by authorized entities. For an entity to assume a role, that party has to be an AWS entity that has the AWS sts:AssumeRole permission for the account in which it lives. Breaking that down a bit, the sts component of this permission tells us this comes from the AWS Secure Token Services, which can handle whole chains of delegation of permissions. For ParkMyCloud, we grant our servers in AWS an IAM Role that has the sts:AssumeRole permission for our account. In turn, this allows our servers to use the customer account ID and external ID to request permission to “Assume” our limited-access role to manage a customer’s virtual machines.

From the security perspective, this means if a hostile party wanted to leverage SaaS to get access to a SaaS customer cloud account via an IAM Role, they would need to:

  • Learn an account ID for a target organization
  • Find a SaaS provider leveraged by that target organization
  • Hack the SaaS enough to learn the External ID component of the target customer account credentials
  • Completely compromise one of the SaaS servers within AWS, allowing for execution of commands/APIs to the customer account (also within AWS), using the account ID, External ID, and Assume Role privileges of that server to gain access to the customer account.
  • Have fun with the customer SaaS customer cloud, but ONLY from that SaaS server.

So….kind-of a short recipe of what is needed to hack a SaaS customer. (Yikes!)  But this is where your access privileges come in. The access privileges granted via your IAM role determine the size of the “window” through which the SaaS provider (or the bad guys) can access your cloud account. A reputable SaaS provider (ahem) will keep this window as small as possible, commensurate with the Least Privilege needed to accomplish their mission.

Also – SaaS services are updated often enough that the service might have to be penetrated multiple times to maintain access to a customer environment.

So why are AWS IAM Users bad?

Going back to the beginning, our quote from AWS stated “An IAM user has permanent long-term credentials and is used to directly interact with AWS services”. There are a couple frightening things here.

Permanent long-term credentials” means that unless you have done something pretty cool with your AWS environment, that IAM User credential does not expire. An IAM User credential consists of a Key ID and Secret Access Key (an AWS-generated pre-shared secret) that are good until you delete them.

“…directly interact with AWS services” means that they do not have to be used from within your AWS account. Or from any other AWS account. Or from your continent, planet, galaxy, dimension, etc. That Key ID and Secret can be used by anyone and anywhere.

From the security perspective, this means if a hostile party wanted to leverage SaaS to get access to a SaaS customer cloud account via an IAM Role, they would need to:

  • Learn an account ID for a target organization
  • Find a SaaS provider leveraged by that target organization
  • Hack the SaaS enough to get the IAM User credentials.
  • Have fun…from anywhere.

So this list may seem only a little bit shorter, but the barriers to compromise are higher, and the opportunity for long-term compromise is MUCH longer. Any new protections or updates for the SaaS servers has no impact on an existing compromise. The horse has bolted, so shutting the barn door will not help at all.

What if the SaaS provider is not in AWS?  Or…what if *I* am not in AWS?

The other cloud providers provide some variation of an access identifier and a pre-shared secret. Unlike AWS, both Azure and Google Cloud credentials can be created with expiration dates, somewhat limiting the window of exposure. Google does a great job of describing their process for Service Accounts here. In the Azure console, service accounts are found under Azure AD>App registrations>All apps>App details>Settings>Keys, and passwords can be set to expire in 1 year, 2 years, or never. I strongly recommend you set reminders someplace for these expiration dates, as it can be tricky to debug an expired service account password for SaaS.

For all providers you can also limit your exposure by setting a very limited access role for your SaaS accounts, as we describe in our other blog here.

Azure does give SaaS providers the ability to create secure “multi-tenant” apps that can be shared across multiple customers. However, the API’s for SaaS cloud management typically flow in the other direction, reaching into the customer environment, rather than the other way around.

IAM Role – the Clear Winner

Fortunately, when AWS “strongly recommended” that we should discontinue support for AWS IAM User-based permissions, we already supported an upgrade path, allowing our customer to migrate from IAM User to IAM Role without losing any account configuration (phew!). We have found some scenarios where IAM Role cannot be used – like between the AWS partitions of AWS global, AWS China, and the AWS US GovCloud. For GovCloud, we support ParkMyCloud SaaS by running another “instance” of ParkMyCloud from within GovCloud, where cross-account IAM Role is supported.

With the additional security protections provided for cross-account access, AWS IAM Role access is the clear winner for SaaS access, both within AWS and across all the various cloud providers.

Why the Principle of Least Privilege is Important for SaaS-based Cloud Management

Why the Principle of Least Privilege is Important for SaaS-based Cloud Management

The principle of least privilege is important to understand and follow as you adopt SaaS technologies. The market for SaaS-based tools is growing rapidly, and can typically be activated much more quickly and cheaply than creating a special-purpose virtual machine within your cloud environment. In this blog, I am focusing specifically on the SaaS cloud management tool area, which can include services like cloud diagramming tools, configuration management tools, storage management and backup tools, or cost optimization tools like ParkMyCloud.

Why the Principle of Least Privilege is Important

Before you start using such tools and services, you should carefully consider how much access you are granting into your cloud. The principle of least privilege is a fundamental tenet of any identity and access control policy, and basically means a service or user should have no more permissions than absolutely required in order to do a job.

Cloud account privileges and permissions are typically granted via roles and permissions. All of the cloud providers provide numerous predefined roles, which consist of pre-packaged sets of permissions. Before granting any requested predefined role to a 3rd-party, you should really investigate the permissions or security policy embedded in that role. In many (most?) cases, you are likely to find that the predefined roles give a lot more information or capabilities away than you are really likely to want.

SaaS Onboarding – Where Least Privilege Can Get Lost

For on-boarding of new SaaS customers, the initial permissions setup is often the most complicated step, and some SaaS cloud management platforms try to simplify the process by asking for one of these predefined roles.  For example, the Amazon ReadOnlyAccess role or the Azure Reader role or the GCP roles/viewer role.  While this certainly makes onboarding of SaaS easier, it also exposes you to a massive data leakage problem.  For example, with the Amazon ReadOnlyAccess role a cloud diagramming tool can certainly get a good enough view of your cloud to create a map…but you are also granting read access for all of your IAM Users, CloudTrail events and history, any S3 objects you have not locked-down with a distinct bucket policy, and….lots of other stuff you probably do not even know you have.  It is like kinda like saying – “Here, please come on in and look at all of our confidential file cabinets – and it is OK for you to make copies of anything interesting, just please do not change any of our secrets to something else…”  No problem, right?

Obviously, least privilege becomes especially critical when giving permissions to a SaaS provider, given the risk of trusting your cloud environment to some unknown party.

Custom Policies for SaaS

Because of the broad nature of many of their predefined roles, all of the major cloud providers give you the ability to assign specific permissions to both internal and external users through Policies.  For example, the following policy snippets show the minimum permissions ParkMyCloud requests to list, start, and stop virtual machines on AWS, Google, and Azure.

Creating and assigning these permissions makes SaaS onboarding a bit more complicated, but it is worth the effort in terms of reducing your exposure.

Other Policy Restrictions

What if you want to give a SaaS provider permissions, but lock it down to only certain resources or certain regions?  AWS and Azure allow you to specify in the policy which resources the policy can be applied to. Google Cloud….not so much.  AWS takes this the farthest, allowing for very robust policies down to specific services, and the addition of tag-based caveats for the policy permissions, for example:

This policy locks down the Start and Stop permissions to only those instances that have the tag name/value parkmycloud: yes, and are located in the us-east-1 region.  Similar Conditions can be used to lock this down by region, instance type, and many other situations. (This recent announcement shows another way to handle the region restriction.)

Azure has somewhat similar features, though with a slightly different JSON layout, as described here.  It does not appear you can use resource tags to for Azure, nor does Azure provide easy ways to limit the geographic scope of permissions.  You can get around the location and grouping of resources by using Azure Management Groups, but that is not quite as flexible as an arbitrary tag-based system, and is actually more intended to aggregate resources across subscriptions, rather than be more specific within a subscription.  That said, the Azure permissions defined here are a bit more granular than AWS.  This does allow for a bit more specificity in permissions if it is needed, but can no doubt grow tedious to list and manage.

Google Cloud provides a long list of predefined roles here, with an excellent listing the contained permissions.  There is also an interesting page describing the taxonomy of the permissions here, but Google Cloud appears to make it a bit difficult to enumerate and understand the permissions individually, outside of the predefined roles.  Google does not provide any tag or resource-based restrictions, apart from assignment at the Project level. More on user management and roles by cloud provider in this blog.

Gotchas

You may note that the ec2:Describe permission in our last example does not have the tag-based restriction.  This is because the tag-based restriction can only be used for certain permissions, as shown in the AWS documentation.  Note also that some APIs can do several different operations, some of which you may be OK with sharing, and others not.  For example, the AWS ModifyInstance permission allows the API user to change the instance type.  But…this one API (and associated permission) also allows the API user to modify security group assignments, shutdown behaviors, and other features – things you may not want to share with an untrusted 3rd party.

Key takeaway here?  Look out for permissions that may have unexpected consequences.

Summary

Beware of SaaS cloud management providers who are asking for simple predefined roles from your cloud provider.  They are either giving a LOT more functionality than you are likely to want from a single provider, or they are asking for a lot more permissions than they need.  Ask for a “limited access policy” that gives the SaaS provider ONLY what they need, and look for a document that defines these permissions and how they tie back to what the SaaS provider is doing for you.

These limited access policies serve to limit your exposure to accidents or compromises at the SaaS provider.

Cloud Storage Cost Comparison: AWS vs. Azure vs. Google

Cloud Storage Cost Comparison: AWS vs. Azure vs. Google

Today, we’ll take a brief look at cloud storage cost comparison from the three major cloud service providers. When it comes to finding a solution for your cloud computing needs, it is fair to say that for every business the solutions are based on a case-by-case scenarios – and given the breadth of cloud storage options available, it is certainly true in this case. A few things we’ll briefly touch points on are pricing models, discounts and steps you can take to avoid wasted cloud spend.

The leading cloud service providers have certain fortes and weaknesses that ultimately differentiate each one of them to be the potential solution to support your development infrastructure, operations and applications. Cloud service providers offer many different cloud pricing points depending on your compute, storage, database, analytics, application and deployment requirements. Additionally, you’d want to consider available services and networks provided to see the full scope of their resource capabilities and governance.

Prices can be subject to the type of hosting option you choose. One example is Relational Database Services (RDS). RDS pricing changes according to which database management system you use, and there are many more services like this to choose from.

More detail, beyond just storage, available in our full cloud pricing comparison.

AWS and Google Stand Out

Although not always the case, AWS is presumed to be the least expensive option available and remains the leader in the cloud computing market. But, Microsoft Azure and Google (GCP) are not far behind, and in recent years they have commanded innovation and market pricing reductions, thus closing gaps to bring them closer to AWS. That been said, being the first in the market gives AWS a great advantage over the competition as they command a large scale of businesses and are able to offer lower prices than the competition. They are well known for attracting more businesses, and in turn, they invest their money back into the cloud by adding more servers to their data centers. Google is closing the gap on AWS as they were the first to cut prices in their pricing model to match AWS’.

Storage Services Overview

Let’s take a look at some of the more popular storage options offered by each of the major three providers.

Amazon S3

Amazon Simple Storage Service (S3) is the most durable, highly performant and secure cloud storage service. It manages accounts at every level, scales on-demand and offers insights with built-in analytics.  

Amazon EBS

Amazon Elastic Block Store (EBS) provides block level storage volumes for use with EC2 instances. EBS delivers low-latency and consistent performance scaled to the needs of your application.

Amazon Glacier

Amazon Glacier provides data archiving and long-term back up at a low-cost. It allows you to query data in place and retrieve only the subset of data you need from within an archive.

More about AWS options: https://aws.amazon.com/products/storage/

Google Cloud Storage

Google Cloud Storage offers a single API for all storage classes, simplifying development integration and reducing code complexity. Its highly scalable and performant with unlimited object storage.

Cloud Filestore

Google Filestore is a high-performance file storage for applications that require a filesystem interface and a shared filesystem for data.

Persistent Disk

Google Persistent Disk is a reliable high-performance block storage for virtual machine instances.

Explore Google storage options: https://cloud.google.com/products/storage/

Archive Storage

Azure Archive Storage offers a low-cost, durable, and highly available secure cloud storage for rarely accessed data with flexible latency requirements.

Blob Storage

Azure Blob Storage is a massively scalable object storage for unstructured data.

Azure Files

Azure Files is a simple, secure and fully managed cloud file sharing storage.

Check this out as well on Azure options: https://docs.microsoft.com/en-us/azure/architecture/aws-professional/services

Sample Pricing Comparison

cloud storage cost comparison chart

Eliminate Cloud Overspend and Save Money

Comparing cloud storage costs and getting the right solution for your storage use case is important, but don’t forget once you deploy you need to ensure you optimize your solution and cost. It’s important that your organization fully understands how much can be wasted on cloud spend. Over-provisioned, underutilized and idle cloud resources run your cloud bill up and create waste. Always ensure that you are optimizing costs and governing usage by eliminating wasted cloud spend  – get started today.

Amazon ECS Overview: What You Need To Know

Amazon ECS Overview: What You Need To Know

Amazon ECS is a great choice of container hosting platforms for AWS developers, among the many available options. Jumping into an ECS deployment can be daunting, as there are multiple options and varying terminology with hard-to-predict costs. We’ll go over some of the basics of Amazon ECS, including some terminology and price considerations you’ll need to consider.

Amazon ECS 101

Amazon ECS (which stands for Elastic Container Service) lets you run Docker containers without having to manage the orchestration of those containers. With ECS, you can deploy your containers on EC2 servers or in a serverless mode, which Amazon calls Fargate. Both deployment types handle the orchestration and underlying server management for you, so you can just schedule and deploy your containers.

Amazon ECS can work for both long-running jobs and short bursts of tasks, and includes tools for adjusting the scale of the container fleet as well as the scheduling of those containers. Task placement definitions let you choose which instances get which containers, or you can let AWS manage this by spreading across all Availability Zones.

Benefits of Amazon ECS include:

  • Easy integrations into other AWS services, like Load Balancers, VPCs, and IAM
  • Highly scalable without having to manage the cluster masters
  • Multiple management methods, including the AWS console, the AWS API, or CloudFormation templates
  • Elastic Container Registry helps you manage and sort your container images

Tasks and Services and Containers (Oh My!)

Diving into the world of containers on AWS requires the use of some terminology you may not be familiar with:

  • Container – An isolated environment that contains the bare minimum of services and code needed to run just a particular part of your application or microservice, designed to be run on any Docker-compatible OS.
  • Task Definition – A layout of the pieces required to run your application, which can include one or more containers along with networking and system requirements.
  • Task – An instantiation of a Task Definition.  Multiple tasks can use the same task definition.
  • Service – A layout of the boundaries and scaling options you set for your groupings of similar Tasks, which is similar to the relationship between AutoScaling Groups and EC2 Virtual Machines.
  • Cluster – A collection of EC2 instances running a specialized operating system where you will run your Service.

ECS Pricing: The (Hopefully Not) Million Dollar Question

Amazon ECS pricing has a few different variables, starting with your choice of deployment methods.  Since Fargate abstracts away the underlying infrastructure, you only pay for the seconds of vCPU and Memory that your Tasks are using (with a minimum of 1 minute for each Task). This pricing structure has the “serverless architecture” benefit of only paying for what you need when you need it, but also means that estimating these charges can be quite difficult.

Standard ECS pricing does not charge per-Task, but will charge based on the infrastructure you have deployed for your cluster. The cluster uses AutoScaling Groups of EC2 instances, and during setup of the cluster you can choose the instance size you want and the number of instances for the initial cluster deployment.  Since the cluster can scale up and down, you have the flexibility if you get a spike in task usage, but you do need to keep an eye on underutilized or idle instances.

Containing the Containers

As you can tell, utilizing Amazon ECS containers manages a lot of the back-end work for you, but brings a whole different set of considerations for your organization.  ParkMyCloud has some news coming later this year to help you manage your ECS containers! Contact us if you’d like to be notified when that’s available.

Not yet using containers, but have other AWS infrastructure? We can help control costs.

Why Reserved Instance Pricing Needs Careful Evaluation

Why Reserved Instance Pricing Needs Careful Evaluation

Once or twice a year we like to take a look at what is going on in the world of reserved instance pricing. We review both the latest offerings and options put out by cloud providers, as well as how users are choosing to use Reserved Instances (AWS), Reserved VMs (Azure) and Committed Use (Google Cloud).

A good place to start when it comes to usage patterns and trends is the annual Rightscale (Flexera) State of Cloud Report. The 2019 report shows that current reservation usage stands at 47% for AWS, 23% for Azure and 10 percent of GCP. These are some interesting data when you view them alongside companies overall reporting that their number one cloud initiative for the coming year is optimizing their existing use of the cloud. All of these cloud providers have a major focus on pre-selling infrastructure via their reservations programs as this provides them with predictable revenue (something much loved by Wall St) plus also allows them to plan for and match supply with demand. In return for an upfront commitment they offer discounts of ‘up to 80%”, albeit much as your local furniture retailer has big saving headlines, these discount levels still warrant further investigation.

While working on an upcoming new feature release we began to dig a little deeper into the nature of current reserved instance pricing and discounts. From our research it appears that a real world discount level is in the 30%-50% range. To achieve some of the much higher level discounts you might see the cloud providers pushing, typically requires commitments of three years; being restricted to only certain regions; restrictions on OS types; and generally a willingness to commit to spending a few million dollars.

Reservation discounts, while not as volatile as spot instances, do change and need to be carefully monitored and analyzed. For example as of this writing, one of the more popular modern m5.large instance types in a US East Region costs $0.096 per hour when purchased on demand, but reduces to $0.037, a significant 62% saving. However, to secure such a discount requires a three-year commitment and prepayment in full up front. While the numbers of such organizations committing to contracts of this nature is not publicly known, it is likely that only the most confident of organizations with large cash reserves would be positioned to make a play like this.

Depending on the precise program used to purchase the reservations, there can be certain options to either convert specific instance families, instance types and OS’s for other types or even to resell the instances on a secondary exchange for a penalty fee of 12%, on AWS for example. Or to terminate the agreement for the same 12% fee on Azure. GCP’s Committed Use program seems to be the most stringent as there is no way to cancel the contract or resell pre-purchased instances, albeit Google does not offer a pre-purchase option.

As the challenge of optimizing cloud spend has slowly moved up the priority list to take the #1 slot, so has a maturation process taken place inside organizations when it comes to undertaking economic analysis and understanding the various tradeoffs. Some organizations are using tools to support such analysis, others are hiring consultants or using in house analytics resources. Whatever the approach in terms of analyzing an organization’s use of cloud, this typically requires looking at balancing the purchase of different types of reservations, spot instances or using on-demand infrastructure that is highly optimized through automation tools. Whatever the approach, the level of complexity in such analysis is certainly not reducing, and mistakes are common. However, the potential savings are significant if you achieve the right balance and is clearly something you should not ignore.

The relative balance between the different options to purchase and consume cloud services in many ways reflects the overall context within which organizations operate, their specific business models and broader macro issues such as the outlook for the overall economy. Understanding the breadth of options is key and although for most organizations, reservations are likely to be a key component it is worth digging into just how large the relative trade offs might be.