AWS Firecracker was announced at AWS re:Invent in November 2018 as a new AWS open source virtualization technology. The technology is purpose-built for creating and managing secure, multi-tenant container and function-based services. It was described by the AWS Chief Evangelist Jeff Barr as “what a virtual machine would look like if it was designed for today’s world of containers and functions.”
What is AWS Firecracker?
Firecracker is a Virtual Machine Manager (VMM) exclusively designed for running transient and short-lived processes. In other words, it helps to optimize the running of functions and serverless workloads. It’s also an important new component in the emerging world of serverless technologies and is used to enhance the backend implementation of Lambda and Fargate. Firecracker helps deliver the speed of containers combined with the security of VMs. If you use Lambda or Fargate, you’re already receiving the benefits of Firecracker. However, if you run/orchestrate a large volume of containers, you should take a look at this service with optimization in mind.
How AWS Firecracker Creates Efficiencies
AWS can realize the economic benefits of Firecracker by creating what they call “microVMs”, which allows them to spread serverless workloads around multiple servers thus getting a greater ROI from its investment in the servers behind serverless. In terms of customer benefit, using Firecracker enables these new microVMs to launch in 125 milliseconds or less, compared to the seconds (or longer) it can take to launch a container or spin up a traditional virtual machine. In a world where thousands of VMs can be spun up and down to tackle a specific workload, this will constitute a significant savings. And remember, these are fully fledged micro virtual machines, not just containers.The micro VM’s themselves are worth a closer look as each includes an in-process rate limiter to optimize shared network and storage resources. As a result, one server can support thousands of microVMs with widely varying processor and memory configurations.\
There is also the enhanced security and workload isolation only available from Kernel-based Virtual Machine (KVMs) – more secure than containers, which are less isolated. One particularly valuable security feature is that Firecracker is statically linked, which means all the libraries it needs to run are included in its executable code. This makes new Firecracker environments safer by eliminating outside libraries. Altogether, this offering and the combination of efficiency, security and speed created quite the buzz at the AWS re:Invent launch.
Will Firecracker make a “bang”?
There are a few caveats related to the still novel aspects of the technology. In particular, compared to alternatives, such as containers or Hyper-V VMs, it is prudent to confine to non-production workloads as the technology is still new and needs to be more fully battle-tested for production use.
However, as confidence, adoption, and experience grow in the use of serverless technologies it certainly seems like Firecracker can offer a popular new method for provisioning compute resources and will likely help bridge the current gap between VMs and containers.
Once or twice a year we like to take a look at what is going on in the world of reserved instance pricing. We review both the latest offerings and options put out by cloud providers, as well as how users are choosing to use Reserved Instances (AWS), Reserved VMs (Azure) and Committed Use (Google Cloud).
A good place to start when it comes to usage patterns and trends is the annual Rightscale (Flexera) State of Cloud Report. The 2019 report shows that current reservation usage stands at 47% for AWS, 23% for Azure and 10 percent of GCP. These are some interesting data when you view them alongside companies overall reporting that their number one cloud initiative for the coming year is optimizing their existing use of the cloud. All of these cloud providers have a major focus on pre-selling infrastructure via their reservations programs as this provides them with predictable revenue (something much loved by Wall St) plus also allows them to plan for and match supply with demand. In return for an upfront commitment they offer discounts of‘up to 80%”, albeit much as your local furniture retailer has big saving headlines, these discount levels still warrant further investigation.
While working on an upcoming new feature release we began to dig a little deeper into the nature of current reserved instance pricing and discounts. From our research it appears that a real world discount level is in the 30%-50% range. To achieve some of the much higher level discounts you might see the cloud providers pushing, typically requires commitments of three years; being restricted to only certain regions; restrictions on OS types; and generally a willingness to commit to spending a few million dollars.
Reservation discounts, while not as volatile as spot instances, do change and need to be carefully monitored and analyzed. For example as of this writing, one of the more popular modern m5.large instance types in a US East Region costs $0.096 per hour when purchased on demand, but reduces to $0.037, a significant 62% saving. However, to secure such a discount requires a three-year commitment and prepayment in full up front. While the numbers of such organizations committing to contracts of this nature is not publicly known, it is likely that only the most confident of organizations with large cash reserves would be positioned to make a play like this.
Depending on the precise program used to purchase the reservations, there can be certain options to either convert specific instance families, instance types and OS’s for other types or even to resell the instances on a secondary exchange for a penalty fee of 12%, on AWS for example. Or to terminate the agreement for the same 12% fee on Azure. GCP’s Committed Use program seems to be the most stringent as there is no way to cancel the contract or resell pre-purchased instances, albeit Google does not offer a pre-purchase option.
As the challenge of optimizing cloud spend has slowly moved up the priority list to take the #1 slot, so has a maturation process taken place inside organizations when it comes to undertaking economic analysis and understanding the various tradeoffs. Some organizations are using tools to support such analysis, others are hiring consultants or using in house analytics resources. Whatever the approach in terms of analyzing an organization’s use of cloud, this typically requires looking at balancing the purchase of different types of reservations, spot instances or using on-demand infrastructure that is highly optimized through automation tools. Whatever the approach, the level of complexity in such analysis is certainly not reducing, and mistakes are common. However, the potential savings are significant if you achieve the right balance and is clearly something you should not ignore.
The relative balance between the different options to purchase and consume cloud services in many ways reflects the overall context within which organizations operate, their specific business models and broader macro issues such as the outlook for the overall economy. Understanding the breadth of options is key and although for most organizations, reservations are likely to be a key component it is worth digging into just how large the relative trade offs might be.
As container technology moves past something new and into the mainstream, users are concerned about the next step: container optimization. In our conversations with customers and potential customers, containers have been a consistent topic for the last few years, typically focused on production environments. However, recent conversations have become more focused, specifically on how to optimize container spending.
Kubernetes – which seems to be the most popular of container services among our customer base – does allow for a number of ways to optimize for costs and to maximize performance. We have identified five specific opportunities ripe for container optimization. Take a look at these within your own environments.
1) Rightsize Your Pods
Kubernetes Pods are the smallest deployable computing units in the Kubernetes container environment. It is a common practice to use a standard template for limits and requests for pod provisioning. If requests describe the minimal requirement for the CPU and memory for a pod to be scheduled on a node, the limits describe the max amount of CPU and memory the pod can consume on that specific node. Typically engineers set the initial limits by using a rule of thumb, such as doubling it just to be on the safe side and then planning to change it later once they have some data to look at. As with many things in life, “later” rarely happens. As a result, the footprint of the cluster inflates over time, exceeding the actual demand for the services running inside the cluster.
Just think about it, if every pod is over-provisioned by 50% and the cluster is always is 80% full, that means that 40% of the cluster capacity is allocated but not used, or simply put — wasted.
2) Turn Off Idle Pods
Many standard instances/VMs and databases in non-production environments are idle outside of working hours and can be turned off or “parked”. The same case exists for Pods, which in non-production environments can and should be scheduled in the same way.
3) Rightsize Your Nodes
Too many worker nodes are the wrong size and type. Kubernetes permits co-allocating the applications on the same nodes, which can dramatically reduce the cloud bill. Yet, incorrectly sized instances and volumes can lead to the inflation of the cost of Kubernetes clusters. Rightsizing could save up to 50% (particularly if no previous action has been taken to rightsize your nodes.)
Another thing to consider is that smaller nodes have a higher relative OS footprint and increase management overhead. The smaller the node, the higher the number of stranded resources. Stranded resources are CPU or memory which are idle, yet cannot be allocated to any of the pods, because the pods which are to be scheduled are too big to claim it. If a pod’s sizes are close to the size of the node (server) the percentage of the resources which are stranded gets higher.
4) Consider Storage Opportunities
Out of the box, containers lose their data when they restart. This is fine for stateless components but becomes an issue when a persistent data store is required. One place to look for additional container optimization opportunities is the overprovisioning of persistent storage (EBS, Azure Storage Disks, etc) related to your containers. There are a number of options to optimize container storage, particularly virtualized storage that can be shared by multiple containers, and which persists over time, without being destroyed when individual containers are destroyed. There are a few different persistent-storage plugins and plugin-driven storage solutions available from third-party vendors.
5) Review Purchasing Options
All of the preceding options related to the actual configuration of your container infrastructure. Just as important as this is ensuring that your purchasing options closely align with your needs. Ensuring the correct instance/VM purchase type for your containerized infrastructure is critical to ensuring flexibility and maximizing ROI. Carefully analyze your purchasing options (e.g. on-demand, reservations and spot) to select the right option for your workload, both in terms of size and usage schedule. Note that reserved instances are not always the best option for resources that can be scheduled to be turned off. Leverage cost optimization tools to support the earlier options for instance scheduling and rightsizing. Such tools can often change the equation and help avoid lock-ins and upfront commitments.
Container Optimization is Just Another Kind of Resource Optimization
The opportunities to save money through container optimization are in essence no different than for your non-containerized resources. Native tools, from either the cloud provider or open source, can help with this, but their capabilities are limited. For a fully optimized environment, you’ll want to take advantage of the growing ecosystem of specific cost optimization tools.
Stay tuned for news from ParkMyCloud on this front coming soon!
Today we’re going to look at an interesting trend we are seeing toward the use of custom machine types in Google Cloud Platform. One of the interesting byproducts of managing the ParkMyCloud platform is that we get to see changes and trends in cloud usage in real time. Since we’re directly at the customer level, we can often see these changes before they are spotted by official cloud industry commentators. They start off as small signals in the noise, but practice has allowed us to see when something is shifting and a trend is emerging – as is the case with these custom machine types.
Over the last year, the shift to greater use of Custom Machine Types (launched in 2016 on Google Compute Engine) and to a lesser extent Optimized EC2 instances (launched in 2018 on AWS) are just such a signal that we have observed growing in strength. Interestingly, Microsoft has yet to offer their equivalent version on Azure.
What do GCE custom machine types let you do?
Custom machine types let you build a bespoke instance to match the specific needs of your workload. Many workloads can be matched to off-the-shelf instance types, but there are many workloads for which it is now possible to build a better matching machine which delivers a more cost effective price. The growth in adoption of this particular instance type supports the case and likely benefits of their availability.
So what are the benefits of these new customized machines? First, they provide a granular level of control to match the needs of your specific application workloads. In practice, this leads to compromise as you select the closest instance type to your optimal configuration. Such compromises typically lead to over-provisioning, a situation we see across the board among our customer base. We analyzed usage of the instances in our platform this summer, and found that across all the instances in our platform, the average vCPU utilization was less than 10%!
Secondly, they allow you to finely tune your machine to maximize the cost effectiveness of your infrastructure. Google claims savings of up to 50% when utilizing their customized options compared to traditional predefined instances, which we believe to be a reasonable assessment as we see the standard instance types are often massively overprovisioned.
On GCE, the variables that you can configure include:
Quantity and type of vCPU’s;
Quantity and type of GPU;
Memory size (albeit there are some limits on the maximum per vCPU).
On AWS, customized options are currently more limited and include only the number and type of vCPU’s, and the options are focused on per-core licensed software problems, rather than cost optimization. It will be interesting if they follow Google and open up cost-based customization options in the coming months, and allow the effective unbundling of fixed off-the-shelf instance types.
Should you use custom machine types?
So just because customization is an option, is this something you should actually pursue? In fact, you will pay a small premium compared to the size of standard instances/VMs, albeit you can optimize for specific workloads, which oftentimes will mean an overall lower cost. To make such an assessment will require that you examine your applications resource use and performance requirements. Such determinations require that you carefully analyze your infrastructure utilization data. This quickly gets complex, although there are a number of platforms which can support thorough analytics and data visualization. Ideally, such analytics would be combined with the ability to recommend specific cost-effective customized instance configurations as well as automate their provisioning.
Watch this space for more news on custom machine types!
The next plain on the cost optimization frontier for ParkMyCloud is cloud sizing. We have been working on product features around resource sizing that will deliver greater automation in the management of cloud infrastructure. A key part of this effort has involved analysis of cloud usage patterns across our entire user base. We’ve identified some interesting patterns and correlations in cloud sizing and usage.
vCPU Utilization Patterns: Lower than Expected
One data point that caught our attention was vCPU metric data, specifically the very low average (and peak) utilization we see in our users’ infrastructure. We know anecdotally that a large proportion of what users manage in our platform consists of non-production instances used for development, staging, testing, and data analytics workloads, many of which do not need to run 24/7/365. But even bearing this in mind, we see a surprisingly low vCPU utilization. Based on our most recent analysis of instances from across the four public cloud providers we support, some 50% of instances had an average vCPU of only 2% and a peak of 55%. Even at the 75th percentile, average utilization was only 7%, albeit with a peak of 98%.
What leads to these cloud sizing decisions?
Of course, when selecting instance sizes and types, vCPU is not the only consideration. To make an accurate assessment of the match between workload and instance type, there are several data points to consider, including memory, network, disk, etc. We have no visibility into the specific workloads on these instances and why they were chosen, but we can make some educated guesses about why this systematic overprovisioning of instances is occurring.
A few potential reasons include:
A need to provision instances with larger vCPUs in order to access instances with the required memory
A need to provision larger storage-optimized instances where the focus is is high data IOPS
Using some other ‘rule of thumb’ when provisioning such as the not-so-tried-and-tested ‘determine what I think I need then double it’ rule.
Clearly, there are a number of options which drive the performance and cost of cloud instances (VMs) including: the number of processor cores; the amount of RAM, storage capacity and storage performance, etc. Just focusing on one of these factors might not be overly useful, other than that we observe such extreme underutilization of one of these key components.
How much do cloud sizing choices matter?
Given the sheer volume of workloads moving to public cloud — some 80% of enterprises reported moving workloads to cloud in 2017 — it is critical to accurately determine, monitor and then optimize your compute resources is critical. If you think there’s a problem with improper cloud sizing in your environment, you may want to check out our recently published cloud waste checklist to identify other problem areas and take action to reduce costs.
There are many reasons why this “supersize me” approach to cloud sizing is occurring. We would be interested to get your take. How does your team determine compute requirements for cloud workloads? Are there other reasons why you might deliberately choose to oversize a resource? Comment below to let us know.