Every year, an exorbitant amount of money is wasted on idle cloud resources. That is – resources that are provisioned, and being paid for, but not actually being used. This is a huge problem that clogs up cloud environments and drains budgets.
Note: a version of this blog was originally published in 2018. It has been completely updated and rewritten for 2020.
Even the Cloud Providers are Talking About It
The issue of idle resources is something that is recognized even by the cloud providers themselves. This may sound counterintuitive. Doesn’t AWS just want as much money from you as it can get? Well, maybe, yes: but the best way for them to do this is by providing you with a positive experience and the most value for your money.
Case in point: at the AWS re:Invent keynote this week, Andy Jassy spoke about a few core guidelines for organizations to follow to ensure organizations are on the path for successful technology financial management. “Start early and start small…The key is to start experimenting with what matters the most to your organization” Jassy said. He shared that a great place to start is by deleting or stopping idle resources in your cloud environment. Small changes like this can have huge impacts and benefits can increase as time goes on. Idle resources are eating at your cloud budget causing you to spend money on resources that aren’t even being used.
AWS’s cloud financial management framework mentions this among the myriad ways your organization can improve practices to reduce usage waste and optimize costs.
The Cost of Idle Resources
The typical “idle resources” that come to mind are instances purchased On Demand that are being used for non-production purposes like development, testing, QA, staging, etc. These resources can be “parked” when they’re not being used, such as on nights and weekends, saving 65% or more per resource each month. In order to fully understand the problem of idle cloud resources, we have to expand this scope beyond just your typical virtual machine.
Most non-production resources can be parked about 65% of the time, that is, parked 12 hours per day and all day on weekends (this is confirmed by looking at the resources parked in ParkMyCloud – they’re scheduled to be off just under 65% of the time.) We see that our customers are paying their cloud providers an average list price of $220 per month for their instances. If you’re currently paying $220 per month for an instance and leaving it running all the time, that means you’re wasting $143 per instance per month.
Maybe that doesn’t sound like much. But if that’s the case for 10 instances, you’re wasting $1,430 per month. One hundred instances? You’re up to a bill of $14,300 for time you’re not using. And that’s just a simple micro example. At a macro level that’s literally billions of dollars in wasted cloud spend.
4 Types of Idle Cloud Resources
So what kinds of resources are typically left idle, consuming your budget? Let’s dig into that, looking at the big three cloud providers — Amazon Web Services (AWS), Microsoft Azure, and Google Cloud Platform (GCP).
On Demand Instances/VMs – this is the core of the conversation, and what we’ve addressed above. On demand resources – and their associated scale groups – are frequently left running when they’re not being used, especially those used for non-production purposes.
Databases – there’s no doubt that databases are frequently left running when not needed as well, in similar circumstances to the On Demand resources, particularly non-production. The problem is whether you can park them to cut back on wasted spend. AWS allows you to park certain types of its RDS services like Neptune and Redshift databases, RDS instances and Google Cloud SQL. Make sure you review your database infrastructure regularly and terminate anything unnecessary – or change to a smaller size if possible.
Load Balancers – AWS Elastic Load Balancers (ELB) cannot be stopped (or parked), so to avoid getting billed for the time you need to remove it. The same can be said for Azure Load Balancer and GCP Load Balancers. Alerts can be set up in Cloudwatch/Azure Metrics/Google Stackdriver when you have a load balancer with no instances, so be sure to make use of those alerts.
Containers – optimizing container use is a project of its own, but there’s no doubt that container services can be a source of waste. It’s important that you regularly review the usage of your containers and the utilization of the infrastructure, especially in non-production environments. In the last few months, ParkMyCloud has released support for Amazon EKS, Azure AKS and Google Cloud GKE so customers can make sure their idle resources are parked.
Cloud waste is a billion-dollar problem facing businesses today. Make sure you’re turning off idle cloud resources in your environment, by parking those that can be stopped and eliminating those that can’t, to do your part in optimizing cloud spend.
Lately, many of our AWS customers (especially those purchasing through the AWS marketplace) have mentioned that they are using an AWS EDP, which stands for Amazon Web Services Enterprise Discount Program. Essentially, this is AWS’s way to provide enterprises a discount off its services based on a volume (consumption) commitment. In the most recent Flexera State of the Cloud Report, 37% of respondents using AWS reported using an EDP.
How does an AWS EDP work?
A simple example of how an AWS EDP or “AWS Enterprise Agreement” might work is as follows: for the next 3 years, you commit to spend $5MM on AWS services, and receive a 13% discount. Even if you don’t spend $5MM you would still owe them $5MM, and of course if you go over you would get billed for the overage. Of course, the terms and amounts are all up to negotiation with AWS.
AWS’s website does not provide a lot of information about these agreements, which is perhaps to be expected considering they will customize the terms for any given customer. Here’s what they say: “Customers also have the option to enroll in an Enterprise Agreement with AWS. Enterprise Agreements give customers the option to tailor agreements that best suit their needs. For additional information on Enterprise Agreements please contact your sales representative.”
There are a few things you should consider about the EDP contract terms you agree upon with AWS. For example, the agreement may be limited to certain accounts, services, and/or regions.
You’ll see big numbers in the news, such as Apple’s $30 million monthly on AWS or Pinterest’s $750 million multi-year deal – but even if you’re not a tech giant or a unicorn startup, an Amazon EDP can still be on the table and a way to get an across-the-board discount.
What Other Agreements Compare to an AWS EDP?
Going back to my days at IBM, we used to generally refer to discount contracts as Enterprise License Agreements (ELAs). An ELA is a software site license that is sold to large enterprises. It typically allows for unlimited use of single or multiple software products throughout the organization, although there were often some restrictions and limitations. During my time at IBM, these were sold upfront for a set dollar amount and term, generally 3 to 5 years and usually had a cap on usage, so at some point overages could kick in – which would help with the renegotiation, of course.
Other terms used with a similar concept include Site License, Enterprise Agreement (this is a common Microsoft term – EA), Volume Purchase Agreement (VPA) and All You Can Eat (AYCE). What all of these have in common is that the vendor gets a large revenue/spend commit, and the enterprise gets discounting and flexibility.
How Else can you Get an AWS Discount?
AWS provides enterprises with multiple ways to consume its services based on their business needs and get volume discounts. Traditional on-demand instances allow you to pay for capacity by the hour without any long-term commitments or upfront payments. AWS Savings Plans are a way to save by committing to use at a micro scale: you commit to a certain amount of spend per hour, and in return get a discount on the VMs you’re already running. The less flexible reserved instances are another option for applications with steady-state or predictable usage and can provide up to a 75% discount compared to on-demand pricing. Especially for smaller organizations, there are a number of ways to get AWS credits to ease the burden. And of course they promote scale groups, spot instances, and other optimization efforts to reduce spend and waste but those are more cost control opportunities then they are discounts. Plus, you can always wait for better pricing.
Should You Use an AWS EDP?
Whether you participate in this program is somewhat predicated on your existing partner relationship and amount of spend with AWS, but you can always reach out to your AWS representative. Before committing to an AWS EDP, ensure that you are confident your organization will consume the amount of resources you are committing to. Keep in mind that this can also include the AWS Marketplace. The third party solutions you can buy on the AWS Marketplace also count toward your AWS EDP, and leverage that discount structure — so before completing a third-party transaction, make sure you check the Marketplace to see if the cloud solution you buy is listed there.
Given our focus on public cloud cost control, we here at ParkMyCloud are always trying to understand more about the future trends in cloud computing, specifically the public cloud infrastructure (IaaS) and platform (PaaS) market. Now that public cloud has become ubiquitous, there’s a common theme. While new services and products continue to develop, more and more of them are focusing on not just creating capabilities that were previously lacking – they’re focused on optimizing what already exists.
Are Cloud Services Still Growing?
Before we dive into optimization, let’s take a look at how the cloud market continues to grow in 2020 and beyond. Gartner estimates that $257.9B will be spent on public cloud services in 2020, up 6.3 from 2019 as outlined in the table below:
And according to IDC, almost half of IT spending is cloud-based, “reaching 60% of all IT infrastructure and 60-70% of all software, services and technology spending in 2020.” These projections come mid-2020, showing that even given the disruption this year, between Gartner and IDC, no one expects cloud adoption and spending to slow down any time soon. So what’s driving this growth and what are the future trends in cloud computing we should be on the lookout for in 2020 and beyond?
Trends in Cloud Computing You’ve Probably Heard About
There is definitely a lot of hype around Blockchain, Quantum Computing, Machine Learning, and AI, as there should be. But at a more basic level, cloud computing is changing businesses in many ways. Whether it is the way they store their data, improvements to agility and go-to-market for faster release of new products and services, or how they develop and operate services remotely in today’s “locked-down world”, cloud computing is benefitting all businesses in every sector. Smart businesses are always looking for the most innovative ways to improve and accomplish their business objectives, i.e., make money.
When it comes to cloud technology, more and more businesses are realizing the benefits that cloud can provide them and are beginning to seek more cloud solutions to conduct their business activities. And obviously, Amazon, Microsoft, Google, Alibaba, IBM, Cisco, VMWare and Oracle plan to capture this spend by providing a dizzying array of IaaS, PaaS, and DaaS offerings to help enterprises build and run their services.
How These Trends Make Cloud Computing Better
Cloud Automation Tools: as modern IT environments continue to become more diverse and distributed in the pursuit of key business goals, they also bear new challenges for the operation teams responsible for keeping everything running smoothly. The go-to strategy for taming the associated complexity can be summed up in one word – automation.
Automation tools, including some that incorporate AI, are on the rise in 2020. These new automation capabilities, along with comprehensive dashboards that provide a holistic view into multi-cloud operations, will become increasingly important for cloud and IT operations to support the lines of business regardless of where they place their workloads. These tools can help put the right workloads in the right place, manage costs, improve security and governance, and ensure application performance.
Desktop as a service (DaaS): DaaS is expected to have the most significant growth in 2020, increasing 95.4% to $1.2 billion. DaaS offers an inexpensive option for enterprises that are supporting the surge of remote workers due to the global pandemic and their need to securely access enterprise applications from multiple devices and locations.
Multi-Cloud and Hybrid Cloud: Once predicted as the future, the multi- and hybrid cloud world has arrived and will continue to grow. Most enterprises (93 percent) described their strategy as multi-cloud in 2020 according to a Flexera report (up 21% from 2018) and 87% have a hybrid cloud strategy. In addition, 71 percent of public cloud adopters are using 2+ unique cloud environments/platforms. These numbers will only go up in 2021. While this offers plenty of advantages to organizations looking to benefit from different cloud capabilities, using more than one CSP complicates governance, cost optimization, and cloud management further as native CSP tools are not multi-cloud. As cloud computing costs remain a primary concern, it’s crucial for organizations to stay ahead with insight into cloud usage trends to manage spend (and prevent waste) and optimize application performance.
It’s a complex problem, but we do see many organizations adopting a multi-cloud strategy with cost control and governance in mind, as it avoids vendor lock-in and allows flexibility for deploying workloads in the most cost-efficient manner (and at a high level, keeps the cloud providers competitive against each other to continually lower prices).
Growth of Managed Services: The global cloud managed services market is growing rapidly and is expected to reach $116B billion by 2025, growing from $62.4B in 2020 according to a study conducted by Markets and Markets. Enterprises are focusing on their primary business operations, which results in higher cloud managed services adoption. Business services, security services, network services, data center services, and mobility services are major categories in the cloud managed services market. Implementation of these services will help enterprises reduce IT and operations costs and will also enhance productivity of those enterprises.
Managed service providers – the good ones, anyway – are experts in their field and some of the most informed consumers of public cloud. By handing cloud operations off to an outside provider, companies are not only optimizing their own time and human resources – they’re also pushing MSPs to become efficient cloud managers so they can remain competitive and keep costs down for themselves and their customers.
Cloud Trends Are Always Evolving
While today, it sometimes seems like we’ve seen the main components of cloud operations and all that’s left to do is optimize them, history tells us that’s not the case. Cloud has been and will continue to be a disruptive force in enterprise IT for years to come as has the Global Pandemic of 2020, and future technology trends in cloud computing will continue to shape the way enterprises leverage public, private and hybrid cloud. Remember: AWS was founded in 2006, the cloud infrastructure revolution is still in early days, and there is plenty more XaaS to be built.
The deliverability of cloud governance models has improved as public cloud usage continues to grow and mature. These models allow large enterprises to tier and scale their AWS Accounts, Azure Subscriptions and Google Projects across hundreds and thousands of cloud users and services. When we first started talking to customers 5+ years ago, mostly AWS users at the time, they often had a single AWS account for their entire organization and required third-party tools to manage usage and costs by project, line of business or application owner. But now, the “Big 3” cloud providers offer an array of ways for even the largest Fortune 500 enterprises to set up, run and manage their use of the dizzying volume of cloud services.
Why Cloud Governance Models are Important
The main way cloud providers allow cloud administrators to manage and grant access to their services is by leveraging Identity and Access Management (IAM) and providing options for roles and policies that govern both access and usage. IAM lets you grant granular access to specific AWS, Azure and/or Google Cloud resources and helps prevent access to other resources. IAM lets you adopt the security principle of least privilege, where you grant only necessary permissions to access specific resources like VM’s, Databases, Storage, Containers, etc.. With IAM, you manage access control by defining who (identity) has what access (role) for which resource.
In ParkMyCloud, we apply this with Teams and Roles. Admins can create Teams (equivalent to Projects, Applications, or Lines of Business) and can invite a Team Lead to manage that PMC Team, and they can in turn grant users access and set permissions for them, which can then by automated based on policies, usually by leveraging tags but you can use other metadata as well.
What if you want more flexibility with the cloud providers to both manage user access and to more tightly align your cloud services and usage to your organizational structure, projects and applications? Each of the major providers has designed ways for large enterprises to implement a hierarchical usage of cloud users and services that probably can look very similar to that enterprises organization chart. (If you can understand their jargon.)
How AWS, Azure, and Google Apply Cloud Governance Models
We dug into AWS, Azure and Google and this is what we found:
Amazon Web Services (AWS)
Tier 1: AWS Organization
Tier 2: Organization Unit
Tier 3: AWS Accounts
Tier 4: Tags
Tier 1: Azure Enterprise Portal
Tier 2: Departments
Tier 3: Accounts
Tier 4: Subscriptions
Tier 5: Resource Groups
Tier 6: Tags
Tier 1: Organization
Tier 2: Folders
Tier 3: Projects
Tier 4: Resources
Tier 6: Tags
Tips for implementing Cloud Governance Models:
Research and attend web sessions on these cloud governance models to ensure you understand the nuance
Implement your cloud provider’s latest hierarchies and governance models prior to mainstream cloud adoption in your organization
Make sure you run the hierarchies you plan to implement by CloudOps, ITOps, DevOps and FinOps to ensure proper organizational mapping and reporting
The cloud providers have done a pretty good job of documenting their roles, policies and hierarchies and creating a graphical representation of their current hierarchical structures cloud governance models. Of course, none of them use the same terminology – I mean, why would you, too easy, right? (And why does Google rank a ‘Folder’ above a ‘Project’? )
With these options available to you, your cloud operations team can make sure to use this to your advantage when planning new resources, accounts, and use cases within your organization. Let us know your thoughts and if you use any of these models to improve your cloud usage.
Cloud spend optimization is always top of mind for public cloud users. It’s usually up there with Security, Governance, and Compliance – and now in 2020, 73% of respondents to Flexera’s State of the Cloud report said that ‘Optimize existing use of cloud (cost savings)’ was their #1 initiative this year.
So – what the heck does that mean? There are many ways to spin it, and while “cost optimization” is broadly applicable, the strategies and tactics to get there will vary widely based on your organization and the maturity of your cloud use.
Having this discussion within enterprises can be challenging, and perspectives change depending on who you talk to within an organization – FinOps? CloudOps? ITOps? DevOps?. And outside of operations, what about the Line of Business (LoB) or the Application owners? Maybe they don’t care about optimization in terms of cost but in terms of performance, so in reality optimization can mean something different to cloud owners and users based on your role and responsibility.
Ultimately though, there are a number of steps that are common no matter who you are. In order to facilitate this discussion and understand where enterprises are in their cloud cost optimization journey, we created a framework called the Cloud Cost Optimization Maturity Curve to identify these common steps.
Cloud Spend Optimization Maturity Curve
While cloud users could be doing any combination of these actions, this is a representation of actions you can take to control cloud spend in order of complexity. For example, Visibility in and of itself does not necessarily save you money but can help identify areas ripe for optimization based on data. And taking scaling actions on IaaS may or may not save you money, but may help you improve application performance through better resource allocation, scaling either up (more $$$) or down (less $$$).
Let’s dig into each in a little more detail:
Visibility – visibility of all costs across clouds, accounts, and applications. This is cloud cost management 1.0, the ability to see cost data better through budgeting, chargeback, and showback.
Schedule suspend – turn off idle resources like virtual machines, databases, scale groups, and container services when not being used, such as nights and weekends based on usage data. This is most common for non-production resources but can have a big bang in terms of savings – 65% savings is a good target that many ParkMyCloud customers achieve even during a free trial.
Delete unused resources – this includes identifying orphaned resources and volumes and then deleting them. Even though you may not be using them, your cloud provider is still charging you for them.
Sizing IaaS (non-production) – many enterprises overprovision their non-production resources and are using only 5-10% of the capacity of a given resource, meaning 90% is unused (really!) so by leveraging usage data you can get recommendations to resize those under utilized resources to save 50% or more.
RI / Savings Plan Management – AWS, Azure, and Google provide the ability to pre-buy capacity and get discounts ranging from 20-60% based on your commitments in both spend and terms. While the savings make it worthwhile, this is not a simple process (though it’s improved with AWS’s savings plans) and requires a very good understanding of the services you will need 12-36 months out.
Scaling IaaS (prod) – this requires collecting data and understanding both the infrastructure and application layers and taking sizing actions up or down to improve both performance and cost. Taking these actions on production resources requires strong communication between Operations and LoB.
Optimizing PaaS – virtual machines, databases, and storage are all physical in nature and can be turned off and resized, but these top the maturity curve since many PaaS services have to be optimized in other ways like scaling the service up/down based on usage or rearchitecting parts of your application.
For more ways to reduce costs, check out the cloud waste checklist for 26 steps to take to eliminate wasted spend at a more granular level.