With $39.5 billion projected to be spent on Infrastructure as a Service (IaaS) this year, many cloud users will find it’s time to optimize spend with an IaaS cost management tool. With so many different options to choose from – picking the right one can be overwhelming. While evaluating your options, you should have an idea of what would be most compatible for you and your organization. In order to cut cloud costs and waste, make sure you look for these 5 things while picking an IaaS cost management tool.
1. UI is Easy to Understand
When adopting a new piece of software, you should not be stressed out trying to figure out how it works. It should be designed around the end user in order to give them an easy user experience so they can accomplish tasks quickly. Many native tools required by the cloud providers require specialized coding knowledge that the IaaS users in your organization may not have. Whether it is useful or not depends on how simple and easy to follow it is so that every cloud user can contribute to the task of managing IaaS cost.
2. Improved Visibility
It is essential that you have all of your information available to you in one place – this helps make sure you didn’t overlook anything. Seeing all your resources on one screen, all at once, will allow you to pinpoint strengths/weaknesses you need to focus on to that will help manage your IaaS cost. Of course, cost management includes more than visibility, which leads to the next points.
3. Provides Reporting
You want your organization to be well informed, so it is important that any IaaS cost management tool you adopt includes the ability to generate cost and savings reports. You can’t change something if you don’t know what it means, the data gathered – past and present – will help you understand the past and make a forecast for the future. These reports will give you the information you need to make quick, informed decisions. Preferably, they contain automated recommendations as well based on your resource utilization history and patterns. Additionally, it’s important for any cost optimization tool to report on the amount of money you have saved using it, so you can justify the cost of the tool as needed to your management or Finance department.
4. Implements actions
After gathering the data and making suggestions, the next step in cost optimization is to actually make these changes. Using the reports and data gathered, the tool should be able to manage your resources and implement any necessary changes without you having to do anything.
5. Automation and APIs
Even though it goes on in the background, APIs are necessary because they allow your tool to work in conjunction with other operations. With the support of inbound actions and outbound notifications, this automated process allows you to streamline all of your data. This will make things faster and more efficient – allowing you to cut down on time and IaaS cost. Highlights to look for include Single Sign-On, ChatOps integrations, and a well-documented API.
Keep Your Organization’s IaaS Cost Needs in Mind
These are just a few of the things you should be looking for when searching for IaaS cost optimization – but you have to find the platform that works best for you!
ParkMyCloud automatically optimizes your IaaS costs with these principles in mind – try it out with a 14-day free trial and see if it’s the right fit for you.
When adopting or optimizing your public cloud use, it’s important to eliminate wasted spend from idle resources – which is why you need to include an instance scheduler in your plan. An instance scheduler ensures that non-production resources – those used for development, staging, testing, and QA – are stopped when they’re not being used, so you aren’t charged for compute time you’re not actually using.
AWS, Azure, and Google Cloud each offer an instance scheduler option. Will these fit your needs – or will you need something more robust? Let’s take a look at the offerings and see the benefits and drawbacks of each.
AWS Instance Scheduler
AWS has a solution called the AWS Instance Scheduler. AWS provides a CloudFormation template that deploys all the infrastructure needed to schedule EC2 and RDS instances. This infrastructure includes DynamoDB tables, Lambda functions, and CloudWatch alarms and metrics, and relies on tagging of instances to shut down and turn on the resources.
The AWS Instance scheduler is fairly robust in that it allows you to have multiple schedules, override those schedules, connect to other AWS accounts, temporarily resize instances, and manage both EC2 instances and RDS databases. However, that management is done exclusively through editing DynamoDB table entries, which is not the most user-friendly experience. All of those settings in DynamoDB are applied via instance tags, which is good if your organization is tag-savvy, but can be a problem if not all users have access to change tags.
If you will have multiple users adding and updating schedules, the Instance Scheduler does not provide good auditing or multi-user capabilities. You’ll want to strongly consider an alternative.
Microsoft Azure Automation
Microsoft has a feature called Azure Automation, which includes multiple solutions for VM management. One of those solutions is “Start/Stop VMs during off-hours”, which deploys runbooks, schedules, and log analytics in your Azure subscription for managing instances. Configuration is done in the runbook parameters and variables, and email notifications can be sent for each schedule.
This solution steps you through the setup for timing of start and stop, along with email configuration and the target VMs. However, multiple schedules require multiple deployments of the solution, and connecting to additional Azure subscriptions requires even more deployments. They do include the ability to order or sequence your start/stop, which can be very helpful for multi-component applications, but there’s no option for temporary overrides and no UI for self-service management. One really nice feature is the ability to recognize when instances are idle, and automatically stop them after a set time period, which the other tools don’t provide.
Google Cloud Scheduler
Google also has packaged some of their Cloud components together into a Google Cloud Scheduler. This includes usage of Google Cloud Functions for running the scripts, Google Cloud Pub/Sub messages for driving the actions, and Google Cloud Scheduler Jobs to actually kick-off the start and stop for the VMs. Unlike AWS and Azure, this requires individual setup (instead of being packaged into a deployment), but the documentation takes you step-by-step through the process.
Google Cloud Scheduler relies on instance names instead of tags by default, though the functions are all made available for you to modify as you need. The settings are all built into those functions, which makes updating or modifying much more complicated than the other services. There’s also no real UI available, and the out-of-the-box experience is fairly limited in scope.
Cloud Native or Third Party?
Each of the instance scheduler tools provided by the cloud providers has a few limitations. One possible dealbreaker is that none of these tools are multi-cloud capable, so if your organization uses multiple public clouds then you may need to go for a third-party tool. They also don’t provide a self-service UI, built-in RBAC capabilities, Single Sign-On, or reporting capabilities. When it comes to cost, all of these tools are “free”, but you end up paying for the deployed infrastructure and services that are used, so the cost can be very hard to pin down.
We built ParkMyCloud to solve the instance scheduler problem (now with rightsizing too). Here’s how the functionality stacks up against the cloud-native options:
AWS Instance Scheduler
Microsoft Azure Automation
Google Cloud Scheduler
Virtual Machine scheduling
Scale Set scheduling
Overall, the cloud-native instance scheduler tools can help you get started on your cost-saving journey, but may not fulfill your longer-term requirements due to their limitations.
Try ParkMyCloud with a free trial — we think you’ll find that it meets your needs in the long run.
If you ask a group of CIOs or analysts for a list of priorities for companies adopting cloud infrastructure, there’s no doubt that cloud visibility would be named near the top. Insight is important for everything from security to cost management. But cloud visibility on its own is not enough, particularly as widespread cloud usage continues to mature.
Don’t Get Us Wrong: Cloud Visibility is Important
Cloud visibility is a broad term, encompassing resource consumption and spend, security and regulatory compliance, and monitoring. In fact, cloud “monitoring” is a term that typically encompasses performance monitoring and security. This is certainly important: some projections show the cloud monitoring marketing reaching $3.9 billion in 2026, so there is obviously demand for these tools.
Another aspect is cost. Cloud cost visibility is a hot topic right now, and with good reason. Public cloud providers’ bills are confusing, and you need to be able to understand what you’re being charged for. It’s also important to see where your spend is going, ideally with slice-and-dice reporting so you can analyze by user, team, project, and resource type, and ensure internal chargeback based on consumption.
However, in terms of resource and cost management, cloud visibility alone is not enough to make change.
Cloud Visibility is Useless without Action
There’s a reason that this time of year, self-help gurus encourage resolution makers to make their goals actionable. Aspirations are great. Knowledge is great. But without practical application, aspirations and knowledge won’t lead to change.
When it comes to cloud cost management, there are several capabilities that you need in order to capitalize on the insights gained through visibility. Three important ones to keep in mind are:
The ability to allocate costs to teams.
The ability to automate remediation.
The ability to optimize spending.
The popular cloud cost management tools tend to be strong on some combination of analytics, reporting dashboards, chargeback/showback, budget allocation, governance, and recommendations (which can get quite granular in areas such as reserved instances and orphaned resources). However, they require external tools or people to act upon these recommendations and lack automation.
Actionable is Good. Optimization is Better.
As you research cloud visibility and monitoring solutions to address knowledge gaps in your organization, be sure to include a requirement to address cloud waste. Cloud optimization should require little to no manual work on your part by integrating into your cloud operations, allowing you to automatically reap the benefits and savings.
Here’s a first step on your optimization journey: pick a cloud account, plug it into ParkMyCloud, and get immediate recommendations for cost reduction. Click to apply the recommendations – or set a policy to do it automatically – and see the savings start to add up.
It’s that time of year: new gym memberships, fresh diet goals, and plans to reform… cloud spending?
If you’re at all involved in your organization’s public cloud infrastructure, that last one should definitely be on your to-do list. Chances are, if you’re spending money on cloud, some of that money is being wasted. For some, a lot of that money is being wasted. Here are the numbers.
Predicted Cloud Spending 2019
The latest predictions from Gartner estimate that overall IT spending will reach $3.8 trillion this year, a growth of 3.2% over IT spending in 2018.
Of this spend, public cloud spending is expected to reach $206.2 billion — of which, the fastest growing segment is Infrastructure as a Service (IaaS) which Gartner says will grow 27.6 percent in 2019 to reach $39.5 billion, up from $31 billion in 2018.
Now we can subdivide the public cloud spend number further to look just at compute resources — typically ⅔ of cloud spend is on compute, or about $26.3 billion. This segment of spend is especially vulnerable to waste, particularly from idle resources and oversized resources.
Wasted Cloud Spending from Idle Resources
Let’s first take a look at idle resources — resources that are being paid for by the hour or minute, but are not actually being used. Typically, this kind of waste occurs in non-production environments – that is, those used for development, testing, staging, and QA. About 44% of compute spend is on non-production resources (that’s our number).
Most non-production resources are only used during a 40-hour work week, and do not need to run 24/7. That means that for the other 128 hours of the week (76%), the resources sit idle, but are still paid for.
So what we get is:
$26.3 billion in compute spend * 0.44 non-production * 0.76 of week idle = $8.8 billion wasted on idle cloud resources
Wasted Cloud Spending from Oversized Resources
The other source of wasted cloud spend is oversized infrastructure — that is, paying for resources at a larger capacity than needed.
RightScale found that 40% of instances were sized at least one size larger than needed for their workloads. Just by reducing an instance by one size, the cost is reduced by 50%. Downsizing by two sizes saves 75%.
The data we see in our users’ infrastructure in the ParkMyCloud confirms this, and in fact we find that it may even be a conservative estimate. Infrastructure managed in our platform has an average CPU utilization of 4.9%. Of course, this doesn’t take memory into account, and could be skewed by the fact that resources managed in ParkMyCloud are more commonly for non-production resources. However, it still paints a picture of gross underutilization, ripe for rightsizing and optimization.
If we take a conservative estimate of 40% of resources oversized by just one size, we find the following:
$26.3 billion in compute spend * 0.4 oversized * 0.5 overspend per oversized resource = $5.3 billion wasted on oversized resources
Total Cloud Spending to be Wasted in 2019
Between idle resources and overprovisioning, wasted cloud spend will exceed $14.1 billion in 2019.
In fact, this estimation of wasted cloud spend is probably low. This calculation doesn’t even account for waste accumulated through orphaned resources, suboptimal pricing options, misuse of reserved instances, and more.
End the Waste
It’s time to fight this cloud waste. That’s what we’re all about at ParkMyCloud — eliminating wasted cloud spending through scheduling, rightsizing, and optimization.
Ready to join us and become a cloud waste killer? Let’s do it.
Serving sizing in the cloud can be tricky. Unless you are about to do some massive high-performance computing project, super-sizing your cloud virtual machines/instances is probably not what you are thinking about when you log in to your favorite cloud service provider. But from looking at customer data within our system, it certainly does look like a lot of folks are walking up to their neighborhood cloud provider and saying exactly that: Super Size Me!
Like at a fast-food place, buying in the super size means paying extra costs…and when you are looking for ways to save money on cloud costs, whether for production or non-production resources, the first place to look is at idle and underutilized resources.
Within the ParkMyCloud SaaS platform, we have collected bazillions (scientific term) of samples of performance data for tens of thousands of virtual machines, across hundreds of customers, and the average of all “Average CPU” readings is an amazing (even to us) 4.9%. When you consider that many of our customer are already addressing underutilization by stopping or “parking” their instances when they are not being used, one can easily conclude that the server sizing is out of control and instances are tremendously overbuilt. In other words, they are much more powerful than they need to be…and thus much more expensive than they need to be. As cool as “super sizing” sounds, the real solution is in rightsizing, and ensuring the instance size and type are better tailored to the actual load.
Before we start talking about what is involved in rightsizing, let’s look at a few more statistics, just because the numbers are pretty cool. Looking at utilization data from about 88.9 million instance-hours on AWS – that’s 10,148 years – we find the following:
So, what is this telling us about server sizing? The percentiles alone tell us that more than 95% of our samples are operating at less than 50% Average CPU – which means if we cut the number of CPUs in half for most of our instances, we would probably still be able to carry our workload. The 95th percentile for Peak CPU is 58%, so if we cut all of those CPUs in half we would either have to be OK with a small degradation in performance, or maybe we select an instance to avoid exceeding 99% peak CPU (which happens around the 93rd percentile – still a pretty massive number).
Looking down at the 75th and 50th percentiles we see instances that could possibly benefit from multiple steps down! As shown in the next section, one step down can save you 50% of the cost for an instance. Two steps down can save you 75%!
Before making an actual server sizing change, this data would need to be further analyzed on an instance by instance basis – it may be that many of these instances have bursty behavior, where their CPUs are more highly utilized for short periods of time, and idle all the rest of the time. Such an instance would be better off being parked or stopped for most of the time, and only started up when needed. Or…depending on the duration and magnitude of the burst, might be better off moving to the AWS T instance family, which accumulates credits for bursts of CPU, and is less expensive than the M family, built for a more continuous performance duty cycle. Also – as discussed below – complete rightsizing would entail looking at some other utilization stats as well, like memory, network, etc.
On every cloud provider there is a clear progression of server sizing and prices within any given instance family. The next size up from where you are is usually twice the CPUs and twice the memory, and as might be expected, twice the price.
Here is a small sample of AWS prices in us-east-1 (N. Virginia) to show you what I mean:
Double the memory and/or double the CPU…and double the price.
It is important to note that there is more to instance utilization than just the CPU stats. There are a number of applications with low-CPU but high network, memory, disk utilization, or database IOPs, and so a complete set of stats are needed before making a rightsizing decision.
This can be where rightsizing across instance families makes sense.
On AWS, some of the most commonly used instance types are the T and M general purpose families. Many production applications start out on the M family, as it has a good balance of CPU and memory. Let’s look at the m5.4xlarge as a specific example, shown in the middle row below.
If you find that such an instance was showing good utilization of its CPU, maybe with an Average CPU of 75% and Peak CPU of 95%, but the memory was extremely underutilized, maybe only consuming 20%, we may want to move to more of a compute-optimized instance family. From the table below, we can see we could move over to a c5.4xlarge, keeping the same number of CPUs, but cutting the RAM in half, saving about 11% of our costs.
On the other hand, if you find the CPU was significantly underutilized, for example showing an Average CPU of 30% and Peak of 45%, but memory was 85% utilized, we may be better off on a memory-optimized instance family. From the table below, we can move to an r5.2xlarge instance cutting the vCPUs in half, and keeping the same amount of RAM, and saving about 34% of the costs.
Within AWS there are additional considerations on the network side. As shown here, available network performance follows the instance size and type. You may find yourself in a situation where memory and CPU are non-issues, but high network bandwidth is critical, and deliberately super-size an instance. Even in this case, though, you should think about whether there is a way to split your workload into multiple smaller instances (and thus multiple network streams) that are less expensive than a beastly machine selected solely on the basis of network performance.
You may also need to consider availability when determining your server sizing. For example, if you need to run in a high-availability mode using an autoscaling group you may be running two instances, either one of which can handle your full load, but both are only 50% active at any given time. As long as they are only 50% active that is fine – but you may want to consider if maybe two instances at half the size would be OK, and then address a surge in load by scaling-up the autoscaling group.
For full cost optimization for your virtual machines, you need to consider appropriate resource scheduling, server sizing, and sustained usage.
Rightsize instances wherever possible. You can easily save 50% just by going down one size tier – and this applies to production resources as well as development and test systems!
Modernize your instance types. This is similar to rightsizing, in that you are changing to the same instance type in a newer generation of the same family, where cloud provider efficiency improvements mean lower costs. For example, moving an application from an m3.xlarge to an m5.xlarge can save 28%!
Park/stop instances when they are not in use. You can save 65% of the cost of a development or test virtual machine by just having it on 12 hours per day on weekdays!
For systems that must be up continually, (and once you have settled on the correct size instance) consider purchasing reserved instances, which can save 54-75% off the regular cost. If you would like a review of your resource usage to see where you can best utilized reserved instances, please let us know.
Last week, ParkMyCloud released the ability to rightsize and modernize instances. This release helps you identify the virtual machine and database instances that are not fully utilized or on an older family, making smart recommendations for better server sizing and/or family selection for the level of utilization, and then letting you execute the rightsize action. We will also be adding a feature for scheduled rightsizing, allowing you to maintain instance continuity, but reducing its size during periods of lower utilization.
Amazon Web Services (AWS) has been pumping out announcements in the lead up to their AWS re:Invent conference next week – which is predicted to exceed 50,000 attendees this year. (See you there?) We’re excited to see what big news the cloud giant has for us next week!
In the meantime, here are three AWS announcements from the last few days that will interest anyone who’s concerned with cloud costs.
Predictive Scaling for EC2
AWS’s new predictive scaling for EC2 is a new and improved way to use Auto Scaling to optimize costs. Typically when you set up an Auto Scaling Group, you need to set scaling policies, such as rules for launching instances based on changes in capacity. Given the complexity of these requirements, some users we’ve talked to forgo them altogether, instead using Auto Scaling simply for instance health checks and replacements.
With predictive scaling for EC2, there is very little the user needs to set up. You will simply set up the group, and machine learning models will analyze daily and weekly scaling patterns to predictively scale. You’ll have choices to optimize for availability, or optimize for cost – making it easy to use Auto Scaling to save money. Of course, sometimes you’ll know better than the machine – for example, development and test instances may require on/off or scale-up/scale-down schedules based on when users need them, which won’t always be consistent. For that, use ParkMyCloud to schedule auto scaling groups to turn off or change scaling when you know they will have little or no utilization.
AWS Cost Explorer Forecasting
AWS has announced an improved forecasting engine for the AWS Cost Explorer. It now breaks down historical data based on charge type – distinguishing between On Demand and Reserved Instance charges – and applies machine learning to predict future spend.
They have extended the prediction range from three months to twelve months, which will certainly be of use for budget forecasting. It’s also accessible via the API – we see this being used to show budget predictions on team dashboards in your office, among other applications.
CloudWatch Automatic Dashboards
The third announcement from this week that we’re looking forward to using ourselves here at ParkMyCloud is the new series of CloudWatch Automatic Dashboards. This will make it remarkably easier to navigate through your CloudWatch metrics and monitor costs and performance, and help potential issues break through the noise.
Now, play around with AWS’s new predictive scaling for EC2, then take some time to relax.
Happy Thanksgiving! (And to our non-U.S. readers, enjoy your Thursday!)