Microsoft Azure recently announced an addition designed to help with Azure chargeback: cost allocation, now in preview in Azure Cost Management + Billing. We’re always glad to see cloud providers making an effort to improve their native cost management capabilities for customers, so here’s a quick look at this update.
Chargeback for Cost Accountability
Cost allocation for cloud services is an ongoing challenge. Depending on organizational structure and decisions about billing and budgets, every organization will handle it a bit differently. In some cases, separating by Azure subscription can make this easier, but in others, your organization may have shared costs such as networking or databases that need to be divided by business unit or customer. However, it is an obstacle that must be addressed in order for organizations to gain visibility, address inefficiencies, and climb up the cloud spend optimization curve to actually take action to reduce and optimize costs.
Many IT organizations address this via an Azure chargeback setup, in which the IT department provisions and delivers services, and each department or group submits internal payment back to IT based on usage. Thus, it becomes an exercise in determining how to tag and define “usage”.
In some cases, showback can be used as an alternative or stepping stone toward chargeback. The content and dollar amounts are the same – but without the accountability driven by chargeback. For this reason, it can be difficult to motivate teams to reduce costs with a showback. We have heard teams using variation on showback – ”shameback”. IT can take the costs they’re showing back and gamify savings, coupled with a public shame/reward mechanism, to drive cost-saving behavior.
What Azure Added with the Preview Cost Allocation Capabilities
The cost allocation capabilities are currently in preview for Enterprise Agreement (EA) and Microsoft Customer Agreement (MCA) accounts. It allows users to identify the costs that need to be split by subscription, resource group, or tag. Then, you can choose to move them, and allocate in any of the following ways: distribute evenly, distribute proportional to total costs, distribute proportional to either network, compute, or storage costs, or choose a custom distribution percentage.
Cost allocation does not affect your Azure invoice, and costs must stay within the original billing account. So, Azure did not actually add chargeback, but they did add visualization and reporting tools to facilitate chargeback processes within your organization, outside of Azure.
Improvements in the Right Direction – or Too Little, Too Late?
Azure and AWS are slowly iterating and improving on their cost visibility, reporting, and management capabilities – but for many customers, it’s too little, too late. The lack of visibility and reporting within the cloud providers’ native offerings is what has led to many of the third-party platforms in the market. We suspect there is still a way to go before customers’ billing and reporting needs are fully met by the CSPs themselves.
And of course, for organizations with a multi-cloud presence, the cloud costs generally need to be managed separately or via a third-party tool. There are some movements within the CSPs to at least acknowledge that their customers are using multiple providers, particularly on the part of Google Cloud. Azure Cost Management has done so in part as well, with the AWS connector addition to the platform, but it’s unclear whether the 1% charge of managed AWS spend is worth the price – especially when you may be able to pay a similar amount for specialized tools that have more features.
G2 Cloud Cost Management Fall 2020 Report Ranks ParkMyCloud First in Usability, Relationship, and Implementation
September 25, 2020 (Dulles, VA) – ParkMyCloud, provider of the leading enterprise platform for continuous cost control in public cloud, was rated #1 in user satisfaction in G2’s Cloud Cost Management Fall 2020 report, published this week. ParkMyCloud was also selected as the #1 product in Usability, Relationship Index, and Implementation for the second consecutive quarter.
“At ParkMyCloud, we aim to deliver the best way to automate cloud cost optimization by finding and eliminating cloud waste, focusing on self-service accessibility for all cloud users in the enterprise, said ParkMyCloud VP Jay Chapel. “I’d like to thank our customers for sharing their feedback. Not only is this helpful for other potential users, but for driving the direction of the product as well. We review all feedback and use it to innovate and improve so we can create the best cloud cost optimization platform and experience for all public cloud users.”
With more than one million reviews of business software, G2 is a trusted authority for business professionals making purchasing decisions. Its quarterly reports are based on reviews by real, verified users, who provide unbiased ratings on user satisfaction, features, usability, and more.
In the report, ParkMyCloud earned the leading satisfaction score at 94%, as well as 91% satisfaction in ease of administration; 93% in ease of doing business, 91% in ease of use, 90% in quality of support, and 91% of users likely to recommend the product. ParkMyCloud was also rated as providing the fastest ROI of any product in the category.
Customer reviews on G2 highlight the amount of savings, the usefulness of automated recommendations, integrations with external tools, ease of training new users, reporting, and easy-to-use UI. One recent reviewer stated, “We have saved tens of thousands of dollars … without any loss of productivity.”
ParkMyCloud, a Turbonomic company, provides a self-service SaaS platform that helps enterprises automatically identify and eliminate wasted cloud spend. More than 1,500 enterprises around the world – including Sysco, Workfront, Hitachi ID Systems, Sage Software, and National Geographic – trust ParkMyCloud to cut their cloud spend by tens of millions of dollars annually. ParkMyCloud allows enterprises to easily manage, govern, and optimize their spend across multiple public clouds. For more information, visit www.parkmycloud.com.
Katy Stalcup Director of Marketing, ParkMyCloud (571) 748-5093 email@example.com
It sounds obvious when you first say it: you can scale AWS ASGs (Auto Scaling Groups) down to zero. This can be a cost-savings measure: zero servers means zero cost. But most people do not do this!
Wait – Why Would You Want to?
Maybe you’ve heard the DevOps saying: servers should be cattle, not pets. This outlook would say that you should have no single server that is indispensable – a special “pet”. Instead, servers should be completely replaceable and important only in herd format, like cattle. One way to adhere to this framework is by creating all servers in groups.
Some of our customers follow this principle: they use Auto Scaling Groups for everything. When they create a new app, they create a new ASG – even if it has a server size of one. This can remove challenges to scale up in the future. However, this leaves these users with built-in wasted spend.
Here’s a common scenario: a production environment is built with Auto Scaling Groups of EC2 instances and RDS databases. A developer or QA specialist copies production to their testing or staging environment, and soon enough, there are three or four environments of ASGs with huge servers and databases mimicking production, all running, and costing money, when no one is using them.
By setting an on/off schedule on your Auto Scaling Groups, you can scale them down to a min/max/desired number of instances as “0” overnight, on weekends, or whenever else these groups are not needed.
In essence, this is just like parking a single EC2 instance when not in use. Even for an EC2 instance, users are unlikely to go into the AWS console at the end of a workday to turn off their non-production servers overnight. For ASGs, it’s even less likely. For a single right-click to stop an EC2 instance, an AWS ASG requires you to go to ASG settings, edit, modify the min/max/desired number of instances – and then remember to do the opposite when you need to turn them back on.
How You Can “Scale to Zero” in Practice
One ParkMyCloud customer, Workfront, is using this daily to keep costs in check. Here’s how Rob Weaver described it in a recent interview with Corey Quinn:
Scaling environments are a perfect example. If we left scaling up the entire time – 24/7 – it would cost as much as a production instance. It’s a full set of databases, application servers, everything. For that one, we’ve got it set so the QA engineers push a button [in ParkMyCloud], they start it up. For a certain amount of time before it shuts back down.
In other cases, we’ve got people who go in and use the [ParkMyCloud] UI, push the little toggle that says “turn this on”, choose how long to turn it on, and they’re done.
How else does Workfront apply ParkMyCloud’s automation to reduce costs for a 5X ROI? Find out here.
Another Fun Fact About AWS ASGs
One gripe some users have about Auto Scaling Groups is that they terminate resources when scaling down (one could argue that those users are pro-pet, anti-cattle, but I digress). If your needs require servers in AWS ASGs to be temporarily stopped instead of terminated, ParkMyCloud can do that too, with the “Suspend ASG Processes” option when parking a scale group. This will suspend the automation of an ASG and stop the servers without terminating them, and reverse this process when the ASG is being “unparked”.
Try both scaling to zero and suspending ASGs – start a free trial of ParkMyCloud to try it out.
In July, Microsoft introduced the Azure Well-Architected Framework best practices – a guide for building and delivering solutions built with Azure’s best practices. If you’ve ever seen the AWS Well-Architected Framework, Azure’s will look… familiar. It strikes many similarities with the Google Cloud Architecture Framework as well, which was released in May. This is perhaps a sign that despite the frequently argued differences between the cloud providers (and people love to compare – by far the most-read post on this blog is this one on AWS vs. Azure vs. Google Cloud market share), they are more similar than different. Is this a bad thing? We would argue, no.
There are many aspects of a well-designed architecture and these frameworks to discuss. Given ParkMyCloud’s focus on cost here, we’ll examine the cost optimization principles in Azure’s framework and how they compare to AWS and Google’s.
Architecture Guidelines at a High Level
The three cloud providers each provide architecture frameworks with similar sets of principles. AWS and Azure use the “pillar” metaphor, and in fact, the pillars are almost identically named:
While at first it is somewhat amusing to note these similarities (did Azure just ctrl+c?), it is reassuring that between the major cloud providers, all can agree what components comprise the best architecture. Better yet, they are providing ever-improving resources, training, assessments and support for their users to learn and apply these best practices.
Who Should Use the Azure Well-Architected Framework – and How to Get Started
Speaking of users – which ones are these architecture frameworks for? In their announcement, Azure noted the shifting of responsibility of security, operations, and cost management from centralized teams toward the workload owner. While the truth of this statement will depend on the organization, we have recognized this shift as well.
So while Azure’s framework is aimed largely at new Azure users and/or new applications, we would recommend every Azure user skim the table of contents and take the well-architected review assessment. The assessment takes the form of a multiple-choice “quiz”. At the end of the assessment, you are given a score and results on a scale from 1 to 100. You are also linked to next steps with detailed articles for each question where there is room for improvement. This assessment is worth the time (and won’t take much of it), giving you a straightforward action plan.
The architecture resources provided by Google Cloud are much briefer than AWS and Azure’s frameworks, and they combine performance and cost optimization into one principle, so it’s not surprising several topics are missing – including any discussion of governance or ownership of cost. AWS focuses on this the most, particularly with the new section on cloud financial management, but Azure certainly also discusses organizational structure, governance, centralization, tagging, and policies. We appreciate the stages of cost optimization Azure uses, from design, to provisioning, to monitoring, to optimizing.
All three cloud providers have similar recommendations in cost optimization regarding scalable design, using tagging for cost governance and visibility, using the most efficient resource cost models, and rightsizing.
Azure puts it this way: cost is important, but you should seek to achieve balance between all the pillars. Shoring up any of the other pillars will almost always increase costs. Invest in security first, then performance, then reliability. Operational excellence can increase or decrease costs. Cost optimization will always be important for any organization in public cloud, but it does not stand alone.
Wasted cloud spend from orphaned volumes and snapshots exacerbates the problem of cloud waste. We have previously estimated that $17.6 billion will be wasted this year due to idle and oversized resources in public cloud. Today we’re going to dive into so-called orphaned resources.
A resource can become “orphaned” when it is detached from the infrastructure it was created to support, such as a volume detached from an instance or a snapshot detached from any volumes. Whether you are aware these remain in your cloud environment or not, they can continue to incur costs, wasting money and driving up your cloud bill.
How Resources Become Detached
One form of orphaned resources comes from storage. Volumes or disks such as Amazon EBS, when created, are attached to an EC2 instance. You can attach multiple volumes to a single instance to add storage space. If an instance is terminated but a volumes attached to it have not, “orphaned volumes” have been created. Note that by default, the boot disk attached to every instance is designed to terminate when the instance is terminated (although it is possible to deselect this option), but any additional disks that have been attached do not necessarily follow this same behavior.
Snapshots can also become orphaned resources. A snapshot is a point-in-time image of a volume. In the case of Amazon EBS, AWS snapshots are stored in Amazon S3. EBS snapshots are incremental, meaning, only the blocks on the device that have changed after your most recent snapshot are saved. If the associated instance and volume is deleted, a snapshot could be considered orphaned.
Are All Detached Resources Unnecessary?
Just because a resource is detached it does not mean it should be deleted. For example, you may want to keep:
The most recent snapshots backing up a volume
Machine images used to create other machines
Snapshots used to inexpensively store the state of a machine you intend to use later, rather than keeping a volume around
However, like the brownish substance in the tupperware in the back of your freezer, anything you want to keep needs to be clearly labeled in order to be useful. By default, snapshots and volumes do not always get automatically tagged with enough information to know what they actually are. In ParkMyCloud we see exabytes of untagged storage in our customers’ environments, with no way of knowing if it is safe to delete. In the AWS console, metadata is not cleanly propagated from the parent instance, and you have to go out of your way to tag snapshots before the parent instances are terminated. Once the parent instance is terminated, it can be impossible to identify the source of an orphaned volume or snapshot without actually re-attaching it to a running instance and looking at the data. Tag early and tag often!
The Size of Wasted Spend
To estimate the size of the problem of orphaned volumes and snapshots, we’ll start with some data from ParkMyCloud customers in aggregate. ParkMyCloud customers spend approximately 15% of their bills on storage. We found that 35% of that storage spend was on unattached volumes and snapshots. As detailed above, while this doesn’t mean that all of that is wasted, the lack of tagging and excess of snapshots of individual volumes indicates that much of it is.
Overall, an average of 5.25% of our customers’ bills is being spent on unattached volumes and snapshots. Then applied to the $50 billion estimated to be spent on Infrastructure as a Service (IaaS) this year gives us a maximum of up to $2.6 billion wasted this year on orphaned volumes and snapshots. This is a huge problem.
Based on the size of this waste and customer demand, ParkMyCloud is developing capabilities to add orphaned volume and snapshot management to our cost control platform.
Interested? Let us know here and we’ll notify you when this capability is released.