Today, we propose a new concept to add to the DevOps mindset: Continuous Cost Control.
In DevOps, speed and continuity are king. Continuous Operations, Continuous Delivery, Continuous Integration. Keep everything running and get new features in the hands of users quickly.
For some organizations, this approach leads to a mindset of “speed at any cost”. Especially in the era of easily consumable public cloud, this results in a habit of wasted spend and blown budgets – which may, of course, meet the goals for delivery. But remember that a goal of Continuous Delivery is sustainability. This applies to the coding and backend of the application, but also to the business side.
With that in mind, we get to the cost of development and operations. At some point in every organization’s lifecycle comes the need to control costs. Perhaps it’s when your system or product reaches a certain level of predictability or maturity – i.e. maintenance mode – or perhaps earlier, depending your organization.
We all know that agility has helped companies create competitive advantage; but customers and others tell us it can’t be “agility at any cost.” That’s why we believe the next challenge is cost-effective agility. That’s what Continuous Cost Control is all about.
What is Continuous Cost Control?
Think of it as the ability to see and automatically take action on development and operations resources, so that the amount spent is a controlled factor and not merely a result. This should occur with no impact to delivery.
Think of the spend your department manages. It likely includes software license costs and true-ups and perhaps various service costs. If you’re using private cloud/on-premise infrastructure, you’ve got equipment purchases and depreciations, plus everything to support that equipment, down to the fuel costs for backup generators, to consider.
However, the second biggest line item (after personnel) for many agile teams is public cloud. Within this bucket, consider the compute costs, bandwidth costs, database costs, storage, transactions… and the list goes on.
While private cloud/on-premise infrastructure requires continuous monitoring and cost control, the problem becomes acute when you change to the utility model of the public cloud. Now, more and more people in your organization have the ability to spin up virtual servers. It can be easy to forget that every hour (or minute, depending on the cloud provider) of this compute time costs money – not to mention all the surrounding costs.
Continually controlling these costs means automating your cost savings at all points in the development pipeline. Early in the process, development and test systems should only be run while actually in use. Later, during testing and staging, systems should be automatically turned on for specific tests, then shut down once the tests are complete. During maintenance and production support, make sure your metrics and logs keep you updated on what is being used – and when.
How to get started with Continuous Cost Control
While Continuous Cost Control is an idea that you should apply to your development and operations practices throughout all project phases, there are a few things you can do to start a cultural behavior of controlled costs.
Perhaps your CFO or CTO came to you and gave a directive to save money on Azure. Perhaps you received the bill on your own, and realize that this needs to be reduced. Or maybe you’re just migrating to the cloud and want to make sure you’re set up for cost control in advance (if so, props to you for being proactive!)
Whatever the reason you want to reduce your bill, there are a lot of little tips and tricks out there. But to get started, here are the top 3 ways to save money on Azure.
1. Set a spending limit on your Azure account
Our first recommendation to save money on Azure is to set a spending limit on your Azure account. We especially recommend this if you are using your Azure account for non-production. This is because once your limit is reached, your VMs will be stopped and deallocated. You will get an email alert and an alert in the Azure portal, and you do have the ability to turn these back on, but this is of course not ideal for any production systems.
Additionally, keep in mind that there are still services you will be charged for, even if your spending limit has been reached, including Visual studio licenses, Azure Active Directory premium, and support plans.
One easy way to spend too much on your Azure compute resources is to use VMs that are not properly sized for the workload you are running on them. Use Azure’s Advisor to ensure that you’re not overpaying for processor cores, memory, disk storage, disk I/O, or network bandwidth. More on right-sizing from TechTarget.
While you’re at it, check to see if there’s a less-expensive region you could choose for the VM for additional cost savings.
3. Turn non-production VMs off when they’re not being used
Our third recommendation to save money on Azure is to turn non-production VMs off when they’re not being used – otherwise, you’re paying for time you don’t need. It’s a quick fix, and one that can save 65% of the cost of the VM – if, for example, it was running 24×7 but is only needed 12 hours per day, Monday through Friday.
One basic approach is to ask developers and testers to turn their VMs off when they are done using them — if you do this, ensure that your users are using the Azure portal to put these VMs in the “stopped deallocated” state. If you stop from within a VM, it will be put in a “stopped” state and you will continue to be charged.
However, relying on human memory is not best, so you’ll want to schedule your non-production VMs to shut down on a schedule. You could attempt to script this, but this is counter productive and wastes valuable development resources to write and maintain.
Instead, it’s best to use software like ParkMyCloud’s to automate on/off schedules – including automating schedule and team assignment for access control – and keep your Azure non-production costs in check.
These three methods should get you started on your goal to reduce costs. Have any other preferred methods to save money on Azure? Leave a comment below to let us know.
DevOps cloud cost control: an oxymoron? If you’re in DevOps, you may not think that cloud cost is your concern. When asked what your primary concern is, you might say speed of delivery, or integrations, or automation. However, if you’re using public cloud, cost should be on your list of problems to control.
The Cloud Waste Problem
If DevOps is the biggest change in IT process in decades, then renting infrastructure on demand is the most disruptive change in IT operations. With the switch from traditional datacenters to public cloud, infrastructure is now used like a utility. Like any utility, there is waste. (Think: leaving the lights on or your air conditioner running when you’re not home.)
How big is the problem? In 2016, enterprises spent $23B on public cloud IaaS services. We estimate that about $6B of that was wasted on unneeded resources. The excess expense known as “cloud waste” comprises several interrelated problems: services running when they don’t need to be, improperly sized infrastructure, orphaned resources, and shadow IT.
Everyone who uses AWS, Azure, and Google Cloud Platform is either already feeling the pressure — or soon will be — to reel in this waste. As DevOps teams are primary cloud users in many companies, DevOps cloud cost control processes become a priority.
4 Principles of DevOps Cloud Cost Control
Let’s put this idea of cloud waste in the framework of some of the core principles of DevOps. Here are four key DevOps principles, applied to cloud cost control:
1. Holistic Thinking
In DevOps, you cannot simply focus on your own favorite corner of the world, or any one piece of a project in a vacuum. You must think about your environment as a whole.
For one thing, this means that, as mentioned above, cost does become your concern. Businesses have budgets. Technology teams have budgets. And, whether you care or not, that means DevOps has a budget it needs to stay within. Whether it’s a concern upfront or doesn’t become one until you’re approached by your CTO or CFO, at some point, infrastructure cost is going to be under scrutiny – and if you go too far out of budget, under direct mandates for reduction.
Solving problems not only speedily and elegantly, but cost efficiently becomes a necessity. You can’t just be concerned about Dev and Ops, you need to think about BizDevOps.
Holistic thinking also means that you need to think about ways to solve problems outside of code… more on this below.
2. No Silos
The principle of “no silos” means not only no communication silos, but also, no silos of access. This applies to the problem of cloud cost control when it comes to issues like leaving compute instances running when they’re not needed. If only one person in your organization has the ability to turn instances on and off, then all responsibility to turn those instances off falls on his or her shoulders.
It also means that if you want to use an instance that is scheduled to be turned off… well, too bad. You either call the person with the keys to log in and turn your instance on, or you wait until it’s scheduled to come on. Or if you really need a test environment now, you spin up new instances – completely defeating the purpose of turning the original instances off.
The solution is eliminating the control silo by allowing users to access their own instances to turn them on when they need them and off when they don’t — of course, using governance via user roles and policies to ensure that cost control tactics remain uninhibited.
(In this case, we’re thinking of providing access to outside management tools like the one we provide, but this can apply to your public cloud accounts and other development infrastructure management portals as well.)
3. Rapid, Useful Feedback
In the case of eliminating cloud waste, the feedback you need is where, in fact, waste is occurring. Are your instances sized properly? Are they running when they don’t need to be? Are there orphaned resources chugging away, eating at your budget?
Useful feedback can also come in the form of total cost savings, percentages of time your instances were shut down over the past month, and overall coverage of your cost optimization efforts. Reporting on what is working for your environment helps you decide how to continually address the problem that you are working on next.
You need monitoring tools in place in order to discover the answers to these questions. Preferably, you should be able to see all of your resources in a single dashboard, to ensure that none of these budget-eaters slip through the cracks. Multi-cloud and multi-region environments make this even more important.
The principle of Automation means that you should not waste time creating solutions when you don’t have to. This relates back to the problem of solving problems outside of code mentioned above.
So when automating, keep your eyes open and do your research. If there’s already an existing tool that does what you’re trying to code, it could be a potential time-saver and process-simplifier.
So take a look at your DevOps processes today, and see how you can incorporate a DevOps cloud cost control – or perhaps, “continuous cost control” – mindset to help with your continuous integration and continuous delivery pipelines. Automate cost control to reduce your cloud expenses and make your life easier.
That brings the list of SSO providers you can use with your ParkMyCloud account to:
Active Directory Federation Services (ADFS) – Microsoft
Azure Active Directory – Microsoft
Okta (in Okta App Network)
OneLogin (in App Catalog)
Ping Identity (in App Catalog)
Stay tuned: ParkMyCloud will be listed in the Centrify marketplace shortly.
We have integrated with Centrify for Single Sign-On, as well as the other SSO providers, to make it simpler:
For account administrators, who can use just-in-time provisioning to automatically add their organization members to ParkMyCloud as they are authenticated in Centrify – all you need to do as an administrator is share your organization’s unique ParkMyCloud login link with your users. This can be found in the ParkMyCloud management console.
For users, who will not need separate login information and a password for ParkMyCloud.
For a step-by-step guide for setting up Centrify as a SAML IdP server for ParkMyCloud, please see this article on our support site. Note that you will already need to have your ParkMyCloud account created – though there’s no need to add additional users until you’ve connected with Centrify, at which point you can add them directly from the SSO provider.
If we still don’t support your SSO provider of choice, please leave a comment below or contact us – we’re all about meeting user needs, here!
Before I try to break down the AWS and Azure cloud pricing jargon, let me give you some context. I am a crusty, old CTO who has been working in advanced technology since the 1980’s. (That’s more than 18 Moore’s Law cycles for processor and chipset fans, and I have lost count of how many technology hype cycles that has been.)
I have grown accustomed to the “deal of a lifetime” on the “technology of the decade” coming around about once every week. So, you can believe me, when I tell you have a very low BS threshold for dishonest sales folks and bogus technology claims. Yes, I am jaded.
My latest venture is a platform, ParkMyCloud, that brings together multiple public cloud providers. And I can tell you first hand that it is not for the faint-of-heart. It’s like being dropped off in the middle of the jungle in Papua, New Guinea. Each cloud provider has its own culture, its own philosophy, its own language and customers, its own maturity level and, worst of all — its own pricing strategy — which makes it tough for buyers to manage costs. I am convinced that the lowest circles of hell are reserved for people who develop cloud service pricing models. AWS and Azure cloud pricing gurus, beware. And reader, to you: caveat emptor.
AWS and Azure Terminology Differences
Case in point: You have probably read the comparisons of various services across the top cloud providers, as people try to wrap their minds around all the varying jargon used to describe pretty much the same thing. For example, let’s just look at one service: Cloud Computing.
In AWS, servers are called Elastic Compute Cloud (EC2) “Instances”. In Azure they are called “Virtual Machines” or “VMs”. Flocks of these spun up from a snapshot according scaling rules are called “auto scaling groups” in AWS. The same things are called “scale sets” in Azure.
Of course cloud providers had to start somewhere, then they learned from their mistakes and improved. When AWS started with EC2, they had not yet released virtual private clouds (VPCs), so their instances ran outside of VPCs. Now all the latest stuff runs inside of VPCs. The older ones are called, “classic” and have a number of limitations. The same thing is true of Azure. When they first released, their VMs were not set up to use what is now their Resource Manager or be managed in Resource Groups (the moral equivalent of CloudFormation Stacks in AWS). Now, all of their latest VMs are compatible with Resource Manager. The older ones are called, you guessed it … “classic”. (What genius came up with the idea to call the older versions of these, the ones you’re probably stranded with and no longer want, “classic”?)
Both AWS and Azure have a dizzying array of instances/VMs to choose from, and doing an apples-to-apples comparison between them can be quite daunting. They have different categories: General purpose, compute optimized, storage optimized, disk optimized, etc.
Then within each one of those, there are types or sizes. For example, in AWS the tiny, cheap ones are currently the “t2” family. In Azure, they are the “A” series. On top of that there are different generations of processors. In AWS, they use an integer after the family type, like t2, m3, m4 and there are sizes, t2.small, m3.medium, m4.large, r16.ginormus (OK, I made that one up).
In Azure, they use a number after the family letter to connote size, like A0, A1, A2, D1, etc. and “v1”, “v2” after that to tell what generation it is, like D1v1, D2v2.
The bottom line: this is very confusing for folks moving their workloads to public cloud from on-premise data centers (yet another Wonderland of jargon and confusion in its own right). How does one decide which cloud provider to use? How does one even begin to compare prices with all of this mess? Cheer up … it gets worse!
AWS and Azure Cloud Pricing – Examining Differences in Charging
To add to that confusion, they charge you differently for the compute time you use. What do I mean? AWS prices their compute time by the hour. And by hour, they mean any fraction of an hour: If you start an instance and run it for 61 minutes then shut it down, you get charged for 2 hours of compute time. Microsoft Azure cloud pricing is listed by the hour for each VM, but they charge you by the minute. So, if you run for 61 minutes, you get charged for 61 minutes. On the surface, this sounds very appealing (and makes me want to wag my finger at AWS and say, “shame on you, AWS”).
However, you really have to pay attention to the use case and the comparable instance prices. Let me give you a concrete example. I mentioned my latest venture, ParkMyCloud, earlier. We park (schedule on/off times) for cloud computing resources in non-production environments (without scripting by the way). So, here is a graph of 6 months worth of data from an m4.large instance somewhere in Asia Pac. The m4 processor family is based on the Xeon Broadwell or Haswell processor and it is one of the most commonly used instance types.
This instance is on a ParkMyCloud parking schedule, where it is RUNNING from 8:00 a.m. to 7:00 p.m. on weekdays and PARKED evenings and weekends. This instance, assuming Linux pricing, costs $0.125 per hour in AWS. From November 6, 2016 until May 9, 2017, this instance ran for 111,690 minutes. This is actually about 1,862 hours, but AWS charged for 1,922 hours and it cost $240.25 in compute time.
Why the difference? ParkMyCloud has a very fast and accurate orchestration engine, but when you start and stop instances, the cloud provider and network response can vary from hour-to-hour and day-to-day, depending on their load, so occasionally things will run that extra minute. And, even though this instance is on a parking schedule, when you look at the graph, you can see that the user took manual control a few times. Stuff happens!
What would the cost have been if AWS charged the same way as Azure? It would have only cost $232.69. Well, that’s not too bad over the course of six months, unless you have 1,000 of these. Then it becomes material.
However, I wouldn’t rush to judgment on AWS. If you look at the comparable Azure VM, the standard pricing DS2 V2, also running Linux, costs $0.152/hour. So, this same instance running in Azure would have cost $290.39. Yikes!
Therefore, in my particular use case, unless the Azure cloud pricing drops to make their CPU pricing more competitive, their per minute pricing really doesn’t save money.
The ironic thing about all of this, is that once you get past all the confusing jargon and the ridiculous approaches to pricing and charging for usage, the actual cloud services themselves are much easier to use than legacy on-premise services. The public cloud services do provide much better flexibility and faster time-to-value. The cloud providers simply need to get out of their own way. Pricing is but one example where AWS and Azure need to make things a lot simpler, so that newcomers can make informed decisions.
From a pricing standpoint, AWS on-demand pricing is still more competitive than Azure cloud pricing for comparable compute engine’s, despite Azure’s more enlightened approach to charging for CPU/Hr time. That said, AWS really needs to get in-line with both Azure and Google, who charge by the minute. Nobody likes being charged extra for something they don’t use.
In the meantime, ParkMyCloud will continue to help you turn off non-production cloud resources, when you don’t need them and help save you a lot of money on your monthly cloud bills. If we make anything sound more complex than it needs to, call us out. No hiding behind jargon here.