Kubernetes – The Latest Holy War
Kubernetes v1.0 was released in 2015, after being developed and used internally at Google since 2014. It quickly emerged as the most popular way to manage and orchestrate large numbers of containers, despite competition from Docker Swarm, Apache Mesos, Hashicorp’s Nomad, and various other software from the usual suspects at IBM, HP, and Microsoft. Google partnered with the Linux Foundation in 2015 to form the Cloud Native Computing Foundation (CNCF) and had Kubernetes as one of the seed technologies.
This public dedication to open-source technology gave Kubernetes instant nerd bonus points, and it being used internally at one of the largest software companies in the world made it the hot new thing. Soon, there were thousands of articles, tweets, blog posts, and conference talks about moving to a microservices architecture built on containers using Kubernetes to manage the pods and services. It wasn’t long before misguided job postings were looking for 10+ years of Kubernetes experience and every startup pivoted from their blockchain-based app to a Kubernetes-based app overnight.
As with any tech fad, the counterarguments started to mount while new fads rose up. Serverless became the new thing, but those who put their eggs in the Kubernetes basket resisted the shift from containers to functions. Zealots from the serverless side argued that you can ditch your entire code base and move to Lambda or Azure scripts, while fanatics of Kubernetes said that you didn’t need functions-as-a-service if you just packaged up your entire OS into a container and just spun up a million of them. So, do you need Kubernetes?
You’re (Probably) Not Google
Here’s the big thing that gets missed when a huge company open-sources their internal tooling – you’re most likely not on their scale. You don’t have the same resources, or the same problems as that huge company. Sure, you are working your hardest to make your company so big that you have the same scaling problems as Google, but you’re probably not there yet. Don’t get me wrong: I love when large enterprises open-source some of their internal tooling (such as Netflix or Amazon), as it’s beneficial to the open-source community and it’s a great learning opportunity, but I have to remind myself that they are solving a fundamentally different problem than I am.
While I’m not suggesting that you avoid planning ahead for scalability, getting something like Kubernetes set up and configured instead of developing your main business application can waste valuable time and funds. There’s a lot of overhead with learning, deploying, and managing Kubernetes that companies like Google can afford. If you can get the same effect from an autoscaling group of VMs with less headache, why wouldn’t you go that route? Remember: something like 60% of global AWS spend is on EC2, and with good reason. You can get surprisingly far using tried-and-true technologies and basics without having to rip everything out and implement the latest fad, which is why Kubernetes (or serverless, or blockchain, or multi-cloud…) shouldn’t be your focus. Kubernetes certainly has its place, and can be the tool you need. But most likely, it’s making things more complex for you without a corresponding benefit.
There’s a lot of talk about multi-cloud architecture – and apparently, a lot of disagreement about whether there is actually any logical use case to use multiple public clouds.
How many use multi-cloud already?
First question: are companies actually using a multi-cloud architecture?
According to a recent survey by IDG: yes. More than half (55%) of respondents use multiple public clouds: 34% use two, 10% use three, and 11% use more than three. IDG did not provide a term definition for multi-cloud. Given the limited list of major public clouds, the “more than three set” might be counting smaller providers. Or, respondents could be counting combinations such as AWS EC2 and Google G-Suite or Microsoft 365.
There certainly are some using multiple major providers – as one example, ParkMyCloud has at least one customer using compute infrastructure in AWS, Azure, Google Cloud, and Alibaba Cloud concurrently. In our observation, this is frequently manifested as separate applications architected on separate cloud providers by separate teams within the greater organization.
Why do organizations (say they) prefer multi-cloud?
With more than half of IDG’s respondents reporting a multi-cloud architecture, now we wonder: why? Or at least – since we humans are poor judges of our own behavior – why do they say they use multiple clouds? On survey, public cloud users indicated they adopted a multi-cloud approach to get best-of-breed platform and service options, while other goals included cost savings, risk mitigation, and flexibility.
Are these good reasons to use multiple clouds? Maybe. The idea of mixing service options from different clouds within a single application is more a dream than reality. Even with Kubernetes. (Stay tuned for a rant post on this soon).
Cloud economist Corey Quinn discussed this on a recent livestream with ParkMyCloud customer Rob Weaver. He asked Rob why his team at Workfront hadn’t yet completed a full Kubernetes architecture.
Rob said, “we had everything in a datacenter, and we decided, we’re going to AWS. We’re going there as fast as we can because it’s going to make us more flexible. Once we’re there, we’ll figure out how to make it save us money. We did basically lift and shift. …. Then, all of the sudden, we had an enormous deal come up, and we had to go into another cloud. Had we taken the approach of writing our own Lambdas to park this stuff, now GCP comes along. We would have to have written a completely different language, a completely different architecture to do the same thing. The idea of software-as-a-service and making things modular where I don’t really care what the implementation is has a lot of value.”
Corey chimed in, “I tend to give a lot of talks, podcasts, blog posts, screaming at people in the street, etc. about the idea that multi-cloud as a best practice is nuts and you shouldn’t be doing it. Whenever I do that, I always make it a point to caveat that, ‘unless you have a business reason to do it.’ You just gave the perfect example of a business reason that makes sense – you have a customer who requires it for a variety of reasons. When you have a strategic reason to go multi-cloud, you go multi-cloud. It makes sense. But designing that from day one doesn’t always make a lot of sense.”
So, Corey would say: Rob’s situation is the one use case where a multi-cloud architecture actually makes sense. Do you agree?
Cloud spend optimization is always top of mind for public cloud users. It’s usually up there with Security, Governance, and Compliance – and now in 2020, 73% of respondents to Flexera’s State of the Cloud report said that ‘Optimize existing use of cloud (cost savings)’ was their #1 initiative this year.
So – what the heck does that mean? There are many ways to spin it, and while “cost optimization” is broadly applicable, the strategies and tactics to get there will vary widely based on your organization and the maturity of your cloud use.
Having this discussion within enterprises can be challenging, and perspectives change depending on who you talk to within an organization – FinOps? CloudOps? ITOps? DevOps?. And outside of operations, what about the Line of Business (LoB) or the Application owners? Maybe they don’t care about optimization in terms of cost but in terms of performance, so in reality optimization can mean something different to cloud owners and users based on your role and responsibility.
Ultimately though, there are a number of steps that are common no matter who you are. In order to facilitate this discussion and understand where enterprises are in their cloud cost optimization journey, we created a framework called the Cloud Cost Optimization Maturity Curve to identify these common steps.
Cloud Spend Optimization Maturity Curve
While cloud users could be doing any combination of these actions, this is a representation of actions you can take to control cloud spend in order of complexity. For example, Visibility in and of itself does not necessarily save you money but can help identify areas ripe for optimization based on data. And taking scaling actions on IaaS may or may not save you money, but may help you improve application performance through better resource allocation, scaling either up (more $$$) or down (less $$$).
Let’s dig into each in a little more detail:
- Visibility – visibility of all costs across clouds, accounts, and applications. This is cloud cost management 1.0, the ability to see cost data better through budgeting, chargeback, and showback.
- Schedule suspend – turn off idle resources like virtual machines, databases, scale groups, and container services when not being used, such as nights and weekends based on usage data. This is most common for non-production resources but can have a big bang in terms of savings – 65% savings is a good target that many ParkMyCloud customers achieve even during a free trial.
- Delete unused resources – this includes identifying orphaned resources and volumes and then deleting them. Even though you may not be using them, your cloud provider is still charging you for them.
- Sizing IaaS (non-production) – many enterprises overprovision their non-production resources and are using only 5-10% of the capacity of a given resource, meaning 90% is unused (really!) so by leveraging usage data you can get recommendations to resize those under utilized resources to save 50% or more.
- RI / Savings Plan Management – AWS, Azure, and Google provide the ability to pre-buy capacity and get discounts ranging from 20-60% based on your commitments in both spend and terms. While the savings make it worthwhile, this is not a simple process (though it’s improved with AWS’s savings plans) and requires a very good understanding of the services you will need 12-36 months out.
- Scaling IaaS (prod) – this requires collecting data and understanding both the infrastructure and application layers and taking sizing actions up or down to improve both performance and cost. Taking these actions on production resources requires strong communication between Operations and LoB.
- Optimizing PaaS – virtual machines, databases, and storage are all physical in nature and can be turned off and resized, but these top the maturity curve since many PaaS services have to be optimized in other ways like scaling the service up/down based on usage or rearchitecting parts of your application.
For more ways to reduce costs, check out the cloud waste checklist for 26 steps to take to eliminate wasted spend at a more granular level.
When most enterprise users hear that their organization will start heavily using ServiceNow governance, they assume that their job is about to get much harder, not easier. This stems from admins putting overly-restrictive policies in place, even with the good intentions of preventing security or financial problems. The negative side effect of this often manifests itself as a huge limitation for users who are just trying to do their job. Ultimately, this can lead to “shadow IT”, angry users, and inefficient business processes. So how can you use ServiceNow governance to increase efficiency rather than prohibit it?
What is ServiceNow governance?
One of the main features of ServiceNow is the ability to implement processes for approvals, requests, and delegation. Governance in ServiceNow includes the policies and definitions of how decisions are made and who can make those decisions. For example, if a user needs a new virtual machine in AWS, they can be required to request one through the ServiceNow portal. Depending on the choices made during this request, cloud admins or finance team members can be alerted to this request and be asked to approve the request before it is carried out. Once approved, the VM will have specific tags and configuration options that match compliance and risk profiles.
What Drives Governance?
Governance policies are implemented with some presumably well-intentioned business goal in mind. Some organizations are trying to keep risk managed through approvals and visibility. Others are trying to rein in IT spending by guiding users to lower-cost alternatives to what they were requesting.
Too often, to the end user, the goal gets lost behind the frustration of actions being slowed, blocked, or redirected by the (beautifully automated) red tape. Admins lose sight of the central business needs while implementing a governance structure that is trying to protect those business needs. For users to comply with these policies, it’s crucial that they understand the motivations behind them – so they don’t work around them.
In practice, this turns into a balancing act. The guiding question that needs to be answered by ServiceNow governance is, “How can we enable our users to do their jobs while preventing costly (or risky) behavior?”
Additionally, it’s critical that new policies are clearly communicated, and that they hook into existing processes. Not to say that this is easy. To be done well, it requires a team of technical and business stakeholders to provide their needs and perspectives. Knowing the technical possibilities and limitations must match up with the business needs and overall organizational plans, while avoiding roadblocks and managing edge cases. There’s a lot to mesh together, and each organization has unique challenges and desires, which makes this whole process hard to generalize.
The End Result
At ParkMyCloud, we try to help facilitate these kinds of governance frameworks. The ParkMyCloud application allows admins to set permissions on users and give access to teams. By reading from resource tags, existing processes for tagging and naming can be utilized. New processes around resource schedule management can be easily communicated via chat or email notifications. Users get the access they need to keep doing their job, but don’t get more access than required. Employing similar ideas in your ServiceNow governance policies can make your users successful and your admins happy.
Now more than ever, organizations have been implementing multi-cloud environments for their public cloud infrastructure.
We not only see this in our customers’ environments: a growing proportion use multiple cloud providers. Additionally, industry experts and analysts report the same. In early June, IDG released its 8th Cloud Computing Survey results where they broke down IT environments, multi-cloud and IT budgets by the numbers. This report also goes into both the upsides and downsides using multiple public clouds. Here’s what they found:
- More than half (55%) of respondents use multiple public clouds:
- 34% use two, 10% use three, and 11% use more than three
- 49% of respondents say they adopted a multi-cloud approach to get best-of-breed platform and service options.
- Other goals include:
- Cost savings/optimization (41%)
- Improved disaster recovery/business continuity (40%)
- Increased platform and service flexibility (39%).
Interestingly, within multi-cloud customers of ParkMyCloud, the majority are users of AWS and Google Cloud, or AWS and Azure; very few are users of Azure and Google Cloud. About 1% of customers have a presence in all three.
Multi-Cloud Across Organizations
The study found that the likelihood of an organization using a multi-cloud environment depends on its size and industry. For instance, government, financial services and manufacturing organizations are less likely to stick to one cloud due to possible security concerns that come with using multiple clouds. IDG concluded that enterprises are more concerned with avoiding vendor lock-in while SMBs are more likely to make cost savings/optimization a priority (makes sense, the smaller the company, the more worried they are about finances).
- Fewer than half of SMBs (47%) use multiple public clouds
- Meanwhile, 66% of enterprises use multiple clouds
What are the advantages of multi-cloud?
Since multi-cloud has been a growing trend over the last few years, we thought it’d be interesting to take a look at why businesses are heading this direction with their infrastructure. More often than not, public cloud users and enterprises have adopted multi-cloud to meet their cloud computing needs. The following are a few advantages and typically the most common reasons users adopt multi-cloud.
- Risk Mitigation – create resilient architectures
- Managing vendor lock-in – get price protection
- Workload Optimization – place your workloads to optimize for cost and performance
- Cloud providers’ unique capabilities – take advantage of offerings in AI, IOT, Machine Learning, and more
While taking advantage of features and capabilities from different cloud providers can be a great way to get the most out of the benefits that cloud services can offer, if not used optimally, these strategies can also result in wasted time, money, and computing capacity. The reality is that these are sometimes only perceived advantages that never come to fruition.
What are the negatives?
As companies implement their multi-cloud environments, they are finding downsides. A staggering 94% of respondents – regardless of the number of clouds they use or size of their organization – find it hard to fully take advantage of their public cloud resources. The survey cited the biggest challenge is controlling cloud costs – users think they’ll be saving money but end up spending more. When organizations migrate to multi-cloud they think they will be cutting costs, but what they typically fail to account for is the growing cloud services and data as well as lack of visibility. For many organizations we talk to, multiple clouds are being used because different groups within the organization use different cloud providers, which makes for challenges in centralized control and management. Controlling these issues brings about another issue of increased costs due to the need of cloud management tools.
Some other challenges companies using multiple public clouds run into are:
- Data privacy and security issues (38%)
- Securing and protecting cloud resources (31%)
- Governance/ compliance concerns (30%)
- Lack of security skills/expertise (30%)
Configuring and managing different CSPs requires deep expertise which makes it more of a pressing need to find employees that have the experience and capabilities to manage multiple clouds. This means that more staff are needed to manage multi-cloud environments confidentiality so it can be done in a way that is secure and highly available. The lack of skills and expertise for managing multiple clouds can become a major issue for organizations as their cloud environments won’t be managed efficiently. In order to try fix this issue, organizations are allocating a decent amount of their IT budget to cloud-specific roles with the hope that adding more specialization in this area can help improve efficiency.
Multi-Cloud Statistics: Use is Still Growing
The statistics on cloud computing show that companies not only use multiple clouds today, but they have plans to expand multi-cloud investments:
- In a survey of nearly 551 IT people who are involved in the purchasing process for cloud computing, 55% of organizations currently use multiple public clouds.
- Organizations using multiple cloud platforms say they will allocate more (35%) of their IT budget to cloud computing.
- SMBs plan to include slightly more for cloud computing in their budgets (33%) compared to enterprises
- While this seems significant, if measured in dollars, enterprises plan a much larger cloud spend than SMBs do $158 million compared to $11.5 million.
The Future of Managing Cloud Costs for Multi-Cloud
As cloud costs remain a primary concern, especially for SMBs, it’s important organizations keep up with the latest cloud usage trends to manage spend and prevent waste. To keep costs in check for a multi-cloud, you can make things easier for your IT department and implement an optimization tool that can track usage and spend across different cloud providers.
For more insight on the rise of multi-cloud and hybrid cloud strategies, and to demonstrate the impact on cloud spend, check out the drain of wasted spend on IT budgets here.
Azure Spot Virtual Machines are a purchasing option that can save significant amounts on infrastructure, for certain types of workloads. Azure Spot VMs are not created as a separate type of VM in Azure, instead, it’s a capability to bid for spare capacity at a discount from on demand pricing. But there’s one caveat: at any given point in time when Azure needs the capacity back, the Azure infrastructure will deallocate and evict the Spot VM from your environment.
In the past, Azure offered Low Priority VMs, which were charged at a fixed price. In March this year, that option was replaced by Azure Spot VMs. With the newer option, you bid by indicating the maximum price you are willing to pay for a VM.
Why Use Azure Spot VMs
Microsoft allows you to use their unused compute capacity at a discounted rate. These discounts are variable and can go up to >90% of the pay-as-you-go rates, depending on the size of the VM and the unused capacity available. The amount of capacity available can vary based on region, time of day, and more.
You can use Azure Spot VMs for workloads that are not critical or need to run 24×7. For example, a basic scenario would be for testing the load of a particular workload that you want to perform for a fraction of the cost. Other use cases include batch processing, stateless applications that can scale out, short-lived jobs that can be run again if the workload is evicted, etc.
Keeping in mind that there are no SLAs or availability guarantees for these Spot VMs. The most significant concern users have is that they may not be available to you to get resources, especially at peak load times. The issue is not with the service, it’s with how it is intended to work. Be aware of this when making the decision to use this approach.
Some important things to consider when using Azure Spot VMs:
- VMs are evicted based on capacity or by if the price exceeds your maximum set price
- Azure’s infrastructure will evict Spot VMs if Azure needs the capacity for pay-as-you-go workloads
- B-series and promo versions of any size (like Dv2, NV, NC, H promo sizes) are not supported
- A Spot VM cannot be converted to a regular VM or vice versa. You would have to delete the VM and attach the disk to a new VM
- VMs that are evicted and deallocated are not turned back on when capacity or price comes back inside allowed limits, you will need to manually turn them back on
- You will be unable to create your VM if the capacity or pricing are not inside the allowed limits
How to Use Azure Spot VMs
You have two choices when deploying Azure Spot VMs. When you enable the feature in your Azure environment, you need to select what type of eviction and eviction policy you want for the capacity:
Types of eviction:
- By capacity only – the VM is evicted when Azure needs capacity. In other words, your maximum price for the spot VM is the current price of the regular VM
- By maximum price – the VM is evicted when the spot price is greater than the maximum price
Eviction policy (currently available):
The eviction policy for Spot VMs is set to Stop / Deallocate which moves your evicted VMs to the stopped-deallocated state, allowing you to redeploy the evicted VMs at a later time. Remember reallocating Spot VMs will be dependent on there being available Spot capacity. However, the deallocated VMs will count against your spot vCPU quota and you will be charged for your underlying disks. If your Spot VM is evicted, but you still need capacity right away, Azure recommends you use a standard VM instead of Spot VM.
Do Azure Spot VMs Save You Money?
Yes: these discounted VMs can save you money. How much will vary? Azure Spot VMs prices are not fixed like standard instances, they change over the day and vary based on the supply and demand in a particular region.
Azure Spot VMs are a good option that can provide cost savings if your application can handle unexpected interruptions.
Use Spot VMs as part of your full cost-saving strategy. For on-demand workloads that aren’t needed 24×7, ensure you have appropriate on/off schedules in place. All VMs should be properly sized to the workload. You can start automating these Azure cost optimization tasks with ParkMyCloud today.