Amazon Web Services (AWS) provides a treasure trove of documents and CloudFormation templates in their AWS Solutions portal, including AWS right sizing, the AWS instance scheduler, a chatbot framework, and more. These solutions can be used as-is for immediate integration into your existing environment, or can be the starting point for developing your own unique toolsets. Today, we’re reviewing the AWS Right Sizing tool to see how much it can help you optimize your infrastructure.
AWS Right Sizing: What It Does
AWS offers a variety of types and sizes of EC2 instances. That means that it’s perfectly possible to select an instance type that’s too large for your actual needs, which means you’ll be paying more than necessary. In fact, the data shows that this is happening most of the time. The AWS Right Sizing tool exists to help users find the correct instance size to meet their needs at the lowest cost.
The tool uses a CloudFormation template that deploys infrastructure and scripts needed to make right sizing recommendations for your AWS account. This infrastructure includes an EC2 instance that will run python scripts, a 2-node Redshift cluster for the right sizing analysis, and an S3 bucket for the raw CloudWatch data and the final CSV output. The total cost of this deployment in the us-east-1 region is $0.65 per hour.
The basis of the right sizing logic is to look at the Max CPU from the past 2 weeks of CloudWatch data for each EC2 instance. If the max CPU is above 50% at any point, then it will not recommend a change, but if it is always below 50% then it will attempt to find the cheapest instance size that matches the I/O, memory, network, and at least the max CPU that was found. The final output is a CSV file that includes information about the existing instance sizes, the utilization of those instances, the recommended instance size, and the cost saved per month.
Worth the hassle?
Based on the logic above, the AWS Right Sizing tool does a very basic level of recommendation for instance sizing. There are a few scenarios where these recommendations may not be helpful, such as applications that are memory-intensive or cases where the instance needs to be a larger size than it currently is. The tool also only spits out a CSV with the recommendations, which means you still have to make decisions and take actions based on those recommendations. The CSV file looks like this:
If those recommendations don’t seem to fit what you’re looking for, the nice thing is they offer the full stack, along with all scripts and CloudFormation templates, as an open-source repository. This means you can take the core of the recommendation engine and tweak it to follow your own logic for customized recommendations, or even use it to trigger the resizing of the instance. AWS also offers Trusted Advisor as a part of their Business-level and Enterprise-level support plans, which can offer right sizing recommendations in real time (amongst other health checks and recommendations).
Overall, this AWS right sizing tool can either be a useful check-up tool for your environment, or the basis for your own cost-optimization initiative, but many users will want a more out-of-the-box automated solution for this.
Since changing server sizes and timing this with maintenance windows can be a hassle, ParkMyCloud has introduced a feature to automate the resizing of your EC2 instances. Interested? Check it out with a free trial.
There’s a growing job function among companies using public cloud: the Enterprise Cloud Manager. We recently did a study on ParkMyCloud users which showed that a growing proportion them have “cloud” or the name of their cloud provider such as “AWS” in their job title. This indicates a growing degree of specialization for individuals who manage cloud infrastructure. In some companies, there is a dedicated role for cloud management – such as an Enterprise Cloud Manager.
Why would you need an Enterprise Cloud Manager?
The world of cloud management is constantly changing and becoming increasingly complex, which can make it confusing, expensive, and hard to control. If someone is not fully versed in this field, they may not always know how to handle problems related to governance, security, and cost control. It is important to dedicate resources in your organization to cloud management and related cloud job roles. This chart from Gartner gives us a look at all the things that are involved in cloud management so we can better understand how many parts need to come together for it to run smoothly.
Having a role in your organization that is dedicated to cloud management allows others, who are not specialized in that field, to focus on their jobs, while also centralizing responsibility. With the help of an Enterprise Cloud Manager, responsibilities are delegated appropriately to ensure cloud environments are handled according to best practices in governance, security, and cost control.
The role of an Enterprise Cloud Manager is to oversee cloud operations. They know all the ins and outs of cloud management so they are able to create processes for resource provisioning and services. Their focus is on optimizing their infrastructure which will help streamline all their cloud operations, improve productivity, and optimize cloud costs.
Automation Tools are Essential
With so much going on in this space, it isn’t possible to expect just one person or a team to manage all of this – you need automation tools. The great thing is that these tools work for companies of any size. Primary users can be people dedicated to this full time, such as an Enterprise Cloud Manager, as well as people managing cloud infrastructure on top of other responsibilities.
Why are these tools important? They provide two main things: visibility and action to act on those recommendations. Customers that were once managing resources manually are now saving time and money by implementing an automation tool. Take a look at the automation tools that are set up through your cloud vendor, as well as third-party tools that are available. Setting up these tools for automation will lessen the need for routine check-ins and maintenance while ensuring your infrastructure is optimized.
Do I really need one?
If you want your organization to be well informed and up to date, then it is important that you have someone or something in place to oversee your cloud operations – an Enterprise Cloud Manager and automation tools.
Workfront is using ParkMyCloud as their go-to solution for cloud cost control, in addition to multi-cloud management and governance benefits they gain from using CloudHealth. We talked with Randy Goddard, Senior Systems Engineer, about how ParkMyCloud came at the “perfect time” and why he sees it being implemented company-wide over the next 6 months.
Randy, thanks for chatting with us. Can you start by telling us about Workfront, what the company does, and your role in the organization?
Workfront is a category-creating company with a platform centered around work management. We enable people to do their best work and to make it matter. If you think of a system of records, like Salesforce as a system of records for customer contact, or HR as a system of records for employee information, Workfront is a system for operational work.
My role began 5 years ago as a traditional systems engineer and over the last 3 years I have moved into a cloud governance role as we made our transition from data center to cloud services. In my cloud governance role I’m third down from the CTO, reporting to the infrastructure manager.
What public clouds are you using – and how many people at Workfront are using the cloud?
We are multi-cloud, using both AWS and Google Cloud Platform for different workloads — and we have about 200 Workfront employees using these two clouds.
So, you use CloudHealth. Tell us about your experience with their multi-cloud management platform – how did you get started and how does it help you?
We’ve used CloudHealth for roughly 2.5 years. Other members of them team piloted and demoed it to us. They left the company shortly after, so I picked it up right after it was introduced and went on to be part of the implementation.
We use CloudHealth for overall governance of all our cloud services. The benefit is the clear visibility into who is running what, where, and what it costs. The side benefits include rightsizing, security notifications, budgeting, and monitoring, in addition to the major benefit of visibility over resources.
How did you learn about ParkMyCloud?
We learned about ParkMyCloud through CloudHealth, actually. A colleague and I attended a webinar in which they talked about automation and the concept of shutting down of resources, introducing ParkMyCloud as the partner solution to accomplish that.
It was perfect timing, really. Just at the moment that CloudHealth and ParkMyCloud partnered and the information was provided in this webinar, one of our busiest units had started working on a homegrown solution. When we became aware of what ParkMyCloud could do, we were in the middle of looking for a solution ourselves, considering build versus buy and determining cost-benefit analysis. We saw the webinar that week, saw the benefit and the cost associated and thought – why would we build our own for the cost that we could get ParkMyCloud?
Was there any pressure from outside of your department to bring cloud costs down?
Since starting on the cloud journey, I have been very well aware of the cost, as has the cloud engineering team. We were really the ones that felt a sense of urgency and paid mind to the actual costs. Outside of this small group, there was a common misconception that the cloud is just free, and there wasn’t an awareness of the need for insight, diligence, and regimen in our cloud environment.
Our team was at the forefront of demonstrating to the business that we need a solution for turning resources off when not using them. We knew we needed to get ahead of costs as they climb and climb and climb, especially in developer environments where resources aren’t required to be on 24/7 and can be oftentimes left unattended for weeks on end. It made a lot of sense to adopt the ParkMyCloud model, pilot it, get it running, and show the business how easy it is to maintain that type of environment.
Funny that you mention the misconception of “it’s free – it’s cloud” – what do you think contributes to that mindset?
I think it’s the migration from traditional data centers in a product-oriented environment or a feature factory. The initial outlay and capital expenditure of buying hardware for a data center is traditionally the only insight that an organization has into how much things cost. But once that capital expenditure is made, the ongoing operational costs are completely obfuscated.
The beauty of cloud is the visibility into how much things actually cost to run. If we want to create widget X, we can now associate direct costs to the infrastructure resources involved into supporting that widget. We never had to pay attention before, but now we have this model where there is free reign in the data center, you get the keys, and you can do what you want. At the same time, there’s a budget associated with all of that and guess who’s in charge? You are. It raises that level of knowledge and awareness that it isn’t just dev costs, it isn’t just the widget, now it’s infrastructure that we have to start paying attention to and architecture around that.
How has your experience been with ParkMyCloud so far?
After a demo, we started a trial and put it to use with cloud credentials for an AWS account that had a lot of development resources. We let the tool model the usage patterns of those resources. After it had enough usage data, we went in to see how automated the process is to spin resources down and back up, and how the scheduling works.
After ParkMyCloud had been running for a couple of weeks, we saw that 7 out of 8 environments with these cloud credentials could be completely shut off for at least 12 hours a day. Because of that, and applying ParkMyCloud to all our enterprise accounts across just the USA, we saw that we could really save a lot of money.
How much are you saving with ParkMyCloud? Any estimates of how much you will save?
The piloting we just did was specifically with automated policy. We set is so that any cloud credential that has ‘-dev’ in the name would be turned off at 7PM our time, and turned on at 7AM. From adding our one cloud credential to see if it could really shut off everything without having to specify the resources by policy, we saw that sure enough it did what we needed it to do and flawlessly. As new things are spun up in that account, they’re shut off at night and turned back on in the morning.
Once we added all of our cloud credentials, we used data from ParkMyCloud’s recommendation screen and our own cost-benefit analysis to present our leadership a safe estimate of $200k in savings a year, but I wouldn’t be surprised at all if it ends up being more. Anytime you can show a cost-benefit analysis with a tool or a resource – that’s solid data you can bank on.
How many teams are using the tool now, and how many could be implementing them in the future?
The cloud engineering team was the poster child and right now we have 2 full teams. Another I am going to run through with next week, making 3 total. That team is probably where we will see some of the greatest savings.
Our implementation is ongoing. We recently presented ParkMyCloud and CloudHealth at a company-wide internal product user conference. We participated as individual contributors to demo how we were using the tools that could enable us to be cloud stewards around our cloud spend, prompting a lot of discussion and interest. We walked interested teams through all of our documentation around the tool, providing them with a short onboarding session.
Across the entire product organization, we have 25-30 teams that will be implementing ParkMyCloud.
How are you using ParkMyCloud’s automation functionality?
We’re making good use of SmartParking. One clear benefit is that you can go in and tune your settings to your environment, and once the analysis has been done on your resources, those come up as potential “smartparking recommendations”. It’s kind of a no-brainer – “yeah! turn these off at this time.” We do have some full, customer-facing production accounts that need to stay on, and we can’t spin those down at night, but the other 80% can and should be evaluated with SmartParking.
Another side benefit is that when we onboard teams with ParkMyCloud, the side discussion is always about rightsizing. We can look at the heat map through those SmartParking recommendation settings and see that it doesn’t really make sense to have this m42xl running 24/7 when it only gets hit certain times of day and max CPU is only going up to 35% – now we can have that rightsizing discussion around resources, opening a dialogue and providing data points. I have also heard some rumblings about automation around rightsizing and we look forward to utilizing that through CloudHealth and ParkMyCloud.
Are you using any other of our tools and features like the Slack integration?
Yes, we do use Slack. In fact, we had been using it since we turned ParkMyCloud on for our development account, and every night we see the report about which resources are spinning down and each morning which ones have been turned back on.
Do you use any other tools or processes in addition to CloudHealth and ParkMyCloud?
No other tools to control costs. We got started with CloudHealth so early on in our journey that I can’t see anything better, even AWS in providing their own dashboard and cloud-native tools hasn’t compared to the reporting, flexibility, and visibility across all of our accounts like CloudHealth does – and that doesn’t provide multi-cloud management. There aren’t any other tools that we have had to use or employ to get the information that we need.
Now we’re excited to be using ParkMyCloud. We were initially attracted to it because you chose to do one thing and do it well. You’re branching out now, with a couple of more things like rightsizing, which you will also do well instead of trying to do a broad spectrum of things poorly or mediocre. That’s what got us – it fits what we need to do.
That’s great to hear. Anything else you would like to add?
This is the beginning of a very good partnership. We have gotten great response and visibility into support and development around the product. I know when I see a problem and I throw it to the ParkMyCloud support team, I always get quick feedback.
That and the obvious: a lot of cloud customers will realize right off the bat that proper governance is not easy. You can’t go into being a cloud user thinking that it’s going to be cheaper or clearly visible, especially with the complexity of adding multiple accounts and then complicating it with multi-cloud management. You’ve got to employ tools that allow you to gain visibility into and management over those resources. Without ParkMyCloud and CloudHeath, we wouldn’t have that.
As container technology moves past something new and into the mainstream, users are concerned about the next step: container optimization. In our conversations with customers and potential customers, containers have been a consistent topic for the last few years, typically focused on production environments. However, recent conversations have become more focused, specifically on how to optimize container spending.
Kubernetes – which seems to be the most popular of container services among our customer base – does allow for a number of ways to optimize for costs and to maximize performance. We have identified five specific opportunities ripe for container optimization. Take a look at these within your own environments.
1) Rightsize Your Pods
Kubernetes Pods are the smallest deployable computing units in the Kubernetes container environment. It is a common practice to use a standard template for limits and requests for pod provisioning. If requests describe the minimal requirement for the CPU and memory for a pod to be scheduled on a node, the limits describe the max amount of CPU and memory the pod can consume on that specific node. Typically engineers set the initial limits by using a rule of thumb, such as doubling it just to be on the safe side and then planning to change it later once they have some data to look at. As with many things in life, “later” rarely happens. As a result, the footprint of the cluster inflates over time, exceeding the actual demand for the services running inside the cluster.
Just think about it, if every pod is over-provisioned by 50% and the cluster is always is 80% full, that means that 40% of the cluster capacity is allocated but not used, or simply put — wasted.
2) Turn Off Idle Pods
Many standard instances/VMs and databases in non-production environments are idle outside of working hours and can be turned off or “parked”. The same case exists for Pods, which in non-production environments can and should be scheduled in the same way.
3) Rightsize Your Nodes
Too many worker nodes are the wrong size and type. Kubernetes permits co-allocating the applications on the same nodes, which can dramatically reduce the cloud bill. Yet, incorrectly sized instances and volumes can lead to the inflation of the cost of Kubernetes clusters. Rightsizing could save up to 50% (particularly if no previous action has been taken to rightsize your nodes.)
Another thing to consider is that smaller nodes have a higher relative OS footprint and increase management overhead. The smaller the node, the higher the number of stranded resources. Stranded resources are CPU or memory which are idle, yet cannot be allocated to any of the pods, because the pods which are to be scheduled are too big to claim it. If a pod’s sizes are close to the size of the node (server) the percentage of the resources which are stranded gets higher.
4) Consider Storage Opportunities
Out of the box, containers lose their data when they restart. This is fine for stateless components but becomes an issue when a persistent data store is required. One place to look for additional container optimization opportunities is the overprovisioning of persistent storage (EBS, Azure Storage Disks, etc) related to your containers. There are a number of options to optimize container storage, particularly virtualized storage that can be shared by multiple containers, and which persists over time, without being destroyed when individual containers are destroyed. There are a few different persistent-storage plugins and plugin-driven storage solutions available from third-party vendors.
5) Review Purchasing Options
All of the preceding options related to the actual configuration of your container infrastructure. Just as important as this is ensuring that your purchasing options closely align with your needs. Ensuring the correct instance/VM purchase type for your containerized infrastructure is critical to ensuring flexibility and maximizing ROI. Carefully analyze your purchasing options (e.g. on-demand, reservations and spot) to select the right option for your workload, both in terms of size and usage schedule. Note that reserved instances are not always the best option for resources that can be scheduled to be turned off. Leverage cost optimization tools to support the earlier options for instance scheduling and rightsizing. Such tools can often change the equation and help avoid lock-ins and upfront commitments.
Container Optimization is Just Another Kind of Resource Optimization
The opportunities to save money through container optimization are in essence no different than for your non-containerized resources. Native tools, from either the cloud provider or open source, can help with this, but their capabilities are limited. For a fully optimized environment, you’ll want to take advantage of the growing ecosystem of specific cost optimization tools.
Stay tuned for news from ParkMyCloud on this front coming soon!
Given our focus on public cloud cost control, we here at ParkMyCloud are always trying to understand more about the future trends in cloud computing, specifically the public cloud infrastructure (IaaS) market. Now that public cloud has reached a key peak in growth, there’s a common theme. While new services and products continue to develop, more and more of them are focusing on not just creating capabilities that were previously lacking – they’re focused on optimizing what already exists.
Are Cloud Services Still Growing?
Before we dive into optimization, let’s take a look at how the cloud market continues to grow in 2019 and beyond. Gartner estimates that $206B will spent on public cloud services in 2019, up 17% from 2018 as outlined in the table below:
And according to IDC, almost half of IT spending was cloud-based in 2018, “reaching 60% of all IT infrastructure and 60-70% of all software, services and technology spending by 2020.” So, between Gartner and IDC, no one expects cloud adoption and spending to slow down any time soon. So what’s driving this growth and what are the future trends in cloud computing we should be on the lookout for in 2019 and beyond?
The Future Trends in Cloud Computing You’ve Probably Heard About
There is definitely a lot of hype around Blockchain, Quantum Computing, Machine Learning, and AI, as there should be. But at a more basic level, cloud computing is changing businesses in many ways. Whether it is the way they store their data, improving agility and go to market for faster release of new products and services, or how they store and protect their secure information, cloud computing is benefitting all businesses in every sector. Smart businesses are always looking for the most innovative ways to improve and accomplish their business objectives, i.e., make money.
When it comes to cloud technology, more and more businesses are realizing the benefits that cloud can provide them and are beginning to seek more cloud computing options to conduct their business activities. And obviously, Amazon, Microsoft, Google, Alibaba, IBM, and Oracle plan to capture this spend by providing a dizzying array of IaaS and PaaS offerings to help enterprises build and run their services.
How These Trends Make Computing Better
- Containers Become Mainstream: Application containerization is more than just a new buzz-word in cloud computing; it is changing the way in which resources are deployed into the cloud. More and more companies utilized containers in 2018. This is another trend that will continue into 2019 and beyond. How it Optimizes: at a development level, containerization allows applications to be developed and deployed faster than ever before. If used efficiently, they can also result in a lower cloud bill.
- Multi-Cloud and Hybrid Cloud: Once predicted as the future, the time of multi-cloud and hybrid cloud has arrived and will continue to grow. Most enterprises (74 percent) described their strategy as hybrid/multi-cloud in 2018. In addition, 62 percent of public cloud adopters are using 2+ unique cloud environments/platforms. These numbers will only go up in 2019. While this offers plenty of advantages to organizations looking to benefit from different cloud capabilities, using more than one CSP complicates governance, cost optimization, and cloud management further as native CSP tools are not multi-cloud. As cloud computing costs remain a primary concern, it’s crucial for organizations to stay ahead with insight into cloud usage trends to manage spend (and prevent waste). How it Optimizes: it’s a complex problem, but we do see many organizations adopting a multi-cloud strategy with cost control in mind, as it avoids vendor lock-in and allows flexibility for deploying workloads in the most cost-efficient manner (and at a high level, keeps the cloud providers competitive against each other to continually lower prices).
- Growth of Managed Services: The global cloud managed services market grew rapidly in 2018 and is expected to reach USD 82.51 billion by 2025, according to a study conducted by Grand View Research, Inc. Enterprises are focusing on their primary business operations, which results in higher cloud managed services adoption. Business services, security services, network services, data center services, and mobility services are major categories in the cloud managed services market. Implementation of these services will help enterprises reduce IT and operations costs and will also enhance productivity of those enterprises. How it Optimizes: managed service providers – the good ones, anyway – are experts in their field and some of the most informed consumers of public cloud. By handing cloud operations off to an outside provider, companies are not only optimizing their own time and human resources – they’re also pushing MSPs to become efficient cloud managers so they can remain competitive and keep costs down for themselves and their customers.
Cloud Trends Are Always Evolving
While today, it sometimes seems like we’ve seen the main components of cloud operations and all that’s left to do is optimize them, history tells us that’s not the case. Cloud has been and will continue to be a disruptive force in enterprise IT, and future trends in cloud computing will continue to shape the way enterprises leverage public, private and hybrid cloud. Remember: AWS was founded in 2006, the cloud infrastructure revolution is still in early days, and there is plenty more XaaS to be built.
Cofense uses ParkMyCloud for multi-cloud cost management. We talked with Todd Morgan, Senior Systems Engineer, about how his team is using the platform to gain “sizable cost savings” at scale.
Thank you for taking the time the speak with us. Can you tell us about Cofense, your role, and the team you work with?
Cofense is a SaaS company in the cybersecurity world. We’ve been around for about 10 years, so we don’t have a legacy of using on-prem infrastructure. The company has leveraged the cloud for their infrastructure needs. My role is that of engineer and architect working in a traditional IT department, and I’m in charge of managing our resources across cloud service providers.
Can you describe how you’re using the cloud and tell us more about what that looks like in your cloud environments?
We are a multi-cloud customer – it gives us a lot of flexibility. We can make cost decisions around which CSP has the most attractive cost models. Also, some solutions are a better fit for one place versus another. We leverage a wide variety of the cloud services available today, including VMs and RDS.
What was it that drove you to look for a multi-cloud cost management tool?
Part of shopping around for cost optimization was to gain insights and be able to make informed decisions for how we use our CSPs. We had been using a cloud tool for security purposes – to identify risks that we need to mitigate. We weren’t happy with the product, so rather than finding a better product that does the same thing, we expanded our scope to include other features such as cost management and config management, hoping to find one cloud tool that does it all. The search revealed that a single tool to meet all of our requirements doesn’t exist today. So, the goal shifted to finding a couple tools that compliment each other. While focusing on cost management requirements, I landed on ParkMyCloud.
I’ve kept a running scorecard of all the other cloud tools we’ve done trials and demos for. I’ve got some winners in mind to purchase, but we’re also thinking of making our own solution while the marketplace continues to evolve. We bought into ParkMyCloud because we were satisfied with the trial, the product met our requirements, and were pleased with how the product roadmap aligns with our goals.
How’d you hear about ParkMyCloud and how are you using it?
I learned about ParkMyCloud from networking conversations with current and former co-workers.
One of our requirements was to identify idle resources that were just sitting and not being used. I wanted a tool that would help give me insight into resource utilization and clearly report on idle resources. Where ParkMyCloud shined was by making the scheduling of resource on hours turnkey.
We have also been using ParkMyCloud’s API to easily override schedules. For example, if someone needs to use a server over the weekend but it’s scheduled to turn off, they can self-service the request to override the schedule.
How do you determine schedules between different departments?
I started with an aggressive plan that was based upon the usage metrics provided by ParkMyCloud. Then I would meet with each team owning a subset of resources, looking to get their sign-off on adjusted schedules. In most cases the teams would outline valid uses cases for times when resources looked idle but they do need them on. After shaving back my plan to meet their needs, we still have sizable cost savings at the end of the day.
What other benefits have you gotten from using the ParkMyCloud platform?
Something else that’s been happening is I’m finding servers that don’t need to be on at all. ParkMyCloud is proving to be a conversation starter about resource usage. These business conversations have led me to decommission idle resources altogether.
For the resources, we do schedule, at scale the cost savings is sizable. We only have a few examples of resources that need to be always-on 24x7x365. For the majority of resources, we have assigned new schedules. Also, when new resources are provisioned, we’re changing it so the default is now scoped to only be on during working hours.
Anything else to add or feedback to share on your use of the platform?
We’re very happy with the tool and the engagement with your team.
Thank you Todd!