We talked with Jedidiah Hurt, DevOps and technical lead at Hybrid Events Group, about how his company is using ParkMyCloud to automate EC2 instance scheduling, saving hours of development work. Below is a transcript of our conversation.
Appreciate you taking the time to speak with us today. Can you start off by giving us some background on your role, what Hybrid Events Group does, and why you got into doing what you do?
I do freelance work for Hybrid Events Group and am now moving into the role of technical lead. We had a big client we were working with this spring and we needed to fire up several EC2 instances. We were doing live broadcasting events across the U.S., which is what the company specializes in – events A/V services. So we do live webcasting, and we can do CapturePro, another service we offer where we basically just show up to any event that someone would want to record, which usually is workshops and keynotes at tech conferences, and we record on video and also capture the presenter’s presentation in video in real time.
ParkMyCloud, what we used it for, was just to automate EC2 instances for doing live broadcasts.
Was there any reason you chose AWS over others like Azure or Google Cloud, out of curiosity?
I just had the most experience with AWS; I was using AWS before Azure and Google Cloud existed. So I haven’t, or I can’t say that I’ve actually really given much of a trial to Azure or Google Cloud. I might have to give them a look here sometime in the future.
Do you use any PaaS services in AWS, or do you focus on compute databases and storage?
Yeah, not a whole lot right now. Just your basic S3, EC2, and I think we are probably going to move into elastic load balancing and auto scaling groups within the next few months or so as we build out our platform.
Do you use Agile development process to build out your platform and provide continuous delivery?
So, I am an agile practitioner, but we are just kind of brown fielding the platform. We are in the architecture stage right now, so we will be doing all of that, as far as continuous deployment, and hopefully continuous integration where we actually have some automated testing.
As far as tools, I’m the only developer on the team right now, so we won’t really have a full Agile or be fully into Agile. We haven’t got boards and prints and planning, weekly meetings, and all those things, because it’s just me. But we integrate portions of it, as far as having stakeholders kind of figuring out what our minimum viable product is.
What drove you to look for something like ParkMyCloud, and how did you come across it?
ParkMyCloud enabled us to automate a process that we were going to do manually, or that I was going to have to write scripts for and maintain. I think initially I was looking into just using the AWS CLI, and some other kind of test scheduler, to bring up the instances and then turn them off after our daily broadcast session was over. I did a little bit of googling to see if there were any time-based solutions available and found ParkMyCloud, and this platform does exactly what’s needed and more.
And you are using the free tier ParkMyCloud, correct?
Yes. I don’t remember what the higher tiers offered, but this was all we really needed. We just had three or four large EC2 instances that we wanted to bring up for four to five hours a day, Monday through Friday, so it had all the core features that we currently need.
Anything that stood out for you in terms of using the product?
I’d say on the plus side I was a little bit concerned at the beginning as far as the reliability of the tool, because we would have been in big trouble with our client if ParkMyCloud failed to bring up an instance at a scheduled start time. We used it, or I guess I would say we relied on it, every day for 2 months solid, and never saw any issues as far as instances not coming up when they were supposed to, or shutting down when they were not supposed to. I was really pleased with, what I would say, the reliability of the tool – that definitely stuck out to me.
From an ROI standpoint, are you satisfied with savings and the way the information is presented to you?
Yeah, absolutely. And I think for us, the ROI wasn’t so much the big difference between having the instances running all the time, or having the instances on a schedule. The ROI was more from the fact that I didn’t have to build the utility to accomplish that because you guys already did that. So in that sense, it probably saved me many hours of development work.
Also, that kind of uneasy feeling you get when you hack up a little script and put it into production versus having a well-tested, fully-automated platform. I’m really happy that we found ParkMyCloud, it has definitely become an important part of our infrastructure management over last few months.
As our final question, how much overhead or time did you have to spend in getting ParkMyCloud set up to manage your environment, and did you have to do anything on a daily or weekly basis to maintain it?
So, as I said, our particular use case was very basic, so it ended up being three instances that we needed to bring up for three or four hours a day and then shut them down. I’d say it took me ten to fifteen minutes to get rolling with ParkMyCloud and automate EC2 instance scheduling. And now we save thousands of dollars per month on our AWS bill.
Over the past couple of years we have had a lot of conversations with large and small enterprises regarding cloud management and cloud optimization tools, all of whom were looking for cost control. They wanted to reduce their bills, just like any utility you might run at home — why spend more than you need to? Amazon Web Services (AWS) actively promotes optimizing cloud infrastructure, and where they lead, others follow. AWS even goes so far as to suggest the following simple steps to control AWS costs:
- Right-size your services to meet capacity needs at the lowest cost;
- Save money when you reserve;
- Use the spot market;
- Monitor and track service usage;
- Use Cost Explorer to optimize savings; and
- Turn off idle instances (we added this one).
Its interesting to note use of the word ‘control’ even though the section is labeled Cost Optimization.
So where is all of this headed? It’s great that AWS offers their own solutions but what if you want automation into your DevOps processes, multi-cloud support (or plan to be multi cloud), real-time reporting on these savings, and to turn stuff off when you are not using it? Well then you likely need to use a third-party tool to help with these tasks.
Let’s take a quick look at a description of each AWS recommendation above, and get a better understanding of each offering. Following this we will then explore if these cost optimization options can be automated as part of a continuous cost control process:
- Right-sizing – Both the EC2 Right Sizing solution and AWS Trusted Advisor analyze utilization of EC2 instances running during the prior two weeks. The EC2 Right Sizing solution analyzes all instances with a max CPU utilization less than 50% and determines a more cost-effective instance type for that workload, if available.
- Reserved Instances (RI) – For certain services like Amazon EC2 and Amazon RDS, you can invest in reserved capacity. With RI’s, you can save up to 75% over equivalent ‘on-demand’ capacity. RI’s are available in three options – (1) All up-front, (2) Partial up-front or (3) No upfront payments.
- Spot – Amazon EC2 Spot instances allow you to bid on spare Amazon EC2 computing capacity. Since Spot instances are often available at a discount compared to On-Demand pricing, you can significantly reduce the cost of running your applications, grow your application’s compute capacity and throughput for the same budget, and enable new types of cloud computing applications.
- Monitor and Track Usage – You can use Amazon CloudWatch to collect and track metrics, monitor log files, set alarms, and automatically react to changes in your AWS resources. You can also use Amazon CloudWatch to gain system-wide visibility into resource utilization, application performance, and operational health.
- Cost Explorer – AWS Cost Explorer gives you the ability to analyze your costs and usage. Using a set of default reports, you can quickly get started with identifying your underlying cost drivers and usage trends. From there, you can slice and dice your data along numerous dimensions to dive deeper into your costs.
- Turn off Idle Instances – To “park” your cloud resources by assigning them schedules of operating hours they will run or be temporarily stopped – i.e. parked. Most non-production resources (dev, test, staging, and QA) can be parked at nights and on weekends, when they are not being used. On the flip side of this, some batch processing or load testing type applications can only run during non-business hours, so they can be shut down during the day.
Many of these AWS solutions offer recommendations, but do require manual efforts to gain the benefits. This is why third party solutions have have seen widespread adoption and include cloud management, cloud governance and visibility, and cloud optimization tools. In part two of this this blog we will have a look at some of those tools, the benefits of each, approach and the level of automation to be gained.
Not only has it become apparent that public cloud is here to stay, it’s also growing faster as time goes on (by 2020, it is estimated that more than 40% of enterprise workloads will be in the cloud). IT infrastructure has changed permanently, and enterprise organizations are coming to terms with some of the side effects of this shift. One of those side effects is the need for tools and processes (and even teams in larger organizations) dedicated to cloud cost management and cost control. Executives from all teams within an organization want to see costs, projections, usage, savings, and quantifiable efforts to save the company money while maximizing IT throughput as enterprises shift to resources to the cloud.
There’s a variety of tools to solve some of these problems, so let’s take a look at a few of the major ones. All of the tools mentioned below support Amazon AWS, Microsoft Azure, and Google Cloud Platform.
CloudHealth provides detailed analytics and reporting on your overall cloud spend, with the ability to slice-and-dice that data in a variety of ways. Recommendations about your instances are made based on a score driven by instance utilization and cloud provider best practices. This data is collected from agents that are installed on the instances, along with cloud-level information. Analysis and business intelligence tools for cloud spend and infrastructure utilization are featured prominently in the dashboard, with governance provided through policies driven by teams for alerts and thresholds. Some actions can be scripted, such as deleting elastic IPs/snapshots and managing EC2 instances, but reporting and dashboards are the main focus.
Overall, the platform seems to be a popular choice for large enterprises wanting cost and governance visibility across their cloud infrastructure. Pricing is based on a percentage of your monthly cloud spend.
Cloudcheckr provides visibility into governance, security, compliance, and cost problems based on doing analytics and checks against logic built into their platform. It relies on non-native tools and integrations to take action on the recommendations, such as Spotinst, Ansible, or Chef. CloudCheckr’s reports cover a wide range of topics, including inventory, utilization, security, costs, and overall best-practices. The UI is simple and is likely equally well regarded by technical and non-technical users.
The platform seems to be a popular choice with small and medium sized enterprises looking for greater overall visibility and recommendations to help optimize their use of cloud. Given their SMB focus customers are often provided this service through MSPs. Pricing is based on your cloud spend, but a free tier is also available.
Cloudyn (recently acquired by Microsoft) is focused on providing advice and recommendations along with chargeback and showback capabilities for enterprise organizations. Cloud resources and costs can be managed through their hierarchical team structure. Visibility, alerting, and recommendations are made in real time to assist in right-sizing instances and identifying outlying resources. Like CloudCheckr, it relies on external tools or people to act upon recommendations and lacks automation
Their platform options include supporting MSPs in the management of their end customer’s cloud environments as well as an interesting cloud benchmarking service called Cloudyndex. Pricing for Cloudyn is also based on your monthly cloud spend. Much of the focus seems to be on current Microsoft Azure customers and users.
Unlike the other tools mentioned, ParkMyCloud focuses on actions and automated scheduling of resources to provide optimization and immediate ROI. Reports and dashboards are available to show the cost savings provided by these schedules and recommendations on which instances to park. The schedules can be manually attached to instances, or automatically assigned based on tags or naming schemes through its Policy Engine. It pairs well with the other previously mentioned recommendation-based tools in this space to provide total cost control through both actions and reporting.
ParkMyCloud is widely used by DevOps and IT Ops in organizations from small startups to global multinationals, all who are keen to automate cost control by leveraging ParkMyCloud’s native API and pre-built integration with tools like Slack, Atlassian, and Jenkins. Pricing is based on a cost per-instance, with a free tier available.
Cloud cost management isn’t just a “should think about” item, it’s a “must have in place” item, regardless of the size of a company’s cloud bill. Specialized tools can help you view, manage, and project your cloud costs no matter which provider you choose. The right toolkit can supercharge your IT infrastructure, so consider a combination of some of the tools above to really get the most out of your AWS, Azure, or Google environment.
Webhooks are user-defined HTTP POST callbacks. They provide a lightweight mechanism for letting remote applications receive push notifications from a service or application, without requiring polling. In today’s IT infrastructure that includes monitoring tools, cloud providers, DevOps processes, and internally-developed applications, webhooks are a crucial way to communicate between individual systems for a cohesive service delivery. Now, in ParkMyCloud, webhooks are available for even more powerful cost control.
For example, you may want to let a monitoring solution like Datadog or New Relic know that ParkMyCloud is stopping a server for some period of time and therefore suppress alerts to that monitoring system for the period the server will be parked, and vice versa enable the monitoring once the server is unparked (turned on). Another example would be to have ParkMyCloud post to a chatroom or dashboard when schedules have been overridden by users. We do this by enabling systems notifications to our cloud webhooks.
Previously only two options were provided when configuring system level and user notifications in ParkMyCloud: System Errors and Parking Actions. We have added 3 new notification options for both system level and user notifications. Descriptions for all five options are provided below:
- System Errors – These are errors occurring within the system itself such as discovery errors, parking errors, invalid credential permissions, etc.
- System Maintenance and Updates – These are the notifications provided via the banner at the top of the dashboard.
- User Actions – These are actions performed by users in ParkMyCloud such as manual resource state toggles, attachment or detachment of schedules, credential updates, etc.
- Parking Actions – These are actions specifically related to parking such as automatic starting or stopping of resources based on defined parking schedules.
- Policy Actions – These are actions specifically related to configured policies in ParkMyCloud such as automatic schedule attachments based on a set rule.
We have made the options more granular to provide you better control on events you want to see or not see.
These options can be seen when adding or modifying a channel for system level notifications (Settings > System Level Notifications). In the image shown below, a channel is being added.
Note: For additional information regarding these options, click on the Info Icon to the right of Notify About.
The new notification options are also viewable by users who want to set up their own notifications (Username > My Profile). These personal notifications are sent via email to the address associated with your user. Personal notifications can be set up by any user, while Webhooks must be set up by a ParkMyCloud Admin.
After clicking on Notifications, you will see the above options and may use the checkboxes to select the notifications you want to receive. You can also set each webhook to handle a specific ParkMyCloud team, then set up multiple webhooks to handle different parts of your organization. This offers maximum flexibility based on each team’s tools, processes, and procedures. Once finished, click on Save Changes. Any of these notifications can be sent then to your cloud webhook and even Slack to ensure ParkMyCloud is integrated into your cloud management operations.
Large companies have traditionally had an impressive list of batch workloads, which run at night, when people have gone home for the day. These include such things as application and database backup jobs; extraction, transform, and load (ETL) jobs; disaster recovery (DR) environment checks and updates; online analytical processing (OLAP) jobs; and monthly/ quarterly billing updates or financial “close”, to name a few.
Traditionally, with on-premise data centers, these workloads have run at night to allow the same hardware infrastructure that supports daytime interactive workloads to be repurposed, if you will, to run these batch workloads at night. This served a couple of purposes:
- It avoided network contention between the two workloads (as both are important), allowing the interactive workloads to remain responsive.
- It avoided data center sprawl by using the same infrastructure to run both, rather than having dedicated infrastructure for interactive and batch.
Things Are Different with Public Cloud
As companies move to the public cloud, they are no longer constrained by having to repurpose the same infrastructure. In fact, they can spin up and spin down new resources on demand in AWS, Azure or Google Cloud Platform (GCP), running both interactive and batch workloads whenever they want.
Network contention is also less of concern, since the public cloud providers typically have plenty of bandwidth. The exception of course is where batch workloads use the same application interfaces or APIs to read/write data.
So, moving to public cloud offers a spectrum of possibilities, and you can use one or any combination of them:
- You can run batch nightly using similar processes as you do in your online data centers, but on separate provisioned instances/virtual machines. This probably results in the least effort to moving batch to the public cloud, the least change to your DevOps processes, and perhaps saves you some money by having instances sized specifically for the workloads and being able to leverage cloud cost savings options (e.g., reserved instances);
- You can run batch on separately provisioned instances/virtual machines, but concurrently with existing interactive workloads. This will likely result in some additional work to change your DevOps processes, but offers more freedom and similar benefits mentioned above. You will still need to pay attention to application interfaces/APIs the workloads may have in common; or
- At the extreme end of the cloud adoptions spectrum, you could use cloud provider platform as a service (PaaS) offerings, such as AWS Batch, Microsoft Azure Batch or GCP Cloud Dataflow, where batch is essentially treated as a “black box”. A detailed comparison of these services is beyond the scope of this blog. However, in summary, these are fully managed services, where you queue up input data in an S3 bucket, object blob or volume along with a job definition, appropriate environment variables and a schedule and you’re off to races. These services employ containers and autoscaling/resource groups/instance groups where appropriate, with options to use less expensive compute in some cases. (For example, with AWS Batch, you have the option of using spot instances.)
The advantage of this approach is potentially faster time to implement and (maybe) less expensive monthly cloud costs, because the compute services run only at the times you specify. The disadvantages of this approach may be the degree of operational/configuration control you have; the fact, that these services may be totally foreign to your existing DevOps folks/processes (i.e., there is a steep learning curve); and it may tie you to that specific cloud provider.
A Simple Alternative
If you are looking to minimize impact to your DevOps processes (that is, the first two approaches mentioned above), but still save money, then ParkMyCloud can help.
Normally, with the first two options, there are cron jobs scheduled to kick-off batch jobs at the appropriate times throughout the day, but the underlying instances must be running for cron to do its thing. You could use ParkMyCloud to put parking schedules on these resources, such they are turned OFF for most of the day, but are turned ON just-in-time to still allow the cron jobs to execute.
We have been successfully using this approach in our own infrastructure for some time now, to control a batch server used to do database backups. This would, in fact, provide more savings than AWS reserved instances.
Let’s look at specific example in AWS. Suppose you have an m4.large server you use run batch jobs. Assuming Linux pricing in us-east-1, this server costs $0.10 per hour, or about $73 per month. Suppose you have configured cron to start batch jobs at midnight UTC and that they normally complete 1 to 1-½ hours later.
You could purchase a Reserved Instance for that server, where you either pay nothing upfront or all upfront and your savings would be 38%-42%.
Or, you could put a ParkMyCloud schedule where the instance is only ON from 11 pm-1 am UTC, allowing enough time for the cron jobs to start and run. The savings in that case would be 87.6% (including the cost of ParkMyCloud) without the need for a one year commitment. Depending on how many batch servers you run in your environment and their sizes, that could be some hefty savings.
Public cloud will offer you a lot of freedom and some potentially attractive cost savings as you move batch workloads from on premise. You are no longer constrained by having the same infrastructure serve two vastly different types of workloads — interactive and batch. The savings you can achieve by moving to public cloud can vary, depending on the approach you take and the provider/service you use.
The approach you take, depends on the amount of process change you’re willing to absorb in your DevOps processes. If you are willing to throw caution to the wind, the cloud provider PaaS offerings for batch can be quite compelling.
If you wish to take a more cautious approach, then we engineered ParkMyCloud to park servers without the need for scripting, or the need for you to be a DevOps expert. This approach allows you to achieve decent savings, with minimal change to your DevOps batch processes and without the need for Reserved Instances.
We’re happy to introduce ParkMyCloud’s new reporting dashboard! There’s now easy to access reports that provide greater insight into information regarding cloud costs, team rosters, and more. Details on this update can be found in our support portal
Now, when you click Reports in the left navigational panel, instead of getting the option to download a full savings report, you’ll see your ParkMyCloud reporting dashboard. This provides a quick view of cloud provider, team and resource costs, and information regarding your ParkMyCloud savings. At the top of the reporting dashboard, two drop-down menus are provided for selecting the report type and the time period. The default selections are Dashboard and Trailing 30 Days, which is what you see after clicking on reporting in the left navigational menu. Click on a drop-down menu to choose other available options.
Underneath the Report Type drop-down menu, you will see several options that are broken down into additional sections (Financial, Resource, Administrative, etc.) Click on an option in the menu to view that specific report within the dashboard. These reports can also be shown using a variety of time periods. Reports may be exported as an CSV or Excel file by clicking on the desired option on the right of the Report and Time Period drop-down menus as well.
Click on Legacy if you would prefer to still use the previous reporting functionality rather than the new reporting dashboard in ParkMyCloud. A pop-up window will appear for selecting the start and end date along with the type of legacy report. As part of this change, we have also moved Audit Logs underneath reporting. To access this option, you will need to select Reports in the left navigational panel and then Audit Log.
Check It Out
If you don’t yet use ParkMyCloud, you can try it now for free. We offer a 14-day free trial of all ParkMyCloud features, after which you can choose to subscribe to a premium plan or continue parking your instances using ParkMyCloud’s free tier forever.
If you already use ParkMyCloud, you’ll instantly see a visual representation of your cloud savings just by logging in to the platform. We challenge you to use this as a scoreboard, and try to drive your monthly savings as high as you can!