Cloud Optimization Tools = Cloud Cost Control (Part II)

Cloud Optimization Tools = Cloud Cost Control (Part II)

A couple of weeks ago in Part 1 of this blog topic we discussed the need for cloud optimization tools to help enterprises with the problem of cloud cost control. Amazon Web Services (AWS) even goes as far as suggesting the following simple steps to control their costs (which can also be applied  to Microsoft Azure and Google Cloud Platform, but of course with slightly different terminology):

    1. Right-size your services to meet capacity needs at the lowest cost;
    2. Save money when you reserve;
    3. Use the spot market;
    4. Monitor and track service usage;
    5. Use Cost Explorer to optimize savings; and
    6. Turn off idle instances (we added this one).

A variety of third-party tools and services have popped up in the market over the past few years to help with cloud cost optimization – why? Because upwards of $23B was spent on public cloud infrastructure in 2016, and spending continues to grow at a rate of 40% per year. Furthermore, depending on who you talk to, roughly 25% of public cloud spend is wasted or not optimized — that’s a huge market! If left unchecked, this waste problem is supposed to triple to over $20B by 2020 – enter the vultures (full disclosure, we are also a vulture, but the nice kind). Most of these tools are lumped under the Cloud Management category, which includes subcategories like Cost Visibility and Governance, Cost Optimization, and Cost Control vendors – we are a cost control vendor to be sure.

Why do you, an enterprise, care? Because there are very unique and subtle differences between the tools that fit into these categories, so your use case should dictate where you go for what – and that’s what I am trying to help you with. So, why am I a credible source to write about this (and not just because ParkMyCloud is the best thing since sliced bread)?

Well, yesterday we had a demo with a FinTech company in California that was interested in Cost Control, or thought they were. It turns out that what they were actually interested in was Cost Visibility and Reporting; the folks we talked to were in Engineering Finance, so their concerns were primarily with billing metrics, business unit chargeback for cloud usage, RI management, and dials and widgets to view all stuff AWS and GCP billing related. Instead of trying to force a square peg into a round hole, we passed them on to a company in this space who’s better suited to solve their immediate needs. In response, the Finance folks are going to put us in touch with the FinTech Cloud Ops folks who care about automating their cloud cost control as part of their DevOps processes.

This type of situation happens more often than not. We have a lot of enterprise customers using ParkMyCloud along with CloudHealth, CloudChekr, Cloudability, and Cloudyn because in general, they provide Cost Visibility and Governance, and we provide actionable, automated Cost Control.

As this is our blog, and my view from the street – we have 200+ customers now using ParkMyCloud, and we demo to 5-10 enterprises per week. Based on a couple of generic customer uses cases where we have strong familiarity, here’s what you need to know to stay ahead of the game:

  • Cost Visibility and Governance: CloudHealth, CloudChekr, Cloudability and Cloudyn (now owned by Microsoft)
  • Reserved Instance (RI) management – all of the above
  • Spot Instance management – SpotInst
  • Monitor and Track Usage: CloudHealth, CloudChekr, Cloudability and Cloudyn
  • Turn off (park) Idle Resources – ParkMyCloud, Skeddly, Gorilla Stack, BotMetric
  • Automate Cost Control as part of your DevOps Process: ParkMyCloud
  • Govern User Access to Cloud Console for Start/Stop: ParkMyCloud
  • Integrate with Single Sign-On (SSO) for Federated User Access: ParkMyCloud

To summarize, cloud cost control is important, and there are many cloud optimization tools available to assist with visibility, governance, management, and control of your single or multi-cloud environments. However, there are very few tools which allow you to set up automated actions leveraging your existing enterprise tools like Ping, Okta, Atlassian, Jenkins, and Slack.  Make sure you are not only focusing on cost visibility and recommendations, but also on action-oriented platforms to really get the best bang for your buck.

Cloud Optimization Tools = Cloud Cost Control

Cloud Optimization Tools = Cloud Cost Control

Over the past couple of years we have had a lot of conversations with large and small enterprises regarding cloud management and cloud optimization tools, all of whom were looking for cost control. They wanted to reduce their bills, just like any utility you might run at home — why spend more than you need to? Amazon Web Services (AWS) actively promotes optimizing cloud infrastructure, and where they lead, others follow. AWS even goes so far as to suggest the following simple steps to control AWS costs:

  1. Right-size your services to meet capacity needs at the lowest cost;
  2. Save money when you reserve;
  3. Use the spot market;
  4. Monitor and track service usage;
  5. Use Cost Explorer to optimize savings; and
  6. Turn off idle instances (we added this one).

Its interesting to note use of the word ‘control’ even though the section is labeled Cost Optimization.

So where is all of this headed? It’s great that AWS offers their own solutions but what if you want automation into your DevOps processes, multi-cloud support (or plan to be multi cloud), real-time reporting on these savings, and to turn stuff off when you are not using it? Well then you likely need to use a third-party tool to help with these tasks.

Let’s take a quick look at a description of each AWS recommendation above, and get a better understanding of each offering. Following this we will then explore if these cost optimization options can be automated as part of a continuous cost control process:

  1. Right-sizing – Both the EC2 Right Sizing solution and AWS Trusted Advisor analyze utilization of EC2 instances running during the prior two weeks. The EC2 Right Sizing solution analyzes all instances with a max CPU utilization less than 50% and determines a more cost-effective instance type for that workload, if available.
  2. Reserved Instances (RI) – For certain services like Amazon EC2 and Amazon RDS, you can invest in reserved capacity. With RI’s, you can save up to 75% over equivalent ‘on-demand’ capacity. RI’s are available in three options – (1) All up-front, (2) Partial up-front or (3) No upfront payments.
  3. Spot – Amazon EC2 Spot instances allow you to bid on spare Amazon EC2 computing capacity. Since Spot instances are often available at a discount compared to On-Demand pricing, you can significantly reduce the cost of running your applications, grow your application’s compute capacity and throughput for the same budget, and enable new types of cloud computing applications.
  4. Monitor and Track Usage – You can use Amazon CloudWatch to collect and track metrics, monitor log files, set alarms, and automatically react to changes in your AWS resources. You can also use Amazon CloudWatch to gain system-wide visibility into resource utilization, application performance, and operational health.
  5. Cost Explorer – AWS Cost Explorer gives you the ability to analyze your costs and usage. Using a set of default reports, you can quickly get started with identifying your underlying cost drivers and usage trends. From there, you can slice and dice your data along numerous dimensions to dive deeper into your costs.
  6. Turn off Idle Instances – To “park” your cloud resources by assigning them schedules of operating hours they will run or be temporarily stopped – i.e. parked. Most non-production resources (dev, test, staging, and QA) can be parked at nights and on weekends, when they are not being used. On the flip side of this, some batch processing or load testing type applications can only run during non-business hours, so they can be shut down during the day.

Many of these AWS solutions offer recommendations, but do require manual efforts to gain the benefits. This is why third party solutions have have seen widespread adoption and include cloud management, cloud governance and visibility, and cloud optimization tools. In part two of this this blog we will have a look at some of those tools, the benefits of each, approach and the level of automation to be gained.

Cloud Webhooks – Notification Options for System Level Alerts to Improve your Cloud Operations

Cloud Webhooks – Notification Options for System Level Alerts to Improve your Cloud Operations

Webhooks are user-defined HTTP POST callbacks. They provide a lightweight mechanism for letting remote applications receive push notifications from a service or application, without requiring polling. In today’s IT infrastructure that includes monitoring tools, cloud providers, DevOps processes, and internally-developed applications, webhooks are a crucial way to communicate between individual systems for a cohesive service delivery. Now, in ParkMyCloud, webhooks are available for even more powerful cost control.

For example, you may want to let a monitoring solution like Datadog or New Relic know that ParkMyCloud is stopping a server for some period of time and therefore suppress alerts to that monitoring system for the period the server will be parked, and vice versa enable the monitoring once the server is unparked (turned on). Another example would be to have ParkMyCloud post to a chatroom or dashboard when schedules have been overridden by users. We do this by enabling systems notifications to our cloud webhooks.

Previously only two options were provided when configuring system level and user notifications in ParkMyCloud: System Errors and Parking Actions. We have added 3 new notification options for both system level and user notifications. Descriptions for all five options are provided below:

  • System Errors – These are errors occurring within the system itself such as discovery errors, parking errors, invalid credential permissions, etc.
  • System Maintenance and Updates – These are the notifications provided via the banner at the top of the dashboard.
  • User Actions – These are actions performed by users in ParkMyCloud such as manual resource state toggles, attachment or detachment of schedules, credential updates, etc.
  • Parking Actions – These are actions specifically related to parking such as automatic starting or stopping of resources based on defined parking schedules.
  • Policy Actions – These are actions specifically related to configured policies in ParkMyCloud such as automatic schedule attachments based on a set rule.

We have made the options more granular to provide you better control on events you want to see or not see.

These options can be seen when adding or modifying a channel for system level notifications (Settings > System Level Notifications). In the image shown below, a channel is being added.

Note: For additional information regarding these options, click on the Info Icon to the right of Notify About.

The new notification options are also viewable by users who want to set up their own notifications (Username > My Profile).  These personal notifications are sent via email to the address associated with your user.  Personal notifications can be set up by any user, while Webhooks must be set up by a ParkMyCloud Admin.

After clicking on Notifications, you will see the above options and may use the checkboxes to select the notifications you want to receive. You can also set each webhook to handle a specific ParkMyCloud team, then set up multiple webhooks to handle different parts of your organization.  This offers maximum flexibility based on each team’s tools, processes, and procedures. Once finished, click on Save Changes. Any of these notifications can be sent then to your cloud webhook and even Slack to ensure ParkMyCloud is integrated into your cloud management operations.

 

Interview: DevOps in AWS – How to Automate Cloud Cost Savings

Interview: DevOps in AWS – How to Automate Cloud Cost Savings

We chatted with Ryan Alexander, DevOps Engineer at Decision Resources Group (DRG) about his company’s use of AWS and how they automate cloud cost savings. Below is a transcript of our conversion.

Hi Ryan, thanks for speaking with us. To start out, can you please describe what your company does?

Decision Resources Group offers market information and data for the medtech industry. For example, let’s say a medical graduate student is doing a thesis on Viagra use in the Boston area. They can use our tool to see information such as age groups, ethnicities, number of hospitals, and number of people who were issued Viagra in the city of Boston.

What does your team do within the company? What is your role?

I’m a DevOps engineer on a team of two. We provide infrastructure automation to the other teams in the organization. We report to senior tech management, which makes us somewhat of an island within the organization.

Can you describe how you are using AWS?

We have an infrastructure team internally. Once a server or infrastructure is built, we take over to build clusters and environments for what’s required. We utilize pretty much every tool AWS offers — EBS, ELB, RDS, Aurora, CloudFormation, etc.

What prompted you to look for a cost control solution?

When I joined DRG in December, there was a new cost saving initiative developing within the organization. It came from our CTO, who knew we could be doing better and wanted to see where we might be leaving money on the table.

How did you hear about ParkMyCloud?

One of my colleagues actually spoke with your CTO, Dale, at AWS re:Invent, and I had also heard about ParkMyCloud at DevOpsDays Toronto 2016. We realized it could help solve some of our cloud cost control problems and decided to take a look.

What challenges were contributing to the high costs? How has ParkMyCloud helped you solve them?

We knew we had a problem where development, staging, and QA environments were only used for 8 hours a day – but they were running for 24 hours a day. We wanted to shut them down and save money on the off hours, which ParkMyCloud helps us do automatically.

We also have “worker” machines that are used a few times a month, but they need to be there. It was tedious to go in and shut them down individually. Now with ParkMyCloud, I put those in a group and shut them down with one click. It is really just that easy to automate cloud cost savings with ParkMyCloud.

We also have security measures in place, where not everyone has the ability to sign in to AWS and shut down instances. If there was a team that needed them started on demand, but they’re in another country and I’m sleeping, they have to wait until I wake up the next morning, or I get up at 2 AM. Now that we set up Single Sign-On, I can set up the guys who use those servers, and give them the rights to startup and shutdown those servers. This has been more efficient for all of us. I no longer have to babysit and turn those on/off as needed, which saves time for all of us.

With ParkMyCloud, we set up teams and users so they can only see their own instances, so they can’t cause a cascading failure because they can only see the servers they need.

Were there any unexpected benefits of ParkMyCloud?

When I started, I deleted 3 servers that were sitting there doing nothing for a year and costing the company lots of money. With ParkMyCloud, that kind of stuff won’t happen, because everything gets sorted into teams. We can see the costs by team and ask the right questions, like, “why is your team’s cost so expensive right now? Why are you ignoring these recommendations from ParkMyCloud to park these instances?”

 

We rely on tagging to do all of this. Tagging is life in DevOps.

How X-Mode Deals with Rising AWS Costs

How X-Mode Deals with Rising AWS Costs

We sat down with Josh Anton, CEO of X-Mode, a technology app that has been experiencing rapid growth and rising AWS costs. We asked him about his company, what cloud services he uses, and how he goes about mitigating those costs.

Can you start by telling us about X-Mode and what you guys do?

X-Mode is a location platform that currently maps out 5-10% of the U.S. population on a monthly basis and 1-2% of the U.S. population daily, which is about 3-6 million active daily users | 15M to 20M users monthly. X-Mode collect location based data from applications and platforms used by these consumers, and then develop consumer segments or attribution where our customers basically use the data to determine if their advertising is effective and to develop target profiles. For example, based on the number and types of coffee shops a person has visited, we can assume they are this type of coffee drinker. Or a company like McDonald’s will determine if their advertising is effective if they see that an ad is run in a certain area, and a person visits that restaurant in the next few days. The data has many applications.

How did you get this idea, Josh?

We started off as an app called Drunk Mode, which was founded and built while I was at the University of Virginia studying Marketing and IT. After about a year and half our app grew to about 1.5 million users by leveraging influencer marketing via Trend Pie and student campus rep program at 50+ universities. In September of 2016, we realized that if we developed a location-based technology platform we could monetize and capitalize on the location data we collected from the Drunk Mode app. Along with input from our advisors, we developed a strategy to help out other small apps by aggregating their data, crunching it, and packaging it up in real-time to sell to ad agencies and retailers, acting almost as a data wholesaler and helping these small app plays monetize their data as a secondary source of income.

Who’s cloud services are you using and how does X-Mode work?

We use Amazon Web Services (AWS) for all of our cloud infrastructure and primarily use their EC2, RDS, and Elastic Beanstalk services. Our technology works by collecting and aggregating location data based on when and where people go on a daily basis. it is collected locally by iOS and Android devices, and passed to AWS’s cloud using their API gateway function. The cool thing is that we are able to pinpoint a person’s location within feet of a retail location. The location data is batched and sent to our servers every 12 hours and we package it up and license the data out to our vendors. We are processing around 10 to 12 billion location based records per month, and have some proprietary algorithms which make our processing very fast and we have almost no burn on the phone’s battery. Our customers are sent the data daily and we use services like Lambda, RDS and Elastic Beanstalk to make this as efficient as possible. We are now developing the functionality to better triangulate beacons so that we can pinpoint locations even better, and send location data within the hour, rather than within the day.

Why did you pick AWS?

We chose AWS because when X-Mode joined Fishbowl Labs (a startup accelerator run and sponsored by AOL in Northern Virginia), we were given $15,000 in free AWS credits. The free credits have made me very loyal to Amazon’s service and now the switching costs would be fairly high in terms of effort and dollars to move away from Amazon. So even though it’s expensive, we are here to stay and adopting more of AWS’s advanced services in order to improve our platform performance and take advantage of their technology advances. Another reason we stay with AWS is that we know it is going to be there, we previously used another service called Parse.com that was acquired by Facebook and a few years later, they shut down the service, for us performance and stability (the server service existing 10 years from now) are very important to us.

Are you still using AWS credits?

No, we used those up many months ago. We have gone from spending a few hundred dollars a month to spending $25,000 or more a month. While that is a cost, it’s also a blessing in that X-Mode is rapidly growing and scaling. Outside of the cost of people, this is our biggest monthly expense. ParkMyCloud was an easy choice, given 75% or more of our AWS spend is on EC2 and RDS services, and ParkMyCloud’s ability to ‘park’ each service and their flexible governance model for our remote engineering team. So we are very excited about the savings ParkMyCloud will produce for us, along with some of the new design work we will be doing to make our platform even more efficient.

Are there other ways you are looking to optimize your AWS spend?  

We believe that we have to re-architect the system. We have actually done that three times given our rapid platform growth, but it is all about making sure that we are optimizing our import/export process. We are running our servers at maximum capacity to help get information to people, and are continually looking to make our operation more efficient. Along with using ParkMyCloud, we are focusing on general platform optimization to make sure we keep costs down, improve performance and innovate at a rapid pace.

What other tools do you use as part of your DevOps process?

Let’s keep in mind we are startup, but we are getting more and more organized in terms of development cycles and have a solid delivery process. And yes, we use tools like Slack, Jira, BaseCamp, Bitbucket, and Google Drive. Everything is SaaS-based and everything is in the cloud, and we follow an agile development process. On the Sales and Marketing side we are solely a millennial workforce and work in office but our development team is basically stay at home dads distributed around the country, so planning and communication are keys to our success. That’s where Slack and Jira come into play. In terms of processes, we are trying to implement a better QA process so we deliver very vetted code to our end users. We do a lot of development planning and mapping each quarter, so all of this is incredibly important to the growth of the X-Mode platform and to the success of our organization.