When looking to keep Google Cloud Platform (GCP) costs in control, the first place users turn are the discount options offered by the cloud service provider itself, such as Google’s Sustained Use discounts. The question is: do Google Sustained Use discounts actually save you money, when you could just turn the instance off?
How Google Sustained Use discounts work
The idea of the Sustained Use discount is that the longer you run a VM instance in any given month, the bigger discount you will get from the list price. The following shows the incremental discount, and its cumulative impact on a hypothetical $100/month VM instance, where the percentages are against the baseline 730-hour month.
I have to say here that the GCP prices listed can be somewhat misleading unless you read the fine print where it says “Note: Listed monthly pricing includes applicable, automatic sustained use discounts, assuming the instance runs for a 730 hour month.” What this means to us is that the list prices of the instances are actually much higher, but their progressive discount means that no one ever actually pays list price. That said – the list price is what you need to know in order to estimate the actual cost you will pay if you do not plan to leave the instance up for 730 hours/month.
For example, the monthly price shown on the GCP pricing link for an n1-standard-8 instance in the Iowa region is (as of this writing) $194.1800. The list price for this instance would be $194.1800/0.7 = $277.40. This is the figure that must be used as the entry point for the table above to calculate the actual cost, given a certain level of utilization.
What if you parked the VM instance instead?
Here at ParkMyCloud, we’re all about scheduling resources to turn off when you’re not using them, i.e., “parking” them. With this mindset, I wondered about the impact of the sustained use discounts on the schedule-based savings. The following chart plots the cost of that n1-standard-8 VM instance, showing Google sustained use discounts combined with a parking schedule.
In this graph, the blue and orange lines show the percent savings from the sustained use discount and scheduling, respectively, based on the number of usage hours. The grey line shows the blended savings, and the yellow line shows the blended net cost. I am sure the sustained usage discount is described someplace with the typical hype of “the more you spend, the more you save!” But, the reality of the matter is the more you spend…the more you spend!
Looking at what this means for ParkMyCloud users, here is the monthly uptime for a few common parking schedules, and the associated cost:
|Assigned schedule||Uptime per 730-hour month||Actual monthly cost for notional n1-standard-8 instance||Savings compared to sustained-use cost of $194.18|
|7am – 7pm weekdays|
|8am – 6pm weekdays|
|8am – 5pm weekdays|
|9am – 5pm weekdays|
Short version: while the 30% sustained use discount does seem like a great deal, any scheduled off time saves you money. Even with the most wide-open “work day” schedule of running 12 hours per weekday, the cost/month is $93.13, a 52% savings compared to the full sustained-use cost of $194.18. This includes the 20% sustained use discount for the usage over 182.5 hours. A welcome discount to be sure, but not a really huge impact to the bottom line.
Another way our users keep these utilization hours low is by keeping their VM instances “always parked” and temporarily overriding the schedule for a set number of hours (such as for an 8-hour workday) when their non-production resources are needed. When the duration of the override expires, the instance is automatically shut down. This gives the best possible savings, and usually never even hits the first GCP discount tier.
Do Google Sustained Use discounts save you money?
Well, it depends on how you look at it. If you are looking at the cost one hour at a time, and you can see the discounts kick in, then it will probably feel like you are saving money. But if you are looking at the price for a whole month (the way it shows up on your bill), then there is no net savings off the publicly listed (and already discounted) price.
To get the optimal savings on your resources, keep them running only when you’re actually using them, and park them when you’re not. If you meet the usage threshold for any of the Sustained Use Discounts then they will further lower your cost per hour. These two savings options combined will optimize your costs and provide the maximum savings.
Like other cloud providers, the Google Cloud Platform (GCP) charges for compute virtual machine instances by the amount of time they are running — which may lead you to search for a Google Cloud instance scheduling solution. If your GCP instances are only busy during or after normal business hours, or only at certain times of the week or month, you can save money by shutting these instances down when they are not being used. So can you set up this scheduling through the Google Cloud console? And if not – what’s the best way to do it?
Why bother scheduling a Google VM to turn off?
As mentioned, depending on your purchasing option, Google Cloud pricing is based on the amount of time an instance is running, charged at a per-second rate. We find that at least 40%, of an organization’s cloud resources (and often much more) are for non-production purposes such as development, testing, staging, and QA. These resources are only needed when employees are actively using them for those purposes — so every second that they are left running when not being used is wasted spend. Since non-production VM instances often have predictable workloads, such as a 7 AM to 7 PM work week, 5 days a week, which means the other 64% of spend is completely wasted. Inconceivable!
The good news is, that means these resources can be scheduled to turn off during nights and weekends to save money. So, let’s take a look at a couple of cloud scheduling options.
Scheduling Option 1: GCP set-scheduling Command
If you were to do a Google search on “google cloud instance scheduling,” hoping to find out how to shut your compute instances down when they are not in use, you would see numerous promising links. The first couple of references appear to discuss how to set instance availability policies and mention a gcloud command line interface for “compute instances set-scheduling”. However, a little digging shows that these interfaces and commands simply describe how to fine-tune what happens when the underlying hardware for your Google virtual machine goes down for maintenance. The options in this case are to migrate the VM to another host (which appears to be a live migration), or to terminate the VM, and if the instance should be restarted if it is terminated. The documentation for the command goes so far as to say that the command is intended to let you set “scheduling options.” While it is great to have control over these behaviors, I feel I have to paraphrase Inigo Montoya – You keep using that word “scheduling” – I do not think it means what you think it means…
Scheduling Option 2: GCP Compute Task Scheduling
The next thing that looks schedule-like is the GCP Cron Service. This is a highly reliable networked version of the Unix cron service, letting you leverage the GCP App Engine services to do all sorts of interesting things. One article describes how to use the Cron Service and Google App Engine to schedule tasks to execute on your Compute Instances. With some App Engine code, you could use this system to start and stop instances as part of regularly recurring task sequences. This could be an excellent technique for controlling instances for scheduled builds, or calculations that happen at the same time of a day/week/month/etc.
While very useful for certain tasks, this technique really lacks flexibility. Google Cloud Cron Service schedules are configured by creating a cron.yaml file inside the app engine application. The GCP Cron Service triggers events in the application, and getting the application to do things like start/stop instances are left as an exercise for the developer. If you need to modify the schedule, you need to go back in and modify the cron.yaml. Also, it can be non-intuitive to build a schedule around your working hours, in that you would need one event for when you want to start an instance, and another when you want to stop it. If you want to set multiple instances to be on different schedules, they would each need to have their own events. This brings us to the final issue, which is that any given application is limited to 20 events for free, up to a maximum of 250 events for a paid application. Those sound like some eel-infested waters.
Scheduling Option 3: ParkMyCloud Google Cloud Instance Scheduling
Google Cloud Platform and ParkMyCloud – mawwage – that dweam within a dweam….
Given the lack of other viable instance scheduling options, we at ParkMyCloud created a SaaS app to automate instance scheduling, helping organizations cut cloud costs by 65% or more on their monthly cloud bill with AWS, Azure, and, of course, Google Cloud.
We aim to provide a number of benefits that you won’t find with, say, the GCP Cron Service. ParkMyCloud’s cloud management software:
- Automates the process of switching non-production instances on and off with a simple, easy-to-use platform – more reliable than the manual process of switching GCP Compute instances off via the GCP console.
- Provides a single-pane-of-glass view, allowing you to consolidate multiple clouds, multiple accounts within each cloud, and multiple regions within each account, all in one easy-to-use interface.
- Does not require a developer background, coding, or custom scripting. It is also more flexible and cost-effective than having developers write scheduling scripts.
- Can be used with a mobile phone or tablet.
- Avoids the hard-coded schedules of the Cron Service. Users can temporarily override schedules if they need to use an instance on short notice.
- Supports Teams and User Roles (with optional SSO), ensuring users will only have access to the resources you grant.
- Helps you identify idle instances by monitoring instance performance metrics, displaying utilization heatmaps, and automatically generating utilization-based “SmartParking” schedule recommendations, which you can accept or modify as you wish...
- Provides “rightsizing” recommendations to identify resources that are routinely underutilized and can be converted to a different Google Cloud server size to save 50-75% of the cost of the resource.
- Has a 14-day free trial, so you can try the platform out in your own environment. There’s also a free-forever tier, useful for startups and those on the Google Cloud free tier, as well as paid tiers with more advanced options for enterprises with a larger Google Cloud footprint.
How Much Can You Save with Scheduling?
While it depends on your exact schedule, many non-production Google Cloud VMs – those used for development, testing, staging, and QA – can be turned off for 12 hours/day on weekdays, and 24 hours/day on weekends. For example, the resource might be running from 7 AM to 7 PM Monday through Friday, and “parked” the rest of the week. This comes out to about 64% savings per resource.
Currently, the average savings per scheduled VM in the ParkMyCloud platform is about $200/month.
How Enterprises Are Benefitting from ParkMyCloud’s Scheduling Software
If you’re not quite ready to start your own trial, take a look at this use case from Workfront, a work management software provider. Workfront uses both AWS and Google Cloud Compute Engine, and needed to coordinate cloud management software across both public clouds. They required automation in order to optimize and control cloud resource costs, especially given users’ tendency to leave resources running when they weren’t being used.
Workfront found that ParkMyCloud would meet their automatic scheduling needs. Now, 200 users throughout the company use ParkMyCloud to:
- Get recommendations of resources that are not being used 24×7, and use policies to automatically apply on/off schedules to them
- Get notifications and control the state of their resources through Slack
- Easily report savings to management
- Save over $200,000 per year
Ways to Save on Google Cloud VMs, Beyond Scheduling
Google has done a great job of creating offerings for customers to save money through regular cloud usage. The two you’ll see mentioned the most are sustained use discounts and committed use discounts. Sustained use discounts give Google Cloud users automatic discounts the longer an instance is run. This post outlines the break-even points between letting an instance run for the discount vs. parking it. Committed use discounts, on the other hand, require an upfront commitment for 1 or 3 years’ usage. We have found that they’re best applicable for predictable workloads such as production environments. There are also the pre-emptible VMs, which are offered at a discount from on demand VMs in exchange for being short-lived – up to 24 hours.
How to Create a Google Cloud Schedule with ParkMyCloud
Getting started with ParkMyCloud is easy. Simply register for a free trial with your email address and connect to your Google Cloud Platform to allow ParkMyCloud to discover and manage your resources. A 14-day free trial free gives your organization the opportunity to evaluate the benefits of ParkMyCloud while you only pay for the cloud computing power you use. At the end of the trial, there is no obligation on you to continue with our service, and all the money your organization has saved is, of course, yours to keep.
Have fun storming the castle!
In this blog we are going to examine how Microsoft Azure region pricing varies and how region selection can help you reduce cloud spending.
How Organizations Select Public Cloud Regions
There are many comparisons that go into pricing differences between AWS vs Azure vs GCP, etc. At the end of the day, however, most organizations select one primary cloud service provider (CSP) for most of their workloads, plus maybe another for multi-cloud redundancy of critical services. Once selected, organizations then typically put many of their workloads in the region closest to their offices, plus maybe some geographic redundancy in their production systems. In other situations, a certain region is selected because that is the first region to support some new CSP feature. As time goes by, other regions become options because either those new features are propagated through the system, or whole new regions are created.
CSP regions tend to cluster around certain larger geographic regions, that I will call “areas” for the purpose of this blog. Looking at Azure in particular, we can see that Azure has three major US areas (Western, Central, and Eastern). The Western and Eastern US areas each have two Azure regions, and the Central area has four Azure regions. The UK, Europe and Australia areas each have two Azure regions. There are a number of other Azure regions as well, but they are far enough dispersed that I would consider them to be areas with a single region.
How Does Azure Region Pricing Vary?
With this regional distribution as a starting point, let’s look next at costs for instances. Here is a somewhat random selection of Azure region pricing data, looking at a variety of instance types (cost data as of approximately March 1, 2018).
While this graphic is a bit busy, there are a couple things that jump out at us:
- Within most of the areas, there are clearly more expensive regions and less expensive regions.
- The least expensive regions, on average across these instance types are us-west-2, us-west-central, and korea-south.
- The most expensive regions are asia-pacific-east, japan-east, and australia-east.
- Windows instances are about 1.5-3 times more expensive than their Linux-based counterparts
Let’s zoom-in on Azure Standard_DS2_v2 instance type, which comprises almost 60% of the total population of Azure instances customers are managing in the ParkMyCloud platform.
We can clearly see the relative volatility in the cost of this instance type across regions. And, while the Windows instance is about 1.5-2 times the cost of the Linux instance, the volatility is fairly closely mirrored across the regions.
Of more interest, however, is how the costs can differ within a given area. From that comparison we can see that there is some real savings to be gained by careful region selection within an area:
Over the course of a year, strategic region selection of a Windows DS2 instance could save up to $578 for the asia-pacific regions, $298 for the us-east regions, and $228 for the Korean regions.
How to Save Using Regions
By comparing regions within your desired “area” as illustrated above, the savings over a quantity of instances can be significant. Good region selection is fundamental to controlling Azure costs, and for costs across the other clouds as well.
If you are using AWS EC2 in production, chances are good that you’re using the AWS M instance type. The M family is a “General Purpose” instance type in AWS, most closely matching a typical off-the-shelf server one would buy from Dell, HP, etc, and was the first instance family released by AWS in 2006.
If you are looking for mnemonics for an AWS certification exam, you may want to think of the M instance type as the Main choice, or the happy Medium between the more specialized instances. The M instance provides a good balance of CPU, RAM, and disk size/performance. The other instance types specialize in different ways, providing above average CPU, RAM, or disk size/performance, and include a price premium. The one exception is the “T” instance type, discussed further below.
For a normal web or application server workload, the M instance type is probably the best tool for the job. Unless you KNOW you are going to be running a highly RAM/CPU/IO-intensive workload, you can usually start with an M instance, monitor its performance for a while, and then if the instance is performance-limited by one of the hardware characteristics, switch over to a more specialized instance to remove the constraint. For example:
- “C” instances for Compute/CPU performance.
- “R” or “X” instances for lots of memory – RAM or eXtreme RAM
- “D”, “H”, or “I” instances optimize for storage with different types/quantities of local storage drives (i.e., HDD or SDD that are part of the physical hardware the instance is running on) for high-Density storage (up to 48TB), High sequential throughput, or fast random I/O IOPS, respectively. (The latter two categories are much more specialized – see here for more details)
The “T” instance family is much like the “M” family, in that it is aimed at general purpose workloads, but at a lower price point. The key difference (and perhaps the only difference) is that the CPU performance is restricted to bursts of high performance (or “bursTs”) that are tracked by AWS through a system of CPU credits. Credits build up when the system is idle, and are consumed when the CPU load exceeds a certain baseline. When the CPU credit balance is used up, the CPU is Throttled to a fraction of its full speed. T instances are good for low-load web servers and non-production systems, such as those used by developers or testers, where continuous predictable high performance is not needed.
Looking at some statistics, the Botmetric Public Cloud Usage Report for 2017 states that 46% of AWS EC2 usage is on the M family, and 83% of non-production workloads are on T instances. Within the ParkMyCloud environment, we see the following top instance family statistics across our customers’ environments:
- I instances: 39%
- M instances: 22%
- T instances: 27%
Since many of our customers are focused on cost optimization for non-production cloud resources (i.e., a lot of developers and test environments), we are probably seeing more “T” instances than “M” instances as they are less expensive, and the “bursty” nature of T instances is not a factor in their work. For a production workload, M instances with dedicated CPU resources are more predictable. While we cannot say for sure why we are also seeing a very large number of “I” instances, it is quite possible that developers/testers are running database software in an EC2 instance, rather than in RDS, in order to have more direct control and visibility into the database system. Still, 49% of the resources are in the General Purpose M and T families.
The Nitty and/or Gritty
Assuming you have decided that an M instance is the right tool for your job, your next choice will be to decide which one. As of the date of this blog, there are twelve different instance types within the M family, covering two generations of systems.
Table 1 – The M Instance Family Specs (Pricing per hour for on-demand instances in US-East-1 Region)
The M4 generation was released in June 2015. The M4 runs 64-bit operating systems on hardware with the 2.3 GHz Intel Xeon E5-2686 (Broadwell) or 2.4 GHz Intel Xeon E5-2676 H3 (Haswell) processors, potentially jumping to 3GHz with Turbo Boost. None of the M4 instance family supports instance store disks, but are all EBS-optimized by default. These instances also support Enhanced Networking, a no-extra-cost option that allows up to 10 Gbps of network bandwidth.
The M5 generation was just released this past November at re:Invent 2017. The M5 generation is based on custom Intel Xeon Platinum 8175M processors running at 2.5GHz. When communicating with other systems in a Cluster Placement Group (a grouping of instances in a single Availability Zone), the m5.24xlarge instance can support an amazing 25 Gbps of network bandwidth. The M5 type also support EBS via an NVMe driver, a block storage interface designed for flash memory. Interestingly, AWS has not jacked-up the EBS performance guarantee for this faster EBS interface. This may be because it is the customer’s responsibility to install the right driver to get the higher performance on older OS images, so this could also be a cheap/free performance win if you can migrate to M5.
Amazon states that the M5 generation delivers 14% better price/performance on a per-core basis than the M4 generation. In the pricing above, one can do the math and find that all of the M5 instances cost $0.048 per vCPU per hour, and that the M4 instances all cost $0.05 per vCPU per hour. So right out of the box, the M5 is priced 4% cheaper than an equivalently configured M4. Do the same math for RAM vs vCPU and you can see that AWS allocates 4GB of RAM per vCPU in both the M4 and M5 generations. This probably says a lot about how the underlying hardware is sliced/diced for virtual machines in the AWS data centers.
For more thoughts on historic M instance pricing, please see our other blog about the dropping cost of cloud services.
Some key takeaways:
- If you are not sure how your application is going to behave under a production load, start with an M instance and migrate to something more specialized if needed.
- If you do not need consistent and continuous high CPU performance, like for dev/test or low usage systems, consider using the similarly General Purpose T instance family.
- If you are launching a new instance, use the M5 generation for the better value.
Overall, the M family gives the best price/performance for General Purpose production systems, Making it your Main choice for Middlin’ performance of Most workloads!
You might read the headline statement that the cost of cloud computing is dropping and say “Well, duh!”. Or maybe you’re on the other side of the fence. A coworker recently referred me to a very interesting blog on the Kapwing site that states Cloud costs aren’t actually dropping dramatically. The author defines“dramatically” based on the targets set by Moore’s Law or the more recently proposed Bezos’ Law, which states that “a unit of [cloud] computing power price is reduced by 50 percent approximately every three years.” The blog focused on the cost of the Google Cloud Platform (GCP) n1-standard-8 machine type, and illustrated historical data for the Iowa region:
|Date||N1-standard-8 Cost per Hour|
The Kapwing blog also illustrates that the GCP storage and network egress costs have not changed at all in three years. These figures certainly add up to a conclusion that Bezos’ Law is not working…at least not for GCP.
Whose law is it anyway?
If we turn this around and try to apply Bezos’ Law to, well, Bezos’ Cloud we see a somewhat different story.
The approach to measuring AWS pricing changes needs to be a bit more systematic than for GCP, as the AWS instance types have been evolving quite a bit over their history. This evolution is shown by the digit that follows the first character in the instance type, indicating the version or generation number of the given instance type. For example, m1.large vs. m5.large. These are similar virtual machines in terms of specifications, with 2 vCPUs and about 8GB RAM, but the m1.large was released in October 2007, and the m5.large in November 2017. While the “1” in the GCP n1-standard-8 could also be a version number, it is still the only version I can see back to at least 2013. For AWS, changes in these generation numbers happen more frequently and likely reflect the new generations of underlying hardware on which the instance can be run.
Show me the data!
In any event, when we make use of the Internet Archive to look at pricing changes of the specific instance type as well as the instance type “family” as it evolves, we see the following (all prices are USD cost per hour for Linux on-demand from the us-east-1 region in the earliest available archived month of data for the quoted year):
*Latest Internet Archive data from Dec 2017 but confirmed to match current Jan 2018 AWS pricing.
FWIW: The second generation m2.large instance type was skipped, though in October 2012 AWS released the “Second Generation Standard” instances for Extra Large and Double Extra Large – along with about an 18% price reduction for the first generation.
To confirm that we can safely compare these prices, we need to look at how the mX.large family has evolved over the years:
|m1.large (originally defined as the “Standard Large” type)||2vCPU w/ECU of 4, 7.5GB RAM|
|m3.large||2vCPU w/ECU of 6.5, 7.5GB RAM|
|m4.large||2vCPU w/ECU of 6.5, 8GB RAM|
|m5.large||2vCPU w/ECU of 10, 8GB RAM|
A couple of notes on this:
- ECU is “Elastic Compute Unit” – a standardized measure AWS uses to support comparison between CPUs on different instance types. At one point, 1 ECU was defined as the compute-power of a 1GHz CPU circa 2007.
- I realize that the AWS mX.large family is not equivalent to the GCP n1-standard-8 machine type mentioned earlier, but I was looking for an AWS machine type family with a long history and fairly consistent configuration(and this is not intended to be a GCP vs AWS cost comparison).
The drop in the cost of cloud computing looks kinda dramatic to me…
The net average of the 3-year reduction figures is -58% per year, so Bezos’ Law is looking pretty good. (And there is probably an interesting grad-student dissertation somewhere about how serverless technologies fit into Bezos’ Law…) When you factor the m1.large ECU of 4 versus the m5.large ECU of 10 into the picture, more than doubling the net computing power, one could easily argue that Bezos’ Law significantly understates the situation. Overall, there is a trend here of not just a significantly declining prices, but also greatly increased capability (higher ECU and more RAM), and certainly reflecting an increased value to the customer.
So, why has the pricing of the older m1 and m3 generations gone flat but is still so much more expensive? On the one hand, one could imagine that the older generations of underlying hardware consume more rack space and power, and thus cost Amazon more to operate. On the other hand, they have LONG since amortized this hardware cost, so maybe they could drop the prices. The reality is probably somewhere in between, where they are trying to motivate customers to migrate to newer hardware, allowing them to eventually retire the old hardware and reuse the rack space.
There is definite motivation here to do a lateral inter-generation “rightsizing” move. We most commonly think of rightsizing as moving an over-powered/under-utilized virtual machine from one instance size to another, like m5.large to m5.medium, but intergenerational rightsizing can add up to some serious savings very quickly. For example, an older m3.large instance could be moved to an m5.large instance in about 1 minute or less (I just did it in 55 seconds: Stop instance, Change Instance Type, Start Instance), immediately saving 39%. This can frequently be done without any impact to the underlying OS. I essentially just pulled out my old CPU and RAM chips and dropped in new ones. Note that it is not necessarily this easy for all instance types – some older AMI’s can break the transition to a newer instance type because of network or other drivers, but it is worth a shot, and the AWS Console should let you know if the transition is not supported (of course: as always make a snapshot first!)
For the full view of cloud compute cost trends, we need to look at both the cost of specific instance types, and the continually evolving generations of that instance type. When we do this, we can see that the cost of cloud computing is, in fact, dropping dramatically…at least on AWS.