We get requests from customers occasionally about whether ParkMyCloud can manage Microsoft Azure Classic vs. ARM VMs. Short answer: no. Since Azure has already announced the deprecation of Azure classic resources – albeit not until March 2023 – you’ll find similar answers from other third-party services. Microsoft advises only to use resource manager VMs. And in fact, unless you already had classic VMs as of February 2020, you are not able to create new classic VMs.
As of February, though, 10% of IaaS VMs still used the classic deployment model – so there are a lot of users with workloads that need to be migrated in order to use third-party tools, new services, and avoid 2023 deprecation.
Azure Classic vs. ARM VM Comparison
Azure Classic and Azure Resource Manager (ARM) are two different deployment models for Azure VMs. In the classic model, resources exist independently, without groups for applications. In the classic deployment model, resource states, policies, and tags are all managed individually. If you need to delete resources, you do so individually. This quickly becomes a management challenge, with individual VMs liable to be left running, or untagged, or with the wrong access permissions.
Azure Resource Manager, on the other hand, provides a deployment model that allows you to manage resources in groups, which are typically divided by application with sub-groups for production and non-production, although you can use whatever groupings make sense for your workloads. Groups can consist of VMs, storage, virtual networks, web apps, databases, and/or database servers. This allows you to maintain consistent role-based access controls, tagging, cost management policies, and to create dependencies between resources so they’re deployed in the correct order. Read more: how to use Azure Resource Groups for better VM management.
How to Migrate to Azure Resource Manager VMs
For existing classic VMs that you wish to migrate to ARM, Azure recommends planning and a lab test in advance. There are four ways to migrate various resources:
- Migration of VMs, not in a virtual network – they will need to be on a virtual network on ARM, so you can choose a new or existing virtual network. These VMs will need to be restarted as part of the migration.
- Migration of VMs in a virtual network – these VMs do not need to be restarted and applications will not incur downtime, as only the metadata is migrating – the underlying VMs run on the same hardware, in the same network, and with the same storage.
- Migration of storage accounts – you can deploy Resource Manager VMs in a classic storage account, so that compute and network resources can be migrated independently of storage. Then, migrate over storage accounts.
- Migration of unattached resources – the following may be migrated independently: storage accounts with no associated disks or VMs, and network security groups, route tables, and reserved IPs that are not attached to VMs or networks.
There are a few methods you can choose to migrate:
We recommend that you move toward ARM resources and eliminate or migrate classic resources as soon as you can. Farewell, classic!
Whether you’re new to public cloud altogether or already use one provider and are interested in trying another, you may be interested in a comparison of the AWS vs Azure vs Google free tier. The big three cloud providers – AWS, Azure and Google Cloud – each have a free tier available that’s designed to give users the cloud experience without all the costs. They include free trial versions of numerous services so users can test out different products and learn how they work before they make a huge commitment. While they may only cover a small environment, it’s a good way to learn more about each cloud provider. For all of the cloud providers, the 12-month free trials are available to only new users.
AWS Free Tier Offerings
AWS free tier includes more than 60 products. There are two different types of free options that are available depending on the product used: always free and 12 months free. To help customers get started on AWS, the services that fall under the free 12-months are for new trial customers and give customers the ability to use the products for free (up to a specific level of usage) for one year from the date the account was created. Keep in mind that once the free 12 months are up, your services will start to be charged at the normal rate. Be prepared and review this checklist of things to do when you outgrow the AWS free tier.
Azure Free Tier Offerings
The Azure equivalent of a free tier is referred to as a free account. As a new user in Azure, you’re given a $200 credit that has to be used in the first 30 days after activating your account. When you’ve used up the credit or 30 days have expired, you’ll have to upgrade to a paid account if you wish to continue using certain products. Ensure that you have a plan to reduce Azure costs in place. If you don’t need the paid products, there’s also the always free option.
Some of the ways people choose to use their free account are to gain insights from their data, test and deploy enterprise apps, create custom mobile experiences and more.
Google Cloud Free Tier Offerings
The Google Cloud Free Tier is essentially an extended free trial that gives you access to free cloud resources so you can learn about Google Cloud services by trying them on your own.
The Google Cloud Free Tier has two parts – a 12-month free trial with a $300 credit to use with any Google Cloud services and always free, which provides limited access to many common Google Cloud resources, free of charge. Google Cloud gives you a little more time with your credit than Azure, you get the full 12 months of the free trial to use your credit. Unlike free trials from the other cloud providers, Google does not automatically charge you once the trial ends – this way you’re guaranteed that the free tier is actually 100% free. Keep in mind that your trial ends after 12 months or once you’ve exhausted the $300 credit. Any usage beyond the free monthly usage limits are covered by the $300 free credit – you must upgrade to a paid account to continue using Google Cloud.
Free Tier Limitations
It’s important to note that the always-free services vary widely between the cloud providers and there are usage limitations. Keep in mind the cloud providers’ motivations: they want you to get attached to the services so you start paying for them. So, be aware of the limits before you spin up any resources, and don’t be surprised by any charges.
In AWS, when your free tier expires or if your application use exceeds the free tier limits, you pay standard, pay-as-you-go service rates. Azure and Google both offer credits for new users that start a free trial, which are a handy way to set a spending limit. However, costs can get a little tricky if you aren’t paying attention. Once the credits have been used you’ll have to upgrade your account if you wish to continue using the products. Essentially, the credit that was acting as a spending limit is automatically removed so whatever you use beyond the free amounts, you will now have to pay for. In Google Cloud, there is a cap on the number of virtual CPUs you can use at once – and you can’t add GPUs or use Windows Server instances.
For 12 months after you upgrade your account, certain amounts of popular products are free. After 12 months, unless decommissioned, any products you may be using will continue to run, and you’ll be billed at the standard pay-as-you-go rates.
Another limitation is that commercial software and operating system licenses typically aren’t available under the free tiers.
These offerings are “use it or lose it” – if you don’t use all your credits or utilize all your usage, there will be no rollover into future months.
Popular Services, Products, and Tools to Check Out for Free
AWS has 33 products that fall under the one-year free tier – here are some of the most popular:
- Amazon EC2 Compute: 750 hours per month of compute time, per month of Linux, RHEL, SLES t2.micro or t3.micro instance and Windows t2.micro or t3.micro instance dependent on region.
- Amazon S3 Storage: 5GB of standard storage
- Amazon RDS Database: 750 hours per month of db.t2.micro database usage using MySQL, PostgreSQL, MariaDB, Oracle BYOL, or SQL Server, 20 GB of General Purpose (SSD) database storage and 20 GB of storage for database backups and DB Snapshots.
For the always-free option, you’ll find a number of products as well, some of these include:
- AWS Lambda: 1 million free compute requests per month and up to 3.2 million seconds of compute time per month.
- Amazon DynamoDB: 25 GB of database storage per month, enough to handle up to 200M requests per month.
- Amazon CloudWatch: 10 custom metrics and alarms per month, 1,000,000 API requests, 5GB of Log Data Ingestion and Log Data Archive and 3 Dashboards with up to 50 metrics.
Azure has 19 products that are free each month for 12 months – here are some of the most popular:
- Linux and Windows virtual machines: 750 hours (using B1S VM) of compute time
- Managed Disk Storage: 64 GB x 2 (P6 SSD)
- Blob Storage: 5GB (LRS hot block)
- File Storage: 5GB (LRS File Storage)
- SQL databases: 250 GB
For their always free offerings, you’ll find even more popular products – here are a few:
- Azure Kubernetes Service: no charge for cluster management, you only pay for the virtual machines and the associated storage and networking resources consumed.
- Azure DevOps: 5 users for open source projects and small projects (with unlimited private Git repos). For larger teams, the cost ranges from $6-$90 per month.
- Azure Cosmos DB (400 RU/s provisioned throughput)
Unlike AWS and Azure, Google Cloud does not have a 12 months free offerings. However, Google Cloud does still have a free tier with a wide range of always free services – some of the most popular ones include:
- Google BigQuery: 1 TB of queries and 10 GB of storage per month.
- Kubernetes Engine: One zonal cluster per month
- Google Compute Engine: 1 f1-micro instance per month only in U.S. regions. 30 GB-months HDD, 5 GB-months snapshot in certain regions and 1 GB of outbound network data from North America to all region destinations per month.
- Google Cloud Storage: 5 GB of regional storage per month, only in the US. 5,000 Class A, and 50,000 Class B operations, and 1 GB of outbound network data from North America to all region destinations per month.
Check out these blog posts on free credits for each cloud provider to see how you can start saving:
Azure Spot Virtual Machines are a purchasing option that can save significant amounts on infrastructure, for certain types of workloads. Azure Spot VMs are not created as a separate type of VM in Azure, instead, it’s a capability to bid for spare capacity at a discount from on demand pricing. But there’s one caveat: at any given point in time when Azure needs the capacity back, the Azure infrastructure will deallocate and evict the Spot VM from your environment.
In the past, Azure offered Low Priority VMs, which were charged at a fixed price. In March this year, that option was replaced by Azure Spot VMs. With the newer option, you bid by indicating the maximum price you are willing to pay for a VM.
Why Use Azure Spot VMs
Microsoft allows you to use their unused compute capacity at a discounted rate. These discounts are variable and can go up to >90% of the pay-as-you-go rates, depending on the size of the VM and the unused capacity available. The amount of capacity available can vary based on region, time of day, and more.
You can use Azure Spot VMs for workloads that are not critical or need to run 24×7. For example, a basic scenario would be for testing the load of a particular workload that you want to perform for a fraction of the cost. Other use cases include batch processing, stateless applications that can scale out, short-lived jobs that can be run again if the workload is evicted, etc.
Keeping in mind that there are no SLAs or availability guarantees for these Spot VMs. The most significant concern users have is that they may not be available to you to get resources, especially at peak load times. The issue is not with the service, it’s with how it is intended to work. Be aware of this when making the decision to use this approach.
Some important things to consider when using Azure Spot VMs:
- VMs are evicted based on capacity or by if the price exceeds your maximum set price
- Azure’s infrastructure will evict Spot VMs if Azure needs the capacity for pay-as-you-go workloads
- B-series and promo versions of any size (like Dv2, NV, NC, H promo sizes) are not supported
- A Spot VM cannot be converted to a regular VM or vice versa. You would have to delete the VM and attach the disk to a new VM
- VMs that are evicted and deallocated are not turned back on when capacity or price comes back inside allowed limits, you will need to manually turn them back on
- You will be unable to create your VM if the capacity or pricing are not inside the allowed limits
How to Use Azure Spot VMs
You have two choices when deploying Azure Spot VMs. When you enable the feature in your Azure environment, you need to select what type of eviction and eviction policy you want for the capacity:
Types of eviction:
- By capacity only – the VM is evicted when Azure needs capacity. In other words, your maximum price for the spot VM is the current price of the regular VM
- By maximum price – the VM is evicted when the spot price is greater than the maximum price
Eviction policy (currently available):
The eviction policy for Spot VMs is set to Stop / Deallocate which moves your evicted VMs to the stopped-deallocated state, allowing you to redeploy the evicted VMs at a later time. Remember reallocating Spot VMs will be dependent on there being available Spot capacity. However, the deallocated VMs will count against your spot vCPU quota and you will be charged for your underlying disks. If your Spot VM is evicted, but you still need capacity right away, Azure recommends you use a standard VM instead of Spot VM.
Do Azure Spot VMs Save You Money?
Yes: these discounted VMs can save you money. How much will vary? Azure Spot VMs prices are not fixed like standard instances, they change over the day and vary based on the supply and demand in a particular region.
Azure Spot VMs are a good option that can provide cost savings if your application can handle unexpected interruptions.
Use Spot VMs as part of your full cost-saving strategy. For on-demand workloads that aren’t needed 24×7, ensure you have appropriate on/off schedules in place. All VMs should be properly sized to the workload. You can start automating these Azure cost optimization tasks with ParkMyCloud today.
There are a wide range of Microsoft Azure VM types that are optimized to meet various needs. Machine types are specialized, and vary by virtual CPU (vCPU), disk capability, and memory size, offering a number of options to match any workload.
With so many options available, finding the right machine type for your workload can be confusing – which is why we’ve created this overview of Azure VM types (as we’ve done with EC2 instance types, and Google Cloud machine types). Note that while AWS EC2 instance types have names associated with their purpose, Azure instance type names are simply in a series from A to N. The chart below and written descriptions are a brief and easy reference, but remember that finding the right machine type for your workload will always depend on your needs.
General purpose VMs have a balanced CPU and memory, making them a great option for testing and development, smaller to medium databases, and web servers with low to medium traffic:
The newest size recommendation in the DC-series, the DCsv2, stands out because of the data protection and code confidentiality it provides while it’s being processed in the cloud. SGX technology and the latest generation of Intel XEON E-2288G Processor back these machines – these VMs can go up to 5.0GHz.
A-series VMs have a CPU-to-memory ratio that works best for entry-level workloads, like those for development and testing. Sizing is throttled for consistent processor performance to run the instance. Av2-series has the option to be deployed on a number of hardware types and processors. To figure out which hardware the size should be deployed on, users must query the virtual hardware in the VM.
Dv2 and Dsv2-series
Dv2 VMs boast powerful CPUs – roughly 35% faster than D-series VMs – and optimized memory, great for production workloads. They have the same memory and disk configurations as the D-series, based upon either a 2.1 GHz, 2.3 GHz or 2.4 GHz processor and Intel Boost Technology.
Dsv2-series sizes run on the same Dv2 processors with Intel Turbo Boost Technology 2.0 and also use premium storage.
With expanded memory (from ~3.5 GiB/vCPU to 4 GiB/vCPU) and adjustments for disk and network limits, the Dv3 series Azure VM type offers the most value to general purpose workloads. The sizes in this series offer a combination of memory, temporary storage, and vCPU that best fits best for enterprise applications, relational databases, in-memory caching, and analytics. It’s important to note that the Dv3-series no longer has the high memory VM sizes of the D/Dv2-series.
This series’ sizes feature premium storage disks and run on 2.1, 2.3, or 2.4 GHz Intel Xeon processors with Intel Turbo Boost Technology 2.0, the Dsv3-series is best suited for most production workloads.
Similar to the AWS t-series machine type family, B-series burstable VMs and ideal for workloads that do not rely on full and continuous CPU performance. Use cases for this series’ VM types include small databases, dev and test environments and low-traffic web servers, microservices and more. Thanks to the B-series, customers can purchase a VM size that builds up credits when underutilized compared to its base performance, and the accumulated credits can be used as bursts. Spikes in compute power allow the VM to burst above the base performance if for higher CPU performance when needed.
Dav4 and Dasv4-series
Dav4-series are one of the new sizes that utilize a 2.35Ghz AMD EPYCTM 7452 processor and can reach a max frequency of 3.35GHz. The combination of memory, temporary storage and vCPU makes these VMs suitable for most production workloads. For premium SSD, Dasv4-series sizes are the best option.
Ddv4 and Ddsv4-series
Similar to other VMs in the D-series, these sizes utilize a combination of memory, temporary disk storage and vCPU that provides a better value for most general-purpose workloads. These new VM sizes have faster and 50% larger local storage (up to 2,400 GiB) and are designed for applications that benefit from low latency, high-speed local storage. The Ddv4-series processors run in a hyper-threaded configuration making them a great option for enterprise-grade applications, relational databases, in-memory caching, and analytics.
The major difference between the two series is that the Ddsv4-series supports Premium Storage and premium Storage caching, while Ddv4-series does not.
Dv4 and Dsv4-series
Both of these new series are currently in preview. The Dv4-series is optimal for general purpose workloads since they run on processors in a hyper-threaded configuration. It features a sustained all core Turbo clock speed of 3.4 GHz.
The Dsv4-series runs on the same processors as the Dv4-series, and even has the same features. The major difference between the two series is that the Dsv4-series supports Premium Storage and premium Storage caching, while Dv4-series does not.
Compute optimized Azure VM types offer a high CPU-to-memory ratio. They’re suitable for medium traffic web servers, network appliances, batch processing, and application servers.
With a base core frequency of 3.4 GHz and a maximum single-core turbo frequency of 3.7 GHz, Fsv2 series VM types offer up to twice the performance boost for vector processing workloads. Not only do they offer great speed for any workload, the Fsv2 also offers the best value for its price based on the ratio of Azure Compute Unit (ACU) per vCPU.
Memory optimized VM types are higher in memory as opposed to CPU, and best suited for relational database services, analytics, and larger caches.
Enterprise applications and large databases will benefit most from the M-series for having the most memory (up to 3.8 TiB) and the highest vCPU count (up to 128) of any VM in the cloud.
The VMs in this series offer the highest vCPU count (up to 416 vCPUs) and largest memory (up to 11.4 TiB) of any VM. Because of these features, It’s ideal for extremely large databases or applications that benefit from high vCPU counts and large amounts of memory. The Mv2-series runs on an Intel® Xeon® processor with an all core base frequency of 2.5 GHz and a max turbo frequency of 3.8 GHz.
Dv2 and DSv2-series 11-15
The Dv2 and DSv2-series 11-15 follow in the footsteps of the original D-series, the main differentiation is a more powerful CPU. For applications that require fast vCPUs, reliable temporary storage, and demand more memory, the Dv2 and DSv2-series all fit the bill for enterprise applications. The Dv2 series offers speed and power with a CPU about 35% faster than that of the D-series. Based on the 2.1, 2.3 and 2.4 GHz Intel Xeon® processors and with Intel Turbo Boost Technology 2.0, they can reach up to 3.1 GHz. The Dv2-series also has the same memory and disk configurations as the D-series.
Ev3-series and Esv3-series
The Ev3 follows in the footsteps of the high memory VM sizes originating from the D/Dv2 families. This Azure VM types provides excellent value for general purpose workloads, boasting expanded memory (from 7 GiB/vCPU to 8 GiB/vCPU) with adjustments to disk and network limits per core basis in alignment with the move to hyperthreading.
The Esv3-series is the optimal choice for memory-intensive enterprise applications. If you want premium storage disks, the Esv3-series sizes are the perfect ones. A difference between the two series is that the Esv3-series supports Premium Storage and premium Storage caching, while Ev3-series does not.
Eav4 and Easv4-series
The Eav4 and Easv4-series utilize the processors they run on in a multi-threaded configuration increasing options for running memory optimized workloads. Though the Eav4-series and Easv4-series have the same memory and disk configurations as the Ev3 & Esv3-series, the Eav4-series sizes are ideal for memory-intensive enterprise applications.
Use the Easv4-series sizes for premium SSD. The Easv4-series sizes are ideal for memory-intensive enterprise applications. Easv4-series sizes can achieve a boosted maximum frequency of 3.35GHz.
Edv4 and Edsv4-series
High vCPU counts and large amounts of memory make Edv4 and Edsv4-series the ideal option for extremely large databases and other applications that benefit from these features. It features a sustained all core Turbo clock speed of 3.4 GHz and many new technology features. Unlike the Ev3/Esv3 sizes with Gen2 VMs, these new VM sizes will have 50% larger local storage, as well as better local disk IOPS for both read and write.
The Edv4 and Edsv4 virtual machine sizes feature up to 504 GiB of RAM, in addition to fast and large local SSD storage (up to 2,400 GiB). These virtual machines are ideal for memory-intensive enterprise applications and applications that benefit from low latency, high-speed local storage. You can attach Standard SSDs and Standard HDDs disk storage to the Edv4 VMs.
Ev4 and Esv4-series
These new sizes are currently under Public Preview Only – you can signup to access them here.
The Ev4 and Esv4-series are ideal for various memory-intensive enterprise applications. They run in a hyper-threaded configuration on 2nd Generation Intel® Xeon® processors and feature up to 504 GiB of RAM.
For big data, data warehousing, large transactional databases, SQL, and NoSQL databases, storage optimized VMs are the best type for their high disk throughput and IO.
Lsv2-series VMs provide high throughput, low latency, directly mapped local NVMe making it these VMs ideal for NoSQL stores such as Apache Cassandra and MongoDB. The Lsv2-series comes in sizes 8 to 80 vCPU and each vCPU has 8 GiB of memory. VMs in this series are optimized and use the local disk on the node that is attached directly to the VM.
GPU VM types, specialized with single or multiple NVIDIA GPUs, work best for video editing and heavy graphics rendering – as in compute-intensive, graphics-intensive, and visualization workloads.
NC, NCv2 and NCv3-series
The sizes in these series are optimized for compute-intensive and network-intensive applications and algorithms. The NCv2-series is powered by NVIDIA Tesla P100 GPUs and provides more than double the computational performance of the NC-series. The NCv3-series is powered by NVIDIA Tesla V100 GPUs and can provide 1.5x the computational performance of the NCv2-series.
NV and NVv3-series
These sizes were made and optimized for remote visualization, streaming, gaming, encoding, and VDI scenarios. These VMs are targeted for GPU accelerated graphics applications and virtual desktops where customers want to visualize their data, simulate results to view, work on CAD, or render and stream content.
ND and NDv2-series
These series are focused on training and inference scenarios for deep learning. The ND-series VMs are a new addition to the GPU family and offer excellent performance for training and inference making them ideal for Deep Learning workloads and AI. The ND-series is also enabled to fit much larger neural net models thanks to the much larger GPU memory size (24 GB).
The NDv2-series is another new addition to the GPU family and with its excellent performance, it meets the needs of the most demanding machine learning, GPU-accelerated AI, HPC workloads and simulation.
The NVv4-series VMs are optimized and designed for remote visualization and VDI. With partitioned GPUs, NVv4 offers the right size for workloads requiring smaller GPU resources. With separated GPUs, this series offers the perfect size VMs for workloads that require smaller GPU resources.
High Performance Compute
For the fastest and most powerful virtual machines, high performance compute is the best choice with optional high-throughput network interfaces (RDMA).
The H-series VMs were built for handling batch workloads, analytics, molecular modeling, and fluid dynamics. These 8 or 16 vCPU VMs are built on the Intel Haswell E5-2667 v3 processor technology and up to 14 GB of RAM per CPU core, and no hyperthreading.
Besides sizable CPU power, the H-series provides options for low latency RDMA networking with FDR InfiniBand and different memory configurations for supporting memory intensive compute requirements.
Applications driven by memory bandwidth, such as explicit finite element analysis, fluid dynamics, and weather modeling are the best fit for HB-series VMs. These VMs feature 4 GB of RAM per CPU core and no simultaneous multithreading.
For applications driven by dense computation, like implicit finite element analysis, molecular dynamics, and computational chemistry HC-series VMs are the best fit. HC VMs feature 8 GB of RAM per CPU core, and no hyperthreading.
Similar to other VMs in the High Performance compute family, HBv2-series VMs are optimized for applications driven by memory bandwidth, such as fluid dynamics, finite element analysis, and reservoir simulation. HBv2 VMs feature 4 GB of RAM per CPU core, and no simultaneous multithreading. These VMs enhance application performance, scalability, and consistency.
What Azure VM type is right for your workload?
The good news is that with this many options VM types, you’re bound to find the right type to meet your computing needs – as long as you know what those needs are. With good insight into your workload, usage trends, and business needs, you’ll be able to find the Azure VM type that’s right for your workloads.
Spot instances and similar “spare capacity” models are frequently cited as one of the top ways to save money on public cloud. However, we’ve noticed that fewer cloud customers are taking advantage of this discounted capacity than you might expect.
We say “spot instances” in this article for simplicity, but each cloud provider has their own name for the sale of discounted spare capacity – AWS’s spot instances, Azure’s spot VMs and Google Cloud’s preemptible VMs.
Spot instances are a type of purchasing option that allows users to take advantage of spare capacity at a low price, with the possibility that it could be reclaimed for other workloads with just brief notice.
In the past, AWS’s model required users to bid on Spot capacity. However, the model has since been simplified so users don’t actually have to bid for Spot Instances anymore. Instead, they pay the Spot price that’s in effect for the current hour for the instances that they launch. The prices are now more predictable with much less volatility. Customers still have the option to control costs by providing a maximum price that they’re willing to pay in the console when they request Spot Instances.
Spot Instances in Each Cloud
Variations of spot instances are offered across different cloud providers. AWS has Spot Instances while Google Cloud offers preemptible VMs and as of March of this year, Microsoft Azure announced an even more direct equivalent to Spot Instances, called Azure Spot Virtual Machines.
Spot VMs have replaced the preview of Azure’s low-priority VMs on scale sets – all eligible low-priority VMs on scale sets have automatically been transitioned to Spot VMs. Azure Spot VMs provide access to unused Azure compute capacity at deep discounts. Spot VMs can be evicted at any time if Azure needs capacity.
AWS spot instances have variable pricing. Azure Spot VMs offer the same characteristics as a pay-as-you-go virtual machine, the differences being pricing and evictions. Google Preemptible VMs offer a fixed discounting structure. Google’s offering is a bit more flexible, with no limitations on the instance types. Preemptible VMs are designed to be a low-cost, short-duration option for batch jobs and fault-tolerant workloads.
Adoption of Spot Instances
Our research indicates that less than 20% of cloud users use spot instances on a regular basis, despite spot being on nearly every list of ways to reduce costs (including our own).
While applications can be built to withstand interruption, specific concerns remain, such as loss of log data, exhausting capacity and fluctuation in the spot market price.
In AWS, it’s important to note that while spot prices can reach the on-demand price, since they are driven by long-term supply and demand, they don’t normally reach on-demand price.
A Spot Fleet, in which you specify a certain capacity of instances you want to maintain, is a collection of Spot Instances and can also include On-Demand Instances. AWS attempts to meet the target capacity specified by using a Spot Fleet to launch the number of Spot Instances and On-Demand Instances specified in the Spot Fleet request.
To help reduce the impact of interruptions, you can set up Spot Fleets to respond to interruption notices by hibernating or stopping instances instead of terminating when capacity is no longer available. Spot Fleets will not launch on-demand capacity if Spot capacity is not available on all the capacity pools specified.
AWS also has a capability that allows you to use Amazon EC2 Auto Scaling to scale Spot Instances – this feature also combines different EC2 instance types and pricing models. You are in control of the instance types used to build your group – groups are always looking for the lowest cost while meeting other requirements you’ve set. This option may be a popular choice for some as ASGs are more familiar to customers compared to Fleet, and more suitable for many different workload types. If you switch part or all of your ASGs over to Spot Instances, you may be able to save up to 90% when compared to On-Demand Instances.
Another interesting feature worth noting is Amazon’s capacity-optimized spot instance allocation strategy. When customers diversify their Fleet or Auto Scaling group, the system will launch capacity from the most available capacity pools, effectively decreasing interruptions. In fact, by switching to capacity-optimized allocation users are able to reduce their overall interruption rate by about 75%.
Is “Eviction” Driving People Away?
There is one main caveat when it comes to spot instances – they are interruptible. All three major cloud providers have mechanisms in place for these spare capacity resources to be interrupted, related to changes in capacity availability and/or changes in pricing.
This means workloads can be “evicted” from a spot instance or VM. Essentially, this means that if a cloud provider needs the resource at any given time, your workloads can be kicked off. You are notified when an AWS spot instance is going to be evicted: AWS emits an event two minutes prior to the actual interruption. In Azure, you can opt to receive notifications that tell you when your VM is going to be evicted. However, you will have only 30 seconds to finish any jobs and perform shutdown tasks prior to the eviction making it almost impossible to manage. Google Cloud also gives you 30 seconds to shut down your instances when you’re preempted so you can save your work for later. Google also always terminates preemptible instances after 24 hours of running. All of this means your application must be designed to be interruptible, and should expect it to happen regularly – difficult for some applications, but not so much for others that are rather stateless, or normally process work in small chunks.
Companies such as Spot – recently acquired by NetApp (congrats!) – help in this regard by safely moving the workload to another available spot instance automatically.
Our research has indicated that fewer than one-quarter of users agree that their spot eviction rate was too low to be a concern – which means for most, eviction rate is a concern. Of course, it’s certainly possible to build applications to be resilient to eviction. For instance, applications can make use of many instance types in order to tolerate market fluctuations and make appropriate bids for each type.
AWS also offers an automatic scaling feature that has the ability to increase or decrease the target capacity of your Spot Fleet automatically based on demand. The goal of this is to allow users to scale in conservatively in order to protect your application’s availability.
Early Adopters of Spot and Other Innovations May be One and the Same
People who are hesitant to build for spot more likely use regular VMs, perhaps with Reserved Instances for savings. It’s likely that people open to the idea of spot instances are the same who would be early adopters for other tech, like serverless, and no longer have a need for Spot.
For the right architecture, spot instances can provide significant savings. It’s a matter of whether you want to bother.