AWS vs Alibaba Cloud Pricing: A Comparison of Compute Options

AWS vs Alibaba Cloud Pricing: A Comparison of Compute Options

As cloud users continue to use Alibaba Cloud, extending its global presence, we’ll review a comparison of AWS vs Alibaba Cloud pricing. Commonly recognized as the #4 cloud provider (from a revenue perspective anyway), Alibaba is one of the fastest-growing companies in the space today. 

Alibaba has been getting a lot of attention lately, given its rapid growth, and making headlines after the release of their latest quarterly revenue and full fiscal year 2019 reports. Alibaba is at the top of the market in Asia, and dominating in China with cloud revenue up 66% year-over-year. While Alibaba is in the top 5 CSPs worldwide, they still have a lot of plans for the future to maintain this growth and continue to move up. 

The company said it is focused on high-value security, analytics, and artificial intelligence tools and “rationalizing our offerings of commodity products and services.” With an annual revenue run rate of $4.5 billion, it is clear that Alibaba Cloud intends to compete globally with AWS and other major cloud providers. 

However, on a global scale, AWS continues to dominate the market. In the latest quarter, Amazon reported Amazon Web Services (AWS) sales of $7.7 billion, compared to $5.44 billion at this time last year. AWS revenue grew 41% in the first quarter – at this time last year, that number was 49%.

ParkMyCloud supports Alibaba Cloud and AWS, and with that, let us focus on pricing and cost savings – our forte. In this blog, we dive a bit into the pricing of Alibaba Elastic Compute Service (ECS), compare it with that of the AWS EC2 service and whether Alibaba Cloud computing can offer better value than AWS.

Alibaba ECS vs AWS EC2

Elastic Compute Service (ECS) and Elastic Cloud Compute (EC2), respectively, are the standard compute services offered by Alibaba Cloud and AWS.

Both cloud computing services provide the same core features:

  • The ability to choose from dozens of instance types.
  • Support for virtual as well as bare-metal servers.
  • Compatibility with a variety of Windows and Linux-based operating systems.
  • The ability to create custom images.

The major differences between Alibaba Cloud ECS and AWS EC2 are that Alibaba Cloud provides a wider range of instance families and that AWS offers more regions globally.

Alibaba vs Aliyun

Finding actual pricing for comparison purposes can be a bit complicated, as the prices are listed in a couple of different places and do not quite exactly match up since pricing varies between different instance types, and no instances from the two companies are identical. If one searches for Alibaba pricing, one ends up here, which I am going to call the “Alibaba Cloud” site. However, when you actually get an account and want to purchase an instance, you can up here or here, both of which I will call the “Aliyun” site. [Note that you may not be able to see the Aliyun sites without signing up for an account and actually logging-in.]  

Aliyun (literally translated “Ali Cloud”) was the original name of the company, and the name was changed to Alibaba Cloud in July 2017. Unsurprisingly, the Aliyun name has stuck around on the actual operational guts of the company, reflecting that it is probably hard-coded all over the place, both internally and externally with customers. (Supernor’s 3rd Conjecture: Engineering can never keep up with Marketing.)

Both sites show that like the other major cloud providers, Alibaba’s pricing model includes a Pay-As-You-Go (PAYG) offering, with per-second billing. Note, however, that in order to save money on stopped instances, one must specifically enable a “No fees for stopped instances” feature. Luckily, this is a global one-time setting for instances operating under all Pay-As-You-Go VPC instances, and you can set it and forget it. Unlike AWS, this feature is not available for any instances with local disks (this and other aspects of the description lead me to believe that Alibaba instances tend to be “sticky” to the underlying hardware instance). On AWS, local disks are described as ephemeral and are simply deallocated when they are not in use. Like AWS, Alibaba Cloud system/data disks continue to accrue costs even when an instance is stopped.

Both sites also show that Alibaba also has a one-month prepaid Subscription model. Based on a review of the pricing listed for the us-east-1 region on the Alibaba Cloud site, the monthly subscription discount reflects a substantial 30-60% discount compared to the cost of a PAYG instance that is left up for a full month. For a non-production environment that may only need to be up during normal business hours (say, 9 hours per day, weekdays only), one can easily see that it may be more cost-effective to go with the PAYG pricing, and use the ParkMyCloud service to shut the instances down during off-hours, saving 73%.

But this is where the similarities between the sites end. For actual pricing, instance availability, and even the actual instance types, one really needs to dive into a live Alibaba account. In particular, if PAYG is your preference, note that the Alibaba public site appears to have PAYG pricing listed for all of their available instance types, which is not consistent with what I found in the actual purchasing console.

Low-End Instance Types – “Entry Level” and “Basic”

The Alibaba Cloud site breaks down the instance types into “Entry Level” and “Enterprise”, listing numerous instance types under both categories. All of the Entry Level instance types are described as “Shared Performance”, which appears to mean the underlying hardware resources are shared amongst multiple instances in a potentially unpredictable way, or as described by Alibaba: “Their computing performance may be unstable, but the cost is relatively low” – an entertaining description to say the least. I did find these instance types on the internal purchasing site, but did not delve any further with them, as they do not offer a point of reference for our AWS vs. Alibaba Cloud pricing comparison. They may be an interesting path for additional investigation for non-production instance types where unstable computing performance may be OK in exchange for a lower price.

That said…after logging in to the Alibaba management console, reaching the Aliyun side of the website, there is no mention of Entry Level vs Enterprise. Instead, we see the top-level options of “Basic Purchase” vs “Advanced Purchase”. Under Basic Purchase, there are four “t5” instance types. The t5 types appear to directly correspond to the first four AWS t2 instance types, in terms of building up CPU credits.

These four instance types do not appear to support the PAYG pricing model. Pricing is only offered on a monthly subscription basis. A 1-year purchase plan is also offered, but the math shows this is just the monthly price x12. It is important to note that the Aliyun site itself has issues, as it lists the t5 instance types in all of the Alibaba regions, but I was unable to purchase any of them in the us-east-1 region – “The configuration for the instance you are creating is currently not supported in this zone.”  (A purchase in us-west-1, slightly more expensive, was fine).

The following shows a price comparison for Alibaba vs AWS for “t” instance prices in a number of regions. The AWS prices reflect the hourly PAYG pricing, multiplied by an average 730 hour month. I was not able to get pricing for any AWS China region, so the Alibaba pricing is provided for reference.

While the AWS prices are higher, the AWS instances are PAYG, and thus could be stopped when not being used, common for t2 instances used in a dev-test environment, and potentially saving over 73%. One can easily see that this kind of savings is needed to compete with the comparatively low Alibaba prices. I do have to wonder what is up with that Windows pricing in China….does Microsoft know about this??

Aliyun “Advanced Purchase”

Looking at the “Advanced” side of the Aliyun purchasing site, we get a lot more options, including Pay-As-You-Go instances. To keep the comparison simple, I am going to limit the scope here to a couple of instance types, trying to compare a couple m5 and i3 instances with their Alibaba equivalents. I will list PAYG pricing where offered.

In this table, the listed monthly AWS prices reflect the hourly pay-as-you-go price, multiplied by an average 730 hour month.

The italicized/grey numbers under Alibaba indicate PAYG numbers that had to be pulled from the public-facing website, as the instance type was not available for PAYG purchase on the internal site. From a review of the various options on the internal Aliyun site, it appears the PAYG option is not actually offered for very many standalone instance types on Alibaba…

The main reason I pulled in the PAYG prices from the second source was for auto scaling, which is normally charged at PAYG prices. In Alibaba, “all ECS instances that Auto Scaling automatically creates, or manually adds to a scaling group will be charged according to their instance types. Note that you will still be charged for Pay-As-You-Go instances even after you stop them.”  It is possible, however, to manually add subscription-based instances to an auto scaling group, and configure them to be not removed when the group scales-down.

In general, the full price of the AWS Linux instances over a month is 22-35% higher than of an Alibaba 1-month subscription. A full price AWS Windows instance over a month is 9-25%  higher than that of an Alibaba subscription. (And once again, it appears Windows licensing fees are not a factor in China.)

When it comes to Alibaba Cloud pricing vs AWS, Alibaba Cloud is trying to attract business and expand their global footprint by offering special promotions typically consisting of free trials, specially priced starter packages, and time-limited discounts on premium services. In many cases, taking advantage of these promotions could be useful in order to save money, but so is AWS.

AWS Introduces Savings Plans for EC2

Amazon also has their fair share of money-saving offerings as well. AWS announced the release of AWS Savings Plans – a new system for getting a discount on committed usage for EC2.

There are two kinds of Savings Plan: 

  • Compute Savings Plan – Apply to EC2 usage regardless of instance family, size, AZ, region, OS, or tenancy.  For any given instance configuration, pricing is similar (if not identical) to an equivalent Convertible RI, giving up to a 66% discount.
  • EC2 Instance Savings Plan – Specific to EC2 instances within a family in a specific region, but regardless of size, OS, or tenancy.  For any given instance configuration, pricing is similar to an equivalent Standard RI, giving up to a 72% discount in exchange for the reduced flexibility.

AWS Reserved Instance new queuing option

You can now purchase reserved instances that, rather than going into effect immediately, are scheduled for future purchase.

Now, when planned correctly, you can avoid lapsing on Reserved Instance coverage for your workloads by scheduling a new reservation purchase to go into effect as soon as the previous one expires. The furthest in advance you can schedule a purchase is three years, which is also the longest RI term available. 

However, AWS RI purchases have few limitations, they can be queued for regional Reserved Instances, but not zonal Reserved Instances. Regional RIs are the broader option as they cover any availability zone in a region, while zonal RIs are for a specific availability zone and actually reserve capacity as well.

AWS vs Alibaba Cloud Pricing: Alibaba is cheaper, but…

Alibaba definitely comes out as less expensive in this AWS vs Alibaba cloud pricing comparison – the one-month subscription has a definite impact. However, for longer-lived instances, AWS Reserved Instances will certainly be less expensive, running about 40-75% less expensive than AWS PAYG, and thus less than some if not all of the Alibaba monthly subscriptions. AWS RI’s are also more easily applicable to auto scaling groups than a monthly subscription instance.

For non-production instances that can be shut down when not in use, PAYG is less expensive for both cloud providers, where ParkMyCloud can help you schedule the downtime. The difficulty with Alibaba will actually be finding instances types that can actually be purchased with the PAYG option.

Cloud Certification Guide: How to Master & Showcase Your Expertise in AWS, Azure, & Google Cloud

Cloud Certification Guide: How to Master & Showcase Your Expertise in AWS, Azure, & Google Cloud

Each of the ‘big three’ cloud providers (AWS, Azure, GCP) offer a number of cloud certification options that individuals can get to validate their cloud knowledge and skill set, while helping them advance in their careers and broaden the scope of their achievements. 

Between the different PaaS specific, role-based (such as dev. or architect) or domain focused certifications, CSPs have numerous options available to help you bring more value to your organization as you keep up with the new business demands and continue to challenge yourself and grow with this world. With these certifications, you are more likely to achieve business goals thanks to your proficiency in specific areas – and benefit from an extra edge on your resume in your next job search. 

Here’s an overview of the certifications offered by AWS, Azure, and GCP and what capabilities an individual validates by completing these certifications. 

Amazon Web Services (AWS) Certifications

AWS offers certifications for different learning levels. The four different categories/levels of certifications include:

  • Foundational: individuals should have at least six months of basic/foundational industry and AWS knowledge.
  • Associate: expected to have one year of experience solving problems and implementing solutions with AWS.
  • Professional: aimed for individuals that have two years of comprehensive experience operating, designing and solving solutions using AWS.
  • Specialty: each of the certifications in this category are based on a technical AWS experience in the specialty domain. Requirements for these certifications can range from a minimum of 6 months to 5 years of required hands-on experience. 

AWS certifications offered include

  • AWS Certified Cloud Practitioner 
    • Individuals are expected to effectively demonstrate a comprehensive understanding of AWS fundamentals and best practices. 
  • AWS Certified Solutions Architect – Associate 
    • Individuals in an associate solutions architect role have 1+ years of experience designing available, fault-tolerant, scalable, and most importantly cost-efficient, distributed systems on AWS.
    • Can demonstrate how to build and deploy applications on AWS.
  • AWS Certified SysOps Administrator – Associate
    • This certification is meant for systems administrators that hold a systems operations role and have at least one year of hands-on experience in management, operations and deployments on AWS.
    • They must be able to migrate on-premises workloads to AWS
    • They can estimate usage costs and identify operational cost control methods. 
    • Must prove knowledge of deploying, operating and managing highly available, scalable and fault-tolerant systems on AWS.
  • AWS Certified Developer – Associate
    • This is for individuals who hold a development role and have at least one or more years of experience developing and maintaining AWS-based applications.
    • Display a basic understanding of core AWS services, uses, and basic AWS architecture best practices.
    • Demonstrate that they are capable of developing, deploying, and debugging cloud-based applications using AWS
  • AWS Certified Solutions Architect – Professional 
    • Individuals in a professional solutions architect role have two or more years of experience operating and managing systems on AWS.
    • They must be able to design and deploy scalable, highly available, and fault-tolerant applications on AWS.
    • Must demonstrate knowledge of migrating complex, multi-tier applications on AWS
    • They are responsible for implementing cost-control strategies.
  • AWS Certified DevOps Engineer – Professional
    • Intended for individuals who have a DevOps engineer role and two or more years of experience operating, provisioning and managing AWS environments.
    • They are able to implement and manage continuous delivery systems and methodologies on AWS.
    • Additionally, they must be able to implement and automate security controls, governance processes, and compliance validation.
    • Can deploy and define metrics, monitoring and logging systems on AWS. 
    • Are responsible for designing, managing, and maintaining tools that automate operational processes.
  • AWS Certified Advanced Networking – Speciality 
    • Intended for individuals who perform intricate networking tasks.
    • Design, develop, and deploy cloud-based solutions using AWS
    • Design and maintain network architecture for all AWS services
    • Leverage tools to automate AWS networking tasks
  • AWS Certified Big Data – Speciality
    • For individuals who perform complex Big Data analyses and have at least two years of experience using AWS.
    • Implement core AWS Big Data services according to basic architecture best practices
    • Design and maintain Big Data
    • Leverage tools to automate data analysis
  • AWS Certified Security – Speciality
    • Individuals who have a security role and at least two years of hands-on experience securing AWS workloads.
    • Exhibit an understanding of specialized data classifications and AWS data protection mechanisms as well as data encryption methods and secure Internet protocols and AWS mechanisms to implement them 
    • Knowledge of AWS security services and features to provide a secure production environment
    • An understanding of security operations and risk
  • AWS Certified Machine Learning – Speciality
    • Intended for individuals in a development or data science role.
    • Ability to design, implement, deploy and maintain machine learning solutions for specific business problems. 
  • AWS Certified Alexa Skill Builder – Speciality
    • Intended for individuals who have a role as an Alexa skill builder. 
    • Individuals have demonstrated an ability to design, build, test, publish and manage Amazon Alexa skills.

Microsoft Azure Certifications

Following the Azure learning path under Microsoft, there are certifications available that allow you to demonstrate your expertise in Microsoft cloud-related technologies and advance your career by earning one of the new Azure role-based certifications or an Azure-related certification in platform, development, or data.

Azure certifications include:

  • Azure Solutions Architect Expert
    • Intended for individuals that have an expertise in network, compute, security and storage so that they can design solutions that run on Azure
  • Azure Fundamentals
    • Individuals will prove their understanding of cloud concepts, Azure pricing and support, core Azure services, as well as the fundamentals of cloud privacy, security, trust and compliance. 
  • Azure DevOps Engineer Expert
    • Individuals will demonstrate an ability to combine people, process, and technologies to continuously deliver valuable products and services that meet business objectives in addition to end user needs. 
  • Azure Developer Associate
    • For individuals that can design, build, test and maintain cloud solutions – such as applications and services – and partner with cloud solutions architects, cloud administrators, cloud DBAs, and clients in order to implement these solutions. 
  • Azure Data Scientist Associate
    • Intended for individuals that apply Azure’s machine learning techniques to train, evaluate, and deploy models that will ultimately help solve business problems. 
  • Azure Data Engineer Associate
    • For individuals that design and implement the management, security, monitoring, and privacy of data – using the full stack of Azure data services – to satisfy business needs. 
  • Azure AI Engineer Associate
    • Intended for individuals that use Machine Learning, Knowledge Mining, and Cognitive Services to architect and implement Microsoft AI solutions – this involves natural language processing, computer vision, speech, agents and bots. 
  • Azure Administrator Associate
    • Individuals must demonstrate their ability to implement, monitor and maintain Azure solutions – this includes major services related to storage, compute, security and network. 
  • Azure Security Engineer Associate 
    • Individuals are expected to be able to implement security controls and threat protection, manage identity and access. Additionally, they must be able to protect data, applications, and networks in the cloud as well as hybrid environments as part of end-to-end infrastructure. 
  • Azure for SAP Workloads Specialty 
    • In this specialty, architects have extensive experience and knowledge of the SAP Landscape Certification process and industry standards that are specific and critical to the long-term operation of an SAP solution. 
  • Azure IoT Developer Specialty
    • In this specialty, individuals must prove that they understand how to implement the Azure services that form an IoT solution – this includes data analysis, data processing, data storage options, and PaaS options. 
    • Must be able to recognize Azure IoT service configuration settings within the code portion of an IoT solution.

GCP Certifications

Google offers three different levels of available certifications:

  • Associate certification – focused on the fundamental skills of deploying, monitoring, and maintaining projects on Google Cloud.
    • This certification is a good starting point for those new to cloud and can be used as a path to professional level certifications.
    • Recommended experience: 6+ months building on Google Cloud
  • Professional certification – span key technical job functions and assess advanced skills in design, implementation, and management.
    • These certifications are recommended for individuals with industry experience and familiarity with Google Cloud products and solutions.
    • Recommended experience: 3+ years of industry experience, including 1+ years on Google Cloud
  • User certification – intended for individuals with experience using G Suite and determines an individual’s ability to use core collaboration tools.
    • Recommended experience: Completion of Applied Digital Skills training course and G Suite Essentials quest, and 1+ months on G Suite.

Available certifications include:

  • Associate Cloud Engineer
    • Intended for individuals that can deploy applications, monitor operations, and manage enterprise solutions. 
    • Individuals display an ability to use the Google Cloud Console and the command-line interface to perform common platform-based tasks to maintain one or more deployed solutions that leverage Google-managed or self-managed services on Google Cloud.
    • Individuals display an ability to set up a cloud solution environment, plan and configure a cloud solution, deploy and implement a cloud solution, ensure successful operation of a cloud solution, and configure access and security.
  • Professional Cloud Architect 
    • For individuals that enable organizations to leverage Google Cloud technologies. 
    • These individuals can design, develop, and manage secure, scalable, and highly available solutions that drive business objectives.
    • Individuals display an ability to design and plan a cloud solution architecture, manage and provision the cloud solution infrastructure, design for security and compliance, analyze and optimize technical and business processes, manage implementations of cloud architecture, and ensure solution and operations reliability. 
  • Professional Cloud Developer
    • These individuals build scalable and highly available applications using Google recommended practices and tools that leverage fully managed services. 
    • Have experience with next generation databases, runtime environments, and developer tools. 
    • Have proficiency with at least one general purpose programming language and are skilled in using Stackdriver.
    • Individuals display an ability to design highly scalable, available, and reliable cloud-native applications, build and test applications, deploy applications, integrate Google Cloud Platform services, and manage application performance monitoring.
  • Professional Data Engineer
    • Intended for individuals that enable data-driven decision making by collecting, transforming, and publishing data. 
    • Individuals should be able to design, build, operate, manage, and monitor secure data processing systems.
    • Individuals display an ability to design data processing systems, build and operationalize data processing systems, operationalize machine learning models, and ensure solution quality.
  • Professional Cloud DevOps Engineer
    • Individuals are responsible for efficient development operations that can balance service reliability and delivery speed. 
    • Individuals are expected to be skilled in using Google Cloud Platform to build software delivery pipelines, deploy and monitor services, and manage and learn from incidents.
    • Individuals display an ability to apply site reliability engineering principles to a service, optimize service performance, implement service monitoring strategies, build and implement CI/CD pipelines for a service, and manage service incidents.
  • Professional Cloud Security Engineer
    •  Intended for individuals that enable organizations to design and implement a secure infrastructure on Google Cloud Platform. 
    • They are expected to have a thorough understanding of security best practices and industry security requirements.
    • These individuals design, develop, and manage a secure infrastructure leveraging Google security technologies and should be proficient in all aspects of Cloud Security.
    • Individuals display an ability to configure access within a cloud solution environment, configure network security, ensure data protection, manage operations within a cloud solution environment and ensure compliance.
  • Professional Cloud Network Engineer
    • Intended for individuals who implement and manage network architectures in Google Cloud Platform. 
    • These individuals ensure successful cloud implementations using the command line interface or the Google Cloud Platform Console.
    • Individuals display an ability to design, plan, and prototype a GCP Network, implement a GCP Virtual Private Cloud (VPC), configure network services and implement hybrid interconnectivity.
  • Professional Collaboration Engineer
    • Intended for individuals that transform business objectives into tangible configurations, policies, and security practices as they relate to users, content, and integrations. 
    • Individuals use tools, programming languages, and APIs to automate workflows. 
    • Individuals display an ability to plan and implement G Suite authorization and access, manage user, resource, and Team Drive lifecycles, manage mail, control and configure G Suite services, configure and manage endpoint access, monitor organizational operations and advance G Suite adoption and collaboration.
  • G Suite User – User Certification
    • This certification lets employers know that you possess the digital skills to work collaboratively and productively in a professional environment, complete common workplace activities using cloud-based tools to create and share documents, spreadsheets, presentations, and files. 

Where to Start

If you aren’t sure where to start, each cloud provider offers a certification that only requires a basic understanding of the platform and are a great way to help you get the ball rolling in your cloud certification journey. The three certifications for beginners are: AWS Certified Cloud Practitioner, Microsoft Certified Azure Fundamentals, and Google Associate Cloud Engineer. Good luck!

Further reading:

5 Favorite AWS Training Resources

5 Free Azure Training Resources

5 Free Google Cloud Training Resources

AWS Trusted Advisor Implies The Existence Of AWS Doubted Advisor

AWS Trusted Advisor Implies The Existence Of AWS Doubted Advisor

AWS Trusted Advisor is a service that helps you understand if you are using your AWS services well. It does this by looking at 72 different best practices across 5 total categories, which include Cost Optimization, Performance, Security, Fault Tolerance, and Service Limits. All AWS users have access to 7 of those best practices, while Business Support and Enterprise Support customers have access to all items in all categories.  Let’s dive in to each category to see what is there and what is missing.

Cost Optimization

A category that is near and dear to our hearts here at ParkMyCloud, the Cost Optimization category includes items related to the following services:

  • EC2 – Reserved Instance purchase recommendations, underutilized VMs, Reserved Instance lease expirations
  • Load Balancers – idle LBs
  • EBS – Underutilized volumes
  • Elastic IP – unassociated addresses
  • RDS – Idle databases
  • Route 53 – Inefficient latency record sets
  • Redshift – Underutilized clusters

This list includes many of the services that are often the most expensive line items in an AWS account, but doesn’t take into account a large percentage of the AWS services available. Also, these recommendations only provide links to other AWS documentation that might help you solve the problem, as opposed to a service like ParkMyCloud that provides both the recommendations and ability to take the action of shutting down idle instances or resizing those instances for you.

Performance

This category caters more towards production instances, as it aims to make sure the performance of your applications is not hindered due to overutilization (as opposed to the Cost Savings category above, which is focused more on underutilization).  This includes:

  • EC2 – highly-utilized VMs, large number of security group rules (per instance or per security group)
  • EBS – SSD volume configuration, overutilized magnetic volumes, EC2 to EBS throughput
  • Route 53 – alias record sets
  • Cloudfront – CDN optimization, header forwarding, cache hit ratio, alternate domain names

This category is one of the weakest in terms of services supported, so you may want to factor that in if you’re trying to make sure your production applications are performing well on alternative AWS services.

Security

The security checks of AWS Trusted Advisor will look at the following items:

  • Security Groups – Unrestricted ports, unrestricted access, RDS access risk
  • IAM – Use of Roles/Users, key rotation, root account MFA, password policy
  • S3 – Bucket permissions
  • CloudTrail – logging use
  • Route 53 – MX and SPF record sets
  • ELB – Listener security, Security groups
  • Cloudfront – Custom SSL certificates, certificates on the origin server
  • Access keys – Exposed keys
  • Snapshots – EBS public snapshots, RDS public snapshots

Security is a tough category to get right, as almost every one of these needs to be reviewed for your business needs.  While this isn’t an exhaustive list of security considerations, it certainly helps your organization cover the basics and prevent some “I can’t believe we did that” moments.

Fault Tolerance

One of the main benefits of the cloud that often gets overlooked is the use of distributed resources to increase fault tolerance for your services.  These items in the fault tolerance category are focused on increasing the redundancy and availability of your applications. They include:

  • EBS – Snapshots
  • EC2 – Availability Zone balance
  • Load Balancer – optimization
  • VPN Tunnel – redundancy
  • Auto Scaling Groups – general ASG usage, health check
  • RDS – backups, multi-AZ configuration
  • S3 – bucket logging, bucket versioning
  • Route 53 – Name server delegations, record sets with high TTL or failover resources, deleted health checks
  • ELB – connection draining, cross-zone load balancing
  • Direct Connect – Connection / location / virtual interface redundancy
  • Aurora DB – instance accessibility
  • EC2 Windows – EC2Config agent age, PV driver versions, ENA driver versions, NVMe driver versions

Overall, this turns out to be a great list of AWS services that can really make sure your production applications have minimal downtime and minimal latency.  Additionally, some services like snapshots and versioning, help with recovering from problems in a timely fashion.

Service Limits

One of the hidden limitations that AWS puts on each account is a limit of how many resources you can spin up at any given time.  This makes sense for AWS, so they don’t have new users unintentionally (or intentionally!) perform a DOS for other users. These service limits can be increased if you ask nicely, but this is one of the few places where you can actually see if you’re coming close.  The services covered are:

  • DynamoDB
  • EBS
  • EC2
  • Kinesis
  • RDS
  • Route 53
  • SES
  • VPC
  • ASG
  • CloudFormation
  • ELB
  • IAM

Verdict: Helpful, But Not Game-Changing

While these checks and advice from AWS Trusted Advisor certainly help AWS users see ways to improve their usage of AWS, the lack of one-click-action makes these recommendations just that – recommendations.  Someone still has to go verify the recommendations and take the actions, which means that in practice, a lot of this gets left as-is. That said, while I wouldn’t suggest upgrading your support just for Trusted Advisor, it certainly can provide value if you’re already on Business Support or Enterprise Support.

Our Favorite Announcements from AWS re:Invent 2019

Our Favorite Announcements from AWS re:Invent 2019

There have been about 1.3 zillion blogs posted this week recapping the announcements from AWS re:invent 2019, and of course we have our own spin on the topic.  Looking primarily at cost optimization and cost visibility, there were a few cool new features posted. None of them were quite as awesome as the new Savings Plan announcement last month, but they are still worthy of note.

AWS Compute Optimizer

With AWS jumping feet-first into machine learning, it is no surprise that they turned it loose on instance rightsizing.  

The Compute Optimizer is a standalone service in AWS, falling under the Management & Governance heading (yes, it is buried in the gigantic AWS menu).  It offers rightsizing for the M, C, R, T, and X instance families and Auto Scaling groups of a fixed size (with the same values for desired/min/max capacity).  To use the service you must first “opt-in” in each of your AWS accounts. Navigate to AWS Cost Optimizer and click the “Get Started” button.  

Interestingly, they only promise a cost reduction “up to 25%”.  This is probably a realistic yet humble claim, given that the savings for a single downsize in the same instance family is typically 50%.  That said, the only way to get that 50% cost reduction is to install the AWS CloudWatch Agent on your instances and configure it to send memory metrics to CloudWatch. If you are not running the agent…then no memory metrics.  Like ParkMyCloud rightsizing, in the absence of memory metrics, the AWS Compute Optimizer can only make cross-family recommendations that change only the CPU or network configuration, leaving memory constant. Hence – a potential 25% cost reduction.

The best part?  It is free! All in all, this feature looks an awful lot like ParkMyCloud rightsizing recommendations, though I believe we add a bit more value by making our recommendations a bit more prominent in our Console – not mixed-in with 100+ other menu items…  The jury is still out on the quality of the recommendations; watch for another blog soon with a deeper dive.

Amazon EC2 Inf1 Instance Family

Every time you congratulate yourself on how much you have been able to save on your cloud costs, AWS comes up with a new way to help you spend that money you had “left over.”  In this case, AWS has created a custom chip, the “Inferentia”, purposely designed to optimize machine learning inference applications.  

Inference applications essentially take a machine learning model that has already been trained via some deep-learning framework like TensorFlow, and uses that model to make predictions based on new data.  Examples of such applications include fraud detection and image or speech recognition.

The Inferentia is combined in the new Inf1 family with Intel® Xeon® CPUs to make a blazingly fast machine for this special-purpose processing.  This higher processing speed allows you to do more work in less time than you could do with the previous instance type used for inferencing applications, the EC2 G4 family.  The G4 is built around Graphics Processing Unit (GPU) chips, so it is pretty easy to see that a purpose-built machine learning chip can be made a lot faster. AWS claims that the Inf1 family will have a “40% lower cost per inference than Amazon EC2 G4 instances.”  This is a huge immediate savings, with only the work of having to recompile your trained model using AWS Neuron, which will optimize it for use with the Inferentia chip.

Next Generation Graviton2 Instances

The final cool cost-savings item is another new instance type that fits into the more commonly used M, C, and R instances families.  These new instance types are built around another custom AWS chip (watch out Intel and AMD…) the Graviton2.  The Graviton chips, in general, are built around the ARM processor design, more commonly found in smartphones and the like.  Graviton was first released last year on the A1 instance family and honestly, we have not seen too many of them pass through the ParkMyCloud system.  Since the Graviton2 is built to support M, C, and R, I think we are much more likely to see widespread use.

Looking at how they perform relative to the current M5 family, AWS described the following performance improvements:

  • HTTPS load balancing with Nginx: +24%
  • Memcached: +43% performance, at lower latency
  • X.264 video encoding: +26%
  • EDA simulation with Cadence Xcellium: +54%

Overall, the new instances offer “40% better price performance over comparable current generation instances.”

The new instance types will be the M6g and M6gd (“g”=Graviton, “d”=NVMe local storage), the C6g and C6gd, and the R6g and R6gd.  The new family is still in Preview mode, so pricing is not yet posted, but AWS is claiming a net “20% lower cost and up to 40% higher performance over Amazon EC2 M5 instances, based on internal testing of workloads.”  We will definitely be trying these new instance types when they release in 2020!

Summary

All in all, there were no real HUGE announcements that would impact your costs, but baby steps are OK too!

Google Cloud Platform vs AWS: Is the answer obvious? Maybe not.

Google Cloud Platform vs AWS: Is the answer obvious? Maybe not.

Google Cloud Platform vs AWS: what’s the deal? A while back, we also asked the same question about Azure vs AWS. After the release of the latest earnings reports a few weeks ago from AWS, Azure, and GCP, it’s clear that Microsoft is continuing to see growth, Amazon is maintaining a steady lead, and Google is stepping in. Now that Google Cloud Platform has solidly secured a spot among the “big three” cloud providers, we think it’s time to take a closer look and see how the underdog matches up to the rest of the competition. 

Is Google Cloud catching up to AWS?

As they’ve been known to do, Amazon, Google, and Microsoft all released their recent quarterly earnings around the same time the same day. At first glance, the headlines tell it all:

The obvious conclusion is that AWS continues to dominate in the cloud war. With all major cloud providers reporting earnings around the same time, we have an ideal opportunity to examine the numbers and determine if there’s more to the story. Here’s what the quarterly earning reports tell us:

  • AWS had the slowest growth they have ever since they began separating their cloud reportings – up just 37% from last year.
  • Microsoft Azure reported a revenue growth rate of 59%.
    • Microsoft doesn’t break out specific revenue amounts for Azure, but Microsoft did report that its “Intelligent Cloud” business revenue increased 27% to $10.8 billion, with revenue from server products and cloud services increasing 30%
  • Google’s revenue has cloud sales lumped together with hardware and revenue from the Google Play app store, summing up to a total of $6.43 billion for the last quarter. 
    • To compare, last year during Q3 their revenue was at $4.64 billion.
  • During their second-quarter conference call in July, Google said their cloud is on an $8 billion revenue run rate – meaning cloud sales have doubled in less than 18 months.

 

You can see here that while Google is the smallest out of the “big three” providers, they have shown the most growth – from Q1 2018 to Q1 2019, Google Cloud has seen growth of 83%. While they still have a ways to go before surpassing AWS and Microsoft, they are moving quickly in the right direction as Canalys reported they were the fasted growing cloud-infrastructure vendor in the last year. 

It’s also important to note that Google is just getting started. Also making headlines was an increase in new hires, adding 6,450 in the last quarter, and most of them going to positions in their cloud sector. Google’s headcount now stands at over 114,000 employees in total.

The Obvious: Google is not surpassing AWS

When it comes to Google Cloud Platform vs AWS, we have a clear winner. Amazon continues to have the advantage as the biggest and most successful cloud provider in the market. While AWS is growing at a smaller rate now than both Google Cloud and Azure, Amazon still holds the largest market share of all three. AWS is the clear competitor to beat as they are the first and most successful cloud provider to date, with the widest range of services, and a strong familiarity among developers.

The Less Obvious: Google is actually gaining more ground

While it’s easy to write off Google Cloud Platform, AWS is not untouchable. AWS has already solidified itself in the cloud market, but with the new features and partnerships, Google Cloud is proving to be a force to be reckoned with. 

Where is Google actually gaining ground?

We know that AWS is at the forefront of cloud providers today, but that doesn’t mean Google Cloud is very far behind. AWS is now just one of the three major cloud providers – with two more (IBM and Alibaba) gaining more popularity as well. Google Cloud Platform has more in store for its cloud business in 2020. 

A big step for google was announced earlier this year at Google Cloud’s conference – Google Cloud Next – the CEO of Google Cloud announced that they would be coming out with a retail platform to directly compete with Amazon, called Google Cloud for Retail. What ‘s different about their product? For starters, they are partnering with companies such as Kohl’s, Target, Bed Bath & Beyond, Shopify, etc. – these retailers are known for being direct competition with Amazon. In addition to that, this will be the first time that Google Cloud has had an AI product that is designed to address a business process for a specific vertical. Google doesn’t appear to be stopping at just retail – Thomas Kurian said they are planning to build capabilities to assist companies in specialized industries, ex: healthcare, manufacturing, media, and more. 

Google’s stock continues to rise. With nearly 6,450 new hires added to the headcount, a vast majority of them being cloud-related jobs, it’s clear that Google is serious about expanding its role in the cloud market. In April of this year, Google reported that 103,459 now work there. Google CFO Ruth Porat said, “Cloud has continued to be the primary driver of headcount.” 

Google Cloud’s new CEO, Thomas Kurian, understands that Google is lagging behind the other two cloud giants, and plans to close that gap in the next two years by growing sales headcount. 

Deals have been made with major retailer Kohl’s department store, and payments processor giant, PayPal. Google CEO Sundar Pichai lists the cloud platform as one of the top three priorities for the company, confirming that they will continue expanding their cloud sales headcount. 

In the past few months, Pichai added his thoughts on why he believes the Google Cloud Platform is on a set path for strong growth. He credits their success to customer confidence in Google’s impressive technology and a leader in machine learning, naming the company’s open-source software TensorFlow as a prime example. Another key component to growth is strategic partnerships, such as the deal with Cisco that is driving co-innovation in the cloud with both products benefiting from each other’s features, as well as teaming up with VMware and Pivotal. 

Driving Google’s growth is also the fact that the cloud market itself is growing so rapidly. The move to the cloud has prompted large enterprises to use multiple cloud providers in building their applications. Companies such as Home Depot Inc. and Target Corp. rely on different cloud vendors to manage their multi-cloud environments. 

Home Depot, in particular, uses both Azure and Google Cloud Platform, and a spokesman for the home improvement retailer explains why that was intentional: “Our philosophy here is to be cloud-agnostic, as much as we can.” this philosophy goes to show that as long as there is more than one major cloud provider in the mix, enterprises will continue trying, comparing, and adopting more than one cloud at a time – making way for Google Cloud to gain more ground.

Multi-cloud environments have become increasingly popular because companies enjoy the advantage of the cloud’s global reach, scalability, and flexibility. Google Cloud has been the most avid supporter of multi-cloud out of the three major providers. Earlier this year at Google Cloud Next, they announced the launch of Anthos, a new managed service offering for hybrid and multi-cloud environments to give enterprises operational consistency. They do this by running quickly on any existing hardware, leverage open APIs and give developers the freedom to modernize. There’s also Google Cloud Composer, which is a fully managed workflow orchestration service built on Apache Airflow that allows users to monitor, schedule and manage workflows across hybrid and multi-cloud environments.

Google Cloud Platform vs. AWS – Why Does It Matter?

Google Cloud Platform vs AWS is only one of the battles to consider in the ongoing cloud war. The truth is, market performance is only one factor in choosing the best cloud provider. As we always say, the specific needs of your business are what will ultimately drive your decision. 

What we do know: the public cloud market is not just growing – it’s booming. Referring back to our Azure vs AWS comparison – the basic questions still remain the same when it comes to choosing the best cloud provider: 

  • Are the public cloud offerings to new customers easily comprehensible?
  • What is the pricing structure and how much do the products cost?
  • Are there adequate customer support and growth options?
  • Are there useful management tools?
  • Will our DevOps processes translate to these offerings?
  • Can the PaaS offerings speed time-to-value and simplify things sufficiently, to drive stickiness?

Right now AWS is certainly in the lead among major cloud providers, but for how long? We will continue to track and compare cloud providers as earnings are reported, offers are increased, and price options grow and change. To be continued in 2020…

Cloud Control: Why Is It So Hard?

Cloud Control: Why Is It So Hard?

According to most organizations, the biggest drivers to cloud are elasticity and agility. In other words, it allows you to instantly provision and de-provision resources based on the needs of the business. You no longer have to build the church for Sunday. Once in the cloud though, 80% of companies report receiving bills 2-3 times what they expected. The truth is, that while the promise of cloud is that you only pay for what you use, the reality is that you pay for what you allocate. The gap between consumption and allocation is what causes the large and unexpected bills.

Cost isn’t the only challenge. While most organizations report cost being their biggest problem in managing a public cloud environment, you cannot truly separate performance from cost, the two are tightly coupled. If an organization was optimizing for cost alone, moving all applications to the smallest instance type would be the way to go, but no one is willing to take the performance hit. In the cloud, more than ever, cost and performance are tied together.

To guarantee their SLAs, applications require access to all the resources they need. Developers, in an effort to make sure their applications behave as expected, allocate resources based on peak demand to ensure they have access to those resources if they need them. Without constantly monitoring and adjusting the resources allocated to each application, over-allocation is the only way to assure application performance. Overprovisioning of virtualized workloads is so prevalent, that it’s estimated that more than 50% of data centers are over-allocated.

On-premises, over-allocation of resources, while still costly, is significantly less impactful to the bottom line. On-premises the over provisioning is masked by over-allocated hardware and hypervisors that allow for sharing resources. In the cloud, where resources are charged by the second or minute, this over provisioning is extremely costly, resulting in bills much larger than expected.

The only way to solve this problem is to find a way to calibrate the allocation of resources continuously based on demand, or in other words, match supply and demand. This would result in TRULY only paying for the resources you need when you need them, the holy grail of cost efficiency. The ideal state is to have the right amount of resources at the right time, no more, and the only way to achieve that is through automation.

So why doesn’t everyone do that?

This is a complicated problem to solve. To achieve that we must look at all resources required by each application and match them to the best instance type, storage tier and network configuration in real time.

Let’s take a simple application running a front end and a back end on AWS EC2 in the Ohio region using EBS storage. There are over 70 instance types available. Each instance type defines the allocated memory, CPU, the benchmarked performance of the CPU to be expected (not all CPU cores perform equally), the available bandwidth for network and IO, the amount of local disk available and more. On top of that, there are 5 storage tiers on EBS that would further define the IOPS and IO throughput capabilities of the applications. This alone results in over 350 options for each component of the application.

Taking a closer look at network complicates matters even further.

Placing the two components across AZs will result in costly communication costs back and forth between the AZs. In addition, the latency in communication across AZs, even in the same region, is larger than within the same AZ, so depending on the latency sensitivity of the application the decision on which AZ to place the app on impacts the performance of the application, not just the cost. Placing them on the same AZ is not a great option either – it increases the risk to the organization in case of an outage on that zone. Cloud providers would only guarantee five 9s (99.99999%) up time when instances are spread across more than a single zone. In the Ohio region, there are 5 availability zones which brings us up to the need to evaluate 1,750 options for each component of the applications. Each of these options need to be evaluated against the memory, CPU, IO, IOPS, Network throughput and so on.

The problem is just as complicated on Azure, with over X instance types and different levels of premium and standard storage tiers and the recent introduction of availability zones.

Where you get the data to back up your decisions is important as well. When looking at the monitored data at the IaaS layer alone neither performance or efficiency can be guaranteed. Let’s take a simple JVM as an example. When looking at the memory monitored at the IaaS layer it will always report using 100% of the heap, but is it utilizing it? Is the application garbage collecting every minute or once a day? The heap itself should be adjusted based on that to make sure the application gets the resources it needs, when it needs them. CPU isn’t better. If the IaaS layer is reporting an application consuming 95% of a single CPU core, most would argue that it needs to be moved to a 2 core instance type. Looking into the application layer allows you to understand how the application is using that CPU. If a single thread is responsible for the bulk of the resource consumption adding another core wouldn’t help but moving to an instance family with stronger CPU performance would be a better solution.

To sum it up, assuring application performance while maintaining efficiency is more difficult than ever. The only way to truly only pay for what you use you must match supply and demand across multiple resources, from the application layer down to the IaaS layer in real time.