AWS recently announced a combination of AWS Systems Manager and Amazon Inspector into a new offering called AWS Server Fleet Management. The goal of this service is to provide a way to secure, automate, and configure a large array of servers through multiple AWS services all working together. Some enterprises already have a config management tool in place, but might be looking for a more AWS-centric way to manage their numerous EC2 servers. Let’s look at how Server Fleet Management works, how it stacks up against other config management tools, and some of the pros and cons of using this solution.
How It Works
AWS Server Fleet Management utilizes quite a few AWS services under the hood. The good news is that you don’t have to deploy these services manually, as there’s a Cloudformation template available that will build the entire stack for you. The services include:
- Amazon Cloudwatch – for kicking off events to trigger other services
- Amazon Inspector – manages the assessment rules for configuration and security
- Amazon SNS – message queue for tracking instance IDs and email addresses
- Amazon Lambda – various tasks, including querying Inspector and updating Systems Manager
- AWS Systems Manager – tracks inventory and configuration for EC2 instances and manages OS patches
- Amazon S3 – secure storage of artifacts
Before deploying the Cloudformation stack, you’ll need to enter a few configuration details. The main configuration detail is the “Managed Instances Tag Value”, which is the tag on your EC2 servers that you’ll place if you want them managed via Server Fleet Management. This can work in conjunction with the “Patch Group” tag in AWS Systems Manager if you want the instance to be automatically patched. Once you specify the tag, an email address, and whether you want a sample fleet to be deployed, you’re ready to create the stack!
Comparison to other tools
In the config management world, there are a few major players, including Chef, Puppet, Ansible, and SaltStack. From a purely configuration perspective, Server Fleet Management doesn’t offer anything new. However, if you’re fully bought-in to running everything within AWS, the flexibility of using Lambda functions in addition to other AWS services can be a huge advantage. On the flip side of that, enterprises that are multi-cloud may want to keep using a cloud-agnostic tool.
Pros and Cons
Along with the possible benefit of being purely within the AWS ecosystem, another major pro of AWS Server Fleet Management is the combination of security enforcement and patch management. Solving both of those problems often requires multiple tools, so this can trim down your list of applications. This solution also has lots of opportunities to tie into other existing AWS solutions or to be customized to fit your use cases.
The expandability can also be considered a con, as the built-in uses are fairly specific and require more customization for larger fleets. Some things that aren’t included are topics like cost management (we’ve got you covered), non-EC2 services that need security audits, application grouping, and cross-account access. There also aren’t any built-in hooks to existing config management tools that are likely already in use.
Automated Security and Patching
All in all, AWS Server Fleet Management is worth looking into if you’ve got a large EC2 deployment. Even if you don’t use the pre-made stack, it might give you some ideas on how to use the underlying AWS services to help secure and manage your fleet. With the included sample fleet, it’s easy to get it set up and try it out!
If you use DevOps processes, automation and orchestration are king — which is why the Google Cloud cron service can be a great tool for managing your Google Compute Engine instances via Google App Engine code. This kind of automation can often involve multiple Google Cloud services, which is great for learning about them or running scheduled tasks that might need to touch multiple instances. Here are a few ideas on how to use the Google Cloud cron service:
1. Automated Snapshots
Since Google Compute Engine lets you take incremental snapshots of the attached disks, you can use the Google App Engine cron to take these snapshots on a daily or weekly basis. This lets you go back in time on any of your compute instances if you mess something up or have some systems fail. If you use Google’s Pub/Sub service, you can have the snapshots take place on all instances that are subscribed to that topic.
As a bonus, you can use a similar idea to manage old snapshots and deleting things you don’t need anymore. For example, schedule a Google Cloud cron to clean up snapshots three months after a server is decommissioned, or to migrate those snapshots to long-term storage.
2. Autoscaling a Kubernetes Cluster
With Google on the forefront of Kubernetes development, many GCP users make heavy use of GKE, the managed Kubernetes service. In order to save some money and make sure your containers aren’t running when they aren’t needed, you could set up a cron job to run at 5:00 p.m. each weekday to scale down your Kubernetes cluster to a size of 0. For maximum cost savings, you can just leave it off until you need it, then manually spin up the cluster, or you could use a second cron to spin you clusters up at 8:00 a.m. so it’s ready for the day.
(By the way — we’re working on functionality to let you do this automatically in ParkMyCloud, just like you can for VMs. Interested? Let us know & we’ll notify you on release.)
3. Send Weekly Reports
Is your boss hounding you for updates? Does your team need to know the status of the service? Is your finance group wondering how your GCP costs are trending for this week? Automate these reports using the Google Cloud cron service! You can gather the info needed and post these reports to a Pub/Sub topic, send them out directly, or display it on your internal dashboard or charting tool for mass consumption. These reports can be for various metrics or services, including Google Compute, Cloud SQL, or your billing information for your various projects.
Other Google Cloud Cron Ideas? Think Outside The Box!
Got any other ideas or existing uses to use the Google Cloud cron service to automate your Google Cloud environment? Let us know how you’re using it and why it helps you manage your cloud infrastructure.
When companies move from on-prem workloads to the cloud, common concerns arise around costs, security, and cloud user management. Each cloud provider handles user permissions in a slightly different way, with varying terminology and roles available to assign to each of your end users. Let’s explore a few of the differences in users and roles within Amazon Web Services (AWS), Microsoft Azure, Google Cloud Platform, and Alibaba Cloud.
AWS IAM Users and Roles
AWS captures all user and role management within IAM, which stands for “Identity and Access Management”. Through IAM, you can manage your users and roles, along with all the permissions and visibility those users and service accounts have within your AWS account. There are a couple different IAM entities:
- Users – used when an actual human will be logging in
- Roles – used when service accounts or scripts will be interacting with resources
Both users and roles can have IAM policies attached, which give specific permissions to operate or view any of the other AWS services.
Azure utilizes the RBAC system within Resource Manager for user permissions, which stands for “Role Based Access Control”. Granting access to Azure resources starts with creating a Security Principal, which can be one of 3 types:
- User – a person who exists in Azure Active Directory
- Group – a collection of users in Azure Active Directory
- Service Principal – an application or service that needs to access a resource
Each Security Principal can be assigned a Role Definition, which is a collection of permissions that they can utilize to view or access resources in Azure. There are a few built-in Role Definitions, such as Owner, Contributor, Reader, and User Access Administrator, but you can also create custom role definitions as well depending on your cloud user management needs. Roles may be assigned on a subscription by subscription basis.
Google Cloud Platform IAM
Google Cloud Platform also uses the term IAM for their user permissions. The general workflow is to grant each “identity” a role that applies to each resource within a project. An identity can be any of the following:
- Google account – any user with an email that is associated with a Google account
- Service account – an application that logs in through the Google Cloud API
- Google group – a collection of Google accounts and service accounts
- G Suite domain – all Google accounts under a domain in G Suite
- Cloud Identity domain – all Google accounts in a non-G-Suite organization
Roles in Google Cloud IAM are a collection of permissions. There are some primitive roles (Owner, Editor, and Viewer), some predefined roles, and the ability to create custom roles with specific permissions through an IAM policy.
Alibaba Cloud RAM
Alibaba Cloud has a service called RAM (Resource Access Management) for managing user identities. These identities work in slightly different ways than the other cloud service providers, though they have similar names:
- RAM-User – a single real identity, usually a person but can also be a service account
- RAM-Role – a virtual identity that can be assigned to multiple real identities
RAM users and roles can have one or more authorization policies attached to them, which in turn can each have multiple permissions in each policy. These permissions then work similarly to other CSPs, where a User or Role can have access to view or act upon a given resource.
Cloud User Management – Principles to Follow, No Matter the Provider
As you can see, each cloud service provider has a way to enable users to access the resources they need in a limited scope, though each method is slightly different. Your organization will need to come up with the policies and roles you want your users to have, which is a balancing act between allowing users to do their jobs and not letting them break the bank (or your infrastructure). The good news is that you will certainly have the tools available to provide granular access control for your cloud user management, regardless of the cloud (or clouds) you’re using.
In the world of infrastructure as code, the biggest divide seems to come in the war between Hashicorp’s Terraform vs. CloudFormation in AWS. Both tools can help you deploy new cloud infrastructure in a repeatable way, but have some pretty big differences that can mean the difference between a smooth rollout or a never ending battle with your tooling. Let’s look at some of the similarities and some of the differences between the two.
While the tools have some very unique features, they also share some common aspects. In general, both CloudFormation and Terraform help you provision new AWS resources from a text file. This means you can iterate and manage the entire infrastructure stack the same as you would any other piece of code. Both tools are also declarative, which means you define what you want the end goal to be, rather than saying how to get there (such as with tools like Chef or Puppet). This isn’t necessarily a good or bad thing, but is good to know if you’re used to other config management tools.
Unique Characteristics of CloudFormation
One of the biggest benefits of using CloudFormation is that it is an AWS product, which means it has tighter tie-ins to other AWS services. This can be a huge benefit if you’re all-in on AWS products and services, as this can help you maximize your cost-effectiveness and efficiency within the AWS ecosystem. CloudFormation also makes use of either YAML or JSON as the format for your code, which might be familiar to those with dev experience. Along the same lines, each change to your infrastructure is a changeset from the previous one, so devs will feel right at home.
There’s some additional tools available around CloudFormation, such as:
- Stacker – for handling multiple CloudFormation stacks simultaneously
- Troposphere -if you prefer python for creating your configuration files
- StackMaster – if you prefer Ruby
- Sceptre – for organizing CloudFormation stacks into environments
Unique Characteristics of Terraform
Just as being an AWS product is a benefit of CloudFormation if you’re in AWS, the fact that Terraform isn’t affiliated with any particular cloud makes it much more suited for multi-cloud and hybrid-cloud environments, and of course, for non-AWS clouds. There are Terraform modules for almost any major cloud or hypervisor in the Terraform Registry, and you can even write your own modules if necessary.
Terraform treats all deployed infrastructure as a state, with any subsequent changes to any particular piece being an update to the state (unlike the changesets mentioned above for CloudFormation). This means you can keep the state and share it, so others know what your stack should look like, and also means you can see what would change if you modify part of your configuration before you actually decide to do it. The Terraform configuration files are written in HCL (Hashicorp Configuration Language), which some consider easier to read than JSON or YAML.
More on Terraform: How to Use Terraform Provisioning and ParkMyCloud to Manage AWS
Terraform vs. CloudFormation: Which to choose?
The good news is that if you’re trying to decide between Terraform vs. CloudFormation, you can’t really go wrong with either. Both tools have large communities with lots of support and examples, and both can really get the job done in terms of creating stacks of resources in your environments. They are both also free, with CloudFormation having no costs (aside from the infrastructure that gets created) and Terraform being open-source while offering a paid Enterprise version for additional collaboration and governance options. Each has their pros and cons, but using either one will help you scale up your infrastructure and manage it all as code.
Since the beginning of public cloud, users have been attempting to improve cloud automation. This can be driven by laziness, scale, organizational mandate, or some combination of those. Since the rise of DevOps practices and principles, this “automate everything” approach has become even more popular, as it’s one of the main pillars of DevOps. One of the ways you can help sort, filter, and automate your cloud environment is to utilize tags on your cloud resources.
In the cloud infrastructure world, tags are labels or identifiers that are attached to your instances. This is a way for you to provide custom metadata to accompany the existing metadata, such as instance family and size, region, VPC, IP information, and more. Tags are created as key/value pairs, although the value is optional if you just want to use the key. For instance, your key could be “Department” with a value of “Finance”, or you could have a key of just “Finance”.
There are 4 general tag categories, as laid out in the best practices from AWS:
- Technical – This often includes things like the application that is running on the resource, what cluster it belongs to, or which environment it’s running in (such as “dev” or “staging”).
- Automation – These tags are read by automated software, and can include things like dates for when to decommission the resource, a flag for opting in or out of a service, or what version of a script or package to install.
- Business and billing – Companies with lots of resources need to track which department or user owns a resource for billing purposes, which customer an instance is serving, or some sort of tracking ID or internal asset management tag.
- Security – Tags can help with compliance and information security, as well as with access controls for users and roles who may be listing and accessing resources.
In general, more tags are better, even if you aren’t actively using those tags just yet. Planning ahead for ways you might search through or group instances and resources can help save headaches down the line. You should also ensure that you standardize your tags by being consistent with the capitalization/spelling and limiting the scope of both the keys and the values for those keys. Using management and provisioning tools like Terraform or Ansible can automate and maintain your tagging standards.
Once you’ve got your tagging system implemented and your resources labelled properly, you can really dive into your cloud automation strategy. Many different automation tools can read these tags and utilize them, but here are a few ideas to help make your life better:
- Configuration Management – Tools like Chef, Puppet, Ansible, and Salt are often used for installing and configuring systems once they are provisioned. This can determine which settings to change or configuration bundles to run on the instances.
- Cost Control – this is the automation area we focus on at ParkMyCloud – our platform’s automated policies can read the tags on servers, scale groups, and databases to determine which schedule to apply and which team to assign the resource to, among other actions.
- CI/CD – If your build tool (like Jenkins or Bamboo) is set to provision or utilize cloud resources for the build or deployment, you can use tags for the build number or code repository to help with the continuous integration or continuous delivery.
- Cloud Account Clean-up – Scripts and tools that help keep your account tidy can use tags that set an end date for the resource as a way to ensure that only necessary systems are around long-term. You can also take steps to automatically shut down or terminate instances that aren’t properly tagged, so you know your resources won’t be orphaned.
Conclusion: Tagging Will Improve Your Cloud Automation
As your cloud use grows, implementing cloud automation will be a crucial piece of your infrastructure management. Utilizing tags not only helps with human sorting and searching, but also with automated tasks and scripts. If you’re not already tagging your systems, having a strategy on the tagging and the automation can save you both time and money.