Earlier this week we discussed ways to improve cloud automation through tagging. Today, I want to extend the conversation to look at how one ParkMyCloud user is applying tagging best practices to improve their cloud governance.
The company we talked to — they’re in media, so let’s call them MediaCorp — has about 10,000 employees, which means the Cloud Engineering team has several hundred cloud users to manage, with a combined 100+ AWS accounts and more than 5,000 active AWS resources. The only way they can maintain security and cost control in a cloud environment of this magnitude is through automated governance. Here’s how they do it.
Tagging Best Practices #1: Always Tag
MediaCorp has a strict policy: every AWS resource must have the same set of five tags attached to it:
- team — essential to establishing ownership of the resource, both for maintenance and for billing
- environment — knowing whether the resource is for production, staging, or QA has implications for on/off schedules
- application — MediaCorp uses this as a trigger for Chef Cookbooks, but can also apply to billing
- expiration date — Any non-production resource has a stated expiration date to prevent orphaned resources
- cost center — The finance department has internal billing codes for all IT resources
How does MediaCorp ensure that all resources are tagged?
Tagging Best Practices #2: Automated Compliance
The key is to use automated rules to enforce that every resource has the five required tags — this is where ParkMyCloud’s policy engine comes into play. MediaCorp has a set of policies set up to check for the five tags. If a resource is missing any, the resource is immediately put on an “always parked” schedule and moved to a team (a way to group instances in ParkMyCloud) specifically for mistagged resources.
When this happens, the Cloud Engineering team gets an email and a Slack notification, so they can track down the creator of the offending resource and correct the process that created it.
Tagging Best Practices #3: Optimize Workflows
Now the tags themselves come into play. MediaCorp uses their five-tag system for three main purposes:
Configuration management: as mentioned above, they use tags as the trigger for Chef cookbooks, and of course the same applies to Puppet Modules, or Ansible Playbooks.
CI/CD: MediaCorp uses Jenkins to provision cloud resources, so they use tags to associate build and deployment servers with their corresponding repository and build number, for both automated and manual development tasks.
Cost control: the “environment” tag determines what parking schedule is applied to each resource. Production resources run 24×7, of course, while “dev” or “test” resources are put on a schedule to park 7:00 PM – 7:00 AM and on weekends. (Users can always log in to override these schedules if needed.)
Conclusion: Tagging is Worth the Effort
It may at first seem unnecessarily harsh to automatically park any resource that doesn’t have proper tags applied, but this process is what allows MediaCorp to keep a well-governed, cost-controlled infrastructure. You can always adapt their use case to your own needs by simply moving resources to another team and notifying that action is needed, without changing the state or schedule on the resource.
Either way, with a rigorous application of tagging best practices in place, you can automate governance and improve your workflows.