Your Approval Workflows for Cloud Resource Provisioning Are Costing You Money

Your Approval Workflows for Cloud Resource Provisioning Are Costing You Money

ITSM tools like ServiceNow have capabilities for creating approval workflows and processes for change control tracking such as cloud resource provisioning. Managers and C-level executives love the governance this provides, as they can use this to make sure they have full compliance with regulations and laws while also preventing rogue IT usage across the enterprise. Without these approval workflows, a user could provision a giant virtual machine without anyone knowing about it, or a firewall change could inadvertently take down your network. Surely these are good uses of approvals, right?

Training Users To Spend Money

One downside I’ve covered before is that these processes often slow users down, wasting time or leading them to circumvent the system altogether. Another consequence of using approval workflows for cloud resource provisioning is that it trains your users to always provision as much as possible. For example, let’s say you have an approval workflow that says your cloud operations team must approve all new VM requests in AWS, as well as all VM size change requests. If I’m a user, and I’m on the fence about needing an m5.large or an m5.xlarge server, I might as well request an m5.xlarge server now instead of having to submit another request to change the size if I need it. Each size up doubles the cost – so this single VM is now costing the company twice as much just because the user doesn’t want to go through additional approvals!

Let’s look at another example that we at ParkMyCloud are very familiar with, which is turning off resources overnight. An organization might set up a workflow that resources with a schedule that shuts them down overnight requires approval to turn back on outside of business hours. If a user has the choice of not applying that schedule versus needing approval to turn it on if they need it, they’ll do whatever it takes to not schedule that resource. This leads to unnecessary cloud waste.

Give Users The Tools To Succeed

At ParkMyCloud, we believe in guardrails and RBAC. However, we also know that empowering users to manage their own servers and databases through a self-service portal leads to more scheduling, rightsizing, and cost savings. By allowing a user to override a scheduled instance for a couple of hours when they need it instead of hassling their IT manager for approval, a user can get their job done while still allowing the instance to shut down.

Approval workflows for cloud resource provisioning, resource scheduling, and other infrastructure tasks do help to a certain extent. Compliance and regulatory adherence is a necessity. With that said, don’t be the CloudOps team that starts putting approval processes on every minor detail or dev server, or you’re going to hinder your users more than help them. Cost savings is a team effort, so when you implement tools, look for a self-service model like ParkMyCloud’s to help you manage resources while empowering users. You’ll save money while making users happy, which is a win-win for your enterprise.

Why Kubernetes If It Makes Your Life Worse?

Why Kubernetes If It Makes Your Life Worse?

The bleeding-edge tech community is full of fast-moving recommendations telling you why Kubernetes, blockchain, serverless, or the latest Javascript library is the pinnacle of technology and you are going to be left behind if you don’t drop what you’re doing RIGHT NOW and fully convert all of your applications. Inevitably, after the initial rush of posts expounding upon the life-changing benefits of these hot new technologies, eager followers will wake up to the fact that perhaps the new fad isn’t a magic bullet after all. This is how new tech holy wars begin, with both sides yelling that the other side is a member of a cult or trying to sell you something. So how do you weed through the noise and decide which technologies might actually improve your operations? And: is Kubernetes one of them?

Kubernetes – The Latest Holy War

Kubernetes v1.0 was released in 2015, after being developed and used internally at Google since 2014. It quickly emerged as the most popular way to manage and orchestrate large numbers of containers, despite competition from Docker Swarm, Apache Mesos, Hashicorp’s Nomad, and various other software from the usual suspects at IBM, HP, and Microsoft. Google partnered with the Linux Foundation in 2015 to form the Cloud Native Computing Foundation (CNCF) and had Kubernetes as one of the seed technologies.

This public dedication to open-source technology gave Kubernetes instant nerd bonus points, and it being used internally at one of the largest software companies in the world made it the hot new thing. Soon, there were thousands of articles, tweets, blog posts, and conference talks about moving to a microservices architecture built on containers using Kubernetes to manage the pods and services. It wasn’t long before misguided job postings were looking for 10+ years of Kubernetes experience and every startup pivoted from their blockchain-based app to a Kubernetes-based app overnight. 

As with any tech fad, the counterarguments started to mount while new fads rose up. Serverless became the new thing, but those who put their eggs in the Kubernetes basket resisted the shift from containers to functions. Zealots from the serverless side argued that you can ditch your entire code base and move to Lambda or Azure scripts, while fanatics of Kubernetes said that you didn’t need functions-as-a-service if you just packaged up your entire OS into a container and just spun up a million of them. So, do you need Kubernetes?

You’re (Probably) Not Google

Here’s the big thing that gets missed when a huge company open-sources their internal tooling – you’re most likely not on their scale. You don’t have the same resources, or the same problems as that huge company. Sure, you are working your hardest to make your company so big that you have the same scaling problems as Google, but you’re probably not there yet. Don’t get me wrong: I love when large enterprises open-source some of their internal tooling (such as Netflix or Amazon), as it’s beneficial to the open-source community and it’s a great learning opportunity, but I have to remind myself that they are solving a fundamentally different problem than I am.

While I’m not suggesting that you avoid planning ahead for scalability, getting something like Kubernetes set up and configured instead of developing your main business application can waste valuable time and funds. There’s a lot of overhead with learning, deploying, and managing Kubernetes that companies like Google can afford. If you can get the same effect from an autoscaling group of VMs with less headache, why wouldn’t you go that route? Remember: something like 60% of global AWS spend is on EC2, and with good reason. You can get surprisingly far using tried-and-true technologies and basics without having to rip everything out and implement the latest fad, which is why Kubernetes (or serverless, or blockchain, or multi-cloud…) shouldn’t be your focus. Kubernetes certainly has its place, and can be the tool you need. But most likely, it’s making things more complex for you without a corresponding benefit.

ServiceNow Governance Should Enable Users, It Usually Constrains Them

ServiceNow Governance Should Enable Users, It Usually Constrains Them

When most enterprise users hear that their organization will start heavily using ServiceNow governance, they assume that their job is about to get much harder, not easier. This stems from admins putting overly-restrictive policies in place, even with the good intentions of preventing security or financial problems.  The negative side effect of this often manifests itself as a huge limitation for users who are just trying to do their job. Ultimately, this can lead to “shadow IT”, angry users, and inefficient business processes. So how can you use ServiceNow governance to increase efficiency rather than prohibit it?

What is ServiceNow governance?

One of the main features of ServiceNow is the ability to implement processes for approvals, requests, and delegation. Governance in ServiceNow includes the policies and definitions of how decisions are made and who can make those decisions. For example, if a user needs a new virtual machine in AWS, they can be required to request one through the ServiceNow portal. Depending on the choices made during this request, cloud admins or finance team members can be alerted to this request and be asked to approve the request before it is carried out. Once approved, the VM will have specific tags and configuration options that match compliance and risk profiles.

What Drives Governance?

Governance policies are implemented with some presumably well-intentioned business goal in mind. Some organizations are trying to keep risk managed through approvals and visibility. Others are trying to rein in IT spending by guiding users to lower-cost alternatives to what they were requesting. 

Too often, to the end user, the goal gets lost behind the frustration of actions being slowed, blocked, or redirected by the (beautifully automated) red tape. Admins lose sight of the central business needs while implementing a governance structure that is trying to protect those business needs. For users to comply with these policies, it’s crucial that they understand the motivations behind them – so they don’t work around them. 

In practice, this turns into a balancing act. The guiding question that needs to be answered by ServiceNow governance is, “How can we enable our users to do their jobs while preventing costly (or risky) behavior?” 

Additionally, it’s critical that new policies are clearly communicated, and that they hook into existing processes.  Not to say that this is easy. To be done well, it requires a team of technical and business stakeholders to provide their needs and perspectives. Knowing the technical possibilities and limitations must match up with the business needs and overall organizational plans, while avoiding roadblocks and managing edge cases. There’s a lot to mesh together, and each organization has unique challenges and desires, which makes this whole process hard to generalize.

The End Result

At ParkMyCloud, we try to help facilitate these kinds of governance frameworks.  The ParkMyCloud application allows admins to set permissions on users and give access to teams. By reading from resource tags, existing processes for tagging and naming can be utilized. New processes around resource schedule management can be easily communicated via chat or email notifications. Users get the access they need to keep doing their job, but don’t get more access than required.  Employing similar ideas in your ServiceNow governance policies can make your users successful and your admins happy.

The 3 Must-Ask Questions When Using Google Cloud IAM

The 3 Must-Ask Questions When Using Google Cloud IAM

Google Cloud IAM (Identity and Access Management) is the core component of Google Cloud that keeps you secure. By adopting the “principle of least privilege” methodology, you can work towards having your infrastructure be only accessible by those who need it. As your organization grows in size, the idea of keeping your IAM permissions correct can seem daunting, so here’s a checklist of what you should think about prior to changing permissions. This can also help you as you continuously enforce your access management.

1. Who? (The “Identity”)

Narrowing down the person or thing who will be accessing resources is the first step in granting IAM permissions. This can be one of several options, including:

  • A Google account (usually used by a human)
  • A service account (usually used by a script/tool)
  • A Google group
  • A G-Suite domain

Our biggest recommendation for this step is to keep this limited to as few identities as possible. While you may need to assign permissions to a larger group, it’s much safer to start with a smaller subset and add permissions as necessary over time. Consider whether this is an automated task or a real person using the access as well, since service accounts with distinct uses makes it easier to track and limit those accounts.

2. What Access? (The “Role”)

Google Cloud permissions often correspond directly with a specific Google Cloud REST API method. These permissions are named based on the GCP service, the specific resource, and the verb that is being allowed. For example, ParkMyCloud requires a permission named “compute.instances.start” in order to issue a start command to Google Compute Engine instances.

These permissions are not granted directly, but instead are included in a role that gets assigned to the identity you’ve chosen. There are three different types of roles:

  • Primitive Roles – These specific roles (Owner, Editor, and Viewer) include a huge amount of permissions across all GCP services, and should be avoided in favor of more specific roles based on need.
  • Predefined Roles – Google provides many roles that describe a collection of permissions for a specific service, like “roles/cloudsql.client” (which includes the permissions “cloudsql.instances.connect” and “cloudsql.instances.get”). Some roles are broad, while others are limited.
  • Custom Roles – If a predefined role doesn’t exist that matches what you need, you can create a custom role that includes a list of specific permissions.

Our recommendation for this step is to use a predefined role where possible, but don’t hesitate to use a custom role. The ParkMyCloud setup has a custom role that specifically lists the exact REST API commands that are used by the system. This ensures that there are no possible ways for our platform to do anything that you don’t intend. When following the “least privilege” methodology, you will find that custom roles are often used.

3. Which Item? (The “Resource”)

Once you’ve decided on the identity and the permissions, you’ll need to assign those permissions to a resource using a Cloud IAM policy. A resource can be very granular or very broad, including things like:

  • GCP Projects
  • Single Compute Engine instances
  • Cloud Storage buckets

Each predefined role has a “lowest level” of resource that can be set. For example, the “App Engine Admin” role must be set at the project level, but the “Compute Load Balancer Admin” can be set at the compute instance level. You can always go higher up the resource hierarchy than the minimum. In the hierarchy, you have individual service resources, which all belong to a project, which can either be a part of a folder (in an organization) or directly a part of the organization.

Our recommendation, as with the Identity question, is to limit this to as few resources as possible. In practice, this might mean making a separate project to group together resources so you can assign a project-level role to an identity. Alternatively, you can just select a few resources within a project, or even an individual resource if possible.

And That’s All That IAM

These three questions provide the crucial decisions that you must make regarding Google Cloud IAM assignments. By thinking through these items, you can ensure that security is higher and risks are lower. For an example of how ParkMyCloud recommends a custom role assigned to a new service account in order to schedule and resize your VMs and databases, check out the documentation for ParkMyCloud GCP access, and sign up for a free trial today to get it connected securely to your environment.

Use this Azure IAM Checklist When You Add New Users

Use this Azure IAM Checklist When You Add New Users

Microsoft Azure IAM, also known as Access Control (IAM), is the product provided in Azure for RBAC and governance of users and roles. Identity management is a crucial part of cloud operations due to security risks that can come from misapplied permissions. Whenever you have a new identity (a user, group, or service principal) or a new resource (such as a virtual machine, database, or storage blob), you should provide proper access with as limited of a scope as possible. Here are some of the questions you should ask yourself to maintain maximum security:

1. Who needs access?

Granting access to an identity includes both human users and programmatic access from applications and scripts. If you are utilizing Azure Active Directory, then you likely want to use those managed identities for role assignments. Consider using an existing group of users or making a new group to apply similar permissions across a set of users, as you can then remove a user from that group in the future to revoke those permissions.

Programmatic access is typically granted through Azure service principals. Since it’s not a user logging in, the application or script will use the App Registration credentials to connect and run any commands. As an example, ParkMyCloud uses a service principal to get a list of managed resources, start them, stop them, and resize them.

2. What role do they need?

Azure IAM uses roles to give specific permissions to identities. Azure has a number of built-in roles based on a few common functions: 

  • Owner – Full management access, including granting access to others
  • Contributor – Management access to perform all actions except granting access to others
  • User Access Administrator – Specific access to grant access to others
  • Reader – View-only access

These built-in roles can be more specific, such as “Virtual Machine Contributor” or “Log Analytics Reader”. However, even with these specific pre-defined roles, the principle of least privilege shows that you’re almost always giving more access than is truly needed.

For even more granular permissions, you can create Azure custom roles and list specific commands that can be run. As an example, ParkMyCloud recommends creating a custom role to list the specific commands that are available as features. This ensures that you start with too few permissions, and slowly build up based on the needs of the user or service account. Not only can this prevent data leaks or data theft, but it can also protect against attacks like malware, former employee revenge, and rogue bitcoin mining.

3. Where do they need access?

The final piece of an Azure IAM permission set is deciding the specific resource that the identity should be able to access. This should be at the most granular level possible to maintain maximum security. For example, a Cloud Operations Manager may need access at the management group or subscription level, while a SQL Server utility may just need access to specific database resources. When creating or assigning the role, this is typically referred to as the “scope” in Azure. 

Our suggestion for the scope of a role is to always think twice before using the subscription or management group as a scope. The scale of your subscription is going to come into consideration, as organizations with many smaller subscriptions that have very focused purposes may be able to use the subscription-level scope more frequently. On the flip side, some companies have broader subscriptions, then use resource groups or tags to limit access, which means the scope is often smaller than a whole subscription.

More Secure, Less Worry

By revisiting these questions for each new resource or new identity that is created, you can quickly develop habits to maintain a high level of security using Azure IAM. For a real-world look at how we suggest setting up a service principal with a custom role to manage the power scheduling and rightsizing of your VMs, scale sets, and AKS clusters, check out the documentation for ParkMyCloud Azure access, and sign up for a free trial today to get it connected securely to your environment.