CloudFormation

A description of the Getting Started CloudFormation file and permissions

The Getting Started with Karpenter guide uses CloudFormation to bootstrap the cluster to enable Karpenter to create and manage nodes, as well as to allow Karpenter to respond to interruption events. This document describes the cloudformation.yaml file used in that guide. These descriptions should allow you to understand:

  • What Karpenter is authorized to do with your EKS cluster and AWS resources when using the cloudformation.yaml file
  • What permissions you need to set up if you are adding Karpenter to an existing cluster

Overview

To download a particular version of cloudformation.yaml, set the version and use curl to pull the file to your local system:

export KARPENTER_VERSION=v0.31.0
curl https://raw.githubusercontent.com/aws/karpenter/"${KARPENTER_VERSION}"/website/content/en/preview/getting-started/getting-started-with-karpenter/cloudformation.yaml > cloudformation.yaml

Following some header information, the rest of the cloudformation.yaml file describes the resources that CloudFormation deploys. The sections of that file can be grouped together under the following general headings:

  • Node Authorization: Creates a NodeInstanceProfile, attaches a NodeRole to it, and connects it to an IAM Identity Mapping used to authorize nodes to the cluster. This defines the permissions each node managed by Karpenter has to access EC2 and other AWS resources. This doesn’t actually create the IAM Identity Mapping. That part is orchestrated by eksctl in the Getting Started guide.
  • Controller Authorization: Creates the KarpenterControllerPolicy that is attached to the service account. Again, the actual service account creation (karpenter), that is combined with the KarpenterControllerPolicy, is orchestrated by eksctl in the Getting Started guide.
  • Interruption Handling: Allows the Karpenter controller to see and respond to interruptions that occur with the nodes that Karpenter is managing. See the Interruption section of the Disruption page for details.

A lot of the object naming that is done by cloudformation.yaml is based on the following:

  • Cluster name: With a username of bob the Getting Started Guide would name your cluster bob-karpenter-demo That name would then be appended to any name below where ${ClusterName} is included.

  • Partition: Any time an ARN is used, it includes the partition name to identify where the object is found. In most cases, that partition name is aws. However, it could also be aws-cn (for China Regions) or aws-us-gov (for AWS GovCloud US Regions).

Node Authorization

The following sections of the cloudformation.yaml file set up IAM permissions for Kubernetes nodes created by Karpenter. In particular, this involves setting up a node role that can be attached and passed to instance profiles that Karpenter generates at runtime:

  • KarpenterNodeRole

KarpenterNodeRole

This section of the template defines the IAM role attached to generated instance profiles. Given a cluster name of bob-karpenter-demo, this role would end up being named "KarpenterNodeRole-bob-karpenter-demo.

KarpenterNodeRole:
  Type: "AWS::IAM::Role"
  Properties:
    RoleName: !Sub "KarpenterNodeRole-${ClusterName}"
    Path: /
    AssumeRolePolicyDocument:
      Version: "2012-10-17"
      Statement:
        - Effect: Allow
          Principal:
            Service:
              !Sub "ec2.${AWS::URLSuffix}"
          Action:
            - "sts:AssumeRole"
    ManagedPolicyArns:
      - !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEKS_CNI_Policy"
      - !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEKSWorkerNodePolicy"
      - !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonEC2ContainerRegistryReadOnly"
      - !Sub "arn:${AWS::Partition}:iam::aws:policy/AmazonSSMManagedInstanceCore"

The role created here includes several AWS managed policies, which are designed to provide permissions for specific uses needed by the nodes to work with EC2 and other AWS resources. These include:

If you were to use a node role from an existing cluster, you could skip this provisioning step and pass this node role to any EC2NodeClasses that you create. Additionally, you would ensure that the Controller Policy has iam:PassRole permission to the role attached to the generated instance profiles.

Controller Authorization

This section sets the AWS permissions for the Karpenter Controller. When used in the Getting Started guide, eksctl uses these permissions to create a service account (karpenter) that is combined with the KarpenterControllerPolicy.

The resources defined in this section are associated with:

  • KarpenterControllerPolicy

Because the scope of the KarpenterControllerPolicy is an AWS region, the cluster’s AWS region is included in the AllowScopedEC2InstanceActions.

KarpenterControllerPolicy

A KarpenterControllerPolicy object sets the name of the policy, then defines a set of resources and actions allowed for those resources. For our example, the KarpenterControllerPolicy would be named: KarpenterControllerPolicy-bob-karpenter-demo

KarpenterControllerPolicy:
  Type: AWS::IAM::ManagedPolicy
  Properties:
    ManagedPolicyName: !Sub "KarpenterControllerPolicy-${ClusterName}"
    # The PolicyDocument must be in JSON string format because we use a StringEquals condition that uses an interpolated
    # value in one of its key parameters which isn't natively supported by CloudFormation
    PolicyDocument: !Sub |
      {
        "Version": "2012-10-17",
        "Statement": [

Someone wanting to add Karpenter to an existing cluster, instead of using cloudformation.yaml, would need to create the IAM policy directly and assign that policy to the role leveraged by the service account using IRSA.

AllowScopedEC2InstanceActions

The AllowScopedEC2InstanceActions statement ID (Sid) identifies a set of EC2 resources that are allowed to be accessed with RunInstances and CreateFleet actions. For RunInstances and CreateFleet actions, the Karpenter controller can read (but not create) image, snapshot, spot-instances-request, security-group, subnet and launch-template EC2 resources, scoped for the particular AWS partition and region.

{
  "Sid": "AllowScopedEC2InstanceActions",
  "Effect": "Allow",
  "Resource": [
    "arn:${AWS::Partition}:ec2:${AWS::Region}::image/*",
    "arn:${AWS::Partition}:ec2:${AWS::Region}::snapshot/*",
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:spot-instances-request/*",
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:security-group/*",
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:subnet/*",
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:launch-template/*"
  ],
  "Action": [
    "ec2:RunInstances",
    "ec2:CreateFleet"
  ]
}

AllowScopedEC2InstanceActionsWithTags

The AllowScopedEC2InstanceActionsWithTags Sid allows the RunInstances, CreateFleet, and CreateLaunchTemplate actions requested by the Karpenter controller to create all fleet, instance, volume, network-interface, or launch-template EC2 resources (for the partition and region), and requires that the kubernetes.io/cluster/${ClusterName} tag be set to owned and a karpenter.sh/nodepool tag be set to any value. This ensures that Karpenter is only allowed to create instances for a single EKS cluster.

{
  "Sid": "AllowScopedEC2InstanceActionsWithTags",
  "Effect": "Allow",
  "Resource": [
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:fleet/*",
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:instance/*",
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:volume/*",
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:network-interface/*",
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:launch-template/*"
  ],
  "Action": [
    "ec2:RunInstances",
    "ec2:CreateFleet",
    "ec2:CreateLaunchTemplate"
  ],
  "Condition": {
    "StringEquals": {
      "aws:RequestTag/kubernetes.io/cluster/${ClusterName}": "owned"
    },
    "StringLike": {
      "aws:RequestTag/karpenter.sh/nodepool": "*"
    }
  }
}

AllowScopedResourceCreationTagging

The AllowScopedResourceCreationTagging Sid allows EC2 CreateTags actions on fleet, instance, volume, network-interface, and launch-template resources, While making RunInstance, CreateFleet, or CreateLaunchTemplate calls. Additionally, this ensures that resources can’t be tagged arbitrarily by Karpenter after they are created.

{
  "Sid": "AllowScopedResourceCreationTagging",
  "Effect": "Allow",
  "Resource": [
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:fleet/*",
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:instance/*",
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:volume/*",
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:network-interface/*",
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:launch-template/*"
  ],
  "Action": "ec2:CreateTags",
  "Condition": {
    "StringEquals": {
      "aws:RequestTag/kubernetes.io/cluster/${ClusterName}": "owned",
      "ec2:CreateAction": [
        "RunInstances",
        "CreateFleet",
        "CreateLaunchTemplate"
      ]
    },
    "StringLike": {
      "aws:RequestTag/karpenter.sh/nodepool": "*"
    }
  }
}

AllowScopedResourceTagging

The AllowScopedResourceTagging Sid allows EC2 CreateTags actions on all instances created by Karpenter after their creation. It enforces that Karpenter is only able to update the tags on cluster instances it is operating on through the karpenter.sh/cluster/${ClusterName}" and karpenter.sh/nodepool tags.

{
  "Sid": "AllowScopedResourceTagging",
  "Effect": "Allow",
  "Resource": "arn:${AWS::Partition}:ec2:${AWS::Region}:*:instance/*",
  "Action": "ec2:CreateTags",
  "Condition": {
    "StringEquals": {
      "aws:ResourceTag/karpenter.sh/cluster/${ClusterName}": "owned"
    },
    "StringLike": {
      "aws:ResourceTag/karpenter.sh/nodepool": "*"
    },
    "ForAllValues:StringEquals": {
      "aws:TagKeys": [
        "karpenter.sh/nodeclaim",
        "Name"
      ]
    }
  }
}

AllowScopedDeletion

The AllowScopedDeletion Sid allows TerminateInstances and DeleteLaunchTemplate actions to delete instance and launch-template resources, provided that karpenter.sh/nodepool and kubernetes.io/cluster/${ClusterName} tags are set. These tags must be present on all resources that Karpenter is going to delete. This ensures that Karpenter can only delete instances and launch templates that are associated with it.

{
  "Sid": "AllowScopedDeletion",
  "Effect": "Allow",
  "Resource": [
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:instance/*",
    "arn:${AWS::Partition}:ec2:${AWS::Region}:*:launch-template/*"
  ],
  "Action": [
    "ec2:TerminateInstances",
    "ec2:DeleteLaunchTemplate"
  ],
  "Condition": {
    "StringEquals": {
      "aws:ResourceTag/kubernetes.io/cluster/${ClusterName}": "owned"
    },
    "StringLike": {
      "aws:ResourceTag/karpenter.sh/nodepool": "*"
    }
  }
}

AllowRegionalReadActions

The AllowRegionalReadActions Sid allows DescribeAvailabilityZones, DescribeImages, DescribeInstances, DescribeInstanceTypeOfferings, DescribeInstanceTypes, DescribeLaunchTemplates, DescribeSecurityGroups, DescribeSpotPriceHistory, and DescribeSubnets actions for the current AWS region. This allows the Karpenter controller to do any of those read-only actions across all related resources for that AWS region.

{
  "Sid": "AllowRegionalReadActions",
  "Effect": "Allow",
  "Resource": "*",
  "Action": [
    "ec2:DescribeAvailabilityZones",
    "ec2:DescribeImages",
    "ec2:DescribeInstances",
    "ec2:DescribeInstanceTypeOfferings",
    "ec2:DescribeInstanceTypes",
    "ec2:DescribeLaunchTemplates",
    "ec2:DescribeSecurityGroups",
    "ec2:DescribeSpotPriceHistory",
    "ec2:DescribeSubnets"
  ],
  "Condition": {
    "StringEquals": {
      "aws:RequestedRegion": "${AWS::Region}"
    }
  }
}

AllowSSMReadActions

The AllowSSMReadActions Sid allows the Karpenter controller to read SSM parameters (ssm:GetParameter) from the current region for SSM parameters generated by ASW services.

NOTE: If potentially sensitive information is stored in SSM parameters, you could consider restricting access to these messages further.

{
  "Sid": "AllowSSMReadActions",
  "Effect": "Allow",
  "Resource": "arn:${AWS::Partition}:ssm:${AWS::Region}::parameter/aws/service/*",
  "Action": "ssm:GetParameter"
}

AllowPricingReadActions

Because pricing information does not exist in every region at the moment, the AllowPricingReadActions Sid allows the Karpenter controller to get product pricing information (pricing:GetProducts) for all related resources across all regions.

{
  "Sid": "AllowPricingReadActions",
  "Effect": "Allow",
  "Resource": "*",
  "Action": "pricing:GetProducts"
}

AllowInterruptionQueueActions

Karpenter supports interruption queues, that you can create as described in the Interruption section of the Disruption page. This section of the cloudformation.yaml template can give Karpenter permission to access those queues by specifying the resource ARN. For the interruption queue you created (${KarepenterInterruptionQueue.Arn}), the AllowInterruptionQueueActions Sid lets the Karpenter controller have permission to delete messages (DeleteMessage), get queue attributes (GetQueueAttributes), get queue URL (GetQueueUrl), and receive messages (ReceiveMessage).

{
  "Sid": "AllowInterruptionQueueActions",
  "Effect": "Allow",
  "Resource": "${KarpenterInterruptionQueue.Arn}",
  "Action": [
    "sqs:DeleteMessage",
    "sqs:GetQueueAttributes",
    "sqs:GetQueueUrl",
    "sqs:ReceiveMessage"
  ]
}

AllowPassingInstanceRole

The AllowPassingInstanceRole Sid gives the Karpenter controller permission to pass (iam:PassRole) the node role (KarpenterNodeRole-${ClusterName}) to generated instance profiles. This gives EC2 permission explicit permission to use the KarpenterNodeRole-${ClusterName} when assigning permissions to generated instance profiles while launching nodes.

{
  "Sid": "AllowPassingInstanceRole",
  "Effect": "Allow",
  "Resource": "arn:${AWS::Partition}:iam::${AWS::AccountId}:role/KarpenterNodeRole-${ClusterName}",
  "Action": "iam:PassRole",
  "Condition": {
    "StringEquals": {
      "iam:PassedToService": "ec2.amazonaws.com"
    }
  }
}

AllowScopedInstanceProfileCreationActions

The AllowScopedInstanceProfileCreationActions Sid gives the Karpenter controller permission to create a new instance profile with iam:CreateInstanceProfile, provided that the request is made to a cluster with kubernetes.io/cluster/${ClusterName set to owned and is made in the current region. Also, karpenter.sh/nodeclass must be set to some value. This ensures that Karpenter can generate instance profiles on your behalf based on roles specified in your EC2NodeClasses that you use to configure Karpenter.

{
  "Sid": "AllowScopedInstanceProfileCreationActions",
  "Effect": "Allow",
  "Resource": "*",
  "Action": [
    "iam:CreateInstanceProfile"
  ],
  "Condition": {
    "StringEquals": {
      "aws:RequestTag/kubernetes.io/cluster/${ClusterName}": "owned",
      "aws:RequestTag/topology.kubernetes.io/region": "${AWS::Region}"
    },
    "StringLike": {
      "aws:RequestTag/karpenter.sh/nodeclass": "*"
    }
  }
}

AllowScopedInstanceProfileTagActions

The AllowScopedInstanceProfileTagActions Sid gives the Karpenter controller permission to tag an instance profile with iam:TagInstanceProfile, based on the values shown below, Also, karpenter.sh/nodeclass must be set to some value. This ensures that Karpenter is only able to act on instance profiles that it provisions for this cluster.

{
  "Sid": "AllowScopedInstanceProfileTagActions",
  "Effect": "Allow",
  "Resource": "*",
  "Action": [
    "iam:TagInstanceProfile"
  ],
  "Condition": {
    "StringEquals": {
      "aws:ResourceTag/kubernetes.io/cluster/${ClusterName}": "owned",
      "aws:ResourceTag/topology.kubernetes.io/region": "${AWS::Region}",
      "aws:RequestTag/kubernetes.io/cluster/${ClusterName}": "owned",
      "aws:RequestTag/topology.kubernetes.io/region": "${AWS::Region}"
    },
    "StringLike": {
      "aws:ResourceTag/karpenter.sh/nodeclass": "*",
      "aws:RequestTag/karpenter.sh/nodeclass": "*"
    }
  }
}

AllowScopedInstanceProfileActions

The AllowScopedInstanceProfileActions Sid gives the Karpenter controller permission to perform iam:AddRoleToInstanceProfile, iam:RemoveRoleFromInstanceProfile, and iam:DeleteInstanceProfile actions, provided that the request is made to a cluster with kubernetes.io/cluster/${ClusterName set to owned and is made in the current region. Also, karpenter.sh/nodeclass must be set to some value. This permission is further enforced by the iam:PassRole permission. If Karpenter attempts to add a role to an instance profile that it doesn’t have iam:PassRole permission on, that call will fail. Therefore, if you configure Karpenter to use a new role through the EC2NodeClass, ensure that you also specify that role within your iam:PassRole permission.

{
  "Sid": "AllowScopedInstanceProfileActions",
  "Effect": "Allow",
  "Resource": "*",
  "Action": [
    "iam:AddRoleToInstanceProfile",
    "iam:RemoveRoleFromInstanceProfile",
    "iam:DeleteInstanceProfile"
  ],
  "Condition": {
    "StringEquals": {
      "aws:ResourceTag/kubernetes.io/cluster/${ClusterName}": "owned",
      "aws:ResourceTag/topology.kubernetes.io/region": "${AWS::Region}"
    },
    "StringLike": {
      "aws:ResourceTag/karpenter.sh/nodeclass": "*"
    }
  }
}

AllowInstanceProfileActions

The AllowInstanceProfileActions Sid gives the Karpenter controller permission to perform iam:GetInstanceProfile actions to retrieve information about a specified instance profile, including understanding if an instance profile has been provisioned for an EC2NodeClass or needs to be re-provisioned.

{
  "Sid": "AllowInstanceProfileReadActions",
  "Effect": "Allow",
  "Resource": "*",
  "Action": "iam:GetInstanceProfile"
}

AllowAPIServerEndpointDiscovery

You can optionally allow the Karpenter controller to discover the Kubernetes cluster’s external API endpoint to enable EC2 nodes to successfully join the EKS cluster.

Note: If you are not using an EKS control plane, you will have to specify this endpoint explicitly. See the description of the aws.clusterEndpoint setting in the ConfigMap documentation for details.

The AllowAPIServerEndpointDiscovery Sid allows the Karpenter controller to get that information (eks:DescribeCluster) for the cluster (cluster/${ClusterName}).

{
  "Sid": "AllowAPIServerEndpointDiscovery",
  "Effect": "Allow",
  "Resource": "arn:${AWS::Partition}:eks:${AWS::Region}:${AWS::AccountId}:cluster/${ClusterName}",
  "Action": "eks:DescribeCluster"
}

Interruption Handling

Settings in this section allow the Karpenter controller to stand-up an interruption queue to receive notification messages from other AWS services about the health and status of instances. For example, this interruption queue allows Karpenter to be aware of spot instance interruptions that are sent 2 minutes before spot instances are reclaimed by EC2. Adding this queue allows Karpenter to be proactive in migrating workloads to new nodes. See the Interruption section of the Disruption page for details.

Defining the KarpenterInterruptionQueuePolicy allows Karpenter to see and respond to the following:

  • AWS health events
  • Spot interruptions
  • Spot rebalance recommendations
  • Instance state changes

The resources defined in this section include:

  • KarpenterInterruptionQueue
  • KarpenterInterruptionQueuePolicy
  • ScheduledChangeRule
  • SpotInterruptionRule
  • RebalanceRule
  • InstanceStateChangeRule

KarpenterInterruptionQueue

The AWS::SQS::Queue resource is used to create an Amazon SQS standard queue. Properties of that resource set the QueueName to the name of your cluster, the time for which SQS retains each message (MessageRetentionPeriod) to 300 seconds, and enabling serverside-side encryption using SQS owned encryption keys (SqsManagedSseEnabled) to true. See SetQueueAttributes for descriptions of some of these attributes.

KarpenterInterruptionQueue:
  Type: AWS::SQS::Queue
  Properties:
    QueueName: !Sub "${ClusterName}"
    MessageRetentionPeriod: 300
    SqsManagedSseEnabled: true

KarpenterInterruptionQueuePolicy

The Karpenter interruption queue policy is created to allow AWS services that we want to receive instance notifications from to push notification messages to the queue. The AWS::SQS::QueuePolicy resource here applies EC2InterruptionPolicy to the KarpenterInterruptionQueue. The policy allows sqs:SendMessage actions to events.amazonaws.com and sqs.amazonaws.com services. It also allows the GetAtt function to get attributes from KarpenterInterruptionQueue.Arn.

KarpenterInterruptionQueuePolicy:
  Type: AWS::SQS::QueuePolicy
  Properties:
    Queues:
      - !Ref KarpenterInterruptionQueue
    PolicyDocument:
      Id: EC2InterruptionPolicy
      Statement:
        - Effect: Allow
          Principal:
            Service:
              - events.amazonaws.com
              - sqs.amazonaws.com
          Action: sqs:SendMessage
          Resource: !GetAtt KarpenterInterruptionQueue.Arn

Rules

This section allows Karpenter to gather AWS Health Events and direct them to a queue where they can be consumed by Karpenter. These rules include:

  • ScheduledChangeRule: The AWS::Events::Rule creates a rule where the EventPattern is set to send events from the aws.health source to KarpenterInterruptionQueue.

    ScheduledChangeRule:
      Type: 'AWS::Events::Rule'
      Properties:
       EventPattern:
         source:
           - aws.health
         detail-type:
           - AWS Health Event
       Targets:
         - Id: KarpenterInterruptionQueueTarget
           Arn: !GetAtt KarpenterInterruptionQueue.Arn
    
  • SpotInterruptionRule: An EC2 Spot Instance Interruption warning tells you that AWS is about to reclaim a Spot instance you are using. This rule allows Karpenter to gather EC2 Spot Instance Interruption Warning events and direct them to a queue where they can be consumed by Karpenter. In particular, the AWS::Events::Rule here creates a rule where the EventPattern is set to send events from the aws.ec2 source to KarpenterInterruptionQueue.

    SpotInterruptionRule:
      Type: 'AWS::Events::Rule'
      Properties:
        EventPattern:
          source:
            - aws.ec2
          detail-type:
            - EC2 Spot Instance Interruption Warning
        Targets:
          - Id: KarpenterInterruptionQueueTarget
            Arn: !GetAtt KarpenterInterruptionQueue.Arn
    
  • RebalanceRule: An EC2 Instance Rebalance Recommendation signal tells you that a Spot instance is at a heightened risk of being interrupted, allowing Karpenter to get new instances or simply rebalance workloads. This rule allows Karpenter to gather EC2 Instance Rebalance Recommendation signals and direct them to a queue where they can be consumed by Karpenter. In particular, the AWS::Events::Rule here creates a rule where the EventPattern is set to send events from the aws.ec2 source to KarpenterInterruptionQueue.

    RebalanceRule:
     Type: 'AWS::Events::Rule'
     Properties:
       EventPattern:
         source:
           - aws.ec2
         detail-type:
           - EC2 Instance Rebalance Recommendation
       Targets:
         - Id: KarpenterInterruptionQueueTarget
           Arn: !GetAtt KarpenterInterruptionQueue.Arn
    
  • InstanceStateChangeRule: An EC2 Instance State-change Notification signal tells you that the state of an instance has changed to one of the following states: pending, running, stopping, stopped, shutting-down, or terminated. This rule allows Karpenter to gather EC2 Instance State-change signals and direct them to a queue where they can be consumed by Karpenter. In particular, the AWS::Events::Rule here creates a rule where the EventPattern is set to send events from the aws.ec2 source to KarpenterInterruptionQueue.

    InstanceStateChangeRule:
     Type: 'AWS::Events::Rule'
     Properties:
       EventPattern:
         source:
           - aws.ec2
         detail-type:
           - EC2 Instance State-change Notification
       Targets:
         - Id: KarpenterInterruptionQueueTarget
           Arn: !GetAtt KarpenterInterruptionQueue.Arn