Running self-hosted GitHub runners on an Auto Scaling group enables organizations to have high availability during active development so that development teams do not have to compromise on runner availability. This allows development teams to have the same flexibility as that of a GitHub-hosted runner while also maintaining all the benefits of self-hosted runners, such as access to private networks and resources, custom software requirements, etc.
Once successfully registered, the Amazon EC2 instances appear under GitHub organization settings,→ Actions →Code, planning, and automation → Runners. The registration process was covered in detail in the note –Build Secure GitHub Self-Hosted Runners on Amazon EC2 with Terraform.
However, managing the self-hosted architecture can become challenging for operations teams during scaling-in events. When Auto Scaling Groups terminate unused Amazon EC2 instances to control costs and maintain agility, the runner services remain registered in GitHub as offline or orphaned runners.

The solution is to automatically deregister the self-hosted runner service before the Amazon EC2 instance terminates. This approach ensures the architecture remains fully automated for both scaling-out and scaling-in events, maintaining a clean GitHub runner inventory. This post demonstrates an event-driven implementation of how to deregister the service when an Amazon EC2 instance is terminated.
Since this is the continuation of the previous post, where we learned how to register EC2 instances as self-hosted runners (link shared above), I highly recommend you familiarize yourself with the implementation. The code to deploy this infrastructure is available at the GitHub repository: kunduso-org/github-self-hosted-runner-amazon-ec2-terraform.
Solution Overview
This solution uses an event-driven architecture to deregister GitHub runners when EC2 instances terminate. The process works for both Auto Scaling Group scale-in events and manual instance terminations.

The deregistration flow:
1. Termination Trigger: Auto Scaling Group initiates instance termination, or manual termination occurs
2. Lifecycle Hook Activation: ASG lifecycle hook pauses the termination process and sends a notification to SNS
3. Lambda Function Invocation: SNS notification triggers the Lambda function with instance details
4. GitHub API Deregistration: Lambda authenticates with GitHub App credentials and removes the runner
5. Lifecycle Completion: Lambda signals ASG to proceed with instance termination
6. Monitoring & Logging: All events are logged to CloudWatch for tracking and troubleshooting
Prerequisites
This use case requires three prerequisites. These are:
PreReq-1. Administrative access to an AWS account to deploy the AWS cloud resources and configure IAM roles for GitHub Actions integration
PreReq-2. Administrative access to a GitHub organization or enterprise to create GitHub Apps and manage self-hosted runners
PreReq-3. All resources from the previous note have been deployed, and EC2 instances are registered with GitHub self-hosted runners
Implementation
The Solutions overview section above can be further broken down into the following implementation steps to create the AWS cloud services. These are:
1. Add a lifecycle hook to the Auto Scaling Group
2. Create an SNS topic to capture notifications from the Auto Scaling Group
3. Deploy an AWS Lambda function to handle the deregistration process
4. Configure lifecycle completion to notify the Auto Scaling Group when Lambda finishes
Let us understand them in detail.
Implementation-1: Add a lifecycle hook to the Auto Scaling Group
A lifecycle hook is an AWS Auto Scaling feature that pauses the scaling process and enables custom actions before an instance is launched or terminated. Using a lifecycle hook, applications can intercept instance state changes and execute additional logic, such as graceful application shutdown or cleanup tasks.

In this use case, when the ASG receives a termination/scale-in request, it initiates the lifecycle hook, which pauses the termination process for 300 seconds and sends a notification to an SNS topic.
Implementation-2: Create an SNS topic to capture notifications from the Auto Scaling Group
Amazon SNS (Simple Notification Service) is a fully managed messaging service that delivers messages from publishers to subscribers through topics. In this architecture, SNS acts as the communication bridge between the Auto Scaling Group lifecycle hook and the Lambda function.

It receives the termination notifications with essential payload data, including the instance ID, lifecycle hook name, Auto Scaling Group name, and lifecycle action token required for the deregistration process.
Implementation-3: Deploy an AWS Lambda function to handle the deregistration process
The AWS Lambda function in this architecture is triggered by the SNS notification, which then runs Python code to authenticate with GitHub using the App credentials stored in Secrets Manager. The function generates a JWT token, obtains a GitHub access token, searches for the runner by instance ID (mentioned in the payload), and removes it from the organization using the GitHub API. It also logs the entire deregistration process to CloudWatch for monitoring and troubleshooting purposes. This is the same CloudWatch log group that was used as part of the registration in the previous note on registering Amazon EC2 instances as self-hosted GitHub runners.

You will notice that the Lambda function uses layers here. That is because the function requires external Python dependencies like PyJWT and cryptography libraries for GitHub App authentication and JWT token generation. Lambda layers provide a way to package and share these dependencies separately from the function code, reducing deployment package size and enabling reuse across multiple functions. This approach also significantly reduces cold start times since the dependencies are pre-loaded in the layer. For a detailed guide on creating Lambda layers with Docker and Terraform, refer to this comprehensive note: Create AWS Lambda Layer using Docker, Terraform, and GitHub Actions.
Implementation-4: Configure lifecycle completion to notify the Auto Scaling Group when Lambda finishes
This task occurs via the Python code executing in the Lambda function.

The final critical step involves notifying the Auto Scaling Group that the deregistration process has completed. The Lambda function calls complete_lifecycle_action() with the lifecycle hook details received from the SNS payload, including the lifecycle hook name, Auto Scaling Group name, instance ID, and lifecycle action token. The LifecycleActionResult parameter is set to 'CONTINUE' on successful deregistration or 'ABANDON' on failure, which signals the ASG to proceed with instance termination in both cases. This completion step is essential to prevent the lifecycle hook from timing out and leaving instances in a hanging state.
These are all the critical components in understanding the deregistration process.
Deployment and Validation
Once all the code was ready, it was deployed via GitHub Actions using the pipeline code in the .github/workflows folder. The workflow is designed to run the command terraform apply only when changes are merged into the main branch, using OIDC authentication for secure, temporary AWS credentials. This ensures all infrastructure modifications are reviewed through pull requests, maintaining best practices for CI/CD and infrastructure as code. To learn more about executing Terraform via GitHub Actions to provision AWS cloud resources, please check CI-CD with Terraform and GitHub Actions to deploy to AWS.
After deployment, the deregistration system can be validated by monitoring CloudWatch logs during instance termination events and confirming runners are correctly removed from the GitHub organization.

As you can see from the image above, for each instance, there are three log streams under the /github-self-hosted-runner/lifecycle log group. These streams are labeled as ${instanceID}/registration, ${instanceID}/execution, and ${instanceID}/deregistration. The registration stream captures the initial runner setup process, the execution stream logs runtime activities, and the deregistration stream tracks the Lambda-driven cleanup process when instances are terminated.
Here is a screenshot of a log stream with deregistration messages showing the complete lifecycle from hook trigger to successful runner removal and lifecycle completion.

Security Best Practices
The deregistration architecture introduces additional security considerations beyond the foundational security measures covered in the previous note. These are:
Event-Driven Security: SNS topics and Lambda functions use customer-managed KMS encryption to protect lifecycle event data in transit, ensuring sensitive instance information remains encrypted throughout the deregistration process
Lambda Network Isolation: The deregistration Lambda function operates within VPC private subnets with no direct internet access, using the NAT Gateway for both AWS service communication and GitHub API calls
Audit Trail Enhancement: Structured CloudWatch logging captures all deregistration events with timestamps and status codes, enabling security monitoring and compliance reporting for runner lifecycle management
These measures complement the existing security framework while ensuring the automated deregistration process maintains the same security posture as the registration infrastructure.
Conclusion
This event-driven deregistration solution eliminates orphaned GitHub runners through automated lifecycle management using Auto Scaling Group hooks, SNS notifications, and Lambda functions. The architecture reduces operational overhead and maintains clean runner inventories without manual intervention.
Combined with the registration process from the previous post, this creates a fully automated, production-ready self-hosted runner infrastructure that scales seamlessly while maintaining security best practices and comprehensive monitoring capabilities.
One thought on “Automated GitHub Self-Hosted Runner Cleanup: Lambda Functions and Auto Scaling Lifecycle Hooks”