VED-1233: Replace manual release steps#1437
Conversation
- Added steps to set the Terraform workspace and manage shared Lambda triggers during blue/green deployments in the deploy-backend.yml workflow. - Introduced a new script, manage_blue_green_event_source_mappings.sh, to handle the preparation and cleanup of event source mappings for Lambda functions. - Updated README.md to document the new blue/green Lambda trigger handoff process, removing manual steps from the deployment flow.
|
This branch is working on a ticket in the NHS England VED JIRA Project. Here's a handy link to the ticket: VED-1233 |
…event_source_mappings.sh - Introduced a new delete_mapping function to handle the deletion of AWS Lambda event source mappings, including a timeout mechanism for deletion confirmation. - Updated adopt_mapping function to utilize the new delete_mapping function, improving the logic for handling target and counterpart mapping UUIDs. - Enhanced code clarity and maintainability by restructuring the mapping lookup and deletion process.
| @@ -0,0 +1,170 @@ | |||
| #!/usr/bin/env bash | |||
There was a problem hiding this comment.
Can this script delete a live lambda event source mapping before terraform apply - outside any saved plan?
This should only be used for the controlled migration. If this script fails between the adopt and apply then state and/or AWS can get out of sync which will not be recorded in the artifact.
Can we move this to a dedicated migration workflow (or behind a one-time flag per env)?
| working-directory: infrastructure/instance | ||
| run: make workspace | ||
|
|
||
| - name: Terraform Apply |
There was a problem hiding this comment.
since the caller of this workflow (continuous-deployment) has concurrency, but this workflow does not - any other caller or future release workflow can race on the shared trigger workspace for preprod and prod.
Should we add concurrency, maybe keyed by the shared scope?
| apply: workspace | ||
| $(tf_cmd) apply $(tf_vars) --auto-approve | ||
|
|
||
| destroy: workspace |
There was a problem hiding this comment.
does this command mean that a destroy will destroy the shared mapping set between blue and green envs?
Should we have an allow flag set for the shared-scope which would determine if the destroy passes or fails?
| counterpart_id_sync_function="imms-${counterpart_sub_environment}-id-sync-lambda" | ||
| fi | ||
|
|
||
| adopt_mapping \ |
There was a problem hiding this comment.
Do we need a validate + separate trigger-state plan artifact before terraform apply?maybe also a tflint andbasic policy pass?
| local function_name="$2" | ||
| local mapping_uuid | ||
|
|
||
| mapping_uuid="$(aws lambda list-event-source-mappings \ |
There was a problem hiding this comment.
will this be a problem if there is a failed cutover - would you ever duplicate, stale, disabled, or partially deleted mappings?
if so, the script should fail on ambiguity, ignore Deleting states, log the UUIDs and current states it found, and verify the final mapping target after apply
| @@ -0,0 +1,4 @@ | |||
| output "id_sync_queue_arn" { | |||
There was a problem hiding this comment.
could we have outputs for both mapping UUIDs, function ARNs, and state?
Also, a documented rollback path and verification commands.
|
|
||
| ## Lambda Trigger Handoff | ||
|
|
||
| The `delta_trigger` and `id_sync_sqs_trigger` event source mappings are managed from `../event_source_mappings` so the main instance plan does not rewrite shared backend state. The deploy workflow applies the main instance first, then adopts or updates the trigger mappings from the dedicated trigger workspace. |
There was a problem hiding this comment.
It maybe worth adding a runbook for first cutover, rollback, and failed-apply recovery, including the exact AWS verification commands?
avshetty1980
left a comment
There was a problem hiding this comment.
Looking really good - just a few questions and comments.
- Added a new workflow for migrating event source mappings, allowing controlled one-time migrations for specific environments. - Updated the deploy-backend.yml workflow to include concurrency settings and additional steps for Terraform initialization, formatting, validation, and applying event source mappings. - Refactored the Makefile to introduce new commands for formatting checks, validation, and applying Terraform plans. - Enhanced the adopt_event_source_mappings.sh script to support verification of event source mappings and improved logging for existing mappings. - Updated README.md to document the new migration process and rollback procedures for event source mappings.
|



Automated blue/green Lambda trigger handoff in deployment by adding pre-plan state adoption and pre-apply stale-trigger cleanup steps to the backend workflow.
Added manage_blue_green_event_source_mappings.sh to resolve live mapping UUIDs, re-import shared delta and id-sync event source mappings into the target Terraform workspace, and delete obsolete side-specific mappings.
This removes the manual release checklist steps (“Disable delta” and “Disable ID sync”), making releases faster and reducing risk of human error during blue/green cutovers.