feat(tfe-job-agent): webhook-based TFC job agent#847
Conversation
Replace polling-based TFE dispatcher with fire-and-forget webhook model: - Go dispatcher: upsert workspace, sync variables, ensure notification config, create run, return immediately - API webhook route: POST /api/tfe/webhook with HMAC-SHA512 verification - Status mapping: TFC notification triggers → ctrlplane job statuses - Job correlation: extract job UUID from TFC run message (same pattern as GitHub webhook handler) - Links/metadata set by webhook handler from TFC payload data - OpenAPI schema: add webhookUrl (required) and triggerRunOnChange (optional) to TerraformCloudJobAgentConfig - Fix workspace-engine build: main.go → . in package.json Tested end-to-end locally against TFC via smee.io webhook tunnel.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds Terraform Cloud webhook handling and a Terraform Cloud job agent: new POST /api/tfe/webhook with HMAC-SHA512 verification, notification→JobStatus mapping and job/metadata updates; workspace-engine gains a TFE job agent (workspace upsert, variable sync, optional run creation, webhook config). Tests and docs updated. Changes
Sequence Diagram(s)sequenceDiagram
participant TFC as Terraform Cloud
participant API as Ctrlplane API (/api/tfe/webhook)
participant Verify as verifySignature
participant Handler as handleRunNotification
participant DB as Database (schema.job / schema.jobMetadata)
participant Queue as Reconciliation Queue
TFC->>API: POST /webhook (notifications + x-tfe-notification-signature)
API->>Verify: compute HMAC-SHA512 & timing-safe compare
alt valid signature
Verify-->>API: ok
API->>Handler: forward payload (notifications)
alt notifications non-empty
Handler->>DB: update job (status, externalId, timestamps, message)
Handler->>DB: upsert jobMetadata (links, run_url)
Handler->>DB: resolve workspaceId via joins
alt workspaceId found
Handler->>Queue: enqueueAllReleaseTargetsDesiredVersion(workspaceId)
end
else notifications empty
Handler-->>API: no-op (200)
end
API-->>TFC: 200 OK
else invalid/missing signature
Verify-->>API: fail
API-->>TFC: 401 Unauthorized
end
alt handler error
API-->>TFC: 500 Internal Server Error (error message)
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 6
🧹 Nitpick comments (4)
apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/config.go (1)
47-48: String parsing fortriggerRunOnChangeonly recognizes lowercase "false".The current logic
val != "false"means values like"FALSE","False", or"0"will be interpreted astrue. Consider case-insensitive comparison or usingstrconv.ParseBoolfor more robust boolean parsing.♻️ Suggested improvement
+import "strings" + case string: - triggerRunOnChange = val != "false" + triggerRunOnChange = strings.ToLower(val) != "false"Or for full boolean string support:
import "strconv" case string: parsed, err := strconv.ParseBool(val) if err == nil { triggerRunOnChange = parsed }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/config.go` around lines 47 - 48, The string branch that sets triggerRunOnChange currently uses a case-sensitive comparison (val != "false"), so values like "False" or "0" are mis-parsed; update the case string branch to use strconv.ParseBool to perform robust, case-insensitive boolean parsing and only set triggerRunOnChange when ParseBool returns no error, and add the strconv import; locate the code around the case string handling of triggerRunOnChange to apply this change.apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/tfe.go (1)
74-74:os.Getenvcalled on every dispatch.Reading
TFE_WEBHOOK_SECRETfrom the environment on each dispatch works but adds minor overhead. If runtime secret rotation isn't required, consider reading it once at construction time. Current approach is valid if dynamic secret updates are desired.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/tfe.go` at line 74, The code currently calls os.Getenv("TFE_WEBHOOK_SECRET") on every dispatch (webhookSecret variable); change this to read the secret once during the TFE dispatcher/controller construction and store it on the struct (e.g., add a field on the TFE client/dispatcher type), then have the dispatch handler use that stored value instead of calling os.Getenv each time; keep the current runtime behavior only if you intentionally want dynamic secret rotation.apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/tfe_test.go (1)
71-82: Consider adding a test case fortriggerRunOnChange: "true"as a string.The tests cover
"false"as a string but not"true". Given the parsing logic treats any non-"false"string as true, adding explicit coverage for the"true"string case would improve test clarity.🧪 Suggested test case
t.Run("string true", func(t *testing.T) { cfg := oapi.JobAgentConfig{ "address": "https://app.terraform.io", "token": "t", "organization": "o", "template": "t", "triggerRunOnChange": "true", } parsed, err := parseJobAgentConfig(cfg) require.NoError(t, err) assert.True(t, parsed.triggerRunOnChange) })🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/tfe_test.go` around lines 71 - 82, Add a test case in tfe_test.go that mirrors the existing "string false" test but sets "triggerRunOnChange" to the string "true"; call parseJobAgentConfig(cfg), assert no error with require.NoError, and assert that parsed.triggerRunOnChange is true (suggested subtest name "string true") so parseJobAgentConfig's string-to-bool handling is explicitly covered.apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go (1)
318-319: Category validation error silently ignored intoCreateOptions/toUpdateOptions.The
categoryType()error is discarded with_. WhilesyncVariablesvalidates categories before calling these methods, direct use oftoCreateOptions/toUpdateOptionscould silently use an empty category. Consider returning an error or validating at the struct level.Also applies to: 330-331
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go` around lines 318 - 319, The methods VariableTemplate.toCreateOptions and toUpdateOptions currently ignore the error returned by categoryType(), which can mask invalid categories; change their signatures to return (tfe.VariableCreateOptions, error), call categoryType() and if it returns an error return that error instead of discarding it, and update callers (e.g., syncVariables) to handle the error; alternatively, perform validation earlier when constructing VariableTemplate (validate via categoryType in the constructor or a new Validate method) and have toCreateOptions/toUpdateOptions assume a validated struct—reference the methods VariableTemplate.toCreateOptions, VariableTemplate.toUpdateOptions, categoryType, and syncVariables when making the change.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@apps/api/src/config.ts`:
- Line 35: The TFE_WEBHOOK_SECRET environment schema currently allows an empty
string which becomes a valid HMAC key; update the zod schema for
TFE_WEBHOOK_SECRET in config.ts to normalize blank strings to undefined (e.g.,
use z.preprocess((val) => val === "" ? undefined : val, z.string().optional()))
so that an empty env value disables the webhook feature consistently with the
router checks (routes/tfe index uses undefined/null guarding).
In `@apps/api/src/routes/tfe/index.ts`:
- Around line 20-24: The HMAC is being computed from JSON.stringify(req.body)
which can differ from the original payload; capture the raw request bytes for
the /api/tfe route (e.g., using express.json's verify option to save the raw
buffer to req.rawBody or using express.raw for this webhook route) and then use
that raw buffer when calling crypto.createHmac(...).update(...).digest(...)
instead of JSON.stringify(req.body); update the code that computes expected (the
createHmac/update call that currently uses body and secret) to use the saved raw
body (e.g., req.rawBody) and keep the rest of the signature comparison logic the
same.
In `@apps/api/src/routes/tfe/run_notification.ts`:
- Around line 113-114: The call to enqueueAllReleaseTargetsDesiredVersion in
run_notification.ts is currently fire-and-forget and can cause unhandled promise
rejections; change the invocation to await the promise so failures propagate to
the async handler (i.e., replace the direct call to
enqueueAllReleaseTargetsDesiredVersion(db, result.workspaceId) with an awaited
call: await enqueueAllReleaseTargetsDesiredVersion(db, result.workspaceId)) so
any queue write errors are properly caught by the surrounding async flow.
In `@apps/workspace-engine/oapi/spec/schemas/jobs.jsonnet`:
- Around line 147-154: The schema currently lists webhookUrl as required in
apps/workspace-engine/oapi/spec/schemas/jobs.jsonnet but runtime/validation
treat webhookUrl as optional; to fix, either remove "webhookUrl" from the
required array in that schema (update the required:
['address','organization','token','template','webhookUrl'] to exclude
webhookUrl) or add a DB migration that backfills a safe default for all existing
TFC job configs before making webhookUrl required (write a migration that
updates persisted TFC agent configs to include a non-null webhookUrl value,
ensure the migration runs in this PR, and update any tests and validation logic
to reflect the chosen approach so schema, runtime validation, and persisted data
remain consistent).
In
`@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go`:
- Around line 173-176: NotificationConfigurations.List returns paginated
results, so change the call in workspace.go to iterate through all pages and
accumulate configurations (similar to how syncVariables handles pagination)
instead of only using the first page; repeatedly call
NotificationConfigurations.List with the appropriate pagination/options until no
more pages, search the combined list for the "ctrlplane-webhook" config to avoid
creating duplicates, and return/use the aggregated configs variable throughout
the surrounding logic that currently assumes a single-page response.
- Around line 102-105: The call to client.Variables.List in workspace.go only
fetches the first page of variables so workspaces with >20 variables aren't
fully synced; update the logic that calls client.Variables.List (the variable
listing block) to iterate through all pages returned by the API (use the
pager/next page mechanism returned by the SDK, calling List repeatedly or using
the provided pager helper) and accumulate variables into existingVars before
proceeding with update/delete logic; ensure you stop when there are no more
pages and propagate any errors encountered during pagination.
---
Nitpick comments:
In
`@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/config.go`:
- Around line 47-48: The string branch that sets triggerRunOnChange currently
uses a case-sensitive comparison (val != "false"), so values like "False" or "0"
are mis-parsed; update the case string branch to use strconv.ParseBool to
perform robust, case-insensitive boolean parsing and only set triggerRunOnChange
when ParseBool returns no error, and add the strconv import; locate the code
around the case string handling of triggerRunOnChange to apply this change.
In
`@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/tfe_test.go`:
- Around line 71-82: Add a test case in tfe_test.go that mirrors the existing
"string false" test but sets "triggerRunOnChange" to the string "true"; call
parseJobAgentConfig(cfg), assert no error with require.NoError, and assert that
parsed.triggerRunOnChange is true (suggested subtest name "string true") so
parseJobAgentConfig's string-to-bool handling is explicitly covered.
In
`@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/tfe.go`:
- Line 74: The code currently calls os.Getenv("TFE_WEBHOOK_SECRET") on every
dispatch (webhookSecret variable); change this to read the secret once during
the TFE dispatcher/controller construction and store it on the struct (e.g., add
a field on the TFE client/dispatcher type), then have the dispatch handler use
that stored value instead of calling os.Getenv each time; keep the current
runtime behavior only if you intentionally want dynamic secret rotation.
In
`@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go`:
- Around line 318-319: The methods VariableTemplate.toCreateOptions and
toUpdateOptions currently ignore the error returned by categoryType(), which can
mask invalid categories; change their signatures to return
(tfe.VariableCreateOptions, error), call categoryType() and if it returns an
error return that error instead of discarding it, and update callers (e.g.,
syncVariables) to handle the error; alternatively, perform validation earlier
when constructing VariableTemplate (validate via categoryType in the constructor
or a new Validate method) and have toCreateOptions/toUpdateOptions assume a
validated struct—reference the methods VariableTemplate.toCreateOptions,
VariableTemplate.toUpdateOptions, categoryType, and syncVariables when making
the change.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 754d8290-955a-43bf-942b-8c2b1d6111a8
📒 Files selected for processing (15)
.env.composeapps/api/src/config.tsapps/api/src/routes/tfe/__tests__/run_notification.test.tsapps/api/src/routes/tfe/__tests__/webhook_router.test.tsapps/api/src/routes/tfe/index.tsapps/api/src/routes/tfe/run_notification.tsapps/api/src/server.tsapps/workspace-engine/oapi/spec/schemas/jobs.jsonnetapps/workspace-engine/package.jsonapps/workspace-engine/svc/controllers/jobdispatch/controller.goapps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/config.goapps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/tfe.goapps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/tfe_test.goapps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.godocs/integrations/job-agents/terraform-cloud.mdx
| GITHUB_BOT_APP_ID: z.string().optional(), | ||
| GITHUB_BOT_PRIVATE_KEY: z.string().optional(), | ||
| GITHUB_WEBHOOK_SECRET: z.string().optional(), | ||
| TFE_WEBHOOK_SECRET: z.string().optional(), |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's check the config.ts file around line 35
fd -t f "config.ts" apps/api/src | head -5Repository: ctrlplanedev/ctrlplane
Length of output: 88
🏁 Script executed:
# Then read the specific file with context around line 35
cat -n apps/api/src/config.ts | sed -n '30,50p'Repository: ctrlplanedev/ctrlplane
Length of output: 815
🏁 Script executed:
# Check the routes/tfe/index.ts file to see how the secret is used
find apps/api/src -name "index.ts" -path "*/routes/tfe/*" -type fRepository: ctrlplanedev/ctrlplane
Length of output: 98
🏁 Script executed:
# Check .env.compose for context
find . -name ".env*compose*" -type f | head -10Repository: ctrlplanedev/ctrlplane
Length of output: 80
🏁 Script executed:
# Read the TFE route file to see how the secret is validated
cat -n apps/api/src/routes/tfe/index.tsRepository: ctrlplanedev/ctrlplane
Length of output: 1919
🏁 Script executed:
# Read .env.compose to see how TFE_WEBHOOK_SECRET is configured
cat -n ./.env.composeRepository: ctrlplanedev/ctrlplane
Length of output: 1322
Normalize blank webhook secrets to undefined.
The schema accepts TFE_WEBHOOK_SECRET= as a valid empty string, and the router only guards against null/undefined (line 15 of routes/tfe/index.ts). A blank env entry becomes a valid HMAC key instead of disabling the feature, allowing anyone to compute a valid signature using an empty secret.
Suggested fix
- TFE_WEBHOOK_SECRET: z.string().optional(),
+ TFE_WEBHOOK_SECRET: z.preprocess(
+ (value) =>
+ typeof value === "string" && value.trim() === "" ? undefined : value,
+ z.string().min(1).optional(),
+ ),📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| TFE_WEBHOOK_SECRET: z.string().optional(), | |
| TFE_WEBHOOK_SECRET: z.preprocess( | |
| (value) => | |
| typeof value === "string" && value.trim() === "" ? undefined : value, | |
| z.string().min(1).optional(), | |
| ), |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/api/src/config.ts` at line 35, The TFE_WEBHOOK_SECRET environment schema
currently allows an empty string which becomes a valid HMAC key; update the zod
schema for TFE_WEBHOOK_SECRET in config.ts to normalize blank strings to
undefined (e.g., use z.preprocess((val) => val === "" ? undefined : val,
z.string().optional())) so that an empty env value disables the webhook feature
consistently with the router checks (routes/tfe index uses undefined/null
guarding).
There was a problem hiding this comment.
It's also the exact same pattern as GITHUB_WEBHOOK_SECRET on line 34. Fixing one without the other would be inconsistent.
There was a problem hiding this comment.
@zacharyblasczyk, that's a fair point — the same empty-string concern applies to GITHUB_WEBHOOK_SECRET as well. For true consistency, the fix should be applied to both:
- GITHUB_WEBHOOK_SECRET: z.string().optional(),
- TFE_WEBHOOK_SECRET: z.string().optional(),
+ GITHUB_WEBHOOK_SECRET: z.preprocess(
+ (value) =>
+ typeof value === "string" && value.trim() === "" ? undefined : value,
+ z.string().min(1).optional(),
+ ),
+ TFE_WEBHOOK_SECRET: z.preprocess(
+ (value) =>
+ typeof value === "string" && value.trim() === "" ? undefined : value,
+ z.string().min(1).optional(),
+ ),That said, if the project's convention is to leave webhook secrets as plain z.string().optional() and rely on runtime guards in the route handlers instead, I understand treating both consistently with the existing pattern. The risk is low as long as developers know not to set these variables to empty strings in production.
( ͡° ͜ʖ ͡°) 🐇
| if (result?.workspaceId != null) | ||
| enqueueAllReleaseTargetsDesiredVersion(db, result.workspaceId); |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
cat -n apps/api/src/routes/tfe/run_notification.ts | head -150Repository: ctrlplanedev/ctrlplane
Length of output: 4556
🏁 Script executed:
rg "enqueueAllReleaseTargetsDesiredVersion" --type ts -B 5 -A 10Repository: ctrlplanedev/ctrlplane
Length of output: 19957
🏁 Script executed:
fd "version-manager" -e ts -o | head -5Repository: ctrlplanedev/ctrlplane
Length of output: 182
🏁 Script executed:
sed -n '37,115p' apps/api/src/routes/tfe/run_notification.tsRepository: ctrlplanedev/ctrlplane
Length of output: 2392
Await the desired-version enqueue.
This promise is currently fire-and-forget, so a queue write failure will result in an unhandled rejection. Since the handler is async, await it here:
Suggested fix
if (result?.workspaceId != null)
- enqueueAllReleaseTargetsDesiredVersion(db, result.workspaceId);
+ await enqueueAllReleaseTargetsDesiredVersion(db, result.workspaceId);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| if (result?.workspaceId != null) | |
| enqueueAllReleaseTargetsDesiredVersion(db, result.workspaceId); | |
| if (result?.workspaceId != null) | |
| await enqueueAllReleaseTargetsDesiredVersion(db, result.workspaceId); |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@apps/api/src/routes/tfe/run_notification.ts` around lines 113 - 114, The call
to enqueueAllReleaseTargetsDesiredVersion in run_notification.ts is currently
fire-and-forget and can cause unhandled promise rejections; change the
invocation to await the promise so failures propagate to the async handler
(i.e., replace the direct call to enqueueAllReleaseTargetsDesiredVersion(db,
result.workspaceId) with an awaited call: await
enqueueAllReleaseTargetsDesiredVersion(db, result.workspaceId)) so any queue
write errors are properly caught by the surrounding async flow.
There was a problem hiding this comment.
This is intentional — it's an enqueue operation (write to a queue table) that's designed to be best-effort. If it fails, the reconciler will pick it up on the next cycle anyway. Adding await here would make the webhook response wait on the enqueue write, and if it throws, the webhook would return 500 to TFC causing unnecessary retries for a non-critical side effect.
There was a problem hiding this comment.
@zacharyblasczyk, that's a reasonable design choice — keeping the webhook response fast and not penalizing TFC for a non-critical side effect makes sense.
That said, there's still a risk of an unhandled promise rejection if the enqueue fails, which can crash or warn the Node.js process even though the behavior is intentional. The idiomatic fix for intentional fire-and-forget is to explicitly mark it with void and attach a .catch() to swallow/log the error:
if (result?.workspaceId != null)
void enqueueAllReleaseTargetsDesiredVersion(db, result.workspaceId).catch(
(err) => logger.warn("Failed to enqueue desired version (best-effort)", { err }),
);This way:
- The webhook handler returns immediately (no awaiting).
- No 500 is ever returned to TFC for this side effect.
- The promise rejection is explicitly handled, preventing unhandled rejection warnings/crashes.
- The warning log gives you observability if the enqueue fails.
✏️ Learnings added
Learnt from: zacharyblasczyk
Repo: ctrlplanedev/ctrlplane PR: 847
File: apps/api/src/routes/tfe/run_notification.ts:115-116
Timestamp: 2026-03-20T16:24:22.839Z
Learning: In apps/api/src/routes/tfe/run_notification.ts, the call to `enqueueAllReleaseTargetsDesiredVersion` at the end of `handleRunNotification` is intentionally fire-and-forget. The enqueue is a best-effort write to a queue table; if it fails, the reconciler picks it up on the next cycle. Awaiting it would delay the webhook response and could return 500 to TFC causing unnecessary retries. The preferred pattern here is `void enqueueAllReleaseTargetsDesiredVersion(db, result.workspaceId).catch(err => logger.warn(...))` to prevent unhandled rejections while keeping non-blocking behavior.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: adityachoudhari26
Repo: ctrlplanedev/ctrlplane PR: 627
File: apps/event-worker/src/workers/job-update/trigger-dependendent-targets.ts:0-0
Timestamp: 2025-07-23T22:42:20.933Z
Learning: In apps/event-worker/src/workers/job-update/trigger-dependendent-targets.ts, when implementing error handling for the triggerDependentTargets function, only a try-catch wrapper is needed around the entire function body. Additional null checks after takeFirst() calls are unnecessary since takeFirst throws an error if no record is found.
Learnt from: adityachoudhari26
Repo: ctrlplanedev/ctrlplane PR: 501
File: apps/webservice/src/app/api/v1/resources/route.ts:69-72
Timestamp: 2025-04-22T04:35:36.016Z
Learning: In Node.js applications, non-awaited Promises that reject after a function returns won't be caught by the outer try-catch block, potentially causing unhandled rejections. Either explicitly mark them with `void` and add `.catch()` handlers, or use `await` to ensure proper error handling.
apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go
Outdated
Show resolved
Hide resolved
| configs, err := client.NotificationConfigurations.List(ctx, workspaceID, nil) | ||
| if err != nil { | ||
| return fmt.Errorf("failed to list notification configs: %w", err) | ||
| } |
There was a problem hiding this comment.
Pagination not handled when listing notification configurations.
Similar to syncVariables, NotificationConfigurations.List() returns paginated results. If a workspace has many notification configs, the ctrlplane-webhook config might not be found on the first page, causing duplicate configs to be created.
🔧 Suggested fix: paginate through all configs
func ensureNotificationConfig(
ctx context.Context,
client *tfe.Client,
workspaceID, webhookURL, webhookSecret string,
) error {
- configs, err := client.NotificationConfigurations.List(ctx, workspaceID, nil)
- if err != nil {
- return fmt.Errorf("failed to list notification configs: %w", err)
- }
-
- for _, cfg := range configs.Items {
+ opts := &tfe.NotificationConfigurationListOptions{}
+ for {
+ configs, err := client.NotificationConfigurations.List(ctx, workspaceID, opts)
+ if err != nil {
+ return fmt.Errorf("failed to list notification configs: %w", err)
+ }
+ for _, cfg := range configs.Items {
+ if cfg.Name != notificationConfigName {
+ continue
+ }
+ // ... existing matching logic
+ }
+ if configs.NextPage == 0 {
+ break
+ }
+ opts.PageNumber = configs.NextPage
+ }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In
`@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go`
around lines 173 - 176, NotificationConfigurations.List returns paginated
results, so change the call in workspace.go to iterate through all pages and
accumulate configurations (similar to how syncVariables handles pagination)
instead of only using the first page; repeatedly call
NotificationConfigurations.List with the appropriate pagination/options until no
more pages, search the combined list for the "ctrlplane-webhook" config to avoid
creating duplicates, and return/use the aggregated configs variable throughout
the surrounding logic that currently assumes a single-page response.
There was a problem hiding this comment.
A TFC workspace has at most a handful of notification configs — the UI itself only supports a small number. You'd need 20+ notification configurations on a single workspace to hit a second page, which is unrealistic. Not worth the code complexity.
There was a problem hiding this comment.
@zacharyblasczyk, that's a fair point — TFC's own UI enforces a low practical ceiling on notification configs per workspace, making pagination a non-issue here. Thanks for the context!
ʕ •ᴥ•ʔ
✏️ Learnings added
Learnt from: zacharyblasczyk
Repo: ctrlplanedev/ctrlplane PR: 847
File: apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go:179-182
Timestamp: 2026-03-20T17:07:42.750Z
Learning: In `apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go`, `ensureNotificationConfig` intentionally does not paginate `NotificationConfigurations.List`. TFC workspaces support only a handful of notification configs (the UI itself enforces a small limit), so a workspace would never have 20+ configs needed to trigger a second page. Pagination would add unnecessary complexity.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
🧠 Learnings used
Learnt from: zacharyblasczyk
Repo: ctrlplanedev/ctrlplane PR: 847
File: apps/api/src/routes/tfe/run_notification.ts:115-116
Timestamp: 2026-03-20T16:24:22.839Z
Learning: In apps/api/src/routes/tfe/run_notification.ts, the call to `enqueueAllReleaseTargetsDesiredVersion` at the end of `handleRunNotification` is intentionally fire-and-forget. The enqueue is a best-effort write to a queue table; if it fails, the reconciler picks it up on the next cycle. Awaiting it would delay the webhook response and could return 500 to TFC causing unnecessary retries. The preferred pattern here is `void enqueueAllReleaseTargetsDesiredVersion(db, result.workspaceId).catch(err => logger.warn(...))` to prevent unhandled rejections while keeping non-blocking behavior.
Learnt from: CR
Repo: ctrlplanedev/ctrlplane PR: 0
File: apps/workspace-engine/CLAUDE.md:0-0
Timestamp: 2025-08-12T18:13:54.630Z
Learning: Applies to apps/workspace-engine/**/*.go : Write comments that explain why, document complex logic/algorithms, provide non-obvious context, include TODO/FIXME, and document exported functions/types/methods
- Use COALESCE for startedAt so run:applying doesn't overwrite the original start time set by run:planning - Paginate variable listing to handle workspaces with >20 variables - Always update notification config (URL + token) so secret rotation propagates without manual intervention
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (2)
apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go (2)
52-52: Trim the obvioustemplateWorkspacecomment.Line 52 just restates the function name. Either remove it or use the comment to capture the non-obvious contract here.
Based on learnings, "Do not add comments that simply restate what the code does".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go` at line 52, Remove the redundant comment that merely repeats the function name for templateWorkspace; either delete it or replace it with a concise comment describing the non-obvious contract of templateWorkspace (e.g., the expected shape of dispatch.Context it uses, which template variables must be present, error behavior, and any side-effects such as escaping or trimming of values) so readers understand preconditions/outputs rather than restating the name.
15-50: Add doc comments for the exported template types.
VCSRepoTemplate,WorkspaceTemplate, andVariableTemplateare exported, so they should each carry a brief package comment describing how they fit into the Terraform Cloud template model.As per coding guidelines, "document exported functions/types/methods".
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go` around lines 15 - 50, Add Go doc comments above each exported type (VCSRepoTemplate, WorkspaceTemplate, VariableTemplate) describing their role in the Terraform Cloud workspace template model and how they are used (e.g., VCSRepoTemplate represents VCS connection details for a workspace; WorkspaceTemplate defines workspace-level settings such as execution mode, triggers, VCSRepo and Variables; VariableTemplate describes a workspace variable including category/HCL/sensitive flags). Ensure each comment starts with the type name (e.g., "VCSRepoTemplate ..." ) and briefly mention important fields or semantics so godoc users understand intent and usage.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In
`@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go`:
- Around line 250-252: toUpdateOptions() currently omits setting the
WorkspaceUpdateOptions.Project field so workspace project changes never
propagate; mirror the toCreateOptions() pattern by adding a conditional that
when w.Project is non-empty sets opts.Project = &tfe.Project{ID: w.Project}
(same guard and assignment used in toCreateOptions()), ensuring to reference the
WorkspaceUpdateOptions instance named opts in toUpdateOptions().
- Around line 158-164: Remove the hardcoded auto-apply override: delete the
autoApply variable and do not set the AutoApply field when constructing
tfe.RunCreateOptions in the client.Runs.Create call (leave AutoApply omitted so
the go-tfe SDK will use the workspace's configured behavior); update the
RunCreateOptions invocation in workspace.go where run is created (the call to
client.Runs.Create with Workspace: &tfe.Workspace{ID: workspaceID}, Message:
&message) to exclude AutoApply entirely.
- Around line 188-228: The update path currently only sets URL and Token on
tfe.NotificationConfigurationUpdateOptions, so it can leave Enabled and Triggers
out-of-sync; update the logic around NotificationConfigurationUpdateOptions
(used in the client.NotificationConfigurations.Update call) to also set Enabled
(true) and the full Triggers slice (the same values used in the create opts) and
ensure Token is conditionally set when webhookSecret != ""; use the same
webhookURL, webhookSecret, enabled flag and triggers as in the create branch so
the notification config is fully reconciled to the desired state.
---
Nitpick comments:
In
`@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go`:
- Line 52: Remove the redundant comment that merely repeats the function name
for templateWorkspace; either delete it or replace it with a concise comment
describing the non-obvious contract of templateWorkspace (e.g., the expected
shape of dispatch.Context it uses, which template variables must be present,
error behavior, and any side-effects such as escaping or trimming of values) so
readers understand preconditions/outputs rather than restating the name.
- Around line 15-50: Add Go doc comments above each exported type
(VCSRepoTemplate, WorkspaceTemplate, VariableTemplate) describing their role in
the Terraform Cloud workspace template model and how they are used (e.g.,
VCSRepoTemplate represents VCS connection details for a workspace;
WorkspaceTemplate defines workspace-level settings such as execution mode,
triggers, VCSRepo and Variables; VariableTemplate describes a workspace variable
including category/HCL/sensitive flags). Ensure each comment starts with the
type name (e.g., "VCSRepoTemplate ..." ) and briefly mention important fields or
semantics so godoc users understand intent and usage.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: b6737286-a88d-468a-af9c-48acf13a6469
📒 Files selected for processing (2)
apps/api/src/routes/tfe/run_notification.tsapps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go
🚧 Files skipped from review as they are similar to previous changes (1)
- apps/api/src/routes/tfe/run_notification.ts
apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go
Outdated
Show resolved
Hide resolved
apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go
Outdated
Show resolved
Hide resolved
apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go
Show resolved
Hide resolved
Align runtime validation with the OpenAPI schema — webhookUrl is required for status updates to flow back via webhooks.
… project to update - Remove hardcoded autoApply=true so runs defer to the workspace setting - Reconcile enabled/triggers on notification config update, not just URL/token - Add missing Project field to toUpdateOptions
There was a problem hiding this comment.
🧹 Nitpick comments (2)
apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/tfe.go (1)
74-83: Consider logging when webhook secret is not configured.When
TFE_WEBHOOK_SECRETis empty, the notification config is created without HMAC authentication, meaning incoming webhooks cannot be verified. While the code handles this gracefully (skipping token setup), it might be worth logging a warning so operators are aware their webhook endpoint is unprotected.💡 Optional: Add a warning when secret is missing
webhookSecret := os.Getenv("TFE_WEBHOOK_SECRET") + if webhookSecret == "" { + log.Warn("TFE_WEBHOOK_SECRET not set, webhook authentication disabled") + } if err := ensureNotificationConfig(🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/tfe.go` around lines 74 - 83, The code reads TFE_WEBHOOK_SECRET into webhookSecret and calls ensureNotificationConfig without warning when it's empty; add a conditional check after reading webhookSecret (before calling ensureNotificationConfig) to detect an empty value and emit a warning via the existing logger (e.g., log.Warn) that the webhook URL (cfg.webhookUrl) will be created without HMAC authentication, referencing TFE_WEBHOOK_SECRET and suggesting operators set it; then proceed to call ensureNotificationConfig as before.apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go (1)
193-200: Consider extracting duplicate trigger list.The same trigger list is defined in both the update (lines 193-200) and create (lines 224-231) paths. Extracting it to a package-level variable would reduce duplication and ensure they stay in sync.
♻️ Optional: Extract common trigger list
+var notificationTriggers = []tfe.NotificationTriggerType{ + tfe.NotificationTriggerCreated, + tfe.NotificationTriggerPlanning, + tfe.NotificationTriggerNeedsAttention, + tfe.NotificationTriggerApplying, + tfe.NotificationTriggerCompleted, + tfe.NotificationTriggerErrored, +} + func ensureNotificationConfig( ctx context.Context, client *tfe.Client, workspaceID, webhookURL, webhookSecret string, ) error { // ... list configs ... updateOpts := tfe.NotificationConfigurationUpdateOptions{ URL: &webhookURL, Enabled: &enabled, - Triggers: []tfe.NotificationTriggerType{ - tfe.NotificationTriggerCreated, - tfe.NotificationTriggerPlanning, - tfe.NotificationTriggerNeedsAttention, - tfe.NotificationTriggerApplying, - tfe.NotificationTriggerCompleted, - tfe.NotificationTriggerErrored, - }, + Triggers: notificationTriggers, } // ... and similarly for create opts ...Also applies to: 224-231
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go` around lines 193 - 200, Extract the duplicated slice of tfe.NotificationTriggerType values into a single package-level variable (e.g., workspaceNotificationTriggers) and replace the inline slices in both the update and create code paths with that variable; locate the two inline slices that contain tfe.NotificationTriggerCreated, tfe.NotificationTriggerPlanning, tfe.NotificationTriggerNeedsAttention, tfe.NotificationTriggerApplying, tfe.NotificationTriggerCompleted, tfe.NotificationTriggerErrored and assign them to the new package-level var so both the update and create paths reference the same slice.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Nitpick comments:
In
`@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/tfe.go`:
- Around line 74-83: The code reads TFE_WEBHOOK_SECRET into webhookSecret and
calls ensureNotificationConfig without warning when it's empty; add a
conditional check after reading webhookSecret (before calling
ensureNotificationConfig) to detect an empty value and emit a warning via the
existing logger (e.g., log.Warn) that the webhook URL (cfg.webhookUrl) will be
created without HMAC authentication, referencing TFE_WEBHOOK_SECRET and
suggesting operators set it; then proceed to call ensureNotificationConfig as
before.
In
`@apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go`:
- Around line 193-200: Extract the duplicated slice of
tfe.NotificationTriggerType values into a single package-level variable (e.g.,
workspaceNotificationTriggers) and replace the inline slices in both the update
and create code paths with that variable; locate the two inline slices that
contain tfe.NotificationTriggerCreated, tfe.NotificationTriggerPlanning,
tfe.NotificationTriggerNeedsAttention, tfe.NotificationTriggerApplying,
tfe.NotificationTriggerCompleted, tfe.NotificationTriggerErrored and assign them
to the new package-level var so both the update and create paths reference the
same slice.
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 30886f6a-17a8-40a0-8ac5-25fec11a5111
📒 Files selected for processing (4)
apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/config.goapps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/tfe.goapps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/tfe_test.goapps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/workspace.go
✅ Files skipped from review due to trivial changes (2)
- apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/config.go
- apps/workspace-engine/svc/controllers/jobdispatch/jobagents/terraformcloud/tfe_test.go
Replace polling-based TFE dispatcher with fire-and-forget webhook model:
Tested end-to-end locally against TFC via smee.io webhook tunnel.
Summary by CodeRabbit
New Features
Documentation
Configuration
Tests