Skip to content

[CFX-5205][CFX-5207][CFX-5208] first pass at wiring up Amplitude telemetry#403

Open
ajalon1 wants to merge 11 commits intodatarobot-oss:feat/telemetryfrom
ajalon1:aj/CFX-5205-add-amplitude
Open

[CFX-5205][CFX-5207][CFX-5208] first pass at wiring up Amplitude telemetry#403
ajalon1 wants to merge 11 commits intodatarobot-oss:feat/telemetryfrom
ajalon1:aj/CFX-5205-add-amplitude

Conversation

@ajalon1
Copy link
Contributor

@ajalon1 ajalon1 commented Mar 19, 2026

RATIONALE

On discussion with @victorborshak I am working off of a feature branch for now. Soon enough I will be merging said feature branch into main.

CHANGES

  1. add amplitude-go as a direct dependency

  2. Create a new module internal/telemetry, which contains:

    1. telemetry.go – client init, IsEnabled() guard, a non-blocking Track() wrapper, and Flush()
    2. events.go – typed event constructors per CLI command
    3. properties.go – common property and property group builders (CLI environment, session, template)
  3. Wire up the disable-telemetry opt-out flag

  4. In the root command's PersistentPreRunE hook, collect common event properties and store them in telemetry context

  5. In the root command's PersistentPostRun hook, flush telemetry events before exit

  6. add the AMPLITUDE_API_KEY and InstallMethod as ldflags in goreleaser configuration; this will be used to enable Amplitude writes (duh) and to add context to Amplitude events about how the CLI was installed

  7. add to the internal/http client the ability to get UserInfo, which will be added as UserID in event context

  8. add AMPLITUDE_API_KEY as a GH Release secret, so that the API key will be baked into the release binary but NOT in the source code itself

PR Automation

Comment-Commands: Trigger CI by commenting on the PR:

  • /trigger-smoke-test or /trigger-test-smoke - Run smoke tests
  • /trigger-install-test or /trigger-test-install - Run installation tests

Labels: Apply labels to trigger workflows:

  • run-smoke-tests or go - Run smoke tests on demand (only works for non-forked PRs)

Important

For Forked PRs: If you're an external contributor, the run-smoke-tests label won't work. Only maintainers can trigger smoke tests on forked PRs by applying the approved-for-smoke-tests label after security review. Please comment requesting maintainer review if you need smoke tests to run.


Note

Medium Risk
Introduces a new telemetry subsystem that runs on every CLI invocation and embeds an Amplitude API key into release builds; failures should be no-ops, but mistakes could impact privacy expectations or runtime behavior at startup/shutdown.

Overview
Adds first-pass Amplitude telemetry support to the CLI. A new internal/telemetry package provides an Amplitude-backed client with Track/Flush, common property collection (session/OS/version/instance/template), and typed event constructors, while remaining a safe no-op when disabled or when no API key is present.

Wires telemetry initialization and shutdown into the root command lifecycle, adds a --disable-telemetry flag (bound via Viper), and updates release tooling to inject AMPLITUDE_API_KEY/install method via goreleaser ldflags and pass the secret through the GitHub release workflow. Also adds a placeholder drapi.GetUserID() for future user identification in telemetry.

Written by Cursor Bugbot for commit 13321d7. This will update automatically on new commits. Configure here.

ajalon1 added 5 commits March 19, 2026 14:35
Add ldflags variables for telemetry configuration:
- AmplitudeAPIKey: injected via GitHub Actions secret at build time
- InstallMethod: set to 'release' for official builds, 'source' for dev

Update release workflow to include AMPLITUDE_API_KEY secret.
Add GetUserID() function to fetch user ID from DataRobot API
(GET /api/v2/userinfo/) for telemetry common properties.
- Add --disable-telemetry persistent flag bound to Viper
- Initialize telemetry client in PersistentPreRunE
- Flush telemetry events in PersistentPostRun with 3s timeout
ajalon1 and others added 4 commits March 19, 2026 15:36
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Update comments to reflect that UserID is now a placeholder, simplify
GetUserID call in CollectCommonProperties, and remove unused imports.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
@ajalon1 ajalon1 changed the title [CFX-5205][CFX-5208] first pass at wiring up Amplitude telemetry [CFX-5205][CFX-5207][CFX-5208] first pass at wiring up Amplitude telemetry Mar 19, 2026
ajalon1 and others added 2 commits March 19, 2026 16:01
- Add missing 'encoding/json' and 'time' imports to internal/drapi/get.go
- Fix undefined type error by changing 'types.Client' to 'amplitude.Client' in telemetry.go

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
Amplitude SDK requires UserID or DeviceID as top-level fields on the event
struct, not just in EventProperties. Without this, events are silently
rejected by Amplitude's HTTP API with 'missing required field' errors.

Co-authored-by: factory-droid[bot] <138933559+factory-droid[bot]@users.noreply.github.com>
MACOS_NOTARY_ISSUER_ID: ${{ secrets.MACOS_NOTARY_ISSUER_ID }}
MACOS_NOTARY_KEY_ID: ${{ secrets.MACOS_NOTARY_KEY_ID }}
MACOS_NOTARY_KEY: ${{ secrets.MACOS_NOTARY_KEY }}
AMPLITUDE_API_KEY: ${{ secrets.AMPLITUDE_API_KEY }}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is currently no GH secret for this, so it will be blank

cmd/root.go Outdated
Comment on lines +92 to +96
PersistentPostRun: func(cmd *cobra.Command, _ []string) {
// Flush telemetry events before exit
if client, ok := cmd.Context().Value(telemetryClientKey{}).(*telemetry.Client); ok {
client.Flush(3 * time.Second)
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually need to check to see if this is executed on error, not just success.

Copy link
Contributor Author

@ajalon1 ajalon1 Mar 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing these generated for me, I wonder if it make sense to have a single event type (NewCLICommandEvent) and then a property for the actual event. That would, I think, decrease our Amplitude costs. Or to top-level command events like dr_plugin, dr_start, dr_dotenv ...

Comment on lines +44 to +45
Environment string // US, EU, JP, or custom — from endpoint URL
DataRobotInstance string // Base URL of configured DataRobot instance
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think Environment is extra information, and DataRobotInstance is sufficient.

Comment on lines +82 to +85
if c.amp == nil {
log.Debug("Telemetry event (dry-run)", "type", event.EventType, "properties", event.EventProperties)
return
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is how I can test without actually wiring up to our test Amplitude account.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor


// Initialize telemetry client
props := telemetry.CollectCommonProperties()
client := telemetry.NewClient(props)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unnecessary I/O on every invocation when telemetry disabled

Medium Severity

CollectCommonProperties() is called unconditionally before NewClient() checks IsEnabled(). This means filesystem I/O (walking directories via repo.FindRepoRoot(), os.ReadDir, os.Getwd, and viper config reads) runs on every CLI invocation even when telemetry is disabled — which includes all dev builds (empty AmplitudeAPIKey) and opted-out users. The IsEnabled() check lives inside NewClient but the expensive property collection happens before it.

Additional Locations (1)
Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant