diff --git a/modules/ROOT/nav.adoc b/modules/ROOT/nav.adoc index 419854e..7f2f8eb 100644 --- a/modules/ROOT/nav.adoc +++ b/modules/ROOT/nav.adoc @@ -49,6 +49,7 @@ ** xref:ai-gateway:gateway-quickstart.adoc[Quickstart] ** xref:ai-gateway:gateway-architecture.adoc[Architecture] ** xref:ai-gateway:configure-provider.adoc[Configure Your LLM Provider] +*** xref:ai-gateway:bedrock-setup.adoc[Set Up AWS Bedrock] ** xref:ai-gateway:aggregation.adoc[MCP Aggregation] ** xref:ai-gateway:connect-agent.adoc[Connect Your Agent] *** xref:ai-gateway:admin/index.adoc[For Admins] diff --git a/modules/agents/pages/create-agent.adoc b/modules/agents/pages/create-agent.adoc index e73a68d..cb77e37 100644 --- a/modules/agents/pages/create-agent.adoc +++ b/modules/agents/pages/create-agent.adoc @@ -178,6 +178,52 @@ When the agent is running, ADP shows its HTTP endpoint URL on the *Configuration You can use this URL to call the agent programmatically or integrate it with external systems. See xref:agents:integration-overview.adoc[]. +== Issue agent credentials + +Every agent has a hidden service account in AI Gateway that scopes the calls the agent makes to MCP servers and the Redpanda broker. To call the agent from an external system as that service account, issue an OAuth 2.0 client credential on the agent's *Credentials* tab. + +The *Setup* tab on the agent detail page shows the equivalent OAuth client configuration (token endpoint, scopes, sample request) after a credential exists. + +[IMPORTANT] +==== +The `client_secret` value is plaintext and returned only at create time. ADP cannot retrieve it again. Capture the value into your secret store before you leave the dialog. To rotate, create a new credential, update the consumer, then delete the old one. +==== + +. Open the agent and click the *Credentials* tab. +. Click *Create credential*. +. Optionally enter a Description (up to 1024 characters) to identify what the credential is used for, for example `customer-support-bot prod`. +. Optionally pick an Expires at date. Leave it unset for a non-expiring credential. +. Click *Create*. ++ +ADP shows a one-time dialog containing: ++ +* *Client ID*: An OAuth 2.0 client identifier in the form `serviceaccounts/`. Stable across credentials and rotations: every credential on the same agent carries the same value. +* *Client secret*: The plaintext OAuth client secret, shown only at create time. + +. Copy both values into your secret store before closing the dialog. The credential row appears in the Credentials table with its description, expiry, and creation time, but the secret is never retrievable again. + +To list existing credentials, return to the *Credentials* tab. The list shows the metadata fields for each credential (description, expiry, creation time) but not the secret. + +To delete a credential, click the delete action on the credential row. Delete is idempotent: deleting a credential that no longer exists succeeds. After delete, the OAuth token endpoint rejects the corresponding `client_id` / `client_secret` pair. + +=== Required permissions + +Credential operations are governed by their own permission set: + +[cols="1,2"] +|=== +|Permission |Allows + +|`dataplane_adp_agent_credential_create` +|Issue new credentials for the agent. + +|`dataplane_adp_agent_credential_list` +|List credential metadata for the agent. + +|`dataplane_adp_agent_credential_delete` +|Delete credentials for the agent. +|=== + == Configure A2A discovery metadata (optional) A2A discovery metadata lets external systems find and invoke the agent through capability-based discovery. Configure this after creation, on the agent's *A2A* tab. diff --git a/modules/ai-gateway/pages/bedrock-setup.adoc b/modules/ai-gateway/pages/bedrock-setup.adoc new file mode 100644 index 0000000..12598c1 --- /dev/null +++ b/modules/ai-gateway/pages/bedrock-setup.adoc @@ -0,0 +1,172 @@ += Set Up AWS Bedrock as an LLM Provider +:description: Create the IAM user, policy, and access keys required for AI Gateway to invoke Amazon Bedrock models, then register the provider in ADP. +:page-topic-type: how-to +:personas: platform_admin +:learning-objective-1: Create an IAM policy that grants AI Gateway permission to invoke Bedrock foundation models and cross-region inference profiles +:learning-objective-2: Create a dedicated IAM user, attach the policy, and generate access keys for AI Gateway +:learning-objective-3: Register Bedrock as an LLM provider in ADP and select the models you want to expose + +// Source: cloudv2 `apps/aigw/docs/customer/bedrock-setup-guide.md` on origin/main, verified 2026-05-19. + +This guide walks you through the AWS-side setup AI Gateway needs to invoke Amazon Bedrock, then through the ADP UI flow that registers Bedrock as an LLM provider. For background on how Bedrock foundation models, cross-region inference profiles, and IAM patterns map to the provider form, see xref:ai-gateway:configure-provider.adoc#bedrock-inference-profiles[AWS Bedrock: Inference profiles and IAM] on the main provider configuration page. + +After completing this guide, you will be able to: + +* [ ] {learning-objective-1} +* [ ] {learning-objective-2} +* [ ] {learning-objective-3} + +== Prerequisites + +* A Redpanda Cloud cluster with ADP enabled. +* An AWS account with Bedrock model access enabled in the region you plan to call. Model availability varies by region; see link:https://docs.aws.amazon.com/bedrock/latest/userguide/models-regions.html[Bedrock models by region^]. +* Access to the AWS CLI configured with credentials that can create IAM users, policies, and access keys. +* Access to the Redpanda Cloud UI. + +== Create the IAM policy + +Create a policy that grants the two Bedrock invoke actions on both foundation-model ARNs and cross-region inference-profile ARNs: + +[,bash] +---- +aws iam create-policy \ + --policy-name RedpandaBedrockInvoke \ + --policy-document '{ + "Version": "2012-10-17", + "Statement": [ + { + "Sid": "BedrockInvoke", + "Effect": "Allow", + "Action": [ + "bedrock:InvokeModel", + "bedrock:InvokeModelWithResponseStream" + ], + "Resource": [ + "arn:aws:bedrock:*::foundation-model/*", + "arn:aws:bedrock:*:*:inference-profile/*" + ] + } + ] + }' +---- + +The second resource entry enables cross-region inference profiles such as `us.anthropic.claude-sonnet-4-6`, which AI Gateway uses when the model identifier carries a geography prefix. See xref:ai-gateway:configure-provider.adoc#bedrock-inference-profiles[AWS Bedrock: Inference profiles and IAM] for the full prefix list and pricing implications. + +NOTE: Anthropic Claude 4.6 and later models cannot be invoked with the bare foundation-model ID and require an inference profile. Without the second `Resource` entry, those calls fail with `AccessDenied`. + +To restrict the policy to specific models and regions for production, replace the wildcard resources with explicit ARNs. For example: + +[,json] +---- +{ + "Resource": [ + "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-sonnet-4-6", + "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-haiku-4-5-20251001" + ] +} +---- + +== Create the IAM user + +Create a dedicated IAM user for AI Gateway and attach the policy: + +[,bash] +---- +aws iam create-user --user-name redpanda-bedrock-invoker + +aws iam attach-user-policy \ + --user-name redpanda-bedrock-invoker \ + --policy-arn arn:aws:iam:::policy/RedpandaBedrockInvoke +---- + +Replace `` with the account ID returned in the `create-policy` output (visible in the policy ARN). + +TIP: Don't reuse an existing IAM user. A dedicated user makes it easy to rotate credentials or revoke access without affecting other AWS workloads. + +== Generate access keys + +Generate the access keys AI Gateway will use: + +[,bash] +---- +aws iam create-access-key --user-name redpanda-bedrock-invoker +---- + +Save the `AccessKeyId` and `SecretAccessKey` from the output. You need both in the next section to register them as Redpanda Cloud secrets. + +CAUTION: AWS displays the secret access key only at creation. Store it in a password manager or pass it directly into the secret-creation flow in the next section. + +== Verify Bedrock access (optional) + +Confirm the IAM user can invoke Bedrock before moving to the UI: + +[,bash] +---- +aws bedrock-runtime invoke-model \ + --model-id us.anthropic.claude-haiku-4-5-20251001-v1:0 \ + --region us-east-1 \ + --content-type application/json \ + --accept application/json \ + --body "$(echo -n '{"anthropic_version":"bedrock-2023-05-31","max_tokens":32,"messages":[{"role":"user","content":"Hello"}]}' | base64)" \ + /tmp/bedrock-test.json \ + && jq . /tmp/bedrock-test.json && rm /tmp/bedrock-test.json +---- + +A successful model response confirms the IAM policy, region, and credentials are correct. If you see `AccessDenied`, check the policy resource list and confirm Bedrock model access is enabled in the target region. + +== Register Bedrock as an LLM provider + +. Sign in to the Redpanda Cloud UI and open ADP. +. Open *LLM Providers* in the sidebar and click *Create provider*. +. Select *AWS Bedrock* as the provider type. +. Enter a Name such as `my-bedrock`. Use lowercase letters, digits, and hyphens. The name is immutable and appears in the proxy URL. +. Select the Region where you want to invoke Bedrock, such as `us-east-1`. +. Configure the credentials: ++ +.. In the Access key ID ref dropdown, type a secret name such as `AWS_ACCESS_KEY_ID`. The inline secret-creation form opens. +.. Paste the `AccessKeyId` value from the IAM user setup and click *Create*. The secret is stored in the cloud secret store, scoped to AI Gateway. +.. Repeat for Secret access key ref. Use a name such as `AWS_SECRET_ACCESS_KEY` and paste the `SecretAccessKey` value. ++ +Secret names are normalized to `UPPER_SNAKE_CASE` automatically and are scoped to AI Gateway so only LLM providers can reference them. + +. Select the models you want to expose through this provider, for example: ++ +* `anthropic.claude-sonnet-4-6` +* `anthropic.claude-haiku-4-5-20251001` +* `amazon.nova-pro-v1:0` ++ +For Anthropic Claude 4.6 and later, pick the inference profile (for example, `us.anthropic.claude-sonnet-4-6`) rather than the bare foundation-model ID. + +. Click *Create provider*. +. On the provider detail page, scroll to the Verify connection section, pick a model, and click *Test Connection*. A successful response confirms that the credentials, region, and IAM policy are correctly configured. + +== Cross-region inference profile billing + +When you call a cross-region inference profile (any model identifier with a `us.`, `eu.`, `apac.`, `au.`, `jp.`, or `global.` prefix), AI Gateway bills at the regional rate for that profile. The regional prefix is preserved end to end so usage in the Governance dashboard and the *Cost & Usage* tab on *LLM Providers* reflects the correct per-region price. + +For example, requests to `eu.anthropic.claude-haiku-4-5` bill at the EU Haiku rate, not the headline foundation-model rate. The `global.` profile shares the headline rate; the geography-specific profiles (`us.`, `eu.`, `apac.`, `au.`, `jp.`) carry approximately a 10% cross-region inference premium. + +== Troubleshooting + +[cols="1,2"] +|=== +|Symptom |What to check + +|`AccessDenied` from Bedrock +|Confirm the IAM policy includes both `bedrock:InvokeModel` and `bedrock:InvokeModelWithResponseStream`, and that the resource list covers the model or inference profile you're calling. For Claude 4.6 and later, the policy must include `arn:aws:bedrock:*:*:inference-profile/*` or an explicit inference-profile ARN. + +|`secret "" not found` +|Confirm the secret exists in the cloud secret store and the reference in the provider configuration matches exactly. Secret names are `UPPER_SNAKE_CASE`. + +|`ValidationException: model ID not supported` +|The model isn't enabled in the region you chose. Open the AWS Bedrock console, switch to the target region, and enable model access for the foundation models you want to expose. + +|`Invocation of model ID … with on-demand throughput isn't supported` +|You called a Claude 4.6 or later model with a bare foundation-model ID. Switch to an inference profile, for example `us.anthropic.claude-sonnet-4-6` instead of `anthropic.claude-sonnet-4-6`. See xref:ai-gateway:configure-provider.adoc#bedrock-inference-profiles[AWS Bedrock: Inference profiles and IAM]. +|=== + +== Next steps + +* xref:ai-gateway:configure-provider.adoc[Configure an LLM provider] +* xref:ai-gateway:connect-agent.adoc[Connect your agent] +* xref:governance:dashboard/overview.adoc[Read the governance overview] diff --git a/modules/ai-gateway/pages/configure-provider.adoc b/modules/ai-gateway/pages/configure-provider.adoc index e1fe15a..fc09fa8 100644 --- a/modules/ai-gateway/pages/configure-provider.adoc +++ b/modules/ai-gateway/pages/configure-provider.adoc @@ -67,7 +67,7 @@ The *Provider type* card shows five cards. Pick the one that matches your upstre |Reach Gemini Pro, Flash, and multimodal models through Google AI Studio. Ideal for long-context workloads and image/video inputs. |*AWS Bedrock* -|Invoke foundation models (Claude, Llama, Titan, Nova, Mistral, AI21 Jamba) hosted inside your AWS account. Requires an AWS region and credentials (static, STS-assumed role, or the default credential chain). Supports the native Bedrock APIs (`InvokeModel`, `Converse`) and an OpenAI-compatible Chat Completions endpoint for `gpt-oss` models. See <> for picking the right model identifier. +|Invoke foundation models (Claude, Llama, Titan, Nova, Mistral, AI21 Jamba) hosted inside your AWS account. Requires an AWS region and credentials (static, STS-assumed role, or the default credential chain). Supports the native Bedrock APIs (`InvokeModel`, `Converse`) and an OpenAI-compatible Chat Completions endpoint for `gpt-oss` models. See <> for picking the right model identifier, and xref:ai-gateway:bedrock-setup.adoc[Set up AWS Bedrock as an LLM provider] for a step-by-step IAM and access-key walkthrough. |*OpenAI-compatible* |Point at any OpenAI-compatible endpoint that ships `/v1/chat/completions` (vLLM, Ollama, LM Studio, LocalAI, Together, Groq, OpenRouter). Useful for self-hosted models and aggregator gateways. Requires a *Base URL*. Authentication is optional. @@ -162,6 +162,11 @@ OpenAI-compatible:: TIP: OpenAI-compatible endpoints can serve any model. Enter the exact model identifiers your upstream server exposes (for example, `meta-llama/Llama-3.3-70B-Instruct` or `qwen3:8b`). ====== +[NOTE] +==== +For the OpenAI, Google AI, and AWS Bedrock provider types, AI Gateway validates that the credential references resolve before it accepts the create or update. A missing or empty secret reference is rejected at save time instead of failing at first call. The OpenAI-compatible type does not require a credential reference, so it can be created with no authentication for local runtimes such as Ollama or vLLM. +==== + [[select-models]] == Select models @@ -236,6 +241,8 @@ Older 4.5 and earlier Claude models still accept bare IDs. Pricing varies by profile. The bare foundation-model ID and the `global.` profile share AWS's headline rate; geo profiles (`us.`, `eu.`, `apac.`, `au.`, `jp.`) carry approximately a 10% cross-region inference premium. Use `global.` when you want the headline rate and don't need a specific geography. Use `us.` / `eu.` / `apac.` when data residency matters. +AI Gateway preserves the regional prefix end to end when it records spend, so usage in the xref:governance:dashboard/overview.adoc[governance dashboard] and the *Cost & Usage* tab is attributed to the correct regional rate. A call to `eu.anthropic.claude-haiku-4-5` is billed at the EU Haiku rate, not the headline foundation-model rate. + === IAM policy patterns Bedrock IAM resources have different ARN structures depending on whether you reference a foundation model, a system-defined inference profile, or an account-scoped application inference profile. The provider's IAM principal needs `bedrock:InvokeModel` and `bedrock:InvokeModelWithResponseStream` on every resource it calls. @@ -327,8 +334,16 @@ Use *Filter* to narrow the charts by provider, model, cost type, token type, or The date-range picker supports last 7 days, last 14 days, last 30 days, last 90 days, month to date, quarter to date, year to date, and custom ranges. The chart subtitle shows the selected date range and bucket size. +A custom range writes `customStart` and `customEnd` ISO-8601 timestamps to the page URL, so the view is shareable: copy the URL after picking a custom range and any teammate who opens it lands on the same window. + +The chart renders empty buckets in the selected range as zero-height bars rather than gaps, so quiet days line up with their date label and the trend stays readable when traffic is bursty. + +The chart palette is colorblind-safe. When multiple providers of the same type exist (for example, two OpenAI providers), the chart renders each one with a distinct hatched pattern so the series stay visually distinguishable. + The spend chart footer summarizes the selected view by cost bucket, including total, input, output, cached, cache writes, and reasoning when the selected traffic includes those categories. +// TODO: When the cached-token totals reconciliation fix lands (tracked internally as AI-1152), confirm the spend footer cost-bucket totals match cached-token spend across mixed cache-hit / cache-miss traffic, then this comment can be removed. + == Edit, disable, or delete a provider * *Edit*: Click *Edit* on the detail page. You can change any field *except* `Name` and `Type`, which are immutable. Model lists, credential references, and the enabled state can all change. diff --git a/modules/governance/pages/dashboard/overview.adoc b/modules/governance/pages/dashboard/overview.adoc index 762f872..1d66496 100644 --- a/modules/governance/pages/dashboard/overview.adoc +++ b/modules/governance/pages/dashboard/overview.adoc @@ -107,23 +107,27 @@ The *Agents* table lists agents registered in the deployment. |Column |What it shows |Name -|Agent display name +|Agent display name. Clicking the name opens the agent's detail page. |Type |Agent type. The UI labels agents as Redpanda Managed because dataplane v1alpha3 does not distinguish managed agents from BYOA agents in this table. |LLM provider -|The provider the agent calls +|The provider the agent calls. |Model -|The model identifier the agent uses +|The model identifier the agent uses. |=== +The table column headers carry filter controls. Filter by name, provider, or model to narrow the list before drilling into a specific agent's spend. + == Read the Top users panel -The *Top users* panel shows the highest-spending users in the selected time range. It ranks user-level spend, shows up to five users per page, and includes a heatmap for the ranked users. +The Top users panel ranks identified users by spend in the selected time range. The panel shows five users per page, fetches the top 20 from the backend, and supports paging between groups of five. + +For the users currently visible on the page, a heatmap renders one row per user and one column per time bucket (hourly for ranges of 24 hours or less, daily otherwise). Each cell's color intensity reflects the user's spend in that bucket: gray for low or no spend, red for the heaviest spend on the page. Hover a cell to see the exact bucket spend. -If no user-level spend exists, the panel stays empty until agents send requests on behalf of identified users. +If no user-level spend exists, the panel stays empty until agents send requests on behalf of identified users. Anonymous traffic doesn't appear here. == Next steps diff --git a/modules/mcp/pages/create-server.adoc b/modules/mcp/pages/create-server.adoc index dfd09b9..1f49162 100644 --- a/modules/mcp/pages/create-server.adoc +++ b/modules/mcp/pages/create-server.adoc @@ -73,6 +73,13 @@ Each managed type ships its own configuration schema. The form on this page is r For per-type fields, see the xref:mcp:managed/managed-catalog.adoc[Managed catalog]: a reference of every managed MCP type Redpanda hosts, grouped by category, with a description and a link to its deep-dive page where one exists. +[NOTE] +==== +MCP enforces a 64-character limit on tool names. For managed MCP types whose generated names exceed that limit, ADP truncates the prefix and replaces it with a hash, so the long-form name becomes something like `64ghux5adn_github_read_v1_GitHubReadService_GetAuthenticatedUser`. ADP always preserves the version, service, and method suffix, so the short tool name an agent sees (for example, `get_authenticated_user`) stays stable across truncations. + +You don't configure the truncation. This detail matters only when you correlate tool calls in logs or transcripts against the generated proto names. +==== + == Configure the self-managed flow (Remote/Proxied only) Two fields on top of the identity fields: diff --git a/modules/mcp/pages/oauth-providers.adoc b/modules/mcp/pages/oauth-providers.adoc index 76ec2f9..383bf94 100644 --- a/modules/mcp/pages/oauth-providers.adoc +++ b/modules/mcp/pages/oauth-providers.adoc @@ -163,7 +163,7 @@ Walk through the create form to register the upstream: + * *HTTP Basic*: `client_id:client_secret` sent as the Basic auth header. Most common. * *POST body*: Credentials sent as form fields in the token-request body. -* *None*: For public clients that rely on PKCE only. +* *None*: For public clients that rely on PKCE only. Pick this when the upstream OAuth app is registered as a public client and AI Gateway authenticates by proving possession of a PKCE code verifier rather than a stored client secret. Leave the client-secret reference unset. + . Provide the *Client ID* and a secret reference for the *Client secret* (for example, `SLACK_CLIENT_SECRET`). . Define the *Supported scopes*. Include every scope any MCP server may need.