bug: Inconsistent prompt_tokens normalization across Anthropic providers (direct API vs Vertex AI vs Bedrock)

### What Happened?

When Portkey normalizes Anthropic model responses to OpenAI schema, `prompt_tokens` has different semantics depending on which provider is used to access the same Anthropic model.

| **Provider** | **`prompt_tokens` includes cache tokens?** |**`total_tokens` includes cache tokens?**|
| ------------- | ------------- |------------- |
| `anthropic` (direct API) | No  | Yes  |
| vertex-ai (Anthropic models)  | No | Yes  |
| bedrock (Anthropic models) | Yes | Yes  |

For the same Anthropic model (e.g. `claude-sonnet-4-20250514`), sending the same prompt with cache:

- Anthropic direct / Vertex AI: `prompt_tokens` = 100 (non-cached input only), `cache_read_input_tokens` = 50, `completion_tokens` = 100, `total_tokens` = 250
- Bedrock: `prompt_tokens` = 150 (includes cached input), `cache_read_input_tokens` = 50, `completion_tokens` = 100, `total_tokens` = 250

Anthropic direct and Vertex AI set `prompt_tokens = input_tokens` (excludes cache). Bedrock sets `prompt_tokens = inputTokens + cacheReadInputTokens + cacheWriteInputTokens` (includes cache).

### What Should Have Happened?

All three Anthropic access paths should normalize `prompt_tokens` consistently. The OpenAI convention (which Portkey normalizes to) is that `prompt_tokens` includes cached tokens, with the breakdown available in `prompt_tokens_details.cached_tokens`. Anthropic direct and Vertex AI should match Bedrock's behavior.


### Relevant Code Snippet

[anthropic/chatComplete.ts#L612C9-L627C11](https://github.com/Portkey-AI/gateway/blob/main/src/providers/anthropic/chatComplete.ts#L612C9-L627C11)
```
        usage: {
          prompt_tokens: input_tokens,
          completion_tokens: output_tokens,
          total_tokens:
            input_tokens +
            output_tokens +
            (cache_creation_input_tokens ?? 0) +
            (cache_read_input_tokens ?? 0),
          prompt_tokens_details: {
            cached_tokens: cache_read_input_tokens ?? 0,
          },
          ...(shouldSendCacheUsage && {
            cache_read_input_tokens: cache_read_input_tokens,
            cache_creation_input_tokens: cache_creation_input_tokens,
          }),
```
[google-vertex-ai/chatComplete.ts#L898C6-L909C9](https://github.com/Portkey-AI/gateway/blob/main/src/providers/google-vertex-ai/chatComplete.ts#L898C6-L909C9)
```
  usage: {
        prompt_tokens: input_tokens,
        completion_tokens: output_tokens,
        total_tokens: totalTokens,
        prompt_tokens_details: {
          cached_tokens: cache_read_input_tokens,
        },
        ...(shouldSendCacheUsage && {
          cache_read_input_tokens: cache_read_input_tokens,
          cache_creation_input_tokens: cache_creation_input_tokens,
        }),
      },
```

[bedrock/chatComplete.ts#L550C4-L565C9](https://github.com/Portkey-AI/gateway/blob/main/src/providers/bedrock/chatComplete.ts#L550C4-L565C9)
```
   usage: {
        prompt_tokens:
          response.usage.inputTokens +
          cacheReadInputTokens +
          cacheWriteInputTokens,
        completion_tokens: response.usage.outputTokens,
        total_tokens: response.usage.totalTokens, // contains the cache usage as well
        prompt_tokens_details: {
          cached_tokens: cacheReadInputTokens,
        },
        // we only want to be sending this for anthropic models and this is not openai compliant
        ...((cacheReadInputTokens > 0 || cacheWriteInputTokens > 0) && {
          cache_read_input_tokens: cacheReadInputTokens,
          cache_creation_input_tokens: cacheWriteInputTokens,
        }),
      },
```


### Your Twitter/LinkedIn

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bug: Inconsistent prompt_tokens normalization across Anthropic providers (direct API vs Vertex AI vs Bedrock) #1564

What Happened?

What Should Have Happened?

Relevant Code Snippet

Your Twitter/LinkedIn

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Provider	`prompt_tokens` includes cache tokens?	`total_tokens` includes cache tokens?
`anthropic` (direct API)	No	Yes
vertex-ai (Anthropic models)	No	Yes
bedrock (Anthropic models)	Yes	Yes

bug: Inconsistent prompt_tokens normalization across Anthropic providers (direct API vs Vertex AI vs Bedrock) #1564

Description

What Happened?

What Should Have Happened?

Relevant Code Snippet

Your Twitter/LinkedIn

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions