ttok4bedrock – ttok-style token counting for Amazon Bedrock

Token counting for Amazon Bedrock models - drop-in replacement for ttok with exact CLI/SDK compatibility.

Features

✅ 100% ttok-compatible interface - Drop-in replacement
🎯 Anthropic Claude models - Uses the Bedrock CountTokens API
🔧 AWS native - Uses the boto3 default credential/region chain
📊 Accurate counts - Uses the Bedrock CountTokens API
⚡ Simple and fast - No caching, no complexity

For more information on this project, see this blog post:

Token Counting Meets Amazon Bedrock

📖 Interactive Code Walkthrough — explore the codebase with annotated explanations

Supported Models

Anthropic Claude models with CountTokens API support. See the AWS documentation for details.

Claude Models:

anthropic.claude-sonnet-4-6 (default)
anthropic.claude-opus-4-6-v1
anthropic.claude-sonnet-4-5-20250929-v1:0
anthropic.claude-opus-4-5-20251101-v1:0
anthropic.claude-haiku-4-5-20251001-v1:0
anthropic.claude-opus-4-1-20250805-v1:0
anthropic.claude-sonnet-4-20250514-v1:0
anthropic.claude-opus-4-20250514-v1:0
anthropic.claude-3-7-sonnet-20250219-v1:0
anthropic.claude-3-5-sonnet-20241022-v2:0
anthropic.claude-3-5-sonnet-20240620-v1:0
anthropic.claude-3-5-haiku-20241022-v1:0

Installation

Prerequisites: Install uv if you haven't already.

Option 1: Install from GitHub repository

uv tool install git+https://github.com/danilop/ttok4bedrock.git

Option 2: Install from local source

# Clone the repository
git clone https://github.com/danilop/ttok4bedrock.git
cd ttok4bedrock

# Install with uv
uv tool install .

Create a convenient alias

After installation, you can create an alias to use ttok instead of ttok4bedrock:

# Add to your shell profile (~/.bashrc, ~/.zshrc, etc.)
alias ttok='ttok4bedrock'

# Or for a one-time use
alias ttok='uv tool run ttok4bedrock'

Quick Start

CLI Usage (Identical to ttok)

After installation with uv tool install, you can use ttok4bedrock directly or create an alias for ttok:

# Count tokens (default: Claude Sonnet 4.6)
ttok4bedrock "Hello, world!"
# Output: 11

# With alias (if you created one)
ttok "Hello, world!"
# Output: 11

# Count from stdin
echo "Count these tokens" | ttok4bedrock
cat document.txt | ttok4bedrock

# Truncate to N tokens
ttok4bedrock -t 100 "Very long text..."
cat large.txt | ttok4bedrock -t 100 > truncated.txt

# Use specific Bedrock model (full model ID)
ttok4bedrock -m anthropic.claude-3-5-sonnet-20241022-v2:0 "Text"
ttok4bedrock -m anthropic.claude-3-7-sonnet-20250219-v1:0 "Text"

# Specify AWS region (uses default if not specified)
ttok4bedrock --aws-region us-west-2 "Text"

Note: If you haven't installed with uv tool install, you can still use uv run ttok4bedrock as shown in the migration section below.

Algorithm Description

Smart Truncation Algorithm

The truncation algorithm is designed to minimize API calls while achieving perfectly exact token counts. Here's how it works:

Phase 1: Initial Assessment

Full Text Analysis: Count tokens for the entire input text
Smart Estimation: Analyze text characteristics (punctuation density, word length, spacing) to improve initial character-to-token ratio estimation
Target Calculation: Use the improved ratio to estimate the target character length

Phase 2: Adaptive Learning Loop

Linear Interpolation: Start with the estimated character length
Token Measurement: Count tokens for the estimated text length
Ratio Refinement: Update the character-to-token ratio based on actual results
Convergence Check: Continue until the token count is within 1-2 tokens of the target
API Limit Protection: Self-imposed limit of 20 API calls to prevent runaway loops

Phase 3: Fine-Tuning

Chunked Adjustment: Add/remove characters in small chunks (5-10 chars) when close to target
Character-by-Character: Final precision adjustment (1-5 characters) only when very close to boundary
Exact Boundary: Find the exact character position where token count crosses the limit

Key Optimizations

Smart Text Analysis: Accounts for punctuation, word length, and spacing patterns
LRU Caching: Uses functools.lru_cache to avoid repeated API calls for identical text
Adaptive Learning: Each API call improves the estimation for subsequent iterations
Efficient Convergence: Typically achieves exact results in 3-5 API calls
Overhead Removal: Automatically subtracts message structure overhead for intuitive token counts

Performance Characteristics

Small texts (≤200 tokens): 3-4 API calls, 100% accuracy
Medium texts (200-1000 tokens): 4-5 API calls, 100% accuracy
Large texts (>1000 tokens): 5-17 API calls, 93-100% accuracy
Cache hits: 0.000s (instant) vs 1+ seconds for API calls

LRU Caching

The library includes intelligent caching to minimize API calls:

Automatic Caching: Uses Python's functools.lru_cache for optimal performance
Configurable Size: Default 1000 entries, customizable via constructor
Cache Statistics: Monitor hit rates and performance via get_cache_info()
Memory Efficient: Automatic eviction of least recently used entries

# Monitor cache performance
counter = BedrockTokenCounter(cache_size=500)
# ... use counter ...
stats = counter.get_cache_info()
print(f"Cache hit rate: {stats['hit_rate']:.1%}")

Overhead Removal

The library automatically removes message structure overhead to provide intuitive token counts:

Message Overhead: Bedrock API wraps text in message structures that add ~7 tokens
Automatic Subtraction: Token counts show only the actual text content
Unified Caching: Overhead calculation uses the same LRU cache as regular token counting
Transparent Operation: Users see clean token counts without API complexity

# Before: "Hello" = 8 tokens (including message overhead)
# After:  "Hello" = 1 token (text content only)
counter = BedrockTokenCounter()
count = counter.count_tokens("Hello", "anthropic.claude-3-5-haiku-20241022-v1:0")
print(count)  # Output: 1


### Python SDK Usage (ttok-compatible)

```python
# Import as drop-in replacement
import ttok4bedrock as ttok

# Count tokens
count = ttok.count_tokens("Hello, world!")
print(count)  # 11

# Use specific model (full Bedrock model ID)
count = ttok.count_tokens(
    "Text to count",
    model="anthropic.claude-3-5-sonnet-20241022-v2:0"
)

# Specify AWS region
count = ttok.count_tokens(
    "Text", 
    model="anthropic.claude-3-5-haiku-20241022-v1:0",
    aws_region="eu-west-1"
)

# Truncate text
truncated = ttok.truncate(
    "Very long text...",
    max_tokens=50,
    model="anthropic.claude-3-5-sonnet-20241022-v2:0"
)

AWS Configuration

Credentials

Uses the standard AWS credential chain (boto3):

Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
AWS credentials file (~/.aws/credentials)
AWS profile (AWS_PROFILE environment variable)
IAM role (for EC2, Lambda, ECS, etc.)

Region

Order of precedence:

--aws-region CLI option or aws_region parameter
AWS_DEFAULT_REGION environment variable
AWS config file (~/.aws/config)
Instance metadata (for EC2)

Required IAM Permissions

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:CountTokens"
      ],
      "Resource": "arn:aws:bedrock:*:*:foundation-model/*"
    }
  ]
}

Migration from ttok

CLI Migration

# Before (ttok with OpenAI)
ttok "Count my tokens"
ttok -m gpt-4 "Text"
cat large.txt | ttok -t 100

# After installation with uv tool install
ttok4bedrock "Count my tokens"  # Same interface!
ttok4bedrock -m anthropic.claude-3-5-sonnet-20241022-v2:0 "Text"
cat large.txt | ttok4bedrock -t 100  # Identical usage

# With alias (recommended)
alias ttok='ttok4bedrock'
ttok "Count my tokens"  # Drop-in replacement!
ttok -m anthropic.claude-3-5-sonnet-20241022-v2:0 "Text"
cat large.txt | ttok -t 100

# Alternative: without installation (uv run)
uv run ttok4bedrock "Count my tokens"  # Same interface!
uv run ttok4bedrock -m anthropic.claude-3-5-sonnet-20241022-v2:0 "Text"
cat large.txt | uv run ttok4bedrock -t 100  # Identical usage

Python Migration

# Before (ttok)
import ttok
count = ttok.count_tokens("Text", model="gpt-4")

# After (ttok4bedrock)
import ttok4bedrock as ttok
count = ttok.count_tokens("Text", model="anthropic.claude-3-5-sonnet-20241022-v2:0")

Error Handling

The tool provides clear error messages for AWS issues:

# Model not found
ttok4bedrock -m anthropic.claude-invalid-model "text"
# Error: AWS Bedrock API error (ValidationException): Model not found

# No credentials
ttok4bedrock "text"
# Error: Unable to locate AWS credentials. Please configure AWS credentials...

# No region
ttok4bedrock "text"
# Error: No AWS region configured. Use --aws-region option or configure a default region.

Note: If using uv run ttok4bedrock, replace ttok4bedrock with uv run ttok4bedrock in the examples above.

Development

# Install dev dependencies
uv sync --extra dev

# Run tests
uv run pytest

# Run with coverage
uv run pytest --cov=ttok4bedrock

License

MIT License (see LICENSE file for details)

Acknowledgments

Interface design inspired by ttok by Simon Willison
Built for the Amazon Bedrock CountTokens API
See AWS Bedrock CountTokens documentation for supported models and regions

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
.github/workflows		.github/workflows
tests		tests
ttok4bedrock		ttok4bedrock
walk-the-code		walk-the-code
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Folders and files

Latest commit

History

Repository files navigation

ttok4bedrock – ttok-style token counting for Amazon Bedrock

Features

Supported Models

Claude Models:

Installation

Option 1: Install from GitHub repository

Option 2: Install from local source

Create a convenient alias

Quick Start

CLI Usage (Identical to ttok)

Algorithm Description

Smart Truncation Algorithm

Phase 1: Initial Assessment

Phase 2: Adaptive Learning Loop

Phase 3: Fine-Tuning

Key Optimizations

Performance Characteristics

LRU Caching

Overhead Removal

AWS Configuration

Credentials

Region

Required IAM Permissions

Migration from ttok

CLI Migration

Python Migration

Error Handling

Development

License

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 1

Languages

Packages