Releases: amperity/chuck-data
v0.5.0
v0.4.9
Implemented Snowflake as a Data Provider
v0.4.8
Bug Fixes
- fix (agent): Include active database in LLM context for Redshift provider
v0.4.7 — AWS Bedrock PII Scan Fix
Bug Fixes
Fixed /scan-pii silently returning 0 results when using AWS Bedrock (#75)
/scan-pii would complete without error but report "0 PII columns found" on all tables when the aws_bedrock provider was active. Two issues combined to cause this:
- Wrong parameter key — The LLM provider factory passed
active_modeltoAWSBedrockProviderasmodel, but the provider's constructor only acceptsmodel_id, causing aTypeErroron initialization. - Databricks model ID suffix — Models selected from the Databricks model list include a context window size suffix (e.g.
anthropic.claude-3-5-sonnet-20241022-v2:0:200k). The:200kportion is Databricks-specific and invalid in the Bedrock API, causingResourceNotFoundException: Model not foundfor every table scanned.
Both errors were swallowed per-table, so the scan appeared to succeed with no PII found instead of reporting the actual failure.
Changes
| File | Change |
|---|---|
chuck_data/llm/factory.py |
Translate model → model_id when constructing AWSBedrockProvider |
chuck_data/llm/providers/aws_bedrock.py |
Strip Databricks context-window suffixes (:200k, :32k, etc.) from model IDs at init, with a warning log when a suffix is removed |
tests/unit/llm/test_factory.py |
Tests for model → model_id key translation and end-to-end suffix stripping |
tests/unit/llm/providers/test_aws_bedrock.py |
Tests for model ID suffix stripping in isolation |
v0.4.6
Bug fixes:
- Fix
active_modelconfig being silently ignored by LLM providers — the model override was stored under the wrong key (model_idinstead ofmodel), causing providers to always use their default model regardless of the configuredactive_model.
v0.4.5
Release Notes - v0.4.5
Bug Fixes
S3 AWS Profile Handling
- Fixed S3 client creation in
setup_stitchto correctly use AWS profile from configuration instead of kwargs - Added helper functions for cleaner S3 operations:
upload_manifest_to_s3: Dedicated function for uploading Stitch manifest filesupload_init_script_to_s3: Dedicated function for uploading initialization scripts
- Improved code organization and maintainability for S3 upload operations
Testing
- Added comprehensive tests for the new helper functions to ensure reliable S3 uploads
- Improved test coverage for manifest and init script upload workflows
Technical Details
This release refactors the S3 client creation logic to properly respect AWS profile configuration, fixing issues where the profile wasn't being correctly applied during S3 operations. The changes make the codebase more maintainable by extracting upload logic into dedicated, well-tested helper functions.
Full Changelog: v0.4.4...v0.4.5
v0.4.4
Release v0.4.4
🐛 Bug Fixes
Redshift Support Improvements
- Fixed
/list-tablescolumn display: Tables now correctly show column counts instead of displaying 0. Added fallback logic to handle cases where metadata queries return empty results - Fixed
/scan-piimodel selection: PII scanning now properly uses the selected LLM model (e.g., Claude) instead of falling back to incorrect defaults - Fixed
/helpcommand filtering: Help text now correctly shows only provider-relevant commands (Databricks vs Redshift) - Fixed setup wizard auto-start: Wizard now only starts when authentication is required, not on every launch
Technical Improvements
- Enhanced provider-aware configuration validation
- Improved Redshift metadata fetching with multiple fallback strategies
- Updated help formatter to support explicit provider filtering
- LLM factory now properly respects
active_modelconfiguration
🧪 Testing
- All 1145 tests passing
- Added comprehensive test coverage for Redshift scenarios
- Updated test fixtures for dict-based API responses
📦 What's Changed
Full changeset: v0.4.3...v0.4.4
v0.4.3
Enabled AWS (Redshift + EMR) in chuck-data
Now users can run stitch on their data inside Redshift. The compute will be provided by EMR.
v0.3.2
Bug Fix
- Fix semantic tags on numeric types (LONG, BIGINT, INT, etc.) causing Stitch job failures
- Updated AI prompt to explicitly prevent semantic tag assignment to numeric columns
- Added code-level filtering as defense in depth
v0.3.1
Changes
- Fix version mismatch between pyproject.toml and chuck_data/version.py
Installation
# Via pip
pip install chuck-data==0.3.1
# Via Homebrew
brew tap amperity/chuck-data && brew install chuck-data