Skip to content

⬆️ Update chardet requirement from <6.0.0,>=5.0.0 to >=5.0.0,<8.0.0#9

Open
dependabot[bot] wants to merge 1 commit intocookiecutterfrom
dependabot/pip/chardet-gte-5.0.0-and-lt-8.0.0
Open

⬆️ Update chardet requirement from <6.0.0,>=5.0.0 to >=5.0.0,<8.0.0#9
dependabot[bot] wants to merge 1 commit intocookiecutterfrom
dependabot/pip/chardet-gte-5.0.0-and-lt-8.0.0

Conversation

@dependabot
Copy link

@dependabot dependabot bot commented on behalf of github Mar 4, 2026

Updates the requirements on chardet to permit the latest version.

Release notes

Sourced from chardet's releases.

7.0.0

Ground-up, MIT-licensed rewrite of chardet. Same package name, same public API — drop-in replacement for chardet 5.x/6.x. Just way faster and more accurate!

Highlights:

  • MIT license (previous versions were LGPL)
  • 96.8% accuracy on 2,179 test files (+2.3pp vs chardet 6.0.0, +7.7pp vs charset-normalizer)
  • 41x faster than chardet 6.0.0 with mypyc (28x pure Python), 7.5x faster than charset-normalizer
  • Language detection for every result (90.5% accuracy across 49 languages)
  • 99 encodings across six eras (MODERN_WEB, LEGACY_ISO, LEGACY_MAC, LEGACY_REGIONAL, DOS, MAINFRAME)
  • 12-stage detection pipeline — BOM, UTF-16/32 patterns, escape sequences, binary detection, markup charset, ASCII, UTF-8 validation, byte validity, CJK gating, structural probing, statistical scoring, post-processing
  • Bigram frequency models trained on CulturaX multilingual corpus data for all supported language/encoding pairs
  • Optional mypyc compilation — 1.49x additional speedup on CPython
  • Thread-safe detect() and detect_all() with no measurable overhead; scales on free-threaded Python 3.13t+
  • Negligible import memory (96 B)
  • Zero runtime dependencies

Breaking changes vs 6.0.0:

  • detect() and detect_all() now default to encoding_era=EncodingEra.ALL (6.0.0 defaulted to MODERN_WEB)
  • Internal architecture is completely different (probers replaced by pipeline stages). Only the public API is preserved.
  • LanguageFilter is accepted but ignored (deprecation warning emitted)
  • chunk_size is accepted but ignored (deprecation warning emitted)
Changelog

Sourced from chardet's changelog.

7.0.0 (2026-03-02)

Ground-up, MIT-licensed rewrite of chardet. Same package name, same public API — drop-in replacement for chardet 5.x/6.x.

Highlights:

  • MIT license (previous versions were LGPL)
  • 96.8% accuracy on 2,179 test files (+2.3pp vs chardet 6.0.0, +7.7pp vs charset-normalizer)
  • 41x faster than chardet 6.0.0 with mypyc (28x pure Python), 7.5x faster than charset-normalizer
  • Language detection for every result (90.5% accuracy across 49 languages)
  • 99 encodings across six eras (MODERN_WEB, LEGACY_ISO, LEGACY_MAC, LEGACY_REGIONAL, DOS, MAINFRAME)
  • 12-stage detection pipeline — BOM, UTF-16/32 patterns, escape sequences, binary detection, markup charset, ASCII, UTF-8 validation, byte validity, CJK gating, structural probing, statistical scoring, post-processing
  • Bigram frequency models trained on CulturaX multilingual corpus data for all supported language/encoding pairs
  • Optional mypyc compilation — 1.49x additional speedup on CPython
  • Thread-safe detect() and detect_all() with no measurable overhead; scales on free-threaded Python 3.13t+
  • Negligible import memory (96 B)
  • Zero runtime dependencies

Breaking changes vs 6.0.0:

  • detect() and detect_all() now default to encoding_era=EncodingEra.ALL (6.0.0 defaulted to MODERN_WEB)
  • Internal architecture is completely different (probers replaced by pipeline stages). Only the public API is preserved.
  • LanguageFilter is accepted but ignored (deprecation warning emitted)
  • chunk_size is accepted but ignored (deprecation warning emitted)

6.0.0 (2026-02-22)

Features:

  • Unified single-byte charset detection with proper language-specific bigram models for all single-byte encodings (replaces Latin1Prober and MacRomanProber heuristics)
  • 38 new languages: Arabic, Belarusian, Breton, Croatian, Czech, Danish, Dutch, English, Esperanto, Estonian, Farsi, Finnish, French, German, Icelandic, Indonesian, Irish, Italian, Kazakh, Latvian, Lithuanian,

... (truncated)

Commits
  • 4b89d62 Reach 100% test coverage across all chardet modules
  • d0896b1 Add CLI, confusion, models, and script utility tests
  • dba8d66 Add tests for uncovered branches across pipeline modules
  • 6ef5eda refactor: remove dead code branches and add pragma for invariant assertion
  • f91071f docs: add test coverage gap implementation plan
  • 692c6d5 docs: add test coverage gap analysis and improvement design
  • e01e148 Add known failure entries for new DOS codepage test files
  • d9f1f3c Remove known failure entries for deleted duplicate test files
  • 3f1495f Allow symlink tests/data to simplify updating test-data repo
  • dabd85b Make tests/data get ignored if it is a symlink too
  • Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

  • @dependabot rebase will rebase this PR
  • @dependabot recreate will recreate this PR, overwriting any edits that have been made to it
  • @dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
  • @dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
  • @dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Updates the requirements on [chardet](https://github.com/chardet/chardet) to permit the latest version.
- [Release notes](https://github.com/chardet/chardet/releases)
- [Changelog](https://github.com/chardet/chardet/blob/main/docs/changelog.rst)
- [Commits](chardet/chardet@5.0.0...7.0.0)

---
updated-dependencies:
- dependency-name: chardet
  dependency-version: 7.0.0
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
@dependabot @github
Copy link
Author

dependabot bot commented on behalf of github Mar 4, 2026

Labels

The following labels could not be found: dependency. Please create it before Dependabot can add it to a pull request.

Please fix the above issues or remove invalid values from dependabot.yml.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

0 participants