Skip to content

fix: add encoding=utf-8 to CLI file operations#1807

Open
nightcityblade wants to merge 1 commit intounclecode:mainfrom
nightcityblade:fix/issue-1762
Open

fix: add encoding=utf-8 to CLI file operations#1807
nightcityblade wants to merge 1 commit intounclecode:mainfrom
nightcityblade:fix/issue-1762

Conversation

@nightcityblade
Copy link

Summary

On Windows, the default file encoding (e.g. charmap / cp1252) cannot encode certain Unicode characters found in crawled content, causing 'charmap' codec can't encode character errors when writing output files via the CLI.

Fixes #1762

List of files changed and why

crawl4ai/cli.py — Added encoding="utf-8" to all 6 open() calls (config read/write + 4 output file writes)

How Has This Been Tested?

  • Code review: all file I/O paths in CLI now explicitly use UTF-8
  • The fix matches Python best practice for cross-platform Unicode file handling

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have added/updated unit tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

On Windows, the default encoding (e.g. 'charmap' / cp1252) cannot
encode certain Unicode characters found in crawled content, causing
'charmap' codec errors when writing output files.

Explicitly set encoding='utf-8' on all file open() calls in the CLI.

Fixes unclecode#1762
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: CLI Error charmap

1 participant