feat(backend): add conversation export and import API (#1808)#3176
feat(backend): add conversation export and import API (#1808)#3176mrveiss merged 1 commit intoDev_new_guifrom
Conversation
Add JSON/Markdown single-session export, bulk JSON archive, and JSON round-trip import with skip/replace/rename conflict resolution. Register as optional feature router. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| ) | ||
| except ValueError: | ||
| continue | ||
| if not os.path.exists(chat_file): |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Copilot Autofix
AI about 5 hours ago
Copilot could not generate an autofix suggestion
Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.
| continue | ||
| if not os.path.exists(chat_file): | ||
| continue | ||
| async with aiofiles.open(chat_file, "r", encoding="utf-8") as fh: |
Check failure
Code scanning / CodeQL
Uncontrolled data used in path expression High
Show autofix suggestion
Hide autofix suggestion
Copilot Autofix
AI about 5 hours ago
At a high level, we should ensure that user-controlled session_id cannot produce arbitrary filenames even within the trusted chats_directory. The simplest robust fix is to sanitize or validate session_id locally in _load_full_session_data and reject values that contain characters which could change the intended naming pattern (path separators, traversal sequences, wildcards, etc.). This is in addition to the existing validate_relative_path protection, giving a clear guarantee that only simple, expected session IDs are used to construct filenames.
Best targeted fix without changing existing functionality:
- Inside
_load_full_session_data(inautobot-backend/services/conversation_export.py), before constructingfilename_template, normalize and validate thesession_id. - Enforce a strict pattern for
session_idat this layer, for example allowing only ASCII letters, digits, underscore, and hyphen. This matches typical session or UUID formats and should not break legitimate usage. - If
session_idfails this check, log a warning and returnNone(behaving as “session not found”), so the API continues to raise aResourceNotFoundErroras it already does when_load_full_session_datareturnsNone. - Implement the check using the standard library only (e.g.,
re), to avoid new dependencies.
Concretely:
-
Add
import reat the top ofautobot-backend/services/conversation_export.py. -
In
_load_full_session_data, right after entering thetryblock (before gettingchats_directoryand before buildingfilename_template), add a small validation snippet:if not re.fullmatch(r"[A-Za-z0-9_-]+", session_id): logger.warning("Rejected session_id with invalid characters: %r", session_id) return None
This ensures filename_template stays a simple filename, eliminating any chance to smuggle directory separators or similar, and should satisfy CodeQL since the path now depends on a strictly validated ID.
| @@ -15,6 +15,7 @@ | ||
| import logging | ||
| import time | ||
| from typing import Any, Dict, List, Optional, Tuple | ||
| import re | ||
|
|
||
| logger = logging.getLogger(__name__) | ||
|
|
||
| @@ -271,6 +272,15 @@ | ||
| Returns None when the session does not exist or loading fails. | ||
| """ | ||
| try: | ||
| # Defensive validation: ensure session_id cannot influence directory | ||
| # structure or introduce unexpected characters in filenames. | ||
| if not re.fullmatch(r"[A-Za-z0-9_-]+", session_id): | ||
| logger.warning( | ||
| "Rejected session_id with invalid characters for file access: %r", | ||
| session_id, | ||
| ) | ||
| return None | ||
|
|
||
| chats_directory = chat_history_manager._get_chats_directory() | ||
| import os | ||
|
|
Summary
services/conversation_export.py: JSON and Markdown single-session export, bulk JSON archive, JSON import with skip/replace/rename conflict resolutionapi/conversation_export.py:GET /conversations/{id}/export?format=json|markdown,GET /conversations/export-all,POST /conversations/importfeature_routers.pyCloses #1808
Test plan
🤖 Generated with Claude Code