feat(perf): enhance performance by improving disk caching and applying other optimizations #556
Conversation
af2170a to
221fb3d
Compare
Add persistent namespace caching to significantly improve warm start performance. Cached namespaces are loaded from disk instead of being rebuilt from scratch. Key changes: - Add NamespaceMetaData and NamespaceCacheData frozen dataclasses for cache serialization with validation fields (mtime, content_hash, python_executable, sys_path_hash) - Add atomic cache writes using temp file + rename pattern - Add reverse dependency tracking for efficient library/variable change propagation (get_library_users, get_variables_users) - Skip content hash computation when mtime AND size match - Add ResourceMetaData for resource caching Tests: - Unit tests for PickleDataCache atomic writes (28 tests) - Unit tests for NamespaceMetaData and cached entries (20 tests) - Unit tests for ResourceMetaData cache keys (15 tests) - Integration tests for namespace caching behavior (11 tests)
221fb3d to
ec8f008
Compare
Optimize workspace-wide reference operations from O(D) to O(k) where D = total documents and k = documents actually referencing the target. Changes: - Add reverse index data structures in DocumentsCacheHelper to track which documents reference each keyword/variable - Use stable (source, name) tuple keys resilient to cache invalidation - Implement diff-based updates to handle removed references after edits - Add get_keyword_ref_users() and get_variable_ref_users() for O(1) lookup - Update Find References to use reverse index with workspace scan fallback - Update unused keyword/variable detection to use reverse index
|
I've included the second performance optimization, which can be found from this commit: 59e0314 ProblemPreviously, workspace-wide reference operations (Find References, unused keyword/variable detection) required scanning all documents in the workspace for each lookup. This resulted in O(D) per-lookup complexity, where D is the total number of documents. For unused-detection checking of K keywords, this required O(K × D) operations, causing noticeable delays in large workspaces. SolutionAdded a reverse index that maps each keyword/variable to the documents that reference it. This reduces lookup complexity from O(D) to O(k), where k is the number of documents that actually use the target (typically much smaller than D). ArchitectureBefore: O(D) Workspace Scanflowchart LR
subgraph "Find References (Before)"
A[Request: Find refs to 'My Keyword'] --> B[Scan ALL documents]
B --> C[doc_001.robot]
B --> D[doc_002.robot]
B --> E[doc_003.robot]
B --> F[...]
B --> G[doc_999.robot]
C --> H[Check namespace]
D --> H
E --> H
F --> H
G --> H
H --> I[Return matches]
end
After: O(k) Reverse Index Lookupflowchart LR
subgraph "Find References (After)"
A[Request: Find refs to 'My Keyword'] --> B[Lookup reverse index]
B --> C["index[('source', 'My Keyword')]"]
C --> D[Return: doc_003, doc_047, doc_891]
D --> E[Scan only 3 documents]
E --> F[Return matches]
end
Data Structuresflowchart TB
subgraph "Reverse Index Structure"
direction TB
A["_keyword_ref_users<br/>dict[tuple[str, str], WeakSet[TextDocument]]"]
B["Key: (source, name)<br/>e.g. ('common.resource', 'My Keyword')"]
C["Value: WeakSet of documents<br/>that reference this keyword"]
A --> B
A --> C
end
subgraph "Forward Index (for diff-based updates)"
direction TB
D["_doc_keyword_refs<br/>WeakKeyDictionary[TextDocument, set]"]
E["Key: TextDocument"]
F["Value: set of (source, name) tuples<br/>this document references"]
D --> E
D --> F
end
|
Extend namespace disk caching to include keyword references, variable references, and local variable assignments. This allows the analysis phase to be completely skipped when loading from a valid cache. Key changes: - Add KeywordRefKey and VariableRefKey stable keys for serialization - Serialize/restore keyword_references, variable_references, and local_variable_assignments in namespace cache - Implement 10% staleness threshold: if >10% of cached references cannot be resolved, fall back to fresh analysis - Track references when loading fully-analyzed namespaces from cache
- Format 4 source files to pass ruff style checks - Fix filepath_base tests to use Path for cross-platform compatibility instead of hardcoded Unix byte strings for hash computation
Enable cache restoration for files using keywords with [Arguments],
significantly improving warm start performance for real-world codebases.
Previously, cache restoration failed for any file that referenced keywords
with arguments because ArgumentDefinition objects from [Arguments] sections
were not included in the variable lookup. This caused the 10% missing
threshold to be exceeded, discarding the cache and triggering full
re-analysis - making the cache ineffective for most production code.
Fix:
- Include ArgumentDefinitions from resource/library keyword [Arguments]
sections in variable reference lookup during cache restoration
- Handle position encoding differences between _get_argument_definitions_from_line
(original ${} positions) and _visit_Arguments (stripped positions)
- Enable caching of KeywordDoc.argument_definitions by removing nosave metadata
Code quality improvements:
- Extract _make_variable_ref_key() helper to eliminate duplication
- Add _VAR_PREFIX_LEN/_VAR_WRAPPER_LEN constants for variable syntax parsing
- Refactor nested function to testable class method
- Fix end_col_offset calculation for recreated VariableDefinitions
- Enhance debug logging for cache troubleshooting
…ache support
- Add source/name indexed caches for O(1) LibraryDoc lookups, avoiding
repeated iteration during namespace cache restoration
- Extract _cache_libdoc() helper and use targeted cache invalidation
- Add _add_embedded_argument_definitions_from_keywords() to restore
variable refs for embedded args in keyword names (e.g., "My KW ${arg}")
- Normalize variable cache keys to strip RF 7.3+ type annotations
- Handle library arguments with col_offset=-1 sentinel value
- Fix import sorting in namespace.py
Fix cache restoration issues that caused 0% hit rate for fully analyzed
namespaces and 72.5% hit rate on RF 7.3+.
Key fixes:
1. Add end_col_offset to VariableRefKey for accurate position restoration
- Different variable types use different position encodings (stripped vs original)
- Storing end_col_offset directly avoids calculation mismatches
2. Fix RF 7.3+ type annotation handling in _strip_argument_definition
- Argument names may include type annotations (e.g., ${timeout: int})
- Now normalizes names before computing position lengths
3. Fix embedded argument restoration to use ORIGINAL positions
- Changed from stripped positions to match namespace_analyzer.py behavior
- Extract type annotations and patterns using EMBEDDED_ARGUMENTS_MATCHER
4. Cache FOR loop and exception variables in _local_variable_assignments
- These were added to _variable_references but not serialized
- Now properly tracked for cache serialization
Performance impact:
- Cache hit rate: 0% → 100% (all RF versions)
- Analysis phase speedup: ~15-40x for cached files
- Overall warm start speedup: ~2-4x
|
Sorry it took me so long to come back to this PR — I wanted to take the time to review it properly. Thanks for the PR — there are some good and thoughtful ideas in here, especially around cache usage, invalidation, and avoiding unnecessary repeated work. I’m going to close the PR in its current form for now. The main reason is that it has grown too large for me to review properly and with enough confidence. It bundles a lot of different changes together, and alongside the actual functional work there are also changes that do not directly belong to the problem being addressed. For example, there are also unrelated cleanup-style changes such as converting type hints to the Python 3.10 style, which makes the PR harder to review as a focused change. That makes it difficult to separate the relevant behavioral changes from unrelated cleanup, and in turn makes the PR more difficult to review with enough trust. On top of that, my impression during review was that in several places the PR is trying to mitigate observed symptoms rather than addressing the underlying cause in a clear and isolated way. That makes it harder for me to build confidence in the change as a whole. I also would have liked to see tests focused more directly on the behavior that actually carries risk here, especially around cache validity, invalidation, and consistency of results. Some of the added tests did not give me much additional confidence in those areas, because a few of them appear to mostly verify very simple data-structure behavior or dataclass field handling rather than the actual high-risk logic in this change. In those cases, the tests feel closer to checking basic Python functionality than meaningfully reducing the risk of the PR itself. That said, I did pick up a few of the ideas from this PR, because I think the general direction makes sense. I’ve just implemented parts of it differently and in a more focused way for the concrete issue I wanted to solve. More broadly, I’m currently thinking about reworking the workspace analysis layer more substantially. That part of the codebase has grown quite a bit over time, and even for me it has become increasingly hard to fully understand and reason about some of the code I wrote there earlier ;-) . Because of that, I’m a bit hesitant to merge larger incremental changes on top of the current structure unless I’m very confident in them. Once I have something in a shape that is easier to discuss, I’d be very interested in hearing your thoughts. And if you’d like to be involved, I’d be happy to have your help with reviewing, testing, or discussing the overall direction. Thanks again for taking the time to put this together and for the work you put into it. I appreciate it, and I’d be happy to continue the discussion once I have something in a shape that is easier to share. |
Summary
This PR implements a comprehensive performance improvement plan that dramatically reduces Language Server startup time and improves edit responsiveness.
LibraryDocobjects to.robotcode_cache/*/resource/max_workers=min(cpu_count, 4))get_library_users()andget_variables_users()for instant change propagationTesting Plan
Architecture
Architecture Overview
flowchart TB subgraph cache["Disk Cache (.robotcode_cache/)"] lib["libdoc/<br/>Library docs"] res["resource/<br/>Resource docs"] ns["namespace/<br/>Namespace state"] end subgraph perf["Performance Optimizations"] p1["Shared Process Pool"] p2["Resource Caching"] p3["Targeted Invalidation"] p4["Parallel Library Loading"] p5["O(1) Dependency Lookups"] p7["Namespace Caching"] p9["Atomic Writes"] end subgraph maps["Reverse Dependency Maps"] imp["_importers<br/>source → documents"] lu["_library_users<br/>lib → documents"] vu["_variables_users<br/>var → documents"] end p1 --> lib p2 --> res p7 --> ns p9 --> ns p3 --> imp p5 --> lu p5 --> vuCold vs Warm Start Flow
flowchart TB subgraph startup["IDE Startup"] open["User Opens VS Code"] end subgraph cold["Cold Start (No Cache) ~2-4 min"] c1["Parse .robot files"] c2["Resolve imports in parallel"] c3["Load libraries<br/>(shared executor)"] c4["Build namespaces"] c5["Save to cache<br/>(atomic write)"] c1 --> c2 --> c3 --> c4 --> c5 end subgraph warm["Warm Start (Cache Hit) ~10-20 sec"] w1["Check .cache.pkl exists"] w2{"Validate:<br/>mtime + size?"} w3["Load (meta, spec)"] w4["Check environment identity"] w5["Restore namespace"] w1 --> w2 w2 -->|"Match"| w3 --> w4 --> w5 w2 -->|"Changed"| miss["Rebuild"] end subgraph runtime["Runtime Editing"] r1["File changed"] r2["O(1) lookup affected docs"] r3["Targeted invalidation"] r4["Rebuild only affected"] r1 --> r2 --> r3 --> r4 end open --> cold open --> warm style cold fill:#ffcccc,stroke:#cc0000 style warm fill:#ccffcc,stroke:#00cc00 style runtime fill:#cce5ff,stroke:#0066ccCache Validation Chain
sequenceDiagram participant LS as Language Server participant Cache as Disk Cache participant FS as File System Note over LS,FS: Warm Start Validation LS->>Cache: Load .cache.pkl Cache-->>LS: (meta, spec) tuple LS->>FS: stat(source_file) FS-->>LS: mtime, size alt mtime matches alt size matches Note over LS: Skip content hash!<br/>(Fast path) LS->>LS: Check python_executable LS->>LS: Check sys_path_hash alt Environment matches LS->>LS: Restore from cache ✓ else Environment changed LS->>LS: Rebuild namespace end else size differs LS->>FS: Read first+last 64KB FS-->>LS: content chunks LS->>LS: Compute tiered hash alt Hash matches LS->>LS: Restore from cache ✓ else Hash differs LS->>LS: Rebuild namespace end end else mtime differs LS->>LS: Rebuild namespace end