Skip to content

Parallelize file search and text search operations using async for improved performance #27

@EH-MLS

Description

@EH-MLS

Currently, the file search and text search operations in the code (e.g., in SearchTextOperation._search_text) are performed sequentially. For large repositories or when searching across many files, this can lead to significant performance bottlenecks.

Proposed Improvement:

  • Refactor the file search and/or text search logic to utilize asynchronous programming (using asyncio and related async file IO libraries) for searching through files and/or lines, rather than using ThreadPoolExecutor or ProcessPoolExecutor.
  • Ensure that resource usage is managed carefully to avoid overwhelming the system, especially with very large file sets.
  • Maintain correct exception handling, especially for IO errors or regex issues, in an async execution context.

Potential benefits:

  • Significantly improved search speed, especially for large codebases or patterns that match in many files.
  • More responsive behavior for users performing large searches.

Acceptance Criteria:

  • File and/or text search operations utilize async processing.
  • Performance improvements are measurable for large-scale searches.
  • No regression in error handling or result accuracy.
  • Parallelization is implemented in a way that is compatible with the async context of the operation.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions