Skip to content

Updated files per bugs 5970369, 5966307, and 5966925#1740

Open
kheiss-uwzoo wants to merge 2 commits intoNVIDIA:26.03from
kheiss-uwzoo:kheiss/qa-review6
Open

Updated files per bugs 5970369, 5966307, and 5966925#1740
kheiss-uwzoo wants to merge 2 commits intoNVIDIA:26.03from
kheiss-uwzoo:kheiss/qa-review6

Conversation

@kheiss-uwzoo
Copy link
Copy Markdown
Collaborator

Summary

This branch brings the 26.03 / NeMo Retriever Library work forward relative to main: library and pipeline behavior, Helm chart and deployment assets, extraction documentation, and supporting API/CI changes. It is a large, multi-topic delta (on the order of 110 files, ~4.8k insertions / ~2.2k deletions vs main).

The most recent commit (NVBug 5970369, 5966307, 5966925) is a focused documentation fix: align audio extraction docs with the Parakeet ASR NIM actually used in docker-compose (nvcr.io/nim/nvidia/parakeet-1-1b-ctc-en-us), replace outdated Riva naming and broken Riva links with NVIDIA Speech NIM documentation, correct support matrix table labels and related links, update quickstart and Python API cross-references, and fix ulimit examples in troubleshooting (remove invalid comma thousands separators in shell snippets).

What changed (high level)

  • NeMo Retriever Library (nemo_retriever/): ingest modes, batch/in-process pipelines, PDF and text handling, markdown output, OCR/page elements, rerank support and tests, retriever/recall behavior, benchmarks, and dependency updates.
  • Helm: chart and values updates, Nemotron / NIM naming alignment in templates (e.g. page elements, table structure, graphic elements, OCR, embedder, reranker), RTX PRO 4500 overrides, README and gotemplate adjustments.
  • Deployment: docker-compose and profile-specific compose files, deployment validation script tweaks.
  • Docs (docs/docs/extraction/): extraction guides, support matrix, quickstart, API references, release notes, and related pages refreshed for current product and pipeline defaults.
  • API / services: internal NIM interfaces (e.g. OCR, YOLOX), text splitting, Redis ingest service touchpoints.
  • Repo hygiene: root README and CONTRIBUTING updates, CI workflow for retriever unit tests, harness/tooling adjustments.

@kheiss-uwzoo kheiss-uwzoo requested a review from a team as a code owner March 27, 2026 16:56
@kheiss-uwzoo kheiss-uwzoo requested review from charlesbluca and removed request for a team March 27, 2026 16:56
@kheiss-uwzoo kheiss-uwzoo requested review from jdye64 and sosahi March 27, 2026 16:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant