Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions docs/docs/extraction/nv-ingest-python-api.md
Original file line number Diff line number Diff line change
Expand Up @@ -372,7 +372,7 @@ You can use this to generate descriptions of unstructured images, infographics,

!!! note

To use the `caption` option, enable the `vlm` profile when you start the NeMo Retriever Library services. The default model used by `caption` is `nvidia/llama-3.1-nemotron-nano-vl-8b-v1`. For more information, refer to [Profile Information in the Quickstart Guide](quickstart-guide.md#profile-information).
To use the `caption` option, enable the `vlm` profile when you start the NeMo Retriever Library services. The default model used by `caption` is `nvidia/nemotron-nano-12b-v2-vl`. For more information, refer to [Profile Information in the Quickstart Guide](quickstart-guide.md#profile-information).

### Basic Usage

Expand All @@ -389,7 +389,7 @@ To specify a different API endpoint, pass additional parameters to `caption`.
```python
ingestor = ingestor.caption(
endpoint_url="https://integrate.api.nvidia.com/v1/chat/completions",
model_name="nvidia/llama-3.1-nemotron-nano-vl-8b-v1",
model_name="nvidia/nemotron-nano-12b-v2-vl",
api_key="nvapi-"
)
```
Expand Down
8 changes: 4 additions & 4 deletions docs/docs/extraction/python-api-reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -464,7 +464,7 @@ You can use this to generate descriptions of unstructured images, infographics,

!!! note

To use the `caption` option, enable the `vlm` profile when you start the NeMo Retriever Library services. The default model used by `caption` is `nvidia/llama-3.1-nemotron-nano-vl-8b-v1`. For more information, refer to [Profile Information in the Quickstart Guide](quickstart-guide.md#profile-information).
To use the `caption` option, enable the `vlm` profile when you start the NeMo Retriever Library services. The default model used by `caption` is `nvidia/nemotron-nano-12b-v2-vl`. For more information, refer to [Profile Information in the Quickstart Guide](quickstart-guide.md#profile-information).

### Basic Usage

Expand All @@ -481,7 +481,7 @@ To specify a different API endpoint, pass additional parameters to `caption`.
```python
ingestor = ingestor.caption(
endpoint_url="https://integrate.api.nvidia.com/v1/chat/completions",
model_name="nvidia/llama-3.1-nemotron-nano-vl-8b-v1",
model_name="nvidia/nemotron-nano-12b-v2-vl",
api_key="nvapi-"
)
```
Expand Down Expand Up @@ -570,11 +570,11 @@ The `store` task uses [fsspec](https://filesystem-spec.readthedocs.io/) for stor
| Amazon S3 | `s3://` | `s3://my-bucket/extracted-images` |
| Google Cloud Storage | `gs://` | `gs://my-bucket/images` |
| Azure Blob Storage | `abfs://` | `abfs://container@account.dfs.core.windows.net/images` |
| MinIO (S3-compatible) | `s3://` | `s3://nemo-retriever/artifacts/store/images` (default) |
| MinIO (S3-compatible) | `s3://` | `s3://nv-ingest/artifacts/store/images` (default) |

!!! tip

`storage_uri` defaults to the server-side `IMAGE_STORAGE_URI` environment variable (commonly `s3://nemo-retriever/...`). If you change that variable—for example to a host-mounted `file://` path—restart the NeMo Retriever Library runtime so the container picks up the new value.
`storage_uri` defaults to the server-side `IMAGE_STORAGE_URI` environment variable (commonly `s3://nv-ingest/...`). If you change that variable—for example to a host-mounted `file://` path—restart the NeMo Retriever Library runtime so the container picks up the new value.

When `public_base_url` is provided, the metadata returned from `ingest()` surfaces that HTTP(S) link while still recording the underlying storage URI. Leave it unset when the storage endpoint itself is already publicly reachable.

Expand Down
2 changes: 1 addition & 1 deletion docs/docs/extraction/quickstart-guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -395,7 +395,7 @@ You can specify multiple `--profile` options.
| `retrieval` | Core | Enables the embedding NIM and (optional) GPU-accelerated Milvus. Omit this profile to use the default LanceDB backend. |
| `audio` | Advanced | Use [Riva](https://docs.nvidia.com/deeplearning/riva/user-guide/docs/index.html) for processing audio files. For more information, refer to [Audio Processing](audio.md). |
| `nemotron-parse` | Advanced | Use [nemotron-parse](https://build.nvidia.com/nvidia/nemotron-parse), which adds state-of-the-art text and table extraction. For more information, refer to [Advanced Visual Parsing](nemoretriever-parse.md). |
| `vlm` | Advanced | Use [llama 3.1 Nemotron 8B Vision](https://build.nvidia.com/nvidia/llama-3.1-nemotron-nano-vl-8b-v1/modelcard) for image captioning of unstructured images and infographics. This profile enables the `caption` method in the Python API to generate text descriptions of visual content. For more information, refer to [Use Multimodal Embedding](vlm-embed.md) and [Extract Captions from Images](nv-ingest-python-api.md#extract-captions-from-images). |
| `vlm` | Advanced | Use [Nemotron Nano 12B v2 VL](https://build.nvidia.com/nvidia/nemotron-nano-12b-v2-vl/modelcard) for image captioning of unstructured images and infographics. This profile enables the `caption` method in the Python API to generate text descriptions of visual content. For more information, refer to [Use Multimodal Embedding](vlm-embed.md) and [Extract Captions from Images](nv-ingest-python-api.md#extract-captions-from-images). |

### Example: Using the VLM Profile for Infographic Captioning

Expand Down
Loading