Skip to content

Commit a0cfa7a

Browse files
authored
Add vllm provider system (#45)
* Add vllm provider system Signed-off-by: jphillips <josh.phillips@fearnworks.com> * use the user host when remote, not the default Signed-off-by: jphillips <josh.phillips@fearnworks.com> * Update image calls to use user server config Signed-off-by: jphillips <josh.phillips@fearnworks.com> * lint tweaks Signed-off-by: jphillips <josh.phillips@fearnworks.com> --------- Signed-off-by: jphillips <josh.phillips@fearnworks.com>
1 parent d65ed62 commit a0cfa7a

24 files changed

Lines changed: 886 additions & 281 deletions

File tree

Taskfile.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ includes:
55
data: ./servers/data_service/Taskfile.data.yml
66
inference: ./servers/inference_bridge/server/Taskfile.inference.yml
77
studio: ./graphcap_studio/Taskfile.studio.yml
8-
8+
models: ./servers/model_runners/Task.models.yml
99
tasks:
1010
dev:
1111
desc: Start the Docker Compose services in watch mode
Lines changed: 103 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
===================================
2+
Running Models with vLLM Provider
3+
===================================
4+
5+
This guide explains how to run pre-configured Large Language Models (LLMs) using the vLLM engine via Docker containers managed by Taskfile.
6+
7+
Prerequisites
8+
=============
9+
10+
- Docker installed and running.
11+
- NVIDIA drivers supporting CUDA installed.
12+
- NVIDIA Container Toolkit installed.
13+
- Task (Go Task runner) installed (see https://taskfile.dev/installation/).
14+
- A Hugging Face Hub token set in your environment variable ``HUGGING_FACE_HUB_TOKEN``.
15+
16+
Available Models and Configurations
17+
===================================
18+
19+
The models are managed via the ``servers/model_runners/Task.models.yml`` file. Each task corresponds to a specific model and hardware configuration.
20+
21+
Key Configuration Parameters
22+
----------------------------
23+
24+
The Taskfile sets environment variables that configure the vLLM server within the Docker container:
25+
26+
- ``MODEL_ID``: The Hugging Face identifier for the model (e.g., ``Qwen/Qwen2.5-VL-32B-Instruct-AWQ``).
27+
- ``QUANTIZATION``: The quantization method used (e.g., ``awq_marlin``). Determines the ``dtype`` used (``float16`` for AWQ/Marlin, ``bfloat16`` otherwise).
28+
- ``VRAM_TARGET``: The target VRAM per GPU in GB (e.g., ``48``, ``24``). Used to select appropriate memory settings.
29+
- ``TENSOR_PARALLEL``: The number of GPUs to use for tensor parallelism (e.g., ``1``, ``2``).
30+
- ``GPU_MEM_UTIL``: Target GPU memory utilization fraction (e.g., ``0.90``).
31+
- ``MAX_SEQS``: Maximum number of sequences the engine can handle concurrently.
32+
- ``MAX_MODEL_LEN``: Maximum sequence length the model can process. This limits KV cache size.
33+
- ``HOST_PORT``: The port on the host machine the server will be exposed on (default ``12434``).
34+
- ``SERVED_MODEL_NAME``: The name the model is served under via the OpenAI API endpoint (default ``vision-worker``).
35+
36+
Running a Model
37+
===============
38+
39+
To run a specific model configuration, use the ``task`` command followed by the task name or its alias. The tasks are defined in ``servers/model_runners/Task.models.yml``.
40+
41+
Example Commands:
42+
-----------------
43+
44+
Run Qwen2.5-VL-32B-AWQ on a single 48GB GPU:
45+
46+
.. code-block:: bash
47+
48+
task models:r32:48
49+
50+
Run Qwen2.5-VL-7B-AWQ on a single 16GB GPU:
51+
52+
.. code-block:: bash
53+
54+
task models:r7:16
55+
56+
Run Qwen2.5-VL-32B-AWQ on two 24GB GPUs (TP=2):
57+
58+
.. code-block:: bash
59+
60+
task models:r32:2x24
61+
62+
How it Works:
63+
-------------
64+
65+
When you run a task:
66+
67+
1. The Taskfile sets the appropriate environment variables for the selected model and VRAM target.
68+
2. It checks for and stops any existing container using the target ``HOST_PORT``.
69+
3. It executes the ``servers/model_runners/_run_vllm_model.sh`` script.
70+
4. The script pulls the latest ``vllm/vllm-openai:latest`` Docker image.
71+
5. The script starts a Docker container using the environment variables to configure the ``vllm`` server arguments (like ``--model``, ``--quantization``, ``--tensor-parallel-size``, ``--max-model-len``, etc.).
72+
6. The vLLM server starts inside the container, listening on the specified port (forwarded to ``HOST_PORT``).
73+
74+
Utility Tasks
75+
=============
76+
77+
Stopping Containers
78+
-------------------
79+
80+
To stop running vLLM server containers:
81+
82+
- Stop all containers started by this Taskfile:
83+
84+
.. code-block:: bash
85+
86+
task models:stop
87+
88+
- Stop a specific container by its name (e.g., the one started by ``r32:48``):
89+
90+
.. code-block:: bash
91+
92+
task models:stop -- --name vllm-qwen25vl32b-48gb-tp1
93+
94+
Following Logs
95+
--------------
96+
97+
To follow the logs of a specific running container:
98+
99+
.. code-block:: bash
100+
101+
task models:logs -- --name vllm-qwen25vl32b-48gb-tp1
102+
103+
You must specify the container name using the ``--name`` flag.

graphcap_studio/src/components/responsive-image/useResponsiveImage.ts

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
import { SERVER_IDS } from "@/features/server-connections/constants";
2+
import { useServerConnections } from "@/features/server-connections/useServerConnections";
13
import { getThumbnailUrl } from "@/services/images";
24
import { generateSrcSet } from "@/utils/imageSrcSet";
35
// SPDX-License-Identifier: Apache-2.0
@@ -50,6 +52,13 @@ export function useResponsiveImage({
5052
const [error, setError] = useState<Error | null>(null);
5153
const [retryCount, setRetryCount] = useState(0);
5254

55+
// Get connections state and find media server URL
56+
const { connections } = useServerConnections();
57+
const mediaServerConnection = connections.find(
58+
(conn) => conn.id === SERVER_IDS.MEDIA_SERVER,
59+
);
60+
const mediaServerUrl = mediaServerConnection?.url ?? "";
61+
5362
// Use refs to track timeout and abort controller
5463
const timeoutRef = useRef<NodeJS.Timeout | null>(null);
5564
const abortControllerRef = useRef<AbortController | null>(null);
@@ -164,8 +173,12 @@ export function useResponsiveImage({
164173

165174
// Memoize the srcset string (default format, e.g., JPEG)
166175
const srcSet = useMemo(() => {
167-
return generateSrcSet(imagePath, getThumbnailUrl, undefined, aspectRatio);
168-
}, [imagePath, aspectRatio]);
176+
// Partially apply getThumbnailUrl with the mediaServerUrl
177+
const getThumbnailWithUrl = (path: string, width: number, height: number, format?: string) =>
178+
getThumbnailUrl(mediaServerUrl, path, width, height, format);
179+
180+
return generateSrcSet(imagePath, getThumbnailWithUrl, undefined, aspectRatio);
181+
}, [imagePath, aspectRatio, mediaServerUrl]);
169182

170183
return {
171184
loading,

graphcap_studio/src/features/datasets/components/DeleteDatasetModal.tsx

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
import { SERVER_IDS } from "@/features/server-connections/constants";
2+
import { useServerConnections } from "@/features/server-connections/useServerConnections";
13
import { useDeleteDataset } from "@/services/dataset";
24
import { toast } from "@/utils/toast";
35
// SPDX-License-Identifier: Apache-2.0
@@ -30,7 +32,14 @@ export function DeleteDatasetModal({
3032
onDatasetDeleted,
3133
}: DeleteDatasetModalProps) {
3234
const [error, setError] = useState<string | null>(null);
33-
const deleteDatasetMutation = useDeleteDataset();
35+
36+
const { connections } = useServerConnections();
37+
const mediaServerConnection = connections.find(
38+
(conn) => conn.id === SERVER_IDS.MEDIA_SERVER,
39+
);
40+
const mediaServerUrl = mediaServerConnection?.url ?? "";
41+
42+
const deleteDatasetMutation = useDeleteDataset(mediaServerUrl);
3443
const isDeleting = deleteDatasetMutation.isPending;
3544

3645
// Ref for the button that should receive initial focus (the Cancel button is safer)

graphcap_studio/src/features/datasets/components/image-uploader/useImageUploader.ts

Lines changed: 11 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
1+
import { SERVER_IDS } from "@/features/server-connections/constants";
2+
import { useServerConnections } from "@/features/server-connections/useServerConnections";
13
import { useUploadImage } from "@/services/dataset";
24
import { toast } from "@/utils/toast";
3-
// SPDX-License-Identifier: Apache-2.0
4-
// TODO: RESOLVE OLD DATASET NAME SYSTEM
55
import { useCallback, useMemo, useState } from "react";
66
import { useDropzone } from "react-dropzone";
77
import { useDatasetContext } from "../../context/DatasetContext"; // Import context hook
@@ -41,11 +41,17 @@ export function useImageUploader({
4141
datasetError,
4242
} = useDatasetContext();
4343

44-
// Use the upload image mutation
45-
const uploadImageMutation = useUploadImage();
44+
45+
const { connections } = useServerConnections();
46+
const mediaServerConnection = connections.find(
47+
(conn) => conn.id === SERVER_IDS.MEDIA_SERVER,
48+
);
49+
const mediaServerUrl = mediaServerConnection?.url ?? "";
50+
51+
const uploadImageMutation = useUploadImage(mediaServerUrl); // Pass URL to hook
4652

4753
// Determine if the uploader should be disabled (no valid dataset, loading, or error)
48-
const isDisabled = !selectedDataset || isLoadingDataset || !!datasetError;
54+
const isDisabled = !selectedDataset || isLoadingDataset || !!datasetError || !mediaServerUrl;
4955

5056
const onDrop = useCallback(
5157
async (acceptedFiles: File[]) => {

graphcap_studio/src/features/datasets/context/DatasetContext.tsx

Lines changed: 9 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11
// SPDX-License-Identifier: Apache-2.0
2+
import { SERVER_IDS } from "@/features/server-connections/constants";
3+
import { useServerConnections } from "@/features/server-connections/useServerConnections";
24
import { useListDatasets } from "@/services/dataset";
35
import type { Dataset, Image } from "@/types";
46
import {
@@ -55,12 +57,17 @@ const DatasetContext = createContext<DatasetContextType | undefined>(undefined);
5557
* Manages dataset state, including fetching the list internally.
5658
*/
5759
export function DatasetProvider({ children }: DatasetProviderProps) {
58-
// Fetch datasets internally
60+
const { connections } = useServerConnections();
61+
const mediaServerConnection = connections.find(
62+
(conn) => conn.id === SERVER_IDS.MEDIA_SERVER,
63+
);
64+
const mediaServerUrl = mediaServerConnection?.url ?? "";
65+
5966
const {
6067
data: datasetListResponse,
6168
isLoading: isLoadingList,
6269
error: listError,
63-
} = useListDatasets();
70+
} = useListDatasets(mediaServerUrl);
6471

6572
// State for the *target* ID that we want to select
6673
const [targetDatasetId, setTargetDatasetId] = useState<string | undefined>(

graphcap_studio/src/features/datasets/hooks/useDatasets.ts

Lines changed: 23 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
import { SERVER_IDS } from "@/features/server-connections/constants";
2+
import { useServerConnections } from "@/features/server-connections/useServerConnections";
13
import {
24
queryKeys,
35
useAddImageToDataset as useAddImageToDatasetMutation,
@@ -32,6 +34,12 @@ export function useDatasets(datasetId: string | undefined) {
3234

3335
const { navigateToDataset } = useDatasetNavigation();
3436

37+
const { connections } = useServerConnections();
38+
const mediaServerConnection = connections.find(
39+
(conn) => conn.id === SERVER_IDS.MEDIA_SERVER,
40+
);
41+
const mediaServerUrl = mediaServerConnection?.url ?? "";
42+
3543
// Track the most recently uploaded images to prioritize them in the sort
3644
const [recentlyUploadedImages, setRecentlyUploadedImages] = useState<Set<string>>(new Set());
3745

@@ -40,9 +48,8 @@ export function useDatasets(datasetId: string | undefined) {
4048
selectDatasetById(datasetId);
4149
}, [datasetId, selectDatasetById]);
4250

43-
// Mutations remain the same
44-
const createDatasetMutation = useCreateDatasetMutation();
45-
const addImageToDatasetMutation = useAddImageToDatasetMutation();
51+
const createDatasetMutation = useCreateDatasetMutation(mediaServerUrl);
52+
const addImageToDatasetMutation = useAddImageToDatasetMutation(mediaServerUrl);
4653

4754
// Filter images based on the selectedDataset from context and the subfolder
4855
const filteredImages = useMemo(() => {
@@ -116,11 +123,14 @@ export function useDatasets(datasetId: string | undefined) {
116123
*/
117124
const handleUploadComplete = useCallback(() => {
118125
const sharedQueryClient = getQueryClient();
119-
// Invalidate the query that DatasetProvider uses internally
120-
sharedQueryClient.invalidateQueries({ queryKey: queryKeys.datasetImages });
126+
127+
if (mediaServerUrl) {
128+
sharedQueryClient.invalidateQueries({ queryKey: queryKeys.datasetImages(mediaServerUrl) });
129+
} else {
130+
console.warn("Cannot invalidate dataset images: Media Server URL not available.");
131+
}
121132

122133
// Optionally, manage recent images state locally as before
123-
// This part might need refinement depending on how quickly context state updates
124134
if (selectedDataset?.images) {
125135
const newRecentImages = new Set(recentlyUploadedImages);
126136
for (const image of selectedDataset.images) {
@@ -132,13 +142,17 @@ export function useDatasets(datasetId: string | undefined) {
132142
}, 5 * 60 * 1000);
133143
}
134144

135-
}, [selectedDataset, recentlyUploadedImages]); // Added queryKeys
145+
}, [selectedDataset, recentlyUploadedImages, mediaServerUrl]); // Added mediaServerUrl
136146

137147
// Function to refetch datasets (might not be needed if context invalidates properly)
138148
const refetch = useCallback(() => {
139149
const sharedQueryClient = getQueryClient();
140-
sharedQueryClient.refetchQueries({ queryKey: queryKeys.datasetImages });
141-
}, []); // Added queryKeys
150+
if (mediaServerUrl) {
151+
sharedQueryClient.refetchQueries({ queryKey: queryKeys.datasetImages(mediaServerUrl) });
152+
} else {
153+
console.warn("Cannot refetch dataset images: Media Server URL not available.");
154+
}
155+
}, [mediaServerUrl]);
142156

143157
return {
144158
// State derived directly from the refactored context

graphcap_studio/src/features/editor/components/ImageEditor.tsx

Lines changed: 14 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,10 @@
1+
import { SERVER_IDS } from "@/features/server-connections/constants";
2+
import { useServerConnections } from "@/features/server-connections/useServerConnections";
13
import {
2-
type ImageProcessResponse,
3-
getImageUrl,
4-
useProcessImage,
4+
getImageUrl,
5+
useProcessImage,
56
} from "@/services/images";
7+
import type { ImageProcessResponse } from "@/types";
68
import { toast } from "@/utils/toast";
79
import { useCallback, useState } from "react";
810
import Cropper, { type Area } from "react-easy-crop";
@@ -25,7 +27,14 @@ export function ImageEditor({ imagePath, onSave, onCancel }: ImageEditorProps) {
2527
const [isSaving, setIsSaving] = useState(false);
2628
const [isEditing, setIsEditing] = useState(false);
2729

28-
const processImageMutation = useProcessImage();
30+
// Get connections state and find media server URL
31+
const { connections } = useServerConnections();
32+
const mediaServerConnection = connections.find(
33+
(conn) => conn.id === SERVER_IDS.MEDIA_SERVER,
34+
);
35+
const mediaServerUrl = mediaServerConnection?.url ?? "";
36+
37+
const processImageMutation = useProcessImage(mediaServerUrl);
2938

3039
const onCropComplete = useCallback(
3140
(croppedArea: Area, croppedAreaPixels: Area) => {
@@ -119,7 +128,7 @@ export function ImageEditor({ imagePath, onSave, onCancel }: ImageEditorProps) {
119128
{isEditing ? (
120129
<>
121130
<Cropper
122-
image={getImageUrl(imagePath)}
131+
image={getImageUrl(mediaServerUrl, imagePath)}
123132
crop={crop}
124133
zoom={zoom}
125134
rotation={rotation}

graphcap_studio/src/features/editor/hooks/useImageEditor.ts

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,5 @@
1+
import { SERVER_IDS } from "@/features/server-connections/constants";
2+
import { useServerConnections } from "@/features/server-connections/useServerConnections";
13
import { queryKeys } from "@/services/dataset";
24
import type { Image } from "@/types";
35
import { toast } from "@/utils/toast";
@@ -21,6 +23,12 @@ export function useImageEditor({ selectedDataset }: UseImageEditorProps) {
2123
const [isEditing, setIsEditing] = useState(false);
2224
const queryClient = useQueryClient();
2325

26+
const { connections } = useServerConnections();
27+
const mediaServerConnection = connections.find(
28+
(conn) => conn.id === SERVER_IDS.MEDIA_SERVER,
29+
);
30+
const mediaServerUrl = mediaServerConnection?.url ?? "";
31+
2432
/**
2533
* Start editing an image
2634
*/
@@ -40,13 +48,16 @@ export function useImageEditor({ selectedDataset }: UseImageEditorProps) {
4048
setIsEditing(false);
4149

4250
// Invalidate cache for this dataset to refresh the images
43-
if (selectedDataset) {
51+
if (selectedDataset && mediaServerUrl) {
4452
queryClient.invalidateQueries({
4553
queryKey: queryKeys.datasetByName(selectedDataset),
4654
});
47-
queryClient.invalidateQueries({ queryKey: queryKeys.datasetImages });
55+
56+
queryClient.invalidateQueries({ queryKey: queryKeys.datasetImages(mediaServerUrl) });
57+
} else {
58+
console.warn("Cannot invalidate queries: Dataset name or Media Server URL missing.");
4859
}
49-
}, [queryClient, selectedDataset]);
60+
}, [queryClient, selectedDataset, mediaServerUrl]);
5061

5162
/**
5263
* Cancel editing

0 commit comments

Comments
 (0)