Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
114 commits
Select commit Hold shift + click to select a range
a167a22
feat: tensorrt support
matejpekar Jan 17, 2026
1d3310f
fix: remove flush
matejpekar Jan 17, 2026
4e27a48
feat: add docker files for cpu/gpu
Feb 8, 2026
fd3154d
feat: add PVC for TensorRT
Feb 8, 2026
eaac807
feat: add support of TensorRT for models
Feb 8, 2026
46fe8b1
feat: add TensorRT cache to workers
Feb 8, 2026
f07723e
add Jiri as coauthor
Feb 8, 2026
9d6e265
fix: remove gpu number from serve.deployment in code
Feb 8, 2026
e7612f9
fix: warning suppress
Feb 9, 2026
5945f10
feat: add jobs to download virchow2
Feb 10, 2026
8ef4cc5
feat: add model provider for hf
Feb 10, 2026
a6c427e
feat: add pvc for huggingface
Feb 10, 2026
27c7801
feat: add virchow2 model
Feb 10, 2026
e5d84cb
fix
Feb 10, 2026
e1fcb6c
fix: fine tune
Feb 10, 2026
e7ac073
feat: add into dockerfile
Feb 14, 2026
51f07a4
fix: remove installs from model
Feb 14, 2026
178f226
fix: based on official docs
Feb 14, 2026
5cc123f
fix
Feb 14, 2026
964114e
fix: remove comment
Feb 14, 2026
32434f6
feat: add more smaller yamls
Feb 23, 2026
3f74fb2
feat: add deploy script
Feb 23, 2026
df76f0f
feat: add deploy script
Feb 23, 2026
181f79e
chore: update docker gpu file
Jurgee Mar 13, 2026
156a7d4
feat: optimalize virchow2 deployment
Jurgee Mar 13, 2026
57176d6
fix: remove hf token, create new secret
Jurgee Mar 14, 2026
fe51ee2
fix
Jurgee Mar 14, 2026
9f37d03
Merge branch 'main' into feature/virchow2-model
Jurgee Mar 14, 2026
210c7e6
fix: remove intra threads
Jurgee Mar 14, 2026
bf7cff1
fix: lint
Jurgee Mar 14, 2026
6813264
fix: remove duplicity
Jurgee Mar 14, 2026
7510c9f
fixes
Jurgee Mar 14, 2026
2eae503
docker files
Jurgee Mar 14, 2026
c5095bd
fix: docker
Jurgee Mar 14, 2026
e94baec
chore: new docker image
Jurgee Mar 14, 2026
7cdd290
chore: cpu docker
Jurgee Mar 14, 2026
8cce2cb
fix
Jurgee Mar 14, 2026
7e329a8
final changes
Jurgee Mar 14, 2026
8dfea82
fix: usage of master branch
Jurgee Mar 15, 2026
e6f8603
Potential fix for pull request finding
Jurgee Mar 15, 2026
fb646c4
Potential fix for pull request finding
Jurgee Mar 15, 2026
b2d083c
Potential fix for pull request finding
Jurgee Mar 15, 2026
bfd90a9
Potential fix for pull request finding
Jurgee Mar 15, 2026
d64f50b
chore: prostate opt
Jurgee Mar 17, 2026
75ed923
fix: comments
Jurgee Mar 18, 2026
bfcbe5b
fix
Jurgee Mar 18, 2026
1247e72
fix: comment remove
Jurgee Mar 18, 2026
d1a3d97
chore: new docker image
Jurgee Mar 18, 2026
3d26432
fix: simple pip command
Jurgee Mar 22, 2026
5ccf860
feat: newly optimalized model
Jurgee Mar 23, 2026
c62f2ef
fix: remove hf token from gpu workers
Jurgee Mar 23, 2026
cea82e7
refactor: replace read_region_relative with read_tile in fetch_tissue…
Jurgee Mar 23, 2026
5816940
fix: Type error
Jurgee Mar 23, 2026
dbd60aa
refactor: new prostate optimalization
Jurgee Mar 23, 2026
327910e
fix: name
Jurgee Mar 23, 2026
3da8983
fix: previous version
Jurgee Mar 23, 2026
c4f21a3
fixes
Jurgee Mar 23, 2026
503d485
fix: mypy
Jurgee Mar 23, 2026
7bbd50c
fix: config mutation
Jurgee Mar 24, 2026
ae96ca1
fix: remove index url
Jurgee Mar 28, 2026
5cd8344
refactor: clean up Virchow2 deployment and simplify model loading
Jurgee Mar 28, 2026
92ac910
fixes
Jurgee Mar 28, 2026
fffcd4f
fix: add workspace size
Jurgee Mar 28, 2026
cc966e3
Update models/virchow2.py
Jurgee Mar 29, 2026
52a4361
fix: remove sufix
Jurgee Mar 29, 2026
9c1f23f
fix: review comments
Jurgee Mar 29, 2026
9075231
Update models/virchow2.py
Jurgee Mar 31, 2026
9a28ac9
fix: remove provider and logger
Jurgee Mar 31, 2026
c12c7cd
fix: main tokens in root
Jurgee Mar 31, 2026
27abbee
fix: remove hf model provider
Jurgee Apr 4, 2026
256df86
fix
Jurgee Apr 4, 2026
1beaa79
fixes
Jurgee Apr 6, 2026
7b5d423
fix: better imports
Jurgee Apr 6, 2026
0c6d2fe
fix: add intra num threads back to models
Jurgee Apr 7, 2026
44be907
fix: reviewer comments
Jurgee Apr 7, 2026
c66e911
fix: correct dim
Jurgee Apr 10, 2026
f1b934b
add testing logger
Jurgee Apr 10, 2026
f461e49
remove logger
Jurgee Apr 10, 2026
276824b
new metrics
Jurgee Apr 10, 2026
ca2f878
fix dim
Jurgee Apr 10, 2026
3d8749a
Update models/virchow2.py
Jurgee Apr 10, 2026
52c5373
fix: metrics
Jurgee Apr 10, 2026
819f1be
add gpu replicas
Jurgee Apr 10, 2026
17b9bbc
Merge branch 'feature/virchow2-model' into feature/prostate-opt
Jurgee Apr 12, 2026
5141517
replicas
Jurgee Apr 12, 2026
47e1e80
remove duplicity
Jurgee Apr 12, 2026
06f4b5d
Merge branch 'main' into feature/prostate-opt
Jurgee Apr 13, 2026
85b07ee
optimalizations
Jurgee Apr 14, 2026
f283aab
comments
Jurgee Apr 14, 2026
9c1385e
lower memory
Jurgee Apr 14, 2026
dada6f9
fix: optional trt sett
Jurgee Apr 15, 2026
5c55363
Merge branch 'feature/prostate-opt' into feature/spilt-ray-serve
Jurgee Apr 15, 2026
166f98b
feat: changeto optimalized models
Jurgee Apr 15, 2026
74ead7f
feat: only one place for name
Jurgee Apr 15, 2026
9433940
Update deploy.sh
Jurgee Apr 15, 2026
d5952e3
mypy fix
Jurgee Apr 15, 2026
5b7974e
Merge branch 'feature/spilt-ray-serve' of https://github.com/RationAI…
Jurgee Apr 15, 2026
1f9cbbd
change name
Jurgee Apr 16, 2026
354c2ee
remove
Jurgee Apr 16, 2026
47d6885
Merge branch 'main' into feature/spilt-ray-serve
Jurgee Apr 16, 2026
62d8303
fix name
Jurgee Apr 17, 2026
791dddf
Update deploy.sh
Jurgee Apr 17, 2026
6452ee0
Update deploy.sh
Jurgee Apr 17, 2026
2aeba33
Update pyproject.toml
Jurgee Apr 17, 2026
c4eb0cb
uv lock
Jurgee Apr 17, 2026
4c3ff7b
review fixes
Jurgee Apr 18, 2026
3413047
fix script
Jurgee Apr 18, 2026
38e3b3f
switch from klustomize to helm templates
Jurgee Apr 20, 2026
c0f03c0
better structure
Jurgee Apr 22, 2026
10c9ab5
uv lock
Jurgee Apr 22, 2026
8c11a54
fix name
Jurgee Apr 22, 2026
0f982e6
review fixes
Jurgee Apr 22, 2026
78c923f
rename gpu workers
Jurgee Apr 22, 2026
43f652b
drop enabled key
Jurgee Apr 22, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions helm/rayservice/Chart.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
apiVersion: v2
name: rayservice
description: Model service Helm chart
version: 0.1.0
type: application
32 changes: 32 additions & 0 deletions helm/rayservice/applications/episeg-1.yaml
Comment thread
Jurgee marked this conversation as resolved.
Comment thread
Jurgee marked this conversation as resolved.
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
- name: episeg-1
import_path: models.semantic_segmentation:app
route_prefix: /episeg-1
runtime_env:
working_dir: https://github.com/RationAI/model-service/archive/refs/heads/main.zip
Comment thread
Jurgee marked this conversation as resolved.
deployments:
- name: SemanticSegmentation
max_ongoing_requests: 16
max_queued_requests: 32
autoscaling_config:
min_replicas: 0
max_replicas: 4
target_ongoing_requests: 4
ray_actor_options:
num_cpus: 4
memory: 12884901888
num_gpus: 1
runtime_env:
env_vars:
MLFLOW_TRACKING_URI: http://mlflow.rationai-mlflow:5000
user_config:
tile_size: 1024
mpp: 0.468
max_batch_size: 8
batch_wait_timeout_s: 0.1
intra_op_num_threads: 4
trt_max_workspace_size: 6442450944
trt_cache_path: /mnt/cache/trt_cache/episeg-1/b8
trt_builder_optimization_level: 3
model:
_target_: providers.model_provider:mlflow
artifact_uri: mlflow-artifacts:/10/39f821ed5b964c71a603cc6db196f9fd/artifacts/checkpoints/epoch=19-step=32020/model.onnx/model.onnx
19 changes: 19 additions & 0 deletions helm/rayservice/applications/heatmap-builder.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,19 @@
- name: heatmap-builder
import_path: builders.heatmap_builder:app
route_prefix: /heatmap-builder
runtime_env:
working_dir: https://github.com/RationAI/model-service/archive/refs/heads/main.zip
Comment thread
Jurgee marked this conversation as resolved.
deployments:
- name: HeatmapBuilder
max_ongoing_requests: 16
max_queued_requests: 32
autoscaling_config:
min_replicas: 0
max_replicas: 4
target_ongoing_requests: 8
ray_actor_options:
num_cpus: 8
memory: 12884901888
user_config:
num_threads: 8
max_concurrent_tasks: 16
31 changes: 31 additions & 0 deletions helm/rayservice/applications/prostate-classifier-1.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,31 @@
- name: prostate-classifier-1
import_path: models.binary_classifier:app
route_prefix: /prostate-classifier-1
runtime_env:
working_dir: https://github.com/RationAI/model-service/archive/refs/heads/main.zip
deployments:
- name: BinaryClassifier
max_ongoing_requests: 512
max_queued_requests: 1024
autoscaling_config:
min_replicas: 0
max_replicas: 4
target_ongoing_requests: 128
ray_actor_options:
num_cpus: 4
num_gpus: 1
memory: 12884901888
runtime_env:
env_vars:
MLFLOW_TRACKING_URI: http://mlflow.rationai-mlflow:5000
user_config:
tile_size: 512
max_batch_size: 256
batch_wait_timeout_s: 0.05
intra_op_num_threads: 4
trt_max_workspace_size: 8589934592
trt_cache_path: /mnt/cache/trt_cache/binary-classifier-1/b256
Comment thread
Jurgee marked this conversation as resolved.
trt_builder_optimization_level: 3
model:
_target_: providers.model_provider:mlflow
artifact_uri: mlflow-artifacts:/65/aebc892f526047249b972f200bef4381/artifacts/checkpoints/epoch=0-step=6972/prostate_model_norm.onnx
28 changes: 28 additions & 0 deletions helm/rayservice/applications/virchow2.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
- name: virchow2
import_path: models.virchow2:app
route_prefix: /virchow2
runtime_env:
config:
setup_timeout_seconds: 1800
working_dir: https://github.com/RationAI/model-service/archive/refs/heads/main.zip
deployments:
- name: Virchow2
max_ongoing_requests: 1024
max_queued_requests: 2048
autoscaling_config:
min_replicas: 0
max_replicas: 4
target_ongoing_requests: 256
ray_actor_options:
num_cpus: 4
num_gpus: 1
memory: 8589934592
runtime_env:
env_vars:
HF_HOME: /mnt/huggingface_cache
user_config:
tile_size: 224
max_batch_size: 512
batch_wait_timeout_s: 0.1
model:
repo_id: paige-ai/Virchow2
65 changes: 65 additions & 0 deletions helm/rayservice/templates/rayservice.yaml
Comment thread
Jurgee marked this conversation as resolved.
Original file line number Diff line number Diff line change
@@ -0,0 +1,65 @@
apiVersion: ray.io/v1
kind: RayService
metadata:
name: {{ .Release.Name }}
Comment thread
matejpekar marked this conversation as resolved.
namespace: {{ .Release.Namespace | default "rationai-jobs-ns" }}
spec:
rayClusterConfig:
rayVersion: 2.53.0
enableInTreeAutoscaling: true
autoscalerOptions:
idleTimeoutSeconds: 60
securityContext:
runAsUser: 1000
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
headGroupSpec:
rayStartParams:
num-cpus: "0"
dashboard-host: "0.0.0.0"
template:
spec:
securityContext:
fsGroupChangePolicy: OnRootMismatch
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
containers:
- name: ray-head
image: rayproject/ray:2.53.0-py312
imagePullPolicy: Always
resources:
limits:
cpu: 0
memory: 4Gi
requests:
cpu: 0
memory: 4Gi
env:
- name: HTTPS_PROXY
value: http://proxy.ics.muni.cz:3128
ports:
- containerPort: 6379
name: gcs-server
- containerPort: 8265
name: dashboard
- containerPort: 10001
name: client
- containerPort: 8000
name: serve
securityContext:
runAsUser: 1000
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
workerGroupSpecs:
{{- range $workerName := .Values.workers }}
{{- $workerContent := printf "workers/%s.yaml" $workerName | $.Files.Get | fromYaml }}
- {{ toYaml $workerContent | nindent 8 | trim }}
{{- end }}
serveConfigV2: |
applications:
{{- range $appName := .Values.applications }}
{{ printf "applications/%s.yaml" $appName | $.Files.Get | indent 4 }}
{{- end }}
9 changes: 9 additions & 0 deletions helm/rayservice/values.yaml
Comment thread
Jurgee marked this conversation as resolved.
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
workers:
- cpu-workers
- mig20-workers

applications:
- episeg-1
- heatmap-builder
- prostate-classifier-1
- virchow2
66 changes: 66 additions & 0 deletions helm/rayservice/workers/cpu-workers.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,66 @@
groupName: cpu-workers
replicas: 0
minReplicas: 0
maxReplicas: 2
template:
spec:
securityContext:
fsGroupChangePolicy: OnRootMismatch
runAsNonRoot: true
seccompProfile:
type: RuntimeDefault
containers:
- name: ray-worker
image: cerit.io/rationai/model-service:2.53.0
imagePullPolicy: Always
resources:
limits:
cpu: 8
memory: 16Gi
requests:
cpu: 8
memory: 16Gi
env:
- name: HTTPS_PROXY
value: http://proxy.ics.muni.cz:3128
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
runAsUser: 1000
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "ray stop"]
volumeMounts:
- name: data
mountPath: /mnt/data
- name: public-data
mountPath: /mnt/data/Public
- name: projects
mountPath: /mnt/projects
- name: bioptic-tree
mountPath: /mnt/bioptic_tree
- name: trt-cache-volume
mountPath: /mnt/cache
- name: huggingface-cache
mountPath: /mnt/huggingface_cache
volumes:
- name: data
persistentVolumeClaim:
claimName: data-ro
- name: public-data
persistentVolumeClaim:
claimName: rationai-data-ro-pvc-jobs
- name: projects
persistentVolumeClaim:
claimName: projects-rw
- name: bioptic-tree
persistentVolumeClaim:
claimName: bioptictree-ro
- name: trt-cache-volume
persistentVolumeClaim:
claimName: tensorrt-cache-pvc
- name: huggingface-cache
persistentVolumeClaim:
claimName: huggingface-cache-pvc
76 changes: 76 additions & 0 deletions helm/rayservice/workers/mig20-workers.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
groupName: mig20-gpu-workers
replicas: 0
minReplicas: 0
maxReplicas: 4
rayStartParams:
num-gpus: "1"
template:
spec:
securityContext:
fsGroup: 1000
fsGroupChangePolicy: OnRootMismatch
runAsNonRoot: true
runAsUser: 1000
seccompProfile:
type: RuntimeDefault
containers:
- name: ray-worker
image: cerit.io/rationai/model-service:2.54.0-gpu
imagePullPolicy: Always
resources:
limits:
cpu: 8
memory: 24Gi
nvidia.com/mig-2g.20gb: 1
requests:
cpu: 8
memory: 24Gi
Comment thread
Jurgee marked this conversation as resolved.
env:
- name: HTTPS_PROXY
value: http://proxy.ics.muni.cz:3128
- name: HF_TOKEN
valueFrom:
secretKeyRef:
name: huggingface-secret
key: token
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop: ["ALL"]
runAsUser: 1000
lifecycle:
preStop:
exec:
command: ["/bin/sh", "-c", "ray stop"]
volumeMounts:
- name: data
mountPath: /mnt/data
- name: public-data
mountPath: /mnt/data/Public
- name: projects
mountPath: /mnt/projects
- name: bioptic-tree
mountPath: /mnt/bioptic_tree
- name: trt-cache-volume
mountPath: /mnt/cache
- name: huggingface-cache
mountPath: /mnt/huggingface_cache
volumes:
- name: data
persistentVolumeClaim:
claimName: data-ro
- name: public-data
persistentVolumeClaim:
claimName: rationai-data-ro-pvc-jobs
- name: projects
persistentVolumeClaim:
claimName: projects-rw
- name: bioptic-tree
persistentVolumeClaim:
claimName: bioptictree-ro
- name: trt-cache-volume
persistentVolumeClaim:
claimName: tensorrt-cache-pvc
- name: huggingface-cache
persistentVolumeClaim:
claimName: huggingface-cache-pvc
Loading