-
Notifications
You must be signed in to change notification settings - Fork 0
Spilt ray serve #4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
Jurgee
wants to merge
114
commits into
main
Choose a base branch
from
feature/spilt-ray-serve
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from all commits
Commits
Show all changes
114 commits
Select commit
Hold shift + click to select a range
a167a22
feat: tensorrt support
matejpekar 1d3310f
fix: remove flush
matejpekar 4e27a48
feat: add docker files for cpu/gpu
fd3154d
feat: add PVC for TensorRT
eaac807
feat: add support of TensorRT for models
46fe8b1
feat: add TensorRT cache to workers
f07723e
add Jiri as coauthor
9d6e265
fix: remove gpu number from serve.deployment in code
e7612f9
fix: warning suppress
5945f10
feat: add jobs to download virchow2
8ef4cc5
feat: add model provider for hf
a6c427e
feat: add pvc for huggingface
27c7801
feat: add virchow2 model
e5d84cb
fix
e1fcb6c
fix: fine tune
e7ac073
feat: add into dockerfile
51f07a4
fix: remove installs from model
178f226
fix: based on official docs
5cc123f
fix
964114e
fix: remove comment
32434f6
feat: add more smaller yamls
3f74fb2
feat: add deploy script
df76f0f
feat: add deploy script
181f79e
chore: update docker gpu file
Jurgee 156a7d4
feat: optimalize virchow2 deployment
Jurgee 57176d6
fix: remove hf token, create new secret
Jurgee fe51ee2
fix
Jurgee 9f37d03
Merge branch 'main' into feature/virchow2-model
Jurgee 210c7e6
fix: remove intra threads
Jurgee bf7cff1
fix: lint
Jurgee 6813264
fix: remove duplicity
Jurgee 7510c9f
fixes
Jurgee 2eae503
docker files
Jurgee c5095bd
fix: docker
Jurgee e94baec
chore: new docker image
Jurgee 7cdd290
chore: cpu docker
Jurgee 8cce2cb
fix
Jurgee 7e329a8
final changes
Jurgee 8dfea82
fix: usage of master branch
Jurgee e6f8603
Potential fix for pull request finding
Jurgee fb646c4
Potential fix for pull request finding
Jurgee b2d083c
Potential fix for pull request finding
Jurgee bfd90a9
Potential fix for pull request finding
Jurgee d64f50b
chore: prostate opt
Jurgee 75ed923
fix: comments
Jurgee bfcbe5b
fix
Jurgee 1247e72
fix: comment remove
Jurgee d1a3d97
chore: new docker image
Jurgee 3d26432
fix: simple pip command
Jurgee 5ccf860
feat: newly optimalized model
Jurgee c62f2ef
fix: remove hf token from gpu workers
Jurgee cea82e7
refactor: replace read_region_relative with read_tile in fetch_tissue…
Jurgee 5816940
fix: Type error
Jurgee dbd60aa
refactor: new prostate optimalization
Jurgee 327910e
fix: name
Jurgee 3da8983
fix: previous version
Jurgee c4f21a3
fixes
Jurgee 503d485
fix: mypy
Jurgee 7bbd50c
fix: config mutation
Jurgee ae96ca1
fix: remove index url
Jurgee 5cd8344
refactor: clean up Virchow2 deployment and simplify model loading
Jurgee 92ac910
fixes
Jurgee fffcd4f
fix: add workspace size
Jurgee cc966e3
Update models/virchow2.py
Jurgee 52a4361
fix: remove sufix
Jurgee 9c1f23f
fix: review comments
Jurgee 9075231
Update models/virchow2.py
Jurgee 9a28ac9
fix: remove provider and logger
Jurgee c12c7cd
fix: main tokens in root
Jurgee 27abbee
fix: remove hf model provider
Jurgee 256df86
fix
Jurgee 1beaa79
fixes
Jurgee 7b5d423
fix: better imports
Jurgee 0c6d2fe
fix: add intra num threads back to models
Jurgee 44be907
fix: reviewer comments
Jurgee c66e911
fix: correct dim
Jurgee f1b934b
add testing logger
Jurgee f461e49
remove logger
Jurgee 276824b
new metrics
Jurgee ca2f878
fix dim
Jurgee 3d8749a
Update models/virchow2.py
Jurgee 52c5373
fix: metrics
Jurgee 819f1be
add gpu replicas
Jurgee 17b9bbc
Merge branch 'feature/virchow2-model' into feature/prostate-opt
Jurgee 5141517
replicas
Jurgee 47e1e80
remove duplicity
Jurgee 06f4b5d
Merge branch 'main' into feature/prostate-opt
Jurgee 85b07ee
optimalizations
Jurgee f283aab
comments
Jurgee 9c1385e
lower memory
Jurgee dada6f9
fix: optional trt sett
Jurgee 5c55363
Merge branch 'feature/prostate-opt' into feature/spilt-ray-serve
Jurgee 166f98b
feat: changeto optimalized models
Jurgee 74ead7f
feat: only one place for name
Jurgee 9433940
Update deploy.sh
Jurgee d5952e3
mypy fix
Jurgee 5b7974e
Merge branch 'feature/spilt-ray-serve' of https://github.com/RationAI…
Jurgee 1f9cbbd
change name
Jurgee 354c2ee
remove
Jurgee 47d6885
Merge branch 'main' into feature/spilt-ray-serve
Jurgee 62d8303
fix name
Jurgee 791dddf
Update deploy.sh
Jurgee 6452ee0
Update deploy.sh
Jurgee 2aeba33
Update pyproject.toml
Jurgee c4eb0cb
uv lock
Jurgee 4c3ff7b
review fixes
Jurgee 3413047
fix script
Jurgee 38e3b3f
switch from klustomize to helm templates
Jurgee c0f03c0
better structure
Jurgee 10c9ab5
uv lock
Jurgee 8c11a54
fix name
Jurgee 0f982e6
review fixes
Jurgee 78c923f
rename gpu workers
Jurgee 43f652b
drop enabled key
Jurgee File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,5 @@ | ||
| apiVersion: v2 | ||
| name: rayservice | ||
| description: Model service Helm chart | ||
| version: 0.1.0 | ||
| type: application |
|
Jurgee marked this conversation as resolved.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,32 @@ | ||
| - name: episeg-1 | ||
| import_path: models.semantic_segmentation:app | ||
| route_prefix: /episeg-1 | ||
| runtime_env: | ||
| working_dir: https://github.com/RationAI/model-service/archive/refs/heads/main.zip | ||
|
Jurgee marked this conversation as resolved.
|
||
| deployments: | ||
| - name: SemanticSegmentation | ||
| max_ongoing_requests: 16 | ||
| max_queued_requests: 32 | ||
| autoscaling_config: | ||
| min_replicas: 0 | ||
| max_replicas: 4 | ||
| target_ongoing_requests: 4 | ||
| ray_actor_options: | ||
| num_cpus: 4 | ||
| memory: 12884901888 | ||
| num_gpus: 1 | ||
| runtime_env: | ||
| env_vars: | ||
| MLFLOW_TRACKING_URI: http://mlflow.rationai-mlflow:5000 | ||
| user_config: | ||
| tile_size: 1024 | ||
| mpp: 0.468 | ||
| max_batch_size: 8 | ||
| batch_wait_timeout_s: 0.1 | ||
| intra_op_num_threads: 4 | ||
| trt_max_workspace_size: 6442450944 | ||
| trt_cache_path: /mnt/cache/trt_cache/episeg-1/b8 | ||
| trt_builder_optimization_level: 3 | ||
| model: | ||
| _target_: providers.model_provider:mlflow | ||
| artifact_uri: mlflow-artifacts:/10/39f821ed5b964c71a603cc6db196f9fd/artifacts/checkpoints/epoch=19-step=32020/model.onnx/model.onnx | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,19 @@ | ||
| - name: heatmap-builder | ||
| import_path: builders.heatmap_builder:app | ||
| route_prefix: /heatmap-builder | ||
| runtime_env: | ||
| working_dir: https://github.com/RationAI/model-service/archive/refs/heads/main.zip | ||
|
Jurgee marked this conversation as resolved.
|
||
| deployments: | ||
| - name: HeatmapBuilder | ||
| max_ongoing_requests: 16 | ||
| max_queued_requests: 32 | ||
| autoscaling_config: | ||
| min_replicas: 0 | ||
| max_replicas: 4 | ||
| target_ongoing_requests: 8 | ||
| ray_actor_options: | ||
| num_cpus: 8 | ||
| memory: 12884901888 | ||
| user_config: | ||
| num_threads: 8 | ||
| max_concurrent_tasks: 16 | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,31 @@ | ||
| - name: prostate-classifier-1 | ||
| import_path: models.binary_classifier:app | ||
| route_prefix: /prostate-classifier-1 | ||
| runtime_env: | ||
| working_dir: https://github.com/RationAI/model-service/archive/refs/heads/main.zip | ||
| deployments: | ||
| - name: BinaryClassifier | ||
| max_ongoing_requests: 512 | ||
| max_queued_requests: 1024 | ||
| autoscaling_config: | ||
| min_replicas: 0 | ||
| max_replicas: 4 | ||
| target_ongoing_requests: 128 | ||
| ray_actor_options: | ||
| num_cpus: 4 | ||
| num_gpus: 1 | ||
| memory: 12884901888 | ||
| runtime_env: | ||
| env_vars: | ||
| MLFLOW_TRACKING_URI: http://mlflow.rationai-mlflow:5000 | ||
| user_config: | ||
| tile_size: 512 | ||
| max_batch_size: 256 | ||
| batch_wait_timeout_s: 0.05 | ||
| intra_op_num_threads: 4 | ||
| trt_max_workspace_size: 8589934592 | ||
| trt_cache_path: /mnt/cache/trt_cache/binary-classifier-1/b256 | ||
|
Jurgee marked this conversation as resolved.
|
||
| trt_builder_optimization_level: 3 | ||
| model: | ||
| _target_: providers.model_provider:mlflow | ||
| artifact_uri: mlflow-artifacts:/65/aebc892f526047249b972f200bef4381/artifacts/checkpoints/epoch=0-step=6972/prostate_model_norm.onnx | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,28 @@ | ||
| - name: virchow2 | ||
| import_path: models.virchow2:app | ||
| route_prefix: /virchow2 | ||
| runtime_env: | ||
| config: | ||
| setup_timeout_seconds: 1800 | ||
| working_dir: https://github.com/RationAI/model-service/archive/refs/heads/main.zip | ||
| deployments: | ||
| - name: Virchow2 | ||
| max_ongoing_requests: 1024 | ||
| max_queued_requests: 2048 | ||
| autoscaling_config: | ||
| min_replicas: 0 | ||
| max_replicas: 4 | ||
| target_ongoing_requests: 256 | ||
| ray_actor_options: | ||
| num_cpus: 4 | ||
| num_gpus: 1 | ||
| memory: 8589934592 | ||
| runtime_env: | ||
| env_vars: | ||
| HF_HOME: /mnt/huggingface_cache | ||
| user_config: | ||
| tile_size: 224 | ||
| max_batch_size: 512 | ||
| batch_wait_timeout_s: 0.1 | ||
| model: | ||
| repo_id: paige-ai/Virchow2 |
|
Jurgee marked this conversation as resolved.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,65 @@ | ||
| apiVersion: ray.io/v1 | ||
| kind: RayService | ||
| metadata: | ||
| name: {{ .Release.Name }} | ||
|
matejpekar marked this conversation as resolved.
|
||
| namespace: {{ .Release.Namespace | default "rationai-jobs-ns" }} | ||
| spec: | ||
| rayClusterConfig: | ||
| rayVersion: 2.53.0 | ||
| enableInTreeAutoscaling: true | ||
| autoscalerOptions: | ||
| idleTimeoutSeconds: 60 | ||
| securityContext: | ||
| runAsUser: 1000 | ||
| allowPrivilegeEscalation: false | ||
| capabilities: | ||
| drop: ["ALL"] | ||
| headGroupSpec: | ||
| rayStartParams: | ||
| num-cpus: "0" | ||
| dashboard-host: "0.0.0.0" | ||
| template: | ||
| spec: | ||
| securityContext: | ||
| fsGroupChangePolicy: OnRootMismatch | ||
| runAsNonRoot: true | ||
| seccompProfile: | ||
| type: RuntimeDefault | ||
| containers: | ||
| - name: ray-head | ||
| image: rayproject/ray:2.53.0-py312 | ||
| imagePullPolicy: Always | ||
| resources: | ||
| limits: | ||
| cpu: 0 | ||
| memory: 4Gi | ||
| requests: | ||
| cpu: 0 | ||
| memory: 4Gi | ||
| env: | ||
| - name: HTTPS_PROXY | ||
| value: http://proxy.ics.muni.cz:3128 | ||
| ports: | ||
| - containerPort: 6379 | ||
| name: gcs-server | ||
| - containerPort: 8265 | ||
| name: dashboard | ||
| - containerPort: 10001 | ||
| name: client | ||
| - containerPort: 8000 | ||
| name: serve | ||
| securityContext: | ||
| runAsUser: 1000 | ||
| allowPrivilegeEscalation: false | ||
| capabilities: | ||
| drop: ["ALL"] | ||
| workerGroupSpecs: | ||
| {{- range $workerName := .Values.workers }} | ||
| {{- $workerContent := printf "workers/%s.yaml" $workerName | $.Files.Get | fromYaml }} | ||
| - {{ toYaml $workerContent | nindent 8 | trim }} | ||
| {{- end }} | ||
| serveConfigV2: | | ||
| applications: | ||
| {{- range $appName := .Values.applications }} | ||
| {{ printf "applications/%s.yaml" $appName | $.Files.Get | indent 4 }} | ||
| {{- end }} | ||
|
Jurgee marked this conversation as resolved.
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,9 @@ | ||
| workers: | ||
| - cpu-workers | ||
| - mig20-workers | ||
|
|
||
| applications: | ||
| - episeg-1 | ||
| - heatmap-builder | ||
| - prostate-classifier-1 | ||
| - virchow2 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,66 @@ | ||
| groupName: cpu-workers | ||
| replicas: 0 | ||
| minReplicas: 0 | ||
| maxReplicas: 2 | ||
| template: | ||
| spec: | ||
| securityContext: | ||
| fsGroupChangePolicy: OnRootMismatch | ||
| runAsNonRoot: true | ||
| seccompProfile: | ||
| type: RuntimeDefault | ||
| containers: | ||
| - name: ray-worker | ||
| image: cerit.io/rationai/model-service:2.53.0 | ||
| imagePullPolicy: Always | ||
| resources: | ||
| limits: | ||
| cpu: 8 | ||
| memory: 16Gi | ||
| requests: | ||
| cpu: 8 | ||
| memory: 16Gi | ||
| env: | ||
| - name: HTTPS_PROXY | ||
| value: http://proxy.ics.muni.cz:3128 | ||
| securityContext: | ||
| allowPrivilegeEscalation: false | ||
| capabilities: | ||
| drop: ["ALL"] | ||
| runAsUser: 1000 | ||
| lifecycle: | ||
| preStop: | ||
| exec: | ||
| command: ["/bin/sh", "-c", "ray stop"] | ||
| volumeMounts: | ||
| - name: data | ||
| mountPath: /mnt/data | ||
| - name: public-data | ||
| mountPath: /mnt/data/Public | ||
| - name: projects | ||
| mountPath: /mnt/projects | ||
| - name: bioptic-tree | ||
| mountPath: /mnt/bioptic_tree | ||
| - name: trt-cache-volume | ||
| mountPath: /mnt/cache | ||
| - name: huggingface-cache | ||
| mountPath: /mnt/huggingface_cache | ||
| volumes: | ||
| - name: data | ||
| persistentVolumeClaim: | ||
| claimName: data-ro | ||
| - name: public-data | ||
| persistentVolumeClaim: | ||
| claimName: rationai-data-ro-pvc-jobs | ||
| - name: projects | ||
| persistentVolumeClaim: | ||
| claimName: projects-rw | ||
| - name: bioptic-tree | ||
| persistentVolumeClaim: | ||
| claimName: bioptictree-ro | ||
| - name: trt-cache-volume | ||
| persistentVolumeClaim: | ||
| claimName: tensorrt-cache-pvc | ||
| - name: huggingface-cache | ||
| persistentVolumeClaim: | ||
| claimName: huggingface-cache-pvc |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,76 @@ | ||
| groupName: mig20-gpu-workers | ||
| replicas: 0 | ||
| minReplicas: 0 | ||
| maxReplicas: 4 | ||
| rayStartParams: | ||
| num-gpus: "1" | ||
| template: | ||
| spec: | ||
| securityContext: | ||
| fsGroup: 1000 | ||
| fsGroupChangePolicy: OnRootMismatch | ||
| runAsNonRoot: true | ||
| runAsUser: 1000 | ||
| seccompProfile: | ||
| type: RuntimeDefault | ||
| containers: | ||
| - name: ray-worker | ||
| image: cerit.io/rationai/model-service:2.54.0-gpu | ||
| imagePullPolicy: Always | ||
| resources: | ||
| limits: | ||
| cpu: 8 | ||
| memory: 24Gi | ||
| nvidia.com/mig-2g.20gb: 1 | ||
| requests: | ||
| cpu: 8 | ||
| memory: 24Gi | ||
|
Jurgee marked this conversation as resolved.
|
||
| env: | ||
| - name: HTTPS_PROXY | ||
| value: http://proxy.ics.muni.cz:3128 | ||
| - name: HF_TOKEN | ||
| valueFrom: | ||
| secretKeyRef: | ||
| name: huggingface-secret | ||
| key: token | ||
| securityContext: | ||
| allowPrivilegeEscalation: false | ||
| capabilities: | ||
| drop: ["ALL"] | ||
| runAsUser: 1000 | ||
| lifecycle: | ||
| preStop: | ||
| exec: | ||
| command: ["/bin/sh", "-c", "ray stop"] | ||
| volumeMounts: | ||
| - name: data | ||
| mountPath: /mnt/data | ||
| - name: public-data | ||
| mountPath: /mnt/data/Public | ||
| - name: projects | ||
| mountPath: /mnt/projects | ||
| - name: bioptic-tree | ||
| mountPath: /mnt/bioptic_tree | ||
| - name: trt-cache-volume | ||
| mountPath: /mnt/cache | ||
| - name: huggingface-cache | ||
| mountPath: /mnt/huggingface_cache | ||
| volumes: | ||
| - name: data | ||
| persistentVolumeClaim: | ||
| claimName: data-ro | ||
| - name: public-data | ||
| persistentVolumeClaim: | ||
| claimName: rationai-data-ro-pvc-jobs | ||
| - name: projects | ||
| persistentVolumeClaim: | ||
| claimName: projects-rw | ||
| - name: bioptic-tree | ||
| persistentVolumeClaim: | ||
| claimName: bioptictree-ro | ||
| - name: trt-cache-volume | ||
| persistentVolumeClaim: | ||
| claimName: tensorrt-cache-pvc | ||
| - name: huggingface-cache | ||
| persistentVolumeClaim: | ||
| claimName: huggingface-cache-pvc | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.