Skip to content

Model service documentation#3

Open
Jurgee wants to merge 26 commits intomainfrom
docs/model-service
Open

Model service documentation#3
Jurgee wants to merge 26 commits intomainfrom
docs/model-service

Conversation

@Jurgee
Copy link
Copy Markdown
Collaborator

@Jurgee Jurgee commented Mar 13, 2026

This PR adds and polishes MkDocs documentation for the model-service. Changes:

  • Add/iterate MkDocs documentation (docs/): quick start, architecture overview, deployment guide, configuration reference, adding models, troubleshooting.
  • Align README with the current Ray Serve + KubeRay RayService setup
  • Add docs dependency group in pyproject.toml ([dependency-groups].docs: mkdocs, mkdocs-material, pymdown-extensions).
  • Add GitLab CI job for docs build: .gitlab-ci.ymldocs:build runs mkdocs build --strict.
  • Update mkdocs.yml as needed for the docs site.

Summary by CodeRabbit

  • Documentation

    • Added a comprehensive documentation site covering architecture, request lifecycle, batching, queues/backpressure, quick start, deployment, model integration, configuration reference, and troubleshooting.
    • Rewrote the README with Kubernetes quick-start, deploy/access steps, prerequisites, and a reference model test example using compressed binary payloads.
    • Added guidance for autoscaling, resource sizing, and operational troubleshooting.
  • Chores

    • Added documentation tooling configuration, dependencies, and an automated docs build workflow.

@coderabbitai
Copy link
Copy Markdown

coderabbitai bot commented Mar 13, 2026

📝 Walkthrough

Walkthrough

Replaces the template README with a Ray Serve–focused README, adds MkDocs configuration and a docs site with ten new pages (architecture, guides, troubleshooting, quick-start), introduces a docs dependency group and CI workflow for building docs, and makes minor non-functional edits to two model files.

Changes

Cohort / File(s) Summary
Top-level README & repo manifest
README.md, ray-service.yaml
Replace generic template README with Ray Serve / Kubernetes–focused README referencing ray-service.yaml, quick-start, testing, support, and license details.
Docs site config & tooling
mkdocs.yml, pyproject.toml
Add MkDocs Material site configuration and a docs dependency group (mkdocs, mkdocs-material, pymdown-extensions).
Documentation build workflow
.github/workflows/build-docs.yml
Add GitHub Actions workflow to build MkDocs site on docs changes and upload site/ artifact.
Docs index & quick start
docs/index.md, docs/get-started/quick-start.md
Add documentation homepage and Kubernetes quick-start with prerequisites, kubectl apply/port-forward steps, and verification guidance.
Architecture docs
docs/architecture/overview.md, docs/architecture/request-lifecycle.md, docs/architecture/queues-and-backpressure.md, docs/architecture/batching.md
Add four architecture pages covering system overview, end-to-end request lifecycle, multi-level queuing/backpressure semantics, and replica-local batching behavior and tuning.
Guides & operational docs
docs/guides/adding-models.md, docs/guides/deployment-guide.md, docs/guides/configuration-reference.md, docs/guides/troubleshooting.md
Add guides for adding models, production deployment workflow, RayService configuration reference (YAML fields, autoscaling, resources), and troubleshooting/playbook.
Model minor edits
models/binary_classifier.py, models/semantic_segmentation.py
Small formatting/comment adjustments in trt_options entries (removed inline type-ignore comments / trailing comma edits); no runtime behavior changes.

Sequence Diagram(s)

(omitted)

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 I hopped through docs to tidy each part,
Pages stacked neat — a rabbitly art.
From cluster start to batched request,
Models hum along, all set to test.
MkDocs lights the way — a hopping heart. 🥕📚

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 50.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'Model service documentation' directly reflects the main change in the PR: comprehensive additions of documentation for Model Service.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch docs/model-service

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the Model Service's discoverability and usability by introducing a comprehensive, structured documentation portal. The new documentation, built with MkDocs, covers essential aspects from quick deployment to detailed architectural insights and troubleshooting, making it easier for users to understand, deploy, and manage machine learning models. Accompanying these changes are updates to the main README, CI/CD pipelines for documentation validation, and proper dependency management for the documentation tools.

Highlights

  • Comprehensive Documentation: Introduced a new MkDocs-based documentation site covering quick start, architecture, deployment, configuration, adding models, and troubleshooting for the model service.
  • README Update: Rewrote the README.md to align with the current Ray Serve + KubeRay setup and provide a high-level overview and links to the new documentation.
  • CI/CD Integration: Added a GitLab CI job to build and validate the MkDocs documentation, ensuring its integrity.
  • Dependency Management: Introduced a dedicated docs dependency group in pyproject.toml for documentation-related tools.
  • MkDocs Configuration: Configured mkdocs.yml to define the documentation site structure, theme, and navigation.
Changelog
  • .gitlab-ci.yml
    • Included the MkDocs.gitlab-ci.yml template.
    • Added a deploy stage to the CI pipeline.
  • README.md
    • Rewrote the entire content to provide a modern, focused overview of the Model Service.
    • Added sections for documentation links, quick start instructions for Kubernetes, testing the reference model, and repository structure.
    • Removed generic GitLab-generated boilerplate.
  • docs/architecture/batching.md
    • Added a new document explaining Ray Serve's batching mechanism, including its API, internal workings, and tuning.
  • docs/architecture/overview.md
    • Added a new document providing a high-level architectural overview of the Model Service, covering system components, scaling, and fault tolerance.
  • docs/architecture/queues-and-backpressure.md
    • Added a new document detailing Ray Serve's queueing and backpressure mechanisms, explaining proxy-side and replica-side queues.
  • docs/architecture/request-lifecycle.md
    • Added a new document tracing the detailed request lifecycle through the Model Service stack, from client to model execution.
  • docs/get-started/quick-start.md
    • Added a new quick start guide for deploying a sample model on Kubernetes.
  • docs/guides/adding-models.md
    • Added a new guide on how to implement, configure, and deploy new machine learning models.
  • docs/guides/configuration-reference.md
    • Added a new reference document explaining key configuration parameters for RayService, applications, and deployments.
  • docs/guides/deployment-guide.md
    • Added a new comprehensive guide for deploying models to production, including resource planning and best practices.
  • docs/guides/troubleshooting.md
    • Added a new troubleshooting guide addressing common deployment and runtime issues.
  • docs/index.md
    • Added the main landing page for the Model Service documentation, outlining its purpose, features, and content structure.
  • mkdocs.yml
    • Added the MkDocs configuration file, defining the site name, description, repository, theme, navigation structure, and Markdown extensions.
  • pyproject.toml
    • Added a docs dependency group including mkdocs, mkdocs-material, and pymdown-extensions.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@Jurgee Jurgee marked this pull request as ready for review March 13, 2026 19:48
@Jurgee Jurgee requested review from a team, JakubPekar, Copilot and ejdam87 March 13, 2026 19:48
Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a comprehensive documentation site for the Model Service, covering its architecture, deployment guides, model integration, configuration, and troubleshooting, while also integrating the documentation build process into the CI pipeline. Review comments suggest improving clarity in the README.md regarding kubectl namespace placeholders, completing a placeholder URL in the configuration reference, addressing a potential LaTeX rendering issue in the configuration-reference.md by suggesting a pymdownx.arithmatex extension, resolving a missing JSON curl example in deployment-guide.md, and adding a final newline character to mkdocs.yml for consistency.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR adds an MkDocs documentation site for the model-service and updates repository metadata/CI to build the docs alongside the existing Ray Serve + KubeRay RayService setup.

Changes:

  • Adds a new MkDocs site (mkdocs.yml) and a full set of docs pages under docs/ (quick start, guides, architecture).
  • Updates README.md to match the current Ray Serve + KubeRay deployment model and reference model payload format.
  • Adds a docs dependency group to pyproject.toml and updates .gitlab-ci.yml to include the docs build template.

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 8 comments.

Show a summary per file
File Description
pyproject.toml Adds a docs dependency group for MkDocs tooling.
mkdocs.yml Introduces MkDocs Material configuration, nav, and Markdown extensions.
docs/index.md Adds the documentation landing page content and navigation pointers.
docs/get-started/quick-start.md Adds a Kubernetes quick start for deploying the RayService.
docs/guides/deployment-guide.md Adds a detailed production deployment guide and ops considerations.
docs/guides/configuration-reference.md Adds a reference for RayService / Serve knobs (autoscaling, backpressure, etc.).
docs/guides/adding-models.md Adds guidance and examples for implementing and integrating models.
docs/guides/troubleshooting.md Adds common failure modes and triage steps for RayService/Serve on K8s.
docs/architecture/overview.md Adds a high-level architecture overview of the stack and scaling model.
docs/architecture/request-lifecycle.md Documents end-to-end request flow and queueing points.
docs/architecture/queues-and-backpressure.md Explains queueing controls (max_queued_requests, max_ongoing_requests).
docs/architecture/batching.md Explains Ray Serve batching behavior and tuning considerations.
README.md Replaces the GitLab template README with repo-specific usage and payload details.
.gitlab-ci.yml Includes the MkDocs CI template and adds a deploy stage for docs build.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@Jurgee Jurgee self-assigned this Mar 13, 2026
@matejpekar matejpekar removed the request for review from JakubPekar March 13, 2026 23:38
@matejpekar matejpekar requested review from Adames4 and matejpekar and removed request for ejdam87 March 13, 2026 23:38
Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Nitpick comments (3)
docs/architecture/request-lifecycle.md (1)

9-83: Consider specifying language for the ASCII diagram code block.

The ASCII diagram is in a fenced code block without a language specifier. While this works, explicitly marking it as text or plain can improve rendering consistency across different Markdown processors.

📝 Optional: Add language specifier

Change line 9 from:


to:
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/architecture/request-lifecycle.md` around lines 9 - 83, The fenced ASCII
diagram block lacks a language tag; update the opening fence for the ASCII
diagram (the large block containing the "External Client", "HTTP Proxy Actor",
"Replica Actor", etc.) to include a language specifier like text or plain (e.g.,
change ``` to ```text) so Markdown renderers consistently treat it as
preformatted text; leave the closing fence unchanged.
pyproject.toml (1)

22-22: Consider using more restrictive version bounds to prevent unexpected breaking changes.

The documentation dependencies use minimum version constraints (>=) which allows any newer version. Verification confirms these packages have progressed significantly beyond the pinned minimums (mkdocs-material from 9.6.0 → 9.7.6, pymdown-extensions from 10.0 → 10.21), which could introduce breaking changes and affect build reproducibility.

Consider using more restrictive version bounds such as mkdocs>=1.6.0,<2.0 or mkdocs~=1.6.0 to allow patch updates while preventing major version changes.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@pyproject.toml` at line 22, Replace the open-ended >= version constraints in
the docs extras list with more restrictive bounds to avoid unexpected breaking
changes: update the docs = [...] entry to use either compatible release
operators or upper bounds (e.g., mkdocs~=1.6.0 or mkdocs>=1.6.0,<2.0 for mkdocs,
and similarly mkdocs-material~=9.6.0 or mkdocs-material>=9.6.0,<10.0, and
pymdown-extensions~=10.0 or pymdown-extensions>=10.0,<11.0) so the docs extras
(the docs = [...] line and the package names mkdocs, mkdocs-material,
pymdown-extensions) lock major versions while still permitting safe patch/minor
updates.
docs/guides/adding-models.md (1)

121-124: Consider using async def reconfigure() for consistency.

The repository's reference implementation (models/binary_classifier.py) uses async def reconfigure(). Making this example async would better align with the production pattern and allow for async operations during reconfiguration (e.g., async model loading).

♻️ Proposed change
-  def reconfigure(self, config: Config):
+    async def reconfigure(self, config: Config):
         self.threshold = config["threshold"]
         self.batch_size = config["batch_size"]
         print(f"Reconfigured: threshold={self.threshold}")
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/guides/adding-models.md` around lines 121 - 124, The example's
reconfigure method is synchronous but the reference implementation uses async;
change the signature from def reconfigure(self, config: Config) to async def
reconfigure(self, config: Config) and update any callers/docs to await
self.reconfigure(...) where appropriate; ensure the body still sets
self.threshold and self.batch_size and optionally show how to await async
model-loading calls inside reconfigure (mirroring models/binary_classifier.py's
pattern).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@docs/architecture/overview.md`:
- Around line 95-100: Update the example under workerGroupSpecs to match the
deployed ray-service.yaml by changing maxReplicas from 4 to 2, add the replicas
field (to show the current replica count), and include the template pod
specification (e.g., container image, resources, and ports) so the snippet
mirrors the real configuration; locate the existing workerGroupSpecs entry with
groupName: cpu-workers and edit it to include replicas and a template block and
set maxReplicas: 2 to match production.

In `@docs/guides/adding-models.md`:
- Around line 33-39: The example's method-level code is misindented: align the
comment "# Process data and return prediction", the lines "result =
self.predict(data)" and "return {"prediction": result}" and the "def
predict(self, data: dict):" block with the other class methods so they are
indented as instance methods; adjust indentation of the predict method body (the
comment and "return data") to be one additional indent level inside predict to
match method scope, ensuring "predict" and its body are at the same indentation
as other class methods.

In `@docs/guides/configuration-reference.md`:
- Line 44: The `runtime_env` line references a broken anchor
`../guides/adding-models.md#6-managing-dependencies`; update the link to point
to the correct target by either changing it to `../guides/adding-models.md`
(linking the whole guide) or to the exact heading anchor that exists in the
adding-models guide (replace `#6-managing-dependencies` with the correct slug),
or remove the anchor entirely if the reference isn't required; update the
`runtime_env` entry accordingly.

In `@mkdocs.yml`:
- Line 4: Update the mkdocs configuration so edit links point to the
repository's default branch: change the edit_uri value from "edit/master/docs/"
to "edit/main/docs/" in mkdocs.yml (look for the edit_uri key).

In `@README.md`:
- Around line 87-97: The fenced code block showing the repository tree in
README.md is missing a language identifier; update that block (the
triple-backtick before the "model-service/" tree) to include a language token
such as "text" (i.e., change ``` to ```text) so markdownlint is satisfied and
syntax highlighting is applied to the model-service/ tree snippet.

---

Nitpick comments:
In `@docs/architecture/request-lifecycle.md`:
- Around line 9-83: The fenced ASCII diagram block lacks a language tag; update
the opening fence for the ASCII diagram (the large block containing the
"External Client", "HTTP Proxy Actor", "Replica Actor", etc.) to include a
language specifier like text or plain (e.g., change ``` to ```text) so Markdown
renderers consistently treat it as preformatted text; leave the closing fence
unchanged.

In `@docs/guides/adding-models.md`:
- Around line 121-124: The example's reconfigure method is synchronous but the
reference implementation uses async; change the signature from def
reconfigure(self, config: Config) to async def reconfigure(self, config: Config)
and update any callers/docs to await self.reconfigure(...) where appropriate;
ensure the body still sets self.threshold and self.batch_size and optionally
show how to await async model-loading calls inside reconfigure (mirroring
models/binary_classifier.py's pattern).

In `@pyproject.toml`:
- Line 22: Replace the open-ended >= version constraints in the docs extras list
with more restrictive bounds to avoid unexpected breaking changes: update the
docs = [...] entry to use either compatible release operators or upper bounds
(e.g., mkdocs~=1.6.0 or mkdocs>=1.6.0,<2.0 for mkdocs, and similarly
mkdocs-material~=9.6.0 or mkdocs-material>=9.6.0,<10.0, and
pymdown-extensions~=10.0 or pymdown-extensions>=10.0,<11.0) so the docs extras
(the docs = [...] line and the package names mkdocs, mkdocs-material,
pymdown-extensions) lock major versions while still permitting safe patch/minor
updates.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: c814cd11-97de-4f6e-95b7-23c0cadbdc2d

📥 Commits

Reviewing files that changed from the base of the PR and between 6beb06f and 3317382.

📒 Files selected for processing (14)
  • .gitlab-ci.yml
  • README.md
  • docs/architecture/batching.md
  • docs/architecture/overview.md
  • docs/architecture/queues-and-backpressure.md
  • docs/architecture/request-lifecycle.md
  • docs/get-started/quick-start.md
  • docs/guides/adding-models.md
  • docs/guides/configuration-reference.md
  • docs/guides/deployment-guide.md
  • docs/guides/troubleshooting.md
  • docs/index.md
  • mkdocs.yml
  • pyproject.toml
💤 Files with no reviewable changes (1)
  • .gitlab-ci.yml

@matejpekar
Copy link
Copy Markdown
Member

@Jurgee add workflow for building documentation

Copy link
Copy Markdown

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

♻️ Duplicate comments (1)
docs/guides/configuration-reference.md (1)

44-44: ⚠️ Potential issue | 🟡 Minor

Fix broken anchor link.

The link targets ../guides/adding-models.md#6-managing-dependencies, but the adding-models guide doesn't have a section with that anchor. Update the link to point to the correct target or remove the anchor.

🔗 Suggested fixes

Option 1: Link to the entire guide:

-- `runtime_env`: dynamic environment setup (see [Managing Dependencies](../guides/adding-models.md#6-managing-dependencies)).
+- `runtime_env`: dynamic environment setup (see [Managing Dependencies](../guides/adding-models.md)).

Option 2: Update to use the correct anchor if the section exists with a different slug (verify first).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@docs/guides/configuration-reference.md` at line 44, The `runtime_env` entry
contains a broken anchor link
(`../guides/adding-models.md#6-managing-dependencies`); update the link in the
`runtime_env` line to either point to the whole guide
(`../guides/adding-models.md`) or replace the fragment with the correct section
anchor if the "Managing Dependencies" heading exists under a different
slug—verify the actual heading slug in the adding-models content and update the
href accordingly.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Duplicate comments:
In `@docs/guides/configuration-reference.md`:
- Line 44: The `runtime_env` entry contains a broken anchor link
(`../guides/adding-models.md#6-managing-dependencies`); update the link in the
`runtime_env` line to either point to the whole guide
(`../guides/adding-models.md`) or replace the fragment with the correct section
anchor if the "Managing Dependencies" heading exists under a different
slug—verify the actual heading slug in the adding-models content and update the
href accordingly.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 1e244c17-9339-4b96-a53d-be95a1af6fa8

📥 Commits

Reviewing files that changed from the base of the PR and between 5d1c74e and 0b282eb.

📒 Files selected for processing (5)
  • .github/workflows/build-docs.yml
  • docs/guides/adding-models.md
  • docs/guides/configuration-reference.md
  • docs/guides/deployment-guide.md
  • mkdocs.yml
✅ Files skipped from review due to trivial changes (4)
  • .github/workflows/build-docs.yml
  • mkdocs.yml
  • docs/guides/deployment-guide.md
  • docs/guides/adding-models.md

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't work. Use the provided RationAI template.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants