YOLO-Toys

Multi-model vision inference platform for YOLOv8, DETR, OWL-ViT, Grounding DINO, and BLIP behind one FastAPI + WebSocket interface.

简体中文 · GitHub Pages · Docs · Issues

Why this project exists

YOLO-Toys packages several practical vision tasks behind one consistent API:

object detection, segmentation, and pose estimation
open-vocabulary detection
image captioning and visual question answering
REST inference and low-latency WebSocket streaming

The project is optimized for people who want to compare model families quickly, build a lightweight demo/backend, or study how to unify mixed vision stacks under one handler architecture.

Quick start

Docker

docker run -p 8000:8000 ghcr.io/lessup/yolo-toys:latest

Open http://localhost:8000.

Local development

git clone https://github.com/LessUp/yolo-toys.git
cd yolo-toys
bash scripts/dev.sh setup
. .venv/bin/activate
make run

What you get

Surface	What it is for
`/infer`	Detection / segmentation / pose / open-vocabulary inference
`/caption`	BLIP image captioning
`/vqa`	BLIP visual QA
`/models`, `/labels`	Model discovery
`/ws`	Real-time streaming inference
`/metrics`, `/health`, `/system/*`	Operations and observability

Model families

Family	Examples	Tasks
YOLOv8	`yolov8n.pt`, `yolov8n-seg.pt`, `yolov8n-pose.pt`	detect / segment / pose
DETR	`facebook/detr-resnet-50`	detect
OWL-ViT / Grounding DINO	`google/owlvit-base-patch32`	zero-shot detect
BLIP	`Salesforce/blip-image-captioning-base`, `Salesforce/blip-vqa-base`	caption / vqa

Repository guide

Path	Role
`app/`	backend runtime
`tests/`	pytest suite
`openspec/`	current specs and change workflow
`docs/`	long-form docs
root Jekyll files	GitHub Pages landing + navigation
`.github/`	workflows, templates, Copilot instructions

Development workflow

Non-trivial work is OpenSpec-first:

explore or clarify
propose a change
implement from tasks
review at phase boundaries
archive the completed change

Canonical local commands:

make lint
make format
make hooks
make typecheck
make test

Next step

Want a guided overview? Start at the GitHub Pages landing site.
Want setup and API details? Go to docs.
Want to contribute or finish repository cleanup? Read CONTRIBUTING.md.

Name		Name	Last commit message	Last commit date
Latest commit History 65 Commits
.claude		.claude
.github		.github
_data		_data
_includes		_includes
_layouts		_layouts
app		app
assets		assets
changelog		changelog
config		config
deployments		deployments
docs		docs
frontend		frontend
openspec		openspec
scripts		scripts
tests		tests
.editorconfig		.editorconfig
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
404.md		404.md
AGENTS.md		AGENTS.md
CLAUDE.md		CLAUDE.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
Makefile		Makefile
PROJECT_STRUCTURE.md		PROJECT_STRUCTURE.md
README.md		README.md
README.zh-CN.md		README.zh-CN.md
SECURITY.md		SECURITY.md
_config.yml		_config.yml
docker-compose.yml		docker-compose.yml
index.md		index.md
pyproject.toml		pyproject.toml
requirements-dev.txt		requirements-dev.txt
requirements.txt		requirements.txt
robots.txt		robots.txt
search.md		search.md
sitemap.xml		sitemap.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

YOLO-Toys

Why this project exists

Quick start

Docker

Local development

What you get

Model families

Repository guide

Development workflow

Next step

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

YOLO-Toys

Why this project exists

Quick start

Docker

Local development

What you get

Model families

Repository guide

Development workflow

Next step

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages