Enterprise video conferencing powered by a custom SFU built from scratch in Rust.
Vertex is a high-performance video conferencing platform designed for real-time communication at scale. At its core is a Selective Forwarding Unit (SFU) implemented entirely in Rust, enabling low-latency, multi-party video calls with adaptive quality through simulcast.
- Multi-party video calls -- support for up to 50 concurrent participants in a single session.
- Screen sharing -- present your screen to all participants with minimal latency.
- In-session chat -- text messaging alongside video for seamless collaboration.
- Waiting room -- control participant entry with a managed waiting room.
- Recording -- capture sessions for later review or distribution.
- Admin dashboard -- monitor active sessions, manage users, and configure platform settings.
- Simulcast -- three quality layers (high, medium, low) for adaptive video delivery based on each participant's network conditions.
Video conferencing architectures generally fall into three categories:
- Mesh: Every participant sends their media stream directly to every other participant. This approach is simple but scales poorly -- bandwidth and CPU usage grow quadratically with the number of participants.
- MCU (Multipoint Control Unit): A central server receives all streams, decodes them, composites them into a single mixed stream, and sends that back to each participant. This reduces client-side bandwidth but demands significant server-side processing power.
- SFU (Selective Forwarding Unit): A central server receives each participant's stream and selectively forwards it to other participants without decoding or re-encoding. This strikes the best balance between scalability and resource efficiency.
Vertex uses a custom SFU built from the ground up in Rust. The server receives media streams via WebRTC, makes intelligent forwarding decisions, and relays the appropriate streams to each participant. Because the SFU never decodes or re-encodes video, server-side CPU usage remains low even at high participant counts.
Each publishing client encodes their video at three quality layers:
| Layer | Resolution | Use Case |
|---|---|---|
| High | Full | Active speaker, pinned video |
| Medium | Reduced | Gallery view, moderate bandwidth |
| Low | Minimal | Thumbnails, constrained networks |
The SFU dynamically selects which layer to forward to each subscriber based on factors such as available bandwidth, viewport size, and speaker activity. This ensures every participant receives the best possible quality given their constraints, without requiring the publisher to send separate streams to each viewer.
| Technology | Role |
|---|---|
| Rust | Core language for the SFU and server logic |
| Tokio | Asynchronous runtime for concurrent I/O |
| Axum | HTTP framework for API endpoints |
| WebRTC | Real-time media transport protocol |
| Technology | Role |
|---|---|
| Next.js | React framework for the UI |
| WebRTC API | Browser-side media handling |
- Rust (stable toolchain)
- Node.js (v18 or later)
- npm
cd sfu
cargo run --releaseThe SFU server will start and listen for WebRTC connections.
cd frontend
npm install
npm run devThe frontend development server will be available at http://localhost:3000.
| Metric | Value |
|---|---|
| Max concurrent participants | 50 |
| Simulcast layers | 3 (high/med/low) |
| Audio latency | < 150ms |
The custom Rust SFU, combined with Tokio's async runtime, ensures efficient resource utilization under load. Simulcast reduces bandwidth consumption by forwarding only the appropriate quality layer to each participant, rather than relaying full-resolution streams to all viewers.
This project is licensed under the MIT License.