Production-ready #![no_std] HTTP/WebSocket server with async/await via Linux syscalls on x86_64.
- Zero Dependencies: Pure Rust, no standard library, no external crates (except
spinfor locks) - 32KB Binary: Stripped, statically linked ELF executable
- Lock-Free Scheduler: Treiber stack for task scheduling
- Multi-Threaded: Real threads with TLS support (2-16 workers)
- Full Async/Await: Poll-based futures with proper waker support
- HTTP Server: Static file serving with embedded HTML
- WebSocket Server: Full handshake + frame parsing with echo functionality
- Production Ready: 28/28 tests passing, stress tested
Workspace Structure:
├── src/main.rs # Entry point (108 lines)
└── crates/
├── syscall/ # Raw syscall wrappers (407 lines)
├── runtime/ # Core async runtime (413 lines)
│ ├── config.rs # Configuration constants
│ ├── allocator.rs # 16MB bump allocator
│ ├── scheduler.rs # Lock-free task scheduler
│ └── io_registry.rs # FD waker registration
├── executor/ # Worker thread pool (86 lines)
├── net/ # Async network futures (185 lines)
├── http/ # HTTP request/response (110 lines)
└── websocket/ # WebSocket protocol (437 lines)
Total: ~2000 lines of Rust code
# Build (requires nightly for build-std)
cargo +nightly build --release
# Run server
./target/x86_64-unknown-none/release/async-nostd <workers> [ip] [port]
# Examples
./target/x86_64-unknown-none/release/async-nostd 8 # 8 workers, 0.0.0.0:8000
./target/x86_64-unknown-none/release/async-nostd 4 127.0.0.1 8080 # custom# All tests (28 tests)
python3 test.py
# Filtered tests
python3 test.py http # HTTP tests only
python3 test.py ws # WebSocket tests only
python3 test.py stress # Stress tests only
python3 test.py browser # Browser simulation tests
python3 test.py concurrent # Concurrent tests only
# Test modes
python3 test.py all # All tests (default)
python3 test.py multi # All multi-threaded tests- Server Log:
/tmp/async-nostd.log(truncated on each start) - Test Log:
/tmp/test_all_output.log(full test output)
Log format:
[ACCEPT] fd=5 # Connection accepted
[HTTP] fd=5 recv request # HTTP request received
[HTTP] fd=5 route=/ (index) # Route determined
[HTTP] fd=5 done, closing # Request completed
[WS] fd=7 handshake complete # WebSocket handshake
[WS] fd=7 recv 18 bytes # WebSocket data
[ppoll] monitoring 3 fds # ppoll status
[ppoll] removing closed fd=7 # Cleanup
Edit crates/runtime/config.rs:
pub const MAX_TASK_SLOTS: usize = 1024; // Max concurrent tasks
pub const HEAP_SIZE: usize = 16 * 1024 * 1024; // Allocator heap
pub const WORKER_STACK_SIZE: usize = 64 * 1024; // Thread stack
pub const LISTEN_BACKLOG: i32 = 128; // Socket backlog- Binary Size: 35KB (stripped)
- Memory: 16MB heap + 64KB per worker thread
- Concurrency: Handles 100+ concurrent connections
- Latency: Sub-millisecond response time
- Throughput: 10K+ requests/second (8 workers)
- Dedicated blocking thread for
accept4()syscall - Eliminates accept/ppoll race conditions
- Spawns async task per connection
- Treiber stack for ready tasks
- 1024 task slots with generation counters
- Node freelist (256 cap) to reduce allocations
ppoll()with infinite timeout- Eventfd for task wake notifications
- Automatic cleanup of closed FDs (POLLHUP/POLLERR)
- Bump allocator: 16MB mmap, CAS-based, no deallocation
- Stack per worker: 64KB mmap
- TLS per thread: 4KB with self-pointer (x86_64 ABI)
- SHA-1 + Base64 handshake
- Frame fragmentation support
- Ping/Pong handling
- Binary opcode for ArrayBuffer compatibility
Segfault?
- Check stack alignment in
_start()(must be 16-byte aligned) - Verify clone stack pointer points to TOP of mmap
- Ensure
unsafewrappers aroundasm!(Edition 2024)
Hangs?
- Check if fd registered with
register_fd_waker() - Verify socket is non-blocking (
O_NONBLOCK) - Monitor logs:
tail -f /tmp/async-nostd.log
Build errors?
- Use
cargo +nightly(build-std requires nightly) - Check
.cargo/config.tomlexists
MIT