sliding-window-attention

Here are 5 public repositories matching this topic...

⚡ Zero-Stall MoE Inference via Lookahead Prediction & Async DMA Prefetching. Optimized for SSD I/O with Hybrid MLA+Sliding Window Attention.

Notes on the Mistral AI model

nlp pytorch mistral llm xformers mistral-7b mixtral mixtral-8x7b sliding-window-attention

🚀 Sliding Window Attention Training for Efficient Large Language Models

Ring sliding window attention implementation with flash attention

A high-performance terminal-integrated LLM engine in Rust

nlp rust machine-learning deep-learning tokenizer inference tui transformer attention-mechanism rope kv-cache llm long-context sliding-window-attention

Add a description, image, and links to the sliding-window-attention topic page so that developers can more easily learn about it.

To associate your repository with the sliding-window-attention topic, visit your repo's landing page and select "manage topics."