Open-Superintelligence-Lab.github.io/RESEARCH.md at main · Open-Superintelligence-Lab/Open-Superintelligence-Lab.github.io

Open Tasks

Claim: For any n-gram language model, there exists a state space language model that can simulate it with arbitrarily small error.

Advanced research on DeepSeek's innovative sparse attention mechanisms for efficient long-context processing.

How a 7M parameter model beats 100x bigger models at Sudoku, Mazes, and ARC-AGI using recursive reasoning.

NVIDIA's breakthrough 4-bit training methodology achieving 2-3x speedup and 50% memory reduction.

Diffusion Transformers with Representation Autoencoders achieve state-of-the-art FID 1.13 on ImageNet.

Quantization-enhanced Reinforcement Learning for LLMs enables RL training of 32B models on a single GPU.