You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Add a Nonlinear-Operators‘ RTL implementation in the prototype lib (under the arch path).
A Pull Request (PR) containing a test written in C for the Common Non-linear operation and a README to introduce your design.
Report the performance results in this issue.
Task Description
LLMs rely heavily on nonlinear operators like SiLU, RMSNorm, and Softmax. While linear layers are highly optimised, these nonlinear functions often require expensive high-precision floating-point arithmetic. Traditional lookup table (LUT) methods often fail to handle the extreme outliers common in modern LLMs, leading to significant accuracy degradation.
NLI is a calibration-free, hardware-friendly framework designed to efficiently approximate different nonlinear functions. Offline find optimal cutpoints, NLI maintains high accuracy even for extreme outliers in LLM inference, where traditional LUT methods usually collapse.
You can learn this methodology from the paper, "NLI:Non-uniform Linear Interpolation Approximation of Nonlinear Operations for Efficient LLMs Inference". Design a ball for Buckyball based on this methodology.
Deliverables
Task Description