Rate-distortion optimization for transformer inference

Split computing for language models, extending the theory of usable information.

April 2026 · Anderson de Andrade, Alon Harell, Ivan V. Bajić