Rate-distortion optimization for transformer inference
Split computing for language models, extending the theory of usable information.
Split computing for language models, extending the theory of usable information.
Isolate the common information between two dependent computer vision tasks.
Theoretical considerations and evaluation of split and distillation points.
Task reconstruction loss acts as a regularizer, increasing rate-distortion performance in coding for humans and machines.
Improving the shared channel in coding for machines (CfM).
A comparison between conditional and residual entropy codecs for a two-channel systems of tasks with nested information.
Unified batch and online transformer inference.
Graph representations using a learnable attention mechanism to sample the neighbourhood of a graph.
Dataset for emotion classification of long-form narratives.
Sentence embeddings augmented by universal parts-of-speech tags evaluated on low-resource languages.
Dictionary-based approach for the extraction of “aspect-of” relationships.
Evaluate the performance impact of optimization algorithms, activation functions, dropout, and maxout networks, in CNNs.
Benchmark of stochastic gradient descent and Nesterov’s accelerated gradient for text classification.