Poster is on display.
Feather is a high-performance emulation library that brings FP8 (E5M2 \& E4M3) precision arithmetic to older GPU architectures (Ampere, Turing, Volta) that lack native hardware support.
By implementing custom bit-packing and optimized Triton kernels, Feather bypasses the memory bandwidth bottleneck, achieving 3x speedup over native PyTorch FP32 \& PyTorch FP16 operations on memory-bound workloads.