Top
Ask
Show
Best
New
Post-transformer inference: 224× compression of Llama-70B with improved accuracy
72 points •
anima-core
• 3 days ago •
55 comments