跨硬件的极速推理:Gemma 2 经过优化,可以在各种硬件上以令人难以置信的速度运行,从功能强大的游戏笔记本电脑和高端台式机到基于云的设置。在 Google AI Studio 中以全精度尝试 Gemma 2,通过 CPU 上的 Gemma.cpp 的量化版本解锁本地性能,或者通过 Hugging Face Transformers 在配备 NVIDIA RTX 或 GeForce RTX 的家用计算机上运行。
原文:Now we’re officially releasing Gemma 2 to researchers and developers globally. Available in both 9 billion (9B) and 27 billion (27B) parameter sizes, Gemma 2 is higher-performing and more efficient at inference than the first generation, with significant safety advancements built in. In fact, at 27B, it offers competitive alternatives to models more than twice its size, delivering the kind of performance that was only possible with proprietary models as recently as December. And that’s now achievable on a single NVIDIA H100 Tensor Core GPU or TPU host, significantly reducing deployment costs.