链载Ai

标题: Llama-3 120B用过的都说好，Ollama 48G显存可跑！ [打印本页]

作者: 链载Ai 时间: 前天 09:49
标题: Llama-3 120B用过的都说好，Ollama 48G显存可跑！

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: var(--articleFontsize);letter-spacing: 0.034em;">Meta-Llama-3-120B-Instruct已经排进Huggingface热门排行Top10，它是一个由"Meta-Llama-3-70B-Instruct"自我合并而成的模型，使用MergeKit工具进行合并的。

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;display: table;border-bottom: 1px solid rgb(248, 57, 41);visibility: visible;">来自网友的评价

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;text-indent: 0em;text-wrap: wrap;background-color: rgb(255, 255, 255);line-height: 1.6em;margin-top: 16px;">Llama3-120B 在这些难题上确实展现了比GPT-4更高的智能

query：观察希格斯场会改变它的状态吗？

GPT-4 -> 不会
Llama3-120B -> 只有在我们质疑量子力学的哥本哈根解释时，让我来解释一下...

https://twitter.com/spectate_or/status/1787308316152242289

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;text-indent: 0em;background-color: rgb(255, 255, 255);line-height: 1.6em;margin-top: 16px;">让Llama-3-120B解释下面的笑话（实际上是发生的）

它轻松地击败了im-also-a-good-gpt2-chatbot和im-a-good-gpt2-chatbot。

https://twitter.com/spectate_or/status/1788031383052374069

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;text-indent: 0em;text-wrap: wrap;background-color: rgb(255, 255, 255);line-height: 1.6em;margin-top: 16px;font-weight: bold;">
llama3-120B 在 bfloat16 格式下表现相当出色

它在数学和编码方面有些软肋，但这是我见过的首个能够可靠地在各种任务上与 Opus 和 GPT-4 竞争的开源模型（OSS model）。通过良好的微调（finetune）和一些额外的人类反馈强化学习（RLHF），它可能接近于匹敌。

https://twitter.com/_xjdr/status/1787666447612985456

有趣的话题：Meta-Llama3-120B原生的自我合并Llama3以击败GPT4

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;text-indent: 0em;text-wrap: wrap;background-color: rgb(255, 255, 255);line-height: 1.6em;margin-top: 0px;">虽然并不倡导视频中的所有观点

https://twitter.com/GG_Ashbrook/status/1788365679860596957

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;text-indent: 0em;text-wrap: wrap;background-color: rgb(255, 255, 255);line-height: 1.6em;margin-top: 0px;">Llama3-120B版本交流——这玩意儿太聪明了

https://twitter.com/erhartford/status/1787050962114207886

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;display: table;border-bottom: 1px solid rgb(248, 57, 41);visibility: visible;">Ollama+Llama3-120b

智能出现：数据+模型深度?

Llama3-120b与Llama3-70b唯一的区别是额外的层，甚至是复制的层。没有训练新信息。因此，这种智能水平确实是从模型的深度中涌现出来的。它不仅仅是训练数据的一个函数。它是数据和深度的结合。

这表明：智能的出现不仅仅是由于训练数据的量，而是数据和模型深度（即模型的复杂性或层数）的结合结果？？？

Llama3-120b配置信息

slices:- sources:- layer_range: [0, 20]model: meta-llama/Meta-Llama-3-70B-Instruct- sources:- layer_range: [10, 30]model: meta-llama/Meta-Llama-3-70B-Instruct- sources:- layer_range: [20, 40]model: meta-llama/Meta-Llama-3-70B-Instruct- sources:- layer_range: [30, 50]model: meta-llama/Meta-Llama-3-70B-Instruct- sources:- layer_range: [40, 60]model: meta-llama/Meta-Llama-3-70B-Instruct- sources:- layer_range: [50, 70]model: meta-llama/Meta-Llama-3-70B-Instruct- sources:- layer_range: [60, 80]model: meta-llama/Meta-Llama-3-70B-Instructmerge_method: passthroughdtype: float16

https://hf-mirror.com/mlabonne/Meta-Llama-3-120B-Instruct

欢迎光临链载Ai (https://www.lianzai.com/)