返回顶部
热门问答 更多热门问答
技术文章 更多技术文章

CogVideoX-5B:最新开源!!你可以在本地使用AI文本生成视频了!(12G显存

[复制链接]
链载Ai 显示全部楼层 发表于 昨天 11:13 |阅读模式 打印 上一主题 下一主题

智谱CogVideoX系列新开源CogVideoX-5b,视频生成质量更高,视觉效果更好,此前开源的版本为CogVideoX-2B。

GIF有点卡 ...

实测案例一(AI-杨百万):
提示词:Picturethis:asleek,confidentcatloungingcasuallyinasun-drenchedroom,itsfurglisteningunderthewarmrays.Butwhatsetsthisfelineapartisnotjustitsglossycoatorthegracefulpoiseitexudes;it'sthepairofstylishsunglassesperchedonitsnose,addinganairofmysteryandcoolnesstoitsdemeanor.Thesunglasses,withtheirreflectivelenses,hidethecat'senigmaticeyes,makingitseemasifit'sponderinglife'smysteriesorperhapsjustplanningitsnextmischievousadventure.Assunlightfiltersthroughthewindow,castingpatternsonthefloor,thecat,utterlyunfazedbyitsunusualaccessory,givesoffavibeofeffortlesschic.Itsitsthere,apictureofserenityanddetachment,occasionallyflickingitstailorlettingoutasoftpurr,completelyembodyingtheessenceofcool.Thiscatdoesn'tjustwearsunglasses;itownsthelook,makinganyonewhoglancesitswaydoadouble-take,charmedbythesightofsuchanunexpectedyetstrikingfashionstatement.

实测案例二(AI-杨百万):
提示词:Inaheartwarmingscene,adelightfulpandabearfindsitselfinthegentleembraceofahuman,engaginginwhatcanonlybedescribedasawhimsicaldance.Thepanda,withitsstrikingblackandwhitefur,looksupwithtrusting,curiouseyes,itsroundfaceframedbyfuzzyears.Thehuman,filledwithjoyandawe,carefullysupportsthepanda'ssoft,plumpbody,guidingitinaseriesofgentle,swayingmovements.Astheymovetogether,thepanda'sclumsyyetendearingattemptstomimictherhythmcreateamomentofpuremagic.Itstinypawsoccasionallyreachout,touchingthehuman'shands,asiftryingtounderstandthisnovelformofinteraction.Aroundthem,theairisfilledwithlaughterandsoftmusic,enhancingtheenchantmentoftheirdance.Thisuniqueencounter,ablendofnature'sinnocenceandhumanaffection,unfoldslikeatenderdanceoffriendship,leavinganindeliblemarkofjoyandconnectiononallwhowitnessit.

推理的硬件需求如下:

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;padding-left: 8px;color: rgb(63, 63, 63);">CogVideoX-2B 模型:

  • ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;text-indent: -1em;display: block;margin: 0.2em 8px;">

    FP16 精度

    • ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;" class="list-paddingleft-1">
    • • 使用 diffusers:需要12.5GB显存

  • ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;text-indent: -1em;display: block;margin: 0.2em 8px;">

    INT8 精度

    • ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;" class="list-paddingleft-1">
    • • 使用 diffusers with torchaudio:需要7.8GB显存

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;padding-left: 8px;color: rgb(63, 63, 63);">CogVideoX-5B 模型:

    ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 16px;color: rgb(63, 63, 63);" class="list-paddingleft-1">
  • BF16 精度

    • • 使用 diffusers:需要20.7GB显存

  • INT8 精度

    • • 使用 diffusers with torchaudio:需要11.4GB显存

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: 16px;letter-spacing: 0.1em;color: rgb(63, 63, 63);">ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: var(--articleFontsize);letter-spacing: 0.034em;"/>

体验界面如下:

ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;display: table;padding-right: 0.2em;padding-left: 0.2em;background: rgb(0, 152, 116);color: rgb(255, 255, 255);">快速上手 ?

本模型已经支持使用 Huggingface 的diffusers库进行部署,你可以按照以下步骤进行部署。

安装对应的依赖

#diffusers>=0.30.1
#transformers>=0.44.0
#accelerate>=0.33.0(建议从源代码安装)
#imageio-ffmpeg>=0.5.1

pipinstall--upgradetransformersacceleratediffusersimageio-ffmpeg

运行代码 (BF16 / FP16)

importtorch
fromdiffusersimportCogVideoXPipeline
fromdiffusers.utilsimportexport_to_video

prompt=(
"Apanda,dressedinasmall,redjacketandatinyhat,sitsonawoodenstool"
"inaserenebambooforest.Thepanda'sfluffypawsstrumaminiatureacoustic"
"guitar,producingsoft,melodictunes.Nearby,afewotherpandasgather,"
"watchingcuriouslyandsomeclappinginrhythm.Sunlightfiltersthroughthetall"
"bamboo,castingagentleglowonthescene.Thepanda'sfaceisexpressive,showing"
"concentrationandjoyasitplays.Thebackgroundincludesasmall,flowingstream"
"andvibrantgreenfoliage,enhancingthepeacefulandmagicalatmosphereofthis"
"uniquemusicalperformance."
)

pipe=CogVideoXPipeline.from_pretrained(
"THUDM/CogVideoX-5b",
torch_dtype=torch.bfloat16
)

pipe.enable_model_cpu_offload()
pipe.vae.enable_tiling()

video=pipe(
prompt=prompt,
num_videos_per_prompt=1,
num_inference_steps=50,
num_frames=49,
guidance_scale=6,
generator=torch.Generator(device="cuda").manual_seed(42),
).frames[0]

export_to_video(video,"output.mp4",fps=8)

回复

使用道具 举报

您需要登录后才可以回帖 登录 | 立即注册

本版积分规则

链载AI是专业的生成式人工智能教程平台。提供Stable Diffusion、Midjourney AI绘画教程,Suno AI音乐生成指南,以及Runway、Pika等AI视频制作与动画生成实战案例。从提示词编写到参数调整,手把手助您从入门到精通。
  • 官方手机版

  • 微信公众号

  • 商务合作

  • Powered by Discuz! X3.5 | Copyright © 2025-2025. | 链载Ai
  • 桂ICP备2024021734号 | 营业执照 | |广西笔趣文化传媒有限公司|| QQ