|
近期,快手开源了名为Kolors(可图)的文本到图像生成模型,该模型具有对英语和汉语的深刻理解,并能够生成高质量、逼真的图像。技术报告中也提了几个重要的工作内容: 首先,Kolors基于通用语言模型(ChatGLM),而不是像Imagen和Stable Diffusion 3基于大语言模型T5,这增强了其对英语和汉语的理解能力,并利用多模态大型语言模型CogVLM重新为训练数据集中的图像生成更详细的描述; 其次,Kolors训练分为两个阶段,即概念学习阶段和质量改进阶段,并使用特定的数据集进行训练以提高视觉吸引力,通过引入高质量的数据和优化高分辨率训练技术来改善图像质量; 最后,Kolors团队提出了一种平衡类别的基准数据集KolorsPrompts,用于指导Kolors的训练和评估。 实验结果表明,即使使用U-Net backbone,可图Kolors也表现出色,在人类评价中超越了现有的开源模型,性能达到了Midjourney-v6水平。Kolors代码和权重已经开源! 
代码开源链接:https://github.com/Kwai-Kolors/Kolors 模型开源链接:https://modelscope.cn/models/Kwai-Kolors/Kolors 技术报告链接:https://github.com/Kwai-Kolors/Kolors/blob/master/imgs/Kolors_paper.pdf
ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;text-wrap: wrap;background-color: rgb(255, 255, 255);visibility: visible;"> ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;text-wrap: wrap;background-color: rgb(255, 255, 255);visibility: visible;">模型卡片直达:ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;text-wrap: wrap;background-color: rgb(255, 255, 255);visibility: visible;">
下载方式: sdk下载: #模型下载frommodelscopeimportsnapshot_downloadmodel_dir=snapshot_download('Kwai-Kolors/Kolors')
git下载 gitclonehttps://www.modelscope.cn/Kwai-Kolors/Kolors.git CLI下载 modelscopedownload--model=Kwai-Kolors/Kolors--local_dir./Kolors/
ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;letter-spacing: 0.544px;text-wrap: wrap;background-color: rgb(255, 255, 255);visibility: visible;">
参考开源项目:https://github.com/kijai/ComfyUI-KwaiKolorsWrapper,我们在魔搭社区免费GPU算力上,完成了Kolors的ComfyUI环境搭建和体验实践。
体验环境使用魔搭社区的Notebook运行Kolors可图模型:

搭建ComfyUI从最新的ComfyUI的代码安装 # #@title Environment Setup
from pathlib import Path
OPTIONS = {}UPDATE_COMFY_UI = True#@param {type:"boolean"}INSTALL_COMFYUI_MANAGER = True#@param {type:"boolean"}INSTALL_KOLORS = True#@param {type:"boolean"}INSTALL_CUSTOM_NODES_DEPENDENCIES = True#@param {type:"boolean"}OPTIONS['UPDATE_COMFY_UI'] = UPDATE_COMFY_UIOPTIONS['INSTALL_COMFYUI_MANAGER'] = INSTALL_COMFYUI_MANAGEROPTIONS['INSTALL_KOLORS'] = INSTALL_KOLORSOPTIONS['INSTALL_CUSTOM_NODES_DEPENDENCIES'] = INSTALL_CUSTOM_NODES_DEPENDENCIES
current_dir = !pwdWORKSPACE = f"{current_dir[0]}/ComfyUI"
%cd /mnt/workspace/
![ ! -d $WORKSPACE ] && echo -= Initial setup ComfyUI =- && git clone https://github.com/comfyanonymous/ComfyUI%cd $WORKSPACE
if OPTIONS['UPDATE_COMFY_UI']:!echo "-= Updating ComfyUI =-"!git pull
if OPTIONS['INSTALL_COMFYUI_MANAGER']:%cd custom_nodes![ ! -d ComfyUI-Manager ] && echo -= Initial setup ComfyUI-Manager =- && git clone https://github.com/ltdrdata/ComfyUI-Manager%cd ComfyUI-Manager!git pull
if OPTIONS['INSTALL_KOLORS']:%cd ../![ ! -d ComfyUI-KwaiKolorsWrapper ] && echo -= Initial setup KOLORS =- && git clone https://github.com/kijai/ComfyUI-KwaiKolorsWrapper.git%cd ComfyUI-KwaiKolorsWrapper!git pull
%cd $WORKSPACE
if OPTIONS['INSTALL_CUSTOM_NODES_DEPENDENCIES']:!pwd!echo "-= Install custom nodes dependencies =-"![ -f "custom_nodes/ComfyUI-Manager/scripts/colab-dependencies.py" ] && python "custom_nodes/ComfyUI-Manager/scripts/colab-dependencies.py"
下载模型权重 #@markdown ###Download standard resources
OPTIONS = {}
#@markdown **unet**
!wget -c "https://modelscope.cn/models/Kwai-Kolors/Kolors/resolve/master/unet/diffusion_pytorch_model.fp16.safetensors" -P ./models/diffusers/Kolors/unet/!wget -c "https://modelscope.cn/models/Kwai-Kolors/Kolors/resolve/master/unet/config.json" -P ./models/diffusers/Kolors/unet/
#@markdown **encoder**
!modelscope download --model=ZhipuAI/chatglm3-6b-base --local_dir ./models/diffusers/Kolors/text_encoder/
#@markdown **vae**
!wget -c "https://modelscope.cn/models/AI-ModelScope/sdxl-vae-fp16-fix/resolve/master/sdxl.vae.safetensors" -P ./models/vae/ #sdxl-vae-fp16-fix.safetensors
#@markdown **scheduler**!wget -c "https://modelscope.cn/models/Kwai-Kolors/Kolors/resolve/master/scheduler/scheduler_config.json" -P ./models/diffusers/Kolors/scheduler/
#@markdown **modelindex**!wget -c "https://modelscope.cn/models/Kwai-Kolors/Kolors/resolve/master/model_index.json" -P ./models/diffusers/Kolors/
通过cloudflareg启动ComfyUI !wget"https://modelscope.oss-cn-beijing.aliyuncs.com/resource/cloudflared-linux-amd64.deb"!dpkg -i cloudflared-linux-amd64.deb
%cd /mnt/workspace/ComfyUIimport subprocessimport threadingimport timeimport socketimport urllib.request
def iframe_thread(port):while True:time.sleep(0.5)sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)result = sock.connect_ex(('127.0.0.1', port))if result == 0:breaksock.close()print("\nComfyUI finished loading, trying to launch cloudflared (if it gets stuck here cloudflared is having issues)\n")
p = subprocess.Popen(["cloudflared", "tunnel", "--url", "http://127.0.0.1:{}".format(port)], stdout=subprocess.PIPE, stderr=subprocess.PIPE)for line in p.stderr:l = line.decode()if "trycloudflare.com " in l:print("This is the URL to access ComfyUI:", l[l.find("http"):], end='')#print(l, end='')
threading.Thread(target=iframe_thread, daemon=True, args=(8188,)).start()
!python main.py --dont-print-server
点击右侧load,加载ComfyUI-KwaiKolorsWrapper项目提供的workflow
文生图体验:

图生图体验(一辆白色小汽车): 
显存占用: 
效果测试简单Prompt
复杂Promptabearstandingonacar,sunset,winter | aboyandagirl,theboystandsattheleftside,theboywearsaredt-shirtandbluepants,thegirlwearsagreent-shirtandpinkpants. | theappleisinthebox,theboxisonthechair,thechairisintheroom.
| 
| 
| ingFang SC", "Hiragino Sans GB", "Microsoft YaHei UI", "Microsoft YaHei", Arial, sans-serif;font-size: var(--articleFontsize);letter-spacing: 0.034em;width: 232px;height: 232px;"/>
|
多实体生成能力很能打,颜色能做到分别控制,空间关系也比较完美
多风格中国水墨画,女孩,长长的头发,闪亮的大眼睛 | oilpainting,agirl,longhair,colorfulhair,shiningeyes | anime,agirl,longhair,colorfulhair,shiningeyes | 
| 
| 
|
多风格,强!
文本一只小狗的照片,微距,变焦,高质量,电影,小狗拿着一个牌子,写着“可图魔搭” | 一张瓢虫的照片,微距,变焦,高质量,电影,拿着一个牌子,写着“可图”
| 天空,上面写着"可图"的文字
| 
| 
| 
|
可以处理简单的文本
多样性多样性还不错
性能测试1024 分辨率,A10,生成一张图片(25步)耗时7秒。 |