ChatTTS WebUI版来了，轻松真人发声 - 链载Ai

ChatTTS是近几日最火的开源项目，短短几天揽获19.7kstar，ChatTTS是专门为对话场景设计的文本转语音模型，例如LLM助手对话任务。它支持英文和中文两种语言。最大的模型使用了10万小时以上的中英文数据进行训练。实现了自然流畅的语音合成。

1ChatTTS

项目地址：https://github.com/2noise/ChatTTS/。官方项目只提供一个python安装包ChatTTS，通过python调用：

importChatTTS
fromIPython.displayimportAudio

chat=ChatTTS.Chat()
chat.load_models(compile=False)#SettoTrueforbetterperformance

texts=["UTYOURTEXTHERE",]

wavs=chat.infer(texts,)

torchaudio.save("output1.wav",torch.from_numpy(wavs[0]),24000)

高级用法：

###################################
#SampleaspeakerfromGaussian.

rand_spk=chat.sample_random_speaker()

params_infer_code={
'spk_emb':rand_spk,#addsampledspeaker
'temperature':.3,#usingcustomtemperature
'top_P':0.7,#topPdecode
'top_K':20,#topKdecode
}

###################################
#Forsentencelevelmanualcontrol.

#useoral_(0-9),laugh_(0-2),break_(0-7)
#togeneratespecialtokenintexttosynthesize.
params_refine_text={
'prompt':'[oral_2][laugh_0][break_6]'
}

wav=chat.infer(texts,params_refine_text=params_refine_text,params_infer_code=params_infer_code)

###################################
#Forwordlevelmanualcontrol.
text='Whatis[uv_break]yourfavoriteenglishfood?[laugh][lbreak]'
wav=chat.infer(text,skip_refine_text=True,params_refine_text=params_refine_text,params_infer_code=params_infer_code)
torchaudio.save("output2.wav",torch.from_numpy(wavs[0]),24000)

运行代码，生成一段比较自然的语音，分为男生，女生版，

2ChatTTS Webui

项目地址：https://github.com/Gouryella/ChatTTS-webui，ChatTTS Webui是基于ChatTTS的浏览器应用，把原来的应用，导入做为一个api，用FsstAPI写了一个server.py的后端，

importChatTTS

chat=ChatTTS.Chat()
app=FastAPI()

app.add_middleware(
CORSMiddleware,
allow_origins=["*"],
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)

classText2Speech(BaseModel):
text:str
voice_adj:int
temperature:float
top_p:float
top_k:int

model_path=os.path.join(os.path.dirname(__file__),'models')
model_files=[
os.path.join(model_path,'asset/Decoder.pt'),
os.path.join(model_path,'asset/DVAE.pt'),
os.path.join(model_path,'asset/GPT.pt'),
os.path.join(model_path,'asset/spk_stat.pt'),
os.path.join(model_path,'asset/tokenizer.pt'),
os.path.join(model_path,'asset/Vocos.pt'),
os.path.join(model_path,'config/decoder.yaml'),
os.path.join(model_path,'config/dvae.yaml'),
os.path.join(model_path,'config/gpt.yaml'),
os.path.join(model_path,'config/path.yaml'),
os.path.join(model_path,'config/vocos.yaml')
]

all_files_exist=all(os.path.exists(file_path)forfile_pathinmodel_files)
assertall_files_exist,"Modelfilesdonotexist,pleasedownloadthemodels."
print('Loadmodelsfromlocalpath.')
chat.load_models(source='local',local_path=model_path)

@app.post("/generate")
asyncdefgenerate_text(request:Text2Speech):
text=request.text
torch.manual_seed(request.voice_adj)
params_infer_code={
'spk_emb':chat.sample_random_speaker(),
'temperature':request.temperature,
'top_P':request.top_p,
'top_K':request.top_k,
}

wavs=awaitasyncio.to_thread(chat.infer,text,use_decoder=True,params_infer_code=params_infer_code)
audio_data=np.array(wavs[0])
ifaudio_data.ndim==1:
audio_data=np.expand_dims(audio_data,axis=0)

audio_buffer=BytesIO()
sf.write(audio_buffer,audio_data.T,24000,format='WAV')
audio_buffer.seek(0)

returnStreamingResponse(audio_buffer,media_type='audio/wav')

if__name__=="__main__":
importuvicorn
uvicorn.run(app,host="0.0.0.0",port=8000)

用Nuxt写了简洁前端界面，通过3001商品访问。

这个界面还可以通过生成的二维码，同一网络环境下，扫码用手机端访问。

3用法

在文字窗口输入想要转语音的文字，在段落中加入笑声和停顿，选项包括男声，女声选择，他对应的就是声音调节，可以手动调节声音，另外三个项默认即可。点击生成，即可生成相应的声音，耗时大概1分钟，可在线试听，保存下载音频文件。

4安装

安装方法比较简单，克隆项目，安装前端的依赖，创建python虚拟环境，安装python包，克隆模型。前置基本条件是环境安装了git，node，anaconda。

1 克隆项目安装依赖

gitclonehttps://github.com/Gouryella/ChatTTS-webui.git
cdChatTTS-webui
npminstall

2安装虚拟环境

condacreate-nchatttspython=3.10
condaactivatechattts
condainstallpytorch==2.2.2torchvision==0.17.2torchaudio==2.2.2pytorch-cuda=12.1-cpytorch-cnvidia

#IfyouareusingMacOSordonotsupportCUDA,use
pipinstalltorch==2.2.2torchvision==0.17.2torchaudio==2.2.2

pipinstall-rrequirements.txt

windows下如果有英伟达GPU，就执行第3条命令(conda)，如果没有，就执行第4条命令(pip)。

3克隆模型

cdapi
gitclonehttps://huggingface.co/2Noise/ChatTTS.gitmodels
cd..

模型体积比较大，下载时间有点长，还需要好的网络环境。

4启动

npmrundev
pythonapi/server.py

在项目目录运行这两个命令，python要切换到对应的虚拟环境。上面两条命令，需要开2个客服端，分别运行。启动完成后，在浏览器输入：http://127.0.0.1:3001/