2024年11月28日,阿里Qwen团队了发布了一个新模型QwQ-32B-Preview,QwQ表示Qwen with Questions,它是一个实验性研究模型,专注于增强 AI 推理能力。作为预览版本,它展现了令人期待的分析能力。通过笔者实际机器测试,采用2*32G显存的GPU的环境配置即可部署推理该模型。下面是关于该模型的一些介绍与总结。
from transformers import AutoModelForCausalLM, AutoTokenizer
model_name = "Qwen/QwQ-32B-Preview"
model = AutoModelForCausalLM.from_pretrained( model_name, torch_dtype="auto", device_map="auto" ) tokenizer = AutoTokenizer.from_pretrained(model_name)
prompt = "How many r in strawberry." messages = [ {"role": "system", "content": "You are a helpful and harmless assistant. You are Qwen developed by Alibaba. You should think step-by-step."}, {"role": "user", "content": prompt} ] text = tokenizer.apply_chat_template( messages, tokenize=False, add_generation_prompt=True ) model_inputs = tokenizer([text], return_tensors="pt").to(model.device)
generated_ids = model.generate( **model_inputs, max_new_tokens=512 ) generated_ids = [ output_ids[len(input_ids):] for input_ids, output_ids in zip(model_inputs.input_ids, generated_ids) ]