我们此前使用OpenAI接口流式传输时,是通过一个[DONE]的标记来判断流的结束,但是目前接触到更多的LLM模型,然后遇到没有[DONE]标识的情况,所以查询了一些资料,我们此前的方法并不是OpenAI 标准流式接口消息流结束的标准,而是通过finish_reason判断流是否结束,以下就是代码示例:
import os
from openai import OpenAI
import json
client = OpenAI(
api_key="xxxx",
base_url="xxxx"
)
response = client.chat.completions.create(
model="minimaxai/minimax-m3",
messages=[
{
"role": "user",
"content": "usb-c和usb-a有什么区别?",
}
],
stream=True,
# max_tokens=1024,
temperature=0.7
)
print("Answer: ", end="")
for chunk in response:
# 正常内容输出
if chunk.choices and len(chunk.choices) > 0:
delta_content = chunk.choices[0].delta.content
if delta_content:
print(delta_content, end="", flush=True)
# 检测流式结束:finish_reason 不为空代表本轮对话结束
finish_reason = chunk.choices[0].finish_reason
if finish_reason:
print("\n[DONE] 流式输出结束")
print()我们可以使用以下方法打印chunk原始结构+原始流
1.基于openai库
from openai import OpenAI
client = OpenAI(
api_key="xxxx",
base_url="xxxx"
)
response = client.chat.completions.create(
model="minimaxai/minimax-m3",
messages=[
{"role": "user", "content": "usb-c和usb-a有什么区别?"}
],
stream=True,
temperature=0.7
)
print("===== 流式 Chunk 原始对象 =====")
for idx, chunk in enumerate(response):
# 打印完整 chunk 结构
print(f"\nChunk {idx}: {chunk}")
# 打印序列化字典,方便核对字段
print(f"Dict: {chunk.model_dump()}")2.基于requests库
import requests
import json
API_KEY = "xxxx"
BASE_URL = "xxxx"
headers = {
"Authorization": f"Bearer {API_KEY}",
"Content-Type": "application/json"
}
payload = {
"model": "minimaxai/minimax-m3",
"messages": [{"role": "user", "content": "usb-c和usb-a有什么区别?"}],
"stream": True,
"temperature": 0.7
}
print("===== 原始流式字节输出 =====")
resp = requests.post(BASE_URL, json=payload, headers=headers, stream=True)
# 逐行打印原始字节,直接查看 b'xxx' 内容
for raw_line in resp.iter_lines():
print(raw_line) # 直接输出 b'xxxx' 原生字节串
内容版权声明:除非注明,否则皆为本站原创文章。
评论列表