API 参考

4 个端点的请求 / 响应 schema + curl 示例

API 参考

GPUShare 暴露 4 个客户端协议端点。每个端点都接受同一把 sk-gpushare-* Key。

端点协议主要用途
POST /v1/chat/completionsOpenAI Chat通用最广,跨厂商
POST /v1/messagesAnthropic MessagesAnthropic SDK 直连
POST /v1beta/models/{model}:generateContentGemini NativeGoogle genai SDK 直连
POST /v1/responsesOpenAI ResponsesGPT-5.x 内置工具 (web_search / image_generation)

Base URL: https://api.dflop.top

鉴权

详见 鉴权。三种方式任选一种,按以下优先级回退:

  1. x-api-key: sk-gpushare-xxx header (推荐)
  2. ?key=sk-gpushare-xxx query (Gemini SDK 默认)
  3. Authorization: Bearer sk-gpushare-xxx header (OpenAI / Anthropic SDK 默认)

POST /v1/chat/completions

OpenAI Chat Completions 兼容端点。最通用,支持全部 33 个模型。

请求

{
  "model": "claude-sonnet-4-5-20250929",
  "messages": [
    {"role": "system", "content": "You are helpful."},
    {"role": "user", "content": "Hello"}
  ],
  "stream": false,
  "max_tokens": 1024,
  "temperature": 0.7,
  "tools": [
    {"type": "function", "function": {...}}
  ]
}
字段必填类型说明
modelstring模型 ID,见 模型列表
messagesarray对话历史,role ∈ {system, user, assistant, tool}
streambooltrue 启用 SSE 流式
max_tokensint生成 token 上限
temperaturefloat0-2
toolsarrayFunction tools 或 {type:"web_search"} / {type:"image_generation"}
tool_choicestring|objectauto / none / {type:"function","function":{...}}
stream_optionsobject流式时 {"include_usage": true} 让 trailing chunk 带 token 统计
response_formatobject{"type":"json_object"} 强制 JSON 输出

响应 (非流式)

{
  "id": "chatcmpl-...",
  "object": "chat.completion",
  "created": 1715845200,
  "model": "claude-sonnet-4-5-20250929",
  "choices": [{
    "index": 0,
    "message": {"role": "assistant", "content": "Hello!"},
    "finish_reason": "stop"
  }],
  "usage": {
    "prompt_tokens": 10,
    "completion_tokens": 5,
    "total_tokens": 15
  }
}

响应 (流式)

stream: true 时返回 SSE 流,每条 data: 行为一个 chunk。详见 流式响应

curl

curl https://api.dflop.top/v1/chat/completions \
  -H "Authorization: Bearer $GPUSHARE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.4",
    "messages": [{"role": "user", "content": "Hello"}]
  }'

POST /v1/messages

Anthropic Messages 兼容端点。

请求

{
  "model": "claude-sonnet-4-5-20250929",
  "max_tokens": 1024,
  "system": "You are helpful.",
  "messages": [
    {"role": "user", "content": "Hello"}
  ],
  "stream": false,
  "tools": [...]
}
字段必填类型说明
modelstring模型 ID
max_tokensintAnthropic 协议必填 (跟 OpenAI 不同)
messagesarray对话,role ∈ {user, assistant}
systemstring系统提示 (顶层字段,不放 messages)
streambool
toolsarrayAnthropic 工具格式 (name / description / input_schema)
tool_choiceobject{"type":"auto"|"any"|"tool", "name": "..."}

响应 (非流式)

{
  "id": "msg_...",
  "type": "message",
  "role": "assistant",
  "model": "claude-sonnet-4-5-20250929",
  "content": [
    {"type": "text", "text": "Hello!"}
  ],
  "stop_reason": "end_turn",
  "usage": {"input_tokens": 10, "output_tokens": 5}
}

curl

curl https://api.dflop.top/v1/messages \
  -H "x-api-key: $GPUSHARE_API_KEY" \
  -H "anthropic-version: 2023-06-01" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claude-sonnet-4-5-20250929",
    "max_tokens": 1024,
    "messages": [{"role": "user", "content": "Hello"}]
  }'

限制

  • Gemini 系列模型 (gemini-*) 暂不支持该端点,详见 兼容矩阵
  • anthropic-version header SDK 自动注入;curl 直调时填 2023-06-01

Gemini Native

两条相关端点:

  • POST /v1beta/models/{model}:generateContent —— 非流式
  • POST /v1beta/models/{model}:streamGenerateContent —— 流式

{model} 占位符在 URL 里直接写,如 /v1beta/models/gemini-2.5-pro:generateContent

请求

{
  "contents": [
    {"role": "user", "parts": [{"text": "Hello"}]}
  ],
  "systemInstruction": {
    "parts": [{"text": "You are helpful."}]
  },
  "generationConfig": {
    "maxOutputTokens": 1024,
    "temperature": 0.7
  },
  "tools": [...]
}

响应

{
  "candidates": [{
    "content": {
      "role": "model",
      "parts": [{"text": "Hello!"}]
    },
    "finishReason": "STOP",
    "index": 0
  }],
  "usageMetadata": {
    "promptTokenCount": 10,
    "candidatesTokenCount": 5,
    "totalTokenCount": 15
  }
}

curl

curl "https://api.dflop.top/v1beta/models/gemini-2.5-pro:generateContent?key=$GPUSHARE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "contents": [{"parts": [{"text": "Hello"}]}]
  }'

限制

  • GPT-5.x 暂不支持该端点

POST /v1/responses

OpenAI Responses API 兼容端点 —— GPT-5.x 系列的原生协议。比 /v1/chat/completions 多两条原生工具:web_search(联网检索)和 image_generation(GPT-image-2 出图),响应里有完整 reasoning / output_text / 工具调用结构化字段。

支持的模型

仅 GPT-5 家族(其他模型走自己的 native 端点,非 GPT-5 调用返 model_not_allowed):

模型备注
gpt-5.4
gpt-5.5推荐 —— 支持 reasoning summary + 内置工具

请求

{
  "model": "gpt-5.5",
  "input": "Say hello in 5 words.",
  "instructions": "You are concise.",
  "stream": false,
  "max_output_tokens": 1024,
  "temperature": 1.0,
  "top_p": 0.98,
  "reasoning": {"effort": "medium"},
  "tools": [
    {"type": "web_search"},
    {"type": "image_generation", "size": "1024x1024"},
    {"type": "function", "name": "get_weather", "description": "...", "parameters": {...}}
  ],
  "tool_choice": "auto",
  "parallel_tool_calls": true
}
字段必填类型说明
modelstringgpt-5.4gpt-5.5
inputstring | array字符串直接当 user prompt;或传消息数组(见下)
instructionsstring系统提示。顶层字段,不是 messages[0](跟 /v1/chat/completions 不同)
streambooltrue 返回 SSE 事件流
max_output_tokensint生成上限(Responses API 用 _output_,不是 max_tokens)
reasoningobject{"effort": "low"|"medium"|"high"} 控制内部思考强度
temperature / top_p / frequency_penalty / presence_penaltyfloat标准采样参数
toolsarray工具类型
tool_choicestring | objectauto / none / {type:"function","name":"..."}
parallel_tool_callsbool默认 true

input 数组形式 (多轮)

"input": [
  {"role": "user", "content": "What is 2+2?"},
  {"role": "assistant", "content": "4"},
  {"role": "user", "content": "What was my first question?"}
]

字符串内容可用 string 简写;vision 必须用 content parts:

{"role": "user", "content": [
  {"type": "input_text", "text": "Describe this image."},
  {"type": "input_image", "image_url": "https://...", "detail": "auto"}
]}

注意: input_image.image_url 必须是公网可访问的 URL(上游会自己拉),CDN 缩略图 / 鉴权 URL 可能返 upstream_error

工具 (tools)

// 1. 内置 web 检索 —— 模型自行决定是否调用,直接返回带答案的 output_text
{"type": "web_search"}

// 2. 内置出图 —— output 里返回 image_generation_call 项 (含 base64 result)
{"type": "image_generation", "size": "1024x1024"}

// 3. 用户自定义 function —— output 里返回 function_call 项,你执行后回带 function_call_output
{
  "type": "function",
  "name": "get_weather",
  "description": "Get current weather",
  "parameters": {"type": "object", "properties": {"city": {"type": "string"}}, "required": ["city"]}
}

响应 (非流式)

{
  "id": "resp_0a430185e6bd1abb016a1576c7bbb08198be6868b655d19349",
  "object": "response",
  "created_at": 1779791559,
  "status": "completed",
  "model": "gpt-5.5",
  "instructions": "...",
  "output": [
    {
      "type": "message",
      "role": "assistant",
      "content": [{"type": "output_text", "text": "Hello, hope you are well."}]
    }
  ],
  "reasoning": {"context": "current_turn", "effort": "medium", "summary": null},
  "usage": {
    "input_tokens": 25,
    "input_tokens_details": {"cached_tokens": 0},
    "output_tokens": 51,
    "output_tokens_details": {"reasoning_tokens": 38},
    "total_tokens": 76
  }
}

output[] 数组按出现顺序排列,每项 type 之一:

type含义
message助手文本回复,文本在 content[].text
reasoning内部推理 summary(可能为空)
function_call模型决定调用你的 function,字段 call_id / name / arguments(JSON 字符串)
image_generation_call内置 image_generation 工具执行结果,result 是 base64 PNG

响应 (流式)

stream: true 时返回 SSE,事件名带 response. 前缀。关键事件序列:

event: response.created            // 整体 response 框架(usage=null)
event: response.in_progress
event: response.output_item.added  // 第 N 个 output 项开始
event: response.output_text.delta  // 文本增量,delta 字段是 chunk
event: response.output_text.done   // 第 N 项文本完成
event: response.output_item.done   // 第 N 项整体结束
event: response.completed          // 全部完成,usage 已填充

每个 data: 都是单条 JSON 对象,自带 sequence_number 单调递增。详见 流式响应

curl

# 基础调用
curl https://api.dflop.top/v1/responses \
  -H "Authorization: Bearer $GPUSHARE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "input": "Hello",
    "max_output_tokens": 100
  }'

# 联网检索
curl https://api.dflop.top/v1/responses \
  -H "Authorization: Bearer $GPUSHARE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "input": "What is today date in Shanghai?",
    "tools": [{"type": "web_search"}]
  }'

# 流式
curl -N https://api.dflop.top/v1/responses \
  -H "Authorization: Bearer $GPUSHARE_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "gpt-5.5",
    "input": "Count 1 to 3.",
    "stream": true
  }'

限制

  • 多轮对话只能用 input 数组自行维护历史previous_response_id 字段在 HTTP 端点上未实现,传它会返 invalid_request —— 上游限定该字段只能走 Responses WebSocket V2 协议,gateway 暂未对外暴露
  • 仅 GPT-5 系列模型路由到本端点。Claude / Gemini / GLM / DeepSeek 等不在 sub2api 上游池,传它们返 model_not_allowed。改成调 /v1/chat/completions 或各自的 native 端点
  • vision 受上游 URL 拉取限制 —— 部分 CDN 缩略图 / 鉴权 / 防盗链 URL 会失败。建议先把图传到自己的 R2 / S3 公开桶再传 URL
  • image_generation 工具不强制流式 —— 本端点是 HTTP 透传,stream:false 也能在响应的 output[] 里拿到 image_generation_call.result (base64 PNG)。这跟 /v1/chat/completions 不同:那条端点的 image_generation 走 sub2api WebSocket V2 适配器,会强制 stream:true 并把图塞进 delta.content 的 markdown 里

错误响应

所有端点返回 HTTP 4xx/5xx 时,响应体为本协议官方错误 schema —— 不混用:

OpenAI Chat (/v1/chat/completions)

{"error": {"message": "...", "type": "...", "code": "..."}}

Anthropic Messages (/v1/messages)

{"type": "error", "error": {"type": "...", "message": "..."}}

Gemini Native (/v1beta/...)

{"error": {"code": 400, "message": "...", "status": "INVALID_ARGUMENT"}}

错误码含义详见 错误码

速率限制

当前未做硬性 QPS 限制。按 Key 余额扣费,超额返回 412 quota_exhausted。详见 鉴权