A lightweight HTTP proxy server for OpenAI APIs. It wraps the official OpenAI SDK and exposes simplified endpoints for chat completions, audio transcriptions, and embeddings with built-in authentication, logging, and metrics.
- Responses API — OpenAI's most advanced interface with tools, web search, file search, MCP, function calling, and streaming
- Chat Completions — Full OpenAI Chat Completions API support including vision (images)
- Audio Transcriptions — Whisper-based speech-to-text
- Embeddings — Generate text embeddings
- Simple Chat — Simplified
/chatgptendpoint for quick prompts - Health Checks — Built-in health endpoints
- Logging & Metrics — Request/response logs, token usage stats, error tracking
- CORS — Enabled for all origins
npm installCreate a .env or .env.local file in the project root:
OPENAI_API_KEY=sk-...
OPENAI_PROJECT_KEY=proj_... # Optional
SECURITY_KEY=your-secret-key # Required for authenticated endpoints
OPENAI_PROXY_UPSTREAM_TIMEOUT_MS=600000
OPENAI_PROXY_UPSTREAM_MAX_TIMEOUT_MS=900000
OPENAI_PROXY_MAX_PARALLEL_REQUESTS=32OPENAI_PROXY_UPSTREAM_TIMEOUT_MSsets the upstream OpenAI SDK timeout used when a caller does not providetimeout.OPENAI_PROXY_UPSTREAM_MAX_TIMEOUT_MScaps caller-providedtimeoutvalues. Values above the cap are clamped.OPENAI_PROXY_MAX_PARALLEL_REQUESTSbounds concurrent OpenAI work inside the proxy. When the limit is reached, the proxy rejects new upstream work with503andRetry-After: 1.
Default values:
- default upstream timeout:
600000ms - maximum upstream timeout:
900000ms - maximum parallel requests:
32
Development (with hot reload):
npm run local:watchProduction:
npm run startThe server starts on http://localhost:3002 by default.
Returns a simple HTML page to verify the server is running.
Main endpoint for OpenAI Chat Completions API.
Proxies requests to POST https://api.openai.com/v1/chat/completions.
| Field | Type | Required | Description |
|---|---|---|---|
security_key |
string | ✅ | Must match SECURITY_KEY env variable |
openai_api_key |
string | ❌ | Override the default API key |
project |
string | ❌ | OpenAI project ID |
organization |
string | ❌ | OpenAI organization ID |
image |
object | ❌ | Image for vision models (see below) |
model |
string | ✅ | Model ID (e.g., gpt-4o, gpt-4o-mini) |
messages |
array | ✅ | Array of message objects |
temperature |
number | ❌ | Sampling temperature (0-2) |
top_p |
number | ❌ | Nucleus sampling (0-1) |
max_tokens |
number | ❌ | Max tokens to generate |
max_completion_tokens |
number | ❌ | Max completion tokens |
| ... | ... | ❌ | Any other Chat Completions API parameters |
Note:
stream: trueis not supported on this endpoint.
{
"url": "https://example.com/image.png"
}or
{
"base64": "iVBORw0KGgoAAAANSUhEUg..."
}curl -X POST http://localhost:3002/openai \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-4o-mini",
"messages": [
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "What is TypeScript?"}
],
"temperature": 0.7,
"max_tokens": 500
}'curl -X POST http://localhost:3002/openai \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-4o",
"messages": [
{"role": "user", "content": "What is in this image?"}
],
"image": {
"url": "https://upload.wikimedia.org/wikipedia/commons/thumb/a/a7/Camponotus_flavomarginatus_ant.jpg/800px-Camponotus_flavomarginatus_ant.jpg"
}
}'Standard OpenAI Chat Completion response:
{
"id": "chatcmpl-...",
"object": "chat.completion",
"created": 1234567890,
"model": "gpt-4o-mini",
"choices": [
{
"index": 0,
"message": {
"role": "assistant",
"content": "TypeScript is..."
},
"finish_reason": "stop"
}
],
"usage": {
"prompt_tokens": 25,
"completion_tokens": 100,
"total_tokens": 125
}
}OpenAI Responses API endpoint — The most advanced interface for generating model responses with support for tools, files, web searching, MCP, function calling, and more.
Proxies requests to POST https://api.openai.com/v1/responses.
The proxy also exposes the other documented Responses API operations:
POST /openai2/compact→POST /v1/responses/compactPOST /openai2/input_tokens→POST /v1/responses/input_tokensGET /openai2/:response_id→GET /v1/responses/:response_idGET /openai2/:response_id/input_items→GET /v1/responses/:response_id/input_itemsPOST /openai2/:response_id/cancel→POST /v1/responses/:response_id/cancelDELETE /openai2/:response_id→DELETE /v1/responses/:response_id
| Field | Type | Required | Description |
|---|---|---|---|
security_key |
string | ✅ | Must match SECURITY_KEY env variable |
openai_api_key |
string | ❌ | Override the default API key |
project |
string | ❌ | OpenAI project ID |
organization |
string | ❌ | OpenAI organization ID |
model |
string | ✅ | Model ID (e.g., gpt-4o, gpt-4.1, o3) |
input |
string/array | ✅ | Text, image, or file inputs to the model |
instructions |
string | ❌ | System/developer message inserted into context |
tools |
array | ❌ | Array of tools (web_search, file_search, function, mcp, etc.) |
tool_choice |
string/object | ❌ | How model should select tools (auto, none, required, or specific tool) |
stream |
boolean | ❌ | Enable Server-Sent Events streaming (default: false) |
temperature |
number | ❌ | Sampling temperature (0-2, default: 1) |
top_p |
number | ❌ | Nucleus sampling (0-1, default: 1) |
max_output_tokens |
integer | ❌ | Max tokens for response including reasoning |
max_tool_calls |
integer | ❌ | Max total calls to built-in tools |
parallel_tool_calls |
boolean | ❌ | Allow parallel tool calls (default: true) |
previous_response_id |
string | ❌ | ID of previous response for multi-turn conversations |
conversation |
string/object | ❌ | Conversation context (cannot use with previous_response_id) |
store |
boolean | ❌ | Store response for later retrieval (default: true) |
metadata |
object | ❌ | Up to 16 key-value pairs for additional info |
include |
array | ❌ | Additional output data to include (see below) |
text |
object | ❌ | Text response configuration (format, structured output) |
reasoning |
object | ❌ | Reasoning model configuration (effort, summary) |
truncation |
string | ❌ | Truncation strategy (auto or disabled) |
background |
boolean | ❌ | Run response in background (default: false) |
service_tier |
string | ❌ | Processing tier (auto, default, flex, priority) |
timeout |
number | ❌ | Proxy-specific OpenAI SDK request timeout in milliseconds for this single upstream call |
timeoutis expressed in milliseconds.- For
POSTendpoints, passtimeoutas a JSON number when possible. Numeric strings are also normalized safely. - For
GET/DELETEResponses endpoints, passtimeoutas a query parameter. - The proxy applies this value to the single upstream OpenAI SDK request for that operation. It does not carry over to later retrieve, list, cancel, or delete calls.
- If no
timeoutis provided, the proxy usesOPENAI_PROXY_UPSTREAM_TIMEOUT_MS. - Missing, invalid, non-finite, or non-positive values fall back to
OPENAI_PROXY_UPSTREAM_TIMEOUT_MS. - Values above
OPENAI_PROXY_UPSTREAM_MAX_TIMEOUT_MSare clamped before the upstream SDK call is made. - If
OPENAI_PROXY_UPSTREAM_MAX_TIMEOUT_MSis configured belowOPENAI_PROXY_UPSTREAM_TIMEOUT_MS, the effective max becomes the default timeout. - The effective timeout is logged in the proxy's structured logs.
The timeout policy is applied across the OpenAI-backed proxy routes, including /openai, /openai2, /openai/audio/transcriptions, and /embeddings.
OpenAI-facing routes now return structured JSON errors instead of generic plain-text 500 responses:
{
"error": {
"message": "Timeout while waiting for OpenAI response",
"type": "upstream_timeout",
"code": "OPENAI_PROXY_TIMEOUT",
"requestId": "a7a27871-9d49-40c0-8c7b-7d44d2770ce8"
}
}Failure categories:
- OpenAI API errors with an upstream HTTP status preserve that status and include sanitized upstream metadata.
- Transport timeouts without a valid upstream response return
504. - Transport failures such as DNS, TLS, socket reset, or other connection failures return
502. - Local overload from the concurrency guard returns
503withRetry-After: 1. - Validation failures return
400. - Proxy auth failures remain
403. - Client disconnects abort upstream work and are logged as cancellations instead of generic server failures.
- Automatic retries are disabled for non-idempotent create-style calls such as
/openai,/openai2,/openai2/compact,/openai/audio/transcriptions, and/embeddingsto avoid duplicating billed work. - The official OpenAI SDK retry mechanism is still used on the safer read-only or idempotent operations exposed by the proxy:
POST /openai2/input_tokensGET /openai2/:response_idGET /openai2/:response_id/input_itemsDELETE /openai2/:response_id
- Retry attempts are logged with request ID, endpoint, attempt number, and sanitized failure details.
Each OpenAI-backed request emits a structured completion log entry with:
- request ID
- endpoint and method
- model when present
- streaming flag
- effective timeout and timeout source
- start time and duration
- final result category and returned HTTP status
- retry count
- overload and cancellation flags
Secrets such as API keys, bearer tokens, proxy security keys, cookies, and access tokens are redacted before they are stored or printed.
Specify additional output data to include:
web_search_call.action.sources— Include web search sourcescode_interpreter_call.outputs— Include code interpreter outputsfile_search_call.results— Include file search resultsmessage.input_image.image_url— Include input image URLsmessage.output_text.logprobs— Include logprobs with messagesreasoning.encrypted_content— Include encrypted reasoning tokens
Web Search Tool:
{
"type": "web_search_preview",
"search_context_size": "medium"
}File Search Tool:
{
"type": "file_search",
"vector_store_ids": ["vs_abc123"],
"max_num_results": 20
}Function Calling Tool:
{
"type": "function",
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string", "description": "City name" }
},
"required": ["location"]
}
}MCP (Model Context Protocol) Tool:
{
"type": "mcp",
"server_label": "my-mcp-server",
"server_url": "https://my-mcp-server.example.com",
"allowed_tools": ["tool1", "tool2"]
}Code Interpreter Tool:
{
"type": "code_interpreter",
"container": { "type": "auto" }
}curl -X POST http://localhost:3002/openai2 \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-4o",
"input": "Tell me a three sentence bedtime story about a unicorn."
}'curl -X POST http://localhost:3002/openai2 \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-4o",
"instructions": "You are a helpful coding assistant. Always provide code examples.",
"input": "How do I read a file in Python?"
}'curl -X POST http://localhost:3002/openai2 \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-4o",
"input": "What are the latest news about AI?",
"tools": [
{ "type": "web_search_preview" }
],
"include": ["web_search_call.action.sources"]
}'curl -X POST http://localhost:3002/openai2 \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-4o",
"input": "What does the documentation say about authentication?",
"tools": [
{
"type": "file_search",
"vector_store_ids": ["vs_abc123"],
"max_num_results": 10
}
],
"include": ["file_search_call.results"]
}'curl -X POST http://localhost:3002/openai2 \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-4o",
"input": "What is the weather in San Francisco?",
"tools": [
{
"type": "function",
"name": "get_weather",
"description": "Get current weather for a location",
"parameters": {
"type": "object",
"properties": {
"location": { "type": "string", "description": "City name" },
"unit": { "type": "string", "enum": ["celsius", "fahrenheit"] }
},
"required": ["location"]
}
}
]
}'curl -X POST http://localhost:3002/openai2 \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-4o",
"input": "Search my Google Drive for Q4 reports",
"tools": [
{
"type": "mcp",
"server_label": "google-drive",
"server_url": "https://mcp.example.com/google-drive",
"allowed_tools": ["search_files", "read_file"]
}
]
}'curl -X POST http://localhost:3002/openai2 \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-4o",
"input": [
{ "type": "input_text", "text": "What is in this image?" },
{ "type": "input_image", "image_url": "https://example.com/image.png" }
]
}'# First request
curl -X POST http://localhost:3002/openai2 \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-4o",
"input": "My name is Alice."
}'
# Response includes "id": "resp_abc123..."
# Second request with previous_response_id
curl -X POST http://localhost:3002/openai2 \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-4o",
"input": "What is my name?",
"previous_response_id": "resp_abc123..."
}'curl -X POST http://localhost:3002/openai2 \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-4o",
"input": "Write a short poem about coding.",
"stream": true
}'GET /openai2/:response_id supports the documented Responses retrieval query parameters:
includestreaminclude_obfuscationstarting_aftertimeout(milliseconds, per upstream retrieve request, practical max900000)
GET /openai2/:response_id/input_items supports:
afterincludelimitordertimeout(milliseconds, per upstream list request, practical max900000)
curl -X POST http://localhost:3002/openai2/compact \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-5",
"input": "Summarize this long-running conversation."
}'curl -X POST http://localhost:3002/openai2/input_tokens \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-4o",
"input": "Count the tokens in this prompt."
}'curl "http://localhost:3002/openai2/resp_abc123/input_items?security_key=your-secret-key&limit=20&order=desc"curl -X POST http://localhost:3002/openai2 \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-4o",
"input": "Extract the name and age from: John is 30 years old.",
"text": {
"format": {
"type": "json_schema",
"name": "person_info",
"schema": {
"type": "object",
"properties": {
"name": { "type": "string" },
"age": { "type": "integer" }
},
"required": ["name", "age"]
}
}
}
}'curl -X POST http://localhost:3002/openai2 \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "o3",
"input": "Solve this complex math problem: ...",
"reasoning": {
"effort": "high",
"summary": "auto"
}
}'curl -X POST http://localhost:3002/openai2 \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"model": "gpt-4o",
"input": "Find the current stock price of Apple and calculate a 10% increase",
"tools": [
{ "type": "web_search_preview" },
{
"type": "function",
"name": "calculate_percentage",
"description": "Calculate percentage of a number",
"parameters": {
"type": "object",
"properties": {
"number": { "type": "number" },
"percentage": { "type": "number" }
},
"required": ["number", "percentage"]
}
}
],
"parallel_tool_calls": true
}'Standard OpenAI Responses API response:
{
"id": "resp_67ccd2bed1ec8190b14f964abc054267...",
"object": "response",
"created_at": 1741476542,
"status": "completed",
"completed_at": 1741476543,
"model": "gpt-4o-2024-08-06",
"output": [
{
"type": "message",
"id": "msg_67ccd2bf17f0819081ff3bb2cf6508e6...",
"status": "completed",
"role": "assistant",
"content": [
{
"type": "output_text",
"text": "In a peaceful grove beneath a silver moon...",
"annotations": []
}
]
}
],
"parallel_tool_calls": true,
"reasoning": { "effort": null, "summary": null },
"store": true,
"temperature": 1.0,
"tool_choice": "auto",
"tools": [],
"usage": {
"input_tokens": 36,
"input_tokens_details": { "cached_tokens": 0 },
"output_tokens": 87,
"output_tokens_details": { "reasoning_tokens": 0 },
"total_tokens": 123
}
}Retrieve a stored response by ID.
| Field | Type | Required | Description |
|---|---|---|---|
security_key |
string | ✅ | Must match SECURITY_KEY env variable |
openai_api_key |
string | ❌ | Override the default API key |
project |
string | ❌ | OpenAI project ID |
organization |
string | ❌ | OpenAI organization ID |
curl "http://localhost:3002/openai2/resp_abc123?security_key=your-secret-key"Delete a stored response.
| Field | Type | Required | Description |
|---|---|---|---|
security_key |
string | ✅ | Must match SECURITY_KEY env variable |
openai_api_key |
string | ❌ | Override the default API key |
curl -X DELETE "http://localhost:3002/openai2/resp_abc123?security_key=your-secret-key"{
"id": "resp_abc123",
"object": "response",
"deleted": true
}Cancel a background response (only for responses created with background: true).
| Field | Type | Required | Description |
|---|---|---|---|
security_key |
string | ✅ | Must match SECURITY_KEY env variable |
openai_api_key |
string | ❌ | Override the default API key |
curl -X POST http://localhost:3002/openai2/resp_abc123/cancel \
-H "Content-Type: application/json" \
-d '{ "security_key": "your-secret-key" }'List input items for a response.
| Field | Type | Required | Description |
|---|---|---|---|
security_key |
string | ✅ | Must match SECURITY_KEY env variable |
openai_api_key |
string | ❌ | Override the default API key |
curl "http://localhost:3002/openai2/resp_abc123/input_items?security_key=your-secret-key"{
"object": "list",
"data": [
{
"id": "msg_abc123",
"type": "message",
"role": "user",
"content": [
{ "type": "input_text", "text": "Tell me a story." }
]
}
],
"first_id": "msg_abc123",
"last_id": "msg_abc123",
"has_more": false
}Simplified chat endpoint using the chatgpt library.
| Field | Type | Required | Description |
|---|---|---|---|
security_key |
string | ✅ | Must match SECURITY_KEY env variable |
prompt |
string | ✅ | The user message |
model |
string | ❌ | Model ID (default: gpt-4o-mini) |
temperature |
number | ❌ | Sampling temperature (0-2) |
top_p |
number | ❌ | Nucleus sampling (0-1) |
max_tokens |
number | ❌ | Max tokens to generate |
max_completion_tokens |
number | ❌ | Max completion tokens |
curl -X POST http://localhost:3002/chatgpt \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"prompt": "Explain quantum computing in simple terms",
"model": "gpt-4o-mini",
"temperature": 0.8
}'Audio transcription using OpenAI Whisper.
| Field | Type | Required | Description |
|---|---|---|---|
file |
file | ✅ | Audio file (wav, mp3, m4a, etc.) |
security_key |
string | ✅ | Must match SECURITY_KEY env variable |
openai_api_key |
string | ❌ | Override the default API key |
project |
string | ❌ | OpenAI project ID |
organization |
string | ❌ | OpenAI organization ID |
model |
string | ❌ | Model ID (default: whisper-1) |
language |
string | ❌ | Language code (e.g., en, es) |
prompt |
string | ❌ | Optional prompt to guide transcription |
temperature |
number | ❌ | Sampling temperature (default: 0) |
response_format |
string | ❌ | json, text, srt, verbose_json, vtt |
timestamp_granularities[] |
string | ❌ | word and/or segment |
curl -X POST http://localhost:3002/openai/audio/transcriptions \
-F "file=@audio.mp3" \
-F "security_key=your-secret-key" \
-F "model=whisper-1" \
-F "language=en" \
-F "response_format=json"{
"text": "Hello, this is a transcription of the audio file."
}Generate text embeddings.
| Field | Type | Required | Description |
|---|---|---|---|
security_key |
string | ✅ | Must match SECURITY_KEY env variable |
input |
string or array | ✅ | Text(s) to embed |
model |
string | ❌ | Model ID (default: text-embedding-3-large) |
dimensions |
number | ❌ | Output dimensions |
encoding_format |
string | ❌ | float or base64 |
curl -X POST http://localhost:3002/embeddings \
-H "Content-Type: application/json" \
-d '{
"security_key": "your-secret-key",
"input": "The quick brown fox jumps over the lazy dog",
"model": "text-embedding-3-small",
"dimensions": 512
}'{
"object": "list",
"data": [
{
"object": "embedding",
"index": 0,
"embedding": [0.0023064255, -0.009327292, ...]
}
],
"model": "text-embedding-3-small",
"usage": {
"prompt_tokens": 9,
"total_tokens": 9
}
}| Status | Description |
|---|---|
400 |
Bad Request — Invalid input or streaming not supported |
403 |
Forbidden — Invalid or missing security_key |
404 |
Not Found — Unknown endpoint |
429 |
Too Many Requests — Rate limit exceeded (health/log endpoints) |
500 |
Internal Server Error — OpenAI API error or server issue |
Build and push Docker image:
npm run dockerOr manually:
docker build -t chatgpt-proxy .
docker run -p 3002:3002 --env-file .env chatgpt-proxy- Port: 3002
- Request Timeout: 15 minutes (900,000 ms)
- Keep-Alive Timeout: 15 minutes
- Headers Timeout: ~16 minutes
Client timeout overrides cannot increase these server-side HTTP limits.
ISC