Completions

The /completions endpoint generates text completions based on the provided prompt.

HTTP Request

POST /completions
POST /v1/completions
Host: your-aqueduct-domain.com
Authorization: Bearer YOUR_AQUEDUCT_TOKEN
Content-Type: application/json

Request Body

The request body should be a JSON object compatible with the OpenAI CompletionCreateParams schema.

Parameter	Type	Description
`model`	string	The name of the model to use.
`prompt`	string or [string]	The prompt(s) to generate completions for.
`suffix`	string	Optional text to append after the prompt.
`max_tokens`	integer	Maximum number of tokens to generate.
`temperature`	number	Sampling temperature to use.
`top_p`	number	Nucleus sampling probability.
`n`	integer	Number of completions to generate for each prompt.
`stream`	boolean	If true, send back partial progress as events.
`stop`	string or [string]	Up to 4 sequences where the API will stop generating.
`presence_penalty`	number	Penalize new tokens based on existing presence.
`frequency_penalty`	number	Penalize new tokens based on existing frequency.
`user`	string	A unique identifier for the end-user.

See the OpenAI documentation for a full list of parameters.

Examples

cURL Example

curl https://your-aqueduct-domain.com/completions \
  -H "Authorization: Bearer YOUR_AQUEDUCT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your-model-name",
    "prompt": "Once upon a time",
    "max_tokens": 50,
    "temperature": 0.7
  }'

Python Example (OpenAI SDK)

from openai import OpenAI

client = OpenAI(
    base_url="https://your-aqueduct-domain.com",
    api_key="YOUR_AQUEDUCT_TOKEN",
)

response = client.completions.create(
    model="your-model-name",
    prompt="Once upon a time",
    max_tokens=50,
    temperature=0.7,
)
print(response.choices[0].text)

Streaming Responses

To receive a streamed response, set "stream": true in the request body. The Aqueduct Gateway will return a Server-Sent Events (SSE) stream with data: ... chunks following the OpenAI streaming format.

curl https://your-aqueduct-domain.com/completions \
  -H "Authorization: Bearer YOUR_AQUEDUCT_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "your-model-name",
    "prompt": "Once upon a time",
    "max_tokens": 50,
    "stream": true
  }'

Error Responses

Aqueduct maps backend errors to HTTP status codes similar to the OpenAI API:

Status Code	Description
200	OK
400	Bad request (invalid parameters)
401	Unauthorized (invalid API token)
403	Forbidden (permission denied)
404	Not found (model or endpoint not available)
422	Unprocessable entity
429	Rate limit exceeded
504	Gateway timeout
500	Internal server error (upstream or gateway error)
503	Service unavailable