Completions
The /completions
endpoint generates text completions based on the provided prompt.
HTTP Request
POST /completions
POST /v1/completions
Host: your-aqueduct-domain.com
Authorization: Bearer YOUR_AQUEDUCT_TOKEN
Content-Type: application/json
Request Body
The request body should be a JSON object compatible with the OpenAI CompletionCreateParams schema.
Parameter | Type | Description |
---|---|---|
model | string | The name of the model to use. |
prompt | string or [string] | The prompt(s) to generate completions for. |
suffix | string | Optional text to append after the prompt. |
max_tokens | integer | Maximum number of tokens to generate. |
temperature | number | Sampling temperature to use. |
top_p | number | Nucleus sampling probability. |
n | integer | Number of completions to generate for each prompt. |
stream | boolean | If true, send back partial progress as events. |
stop | string or [string] | Up to 4 sequences where the API will stop generating. |
presence_penalty | number | Penalize new tokens based on existing presence. |
frequency_penalty | number | Penalize new tokens based on existing frequency. |
user | string | A unique identifier for the end-user. |
See the OpenAI documentation for a full list of parameters.
Examples
cURL Example
curl https://your-aqueduct-domain.com/completions \
-H "Authorization: Bearer YOUR_AQUEDUCT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "your-model-name",
"prompt": "Once upon a time",
"max_tokens": 50,
"temperature": 0.7
}'
Python Example (OpenAI SDK)
from openai import OpenAI
client = OpenAI(
base_url="https://your-aqueduct-domain.com",
api_key="YOUR_AQUEDUCT_TOKEN",
)
response = client.completions.create(
model="your-model-name",
prompt="Once upon a time",
max_tokens=50,
temperature=0.7,
)
print(response.choices[0].text)
Streaming Responses
To receive a streamed response, set "stream": true
in the request body. The Aqueduct Gateway will return a Server-Sent Events (SSE) stream with data: ...
chunks following the OpenAI streaming format.
curl https://your-aqueduct-domain.com/completions \
-H "Authorization: Bearer YOUR_AQUEDUCT_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"model": "your-model-name",
"prompt": "Once upon a time",
"max_tokens": 50,
"stream": true
}'
Error Responses
Aqueduct maps backend errors to HTTP status codes similar to the OpenAI API:
Status Code | Description |
---|---|
200 | OK |
400 | Bad request (invalid parameters) |
401 | Unauthorized (invalid API token) |
403 | Forbidden (permission denied) |
404 | Not found (model or endpoint not available) |
422 | Unprocessable entity |
429 | Rate limit exceeded |
504 | Gateway timeout |
500 | Internal server error (upstream or gateway error) |
503 | Service unavailable |