Security enhancements to IP Vault and new Gemini 3.1 Flash Live STS model for Convo mode
curl --request POST \
--url https://pria.praxislxp.com/api/ai/chat/completions \
--header 'Content-Type: application/json' \
--header 'x-access-token: <x-access-token>' \
--data '
{
"messages": [
{
"content": "<string>",
"name": "<string>",
"tool_call_id": "<string>",
"tool_calls": [
{}
]
}
],
"model": "pria",
"stream": true
}
'"data: {\"id\":\"chatcmpl-1735...\",\"object\":\"chat.completion.chunk\",\"created\":1735000000,\"model\":\"pria\",\"choices\":[{\"index\":0,\"delta\":{\"role\":\"assistant\",\"content\":\"\"},\"finish_reason\":null,\"logprobs\":null}]}\n\ndata: {\"id\":\"chatcmpl-1735...\",\"object\":\"chat.completion.chunk\",\"created\":1735000000,\"model\":\"pria\",\"choices\":[{\"index\":0,\"delta\":{\"content\":\"Hello\"},\"finish_reason\":null,\"logprobs\":null}]}\n\ndata: [DONE]\n"OpenAI-compatible streaming chat completions endpoint. Accepts a messages[]
array, extracts the last user message as the active turn, replays any prior
user/assistant messages as conversation history (sanitized for Bedrock’s
alternating-role requirement), runs the active turn through Praxis’s
RAG/tool pipeline, and streams back OpenAI-format SSE chunks.
Today’s primary consumer: the ElevenLabs Voice Agent in Convo (Direct) mode — its custom-LLM webhook points at this endpoint.
OpenAI compatibility surface (what Pria reads vs. ignores):
Pria accepts the OpenAI shape but only reads messages[] and model.
model is informational — the effective model is determined by the
Praxis cascade (assistant > chatCompletionModel > institution
conversationModel). Fields commonly seen on OpenAI clients but
silently ignored here: tools, tool_choice, temperature,
max_tokens, stream, stream_options, top_p, n,
frequency_penalty, presence_penalty, response_format, seed,
logit_bias, user. The response is always SSE (server forces
streaming mode regardless of stream). Tool calls are executed
server-side by Pria’s tool runtime — they are not surfaced as OpenAI
tool_calls deltas; tool acknowledgements appear inline as spoken
phrases in the content stream.
Per-institution gate. Disabled by default. The administrator must
set institution.chatCompletionEnabled = true to allow inbound traffic.
Disabled institutions receive 403 chat_completion_disabled.
Override fields (institution-level, all optional):
chatCompletionModel — overrides the conversation model and provider
routing for inbound requests. Priority: assistant.conversationModel
chatCompletionModel>institution.conversationModel. Assistant always wins — the override only applies when no assistant has overridden the conversation model. Empty/unset = inherit from existing cascade.
chatCompletionMaxCompletionTokens — overrides maxCompletionTokens.
Sentinel: -1 = inherit, 0 = Auto (catalog cap), >0 = explicit.chatCompletionReasoningEffort — overrides reasoningEffort. Empty
string = inherit. Common voice-mode value: 'none'.Detection of “this is a chat-completion inbound request” is path-based —
any request landing here sets requestArgs.isChatCompletion = true,
which the override helpers in rag.js and reasoning_effort_utils.js
read to apply the cascade above.
curl --request POST \
--url https://pria.praxislxp.com/api/ai/chat/completions \
--header 'Content-Type: application/json' \
--header 'x-access-token: <x-access-token>' \
--data '
{
"messages": [
{
"content": "<string>",
"name": "<string>",
"tool_call_id": "<string>",
"tool_calls": [
{}
]
}
],
"model": "pria",
"stream": true
}
'"data: {\"id\":\"chatcmpl-1735...\",\"object\":\"chat.completion.chunk\",\"created\":1735000000,\"model\":\"pria\",\"choices\":[{\"index\":0,\"delta\":{\"role\":\"assistant\",\"content\":\"\"},\"finish_reason\":null,\"logprobs\":null}]}\n\ndata: {\"id\":\"chatcmpl-1735...\",\"object\":\"chat.completion.chunk\",\"created\":1735000000,\"model\":\"pria\",\"choices\":[{\"index\":0,\"delta\":{\"content\":\"Hello\"},\"finish_reason\":null,\"logprobs\":null}]}\n\ndata: [DONE]\n"Documentation Index
Fetch the complete documentation index at: https://docs.praxis-ai.com/llms.txt
Use this file to discover all available pages before exploring further.
Praxis JWT
Public ID of the institution context (optional, defaults to user's primary institution)
Conversation/course ID
Assistant ObjectId (24-char hex)
IANA timezone string (e.g. "America/New_York") for date-aware prompts
OpenAI-format messages array. The last user message is the
active turn; earlier user and assistant entries are
replayed as conversation history. system and tool
messages are accepted in the shape but ignored — Pria
builds its own system prompt from assistant + institution
settings, and tool execution is server-managed.
Show child attributes
Informational only — echoed back in SSE chunks as
choices[].delta.model. The actual model dispatched is
determined by the Praxis cascade
(assistant.conversationModel >
institution.chatCompletionModel >
institution.conversationModel).
"pria"
Ignored — the endpoint always returns text/event-stream.
Accepted for OpenAI shape compatibility.
true
SSE stream of OpenAI-format completion chunks. Terminated with a final data: [DONE] line.
The response is of type string.
"data: {\"id\":\"chatcmpl-1735...\",\"object\":\"chat.completion.chunk\",\"created\":1735000000,\"model\":\"pria\",\"choices\":[{\"index\":0,\"delta\":{\"role\":\"assistant\",\"content\":\"\"},\"finish_reason\":null,\"logprobs\":null}]}\n\ndata: {\"id\":\"chatcmpl-1735...\",\"object\":\"chat.completion.chunk\",\"created\":1735000000,\"model\":\"pria\",\"choices\":[{\"index\":0,\"delta\":{\"content\":\"Hello\"},\"finish_reason\":null,\"logprobs\":null}]}\n\ndata: [DONE]\n"
Was this page helpful?