Disclaimer: This website is currently in beta testing. Some features may not be complete, and content may be out of date.
Back to Development stories
AI Models·6 min read

Ophraxx Lite: how we built our model lineup and routing system

An inside look at the three-model Ophraxx Lite lineup — what each model is designed for, how the prompt classifier automatically routes queries between them, how the five personality modes work, and what's coming next.

Abstract circuit board close-up

The three-model architecture

Ophraxx Lite is the name for our Discord bot model family. The lineup currently has three tiers. Ophraxx Lite • OX-L1 is the live, free-tier model — optimized for speed, reliability, and everyday questions including casual chat, quick factual lookups, Discord moderation help, and short-form tasks. It targets a 1,024-token output ceiling to keep responses snappy and avoid overwhelming users in a chat environment.

Ophraxx Lite • OX-B12 is the upcoming premium model, designed for upgraded reasoning — complex prompts, coding help, technical deep-dives, and longer structured answers. It targets a 2,048-token ceiling and is tuned for problems where getting the answer right matters more than getting it fast. Ophraxx Lite • OX-U45 is the future flagship — reserved for the highest-tier use cases, also at a 2,048-token ceiling, representing the most capable version of the platform we plan to deploy.

All three models run on our own AI infrastructure, optimized for extremely low inference latency — essential for a Discord bot where users expect near-instant replies. The underlying model details are intentionally abstracted behind the Ophraxx Lite branding so we can upgrade them over time without disrupting user expectations or requiring any changes on the server side.

Automatic query routing with the prompt classifier

One of the core engineering decisions we made early was that users should never have to manually pick which model to use. The prompt classifier handles this automatically on every message. It evaluates each query against two sets of keyword patterns: a complex set covering words like analyze, algorithm, compare, write a, explain, step-by-step, essay, and academic subject names like mathematics, philosophy, and chemistry; and a simple set covering greetings, identity questions, and one-word replies.

The classifier also weighs message length and question count. A message over 180 characters is automatically treated as complex. A message between 80 and 180 characters with complex keywords or multiple questions scores higher. A high enough complexity score routes to the more capable model. This means a user typing 'write me a Python class that implements a binary search tree' gets the reasoning model automatically, while 'what time is it in Tokyo' goes to the fast base model — with no configuration required.

If the reasoning model is not yet live, the router falls back gracefully to Lite OX-L1. This fallback is intentional: the system always gives the user a response even if their preferred tier is unavailable, and the transition to the better model happens transparently once it goes live.

The five personality modes

Beyond which model is used, we built a personality system that lets communities tune how the AI communicates. There are five modes. Default is the balanced baseline — helpful, intelligent, and adaptive to the tone of the conversation. Professional is formal and business-focused, avoiding slang, leading with the most important information, and using precise structured language.

Friendly is the polar opposite — warm, casual, encouraging, and comfortable with light humor, like a knowledgeable friend. Tutor is patient and educational, breaking down complex topics step by step with analogies and real-world examples, and actively inviting follow-up questions. Concise is the most minimal mode — one to three sentences unless complexity demands more, bullet points over paragraphs, no preamble whatsoever.

Each personality mode appends a behavioral addendum to the system prompt, layered on top of the core identity and safety rules. The safety rules are always non-negotiable regardless of mode — personality only changes tone and style, never constraints. Server admins configure the active personality through the bot's setup flow and can change it at any time.

What's coming next

Ophraxx Lite • OX-B12 is the most immediate priority — bringing the reasoning-capable model to premium servers. We are also exploring how to expose more of the model routing logic to admins through server settings, so communities with specific use cases like a coding-focused server can bias the router toward the more capable model by default.

Further out, Ophraxx Lite • OX-U45 represents the future of the platform — a model tier capable of handling the most demanding long-form tasks. The architecture we built, with the classifier and personality layers sitting above the model abstraction, means we can roll in new models underneath without rebuilding the product layer. That was a deliberate architectural decision from day one.