Definition

LLM (Large Language Model)

An AI model trained on large text datasets, used as the reasoning engine in AI voice agents.

A Large Language Model (LLM) is an AI model trained on massive text datasets to predict and generate natural language. LLMs like GPT-4o, Claude, and Gemini are the reasoning engine at the heart of modern AI voice agents — they interpret caller intent, generate responses, decide what actions to take, and maintain conversation context across multiple turns.

How LLMs work in voice agents

  1. The caller's speech is transcribed to text by the STT engine
  2. The text is sent to the LLM along with a system prompt defining the agent's persona, knowledge base, and allowed actions
  3. The LLM generates a text response and, if appropriate, a structured action (book appointment, look up order, transfer call)
  4. The response is sent to the TTS engine to produce audio

System prompts are the instructions that define how the LLM behaves during a call. They typically include the agent's name and personality, the business's policies and FAQs, instructions for specific scenarios (angry caller, HIPAA-sensitive topics), and the available integrations and actions.

Token context window: LLMs process text in chunks called tokens. A longer context window means the agent can remember more of the conversation history and respond consistently across a long call.

Grounding: For domain-specific knowledge (product catalog, appointment availability, order status), LLMs are grounded with real-time data retrieved from external systems via function calls or RAG (retrieval-augmented generation).