The LLMEngine is the core component of the LetheAISharp library that provides a high-level interface for communicating with Large Language Models (LLMs). It acts as a middleware layer that handles connections to various backends, manages chat history, personas, and provides both simple query methods and full conversation management.
🚀 New to LetheAISharp? Start with the Quick Start Guide for a 5-minute setup!
- Backend Compatibility: Supports KoboldAPI and OpenAI API. Also comes with its own internal backend.
- Simple Queries: Direct text queries with streaming and non-streaming options
- Full Communication: Complete chat system with personas, history, and context
- Persona Management: Bot and user personas with customizable attributes
- Chat History: Automatic session management and message logging
- RAG Integration: Retrieval-Augmented Generation support
- Event System: Real-time streaming and status updates
- Quick Start Guide - Get running in 5 minutes!
- Basic Setup and Initialization
- Settings Configuration
- Author Roles and Message Types
- Simple Queries
- Full Communication Mode
- Personas and Chat Management
- Instruction Format (LLMEngine.Instruct)
- Sampling Settings (LLMEngine.Sampler)
- System Prompt (LLMEngine.SystemPrompt)
- Macros System
- Events and Streaming
- Advanced Features
- Examples
The first step is to configure and connect to your LLM backend:
using LetheAISharp.LLM;
using LetheAISharp.Files;
// Setup connection to KoboldCpp (recommended)
LLMEngine.Setup("http://localhost:5001", BackendAPI.KoboldAPI);
// Or setup connection to OpenAI-compatible API
LLMEngine.Setup("http://localhost:1234", BackendAPI.OpenAI, "your-api-key");
// Or for a local GGUF file, you can use the internal backend LlamaSharp (in this case, you app is 100% self contained, and need no external program)
LLMEngine.Setup("C:\\Path\\To\\mymodel.gguf", BackendAPI.LlamaSharp);After setting up the connection parameters, establish the connection:
// Connect and retrieve model information
await LLMEngine.Connect();
// Check if connection was successful
if (LLMEngine.Status == SystemStatus.Ready)
{
Console.WriteLine($"Connected to: {LLMEngine.CurrentModel}");
Console.WriteLine($"Backend: {LLMEngine.Backend}");
Console.WriteLine($"Max Context: {LLMEngine.MaxContextLength} tokens");
}You can monitor the engine status:
// Subscribe to status changes
LLMEngine.OnStatusChanged += (sender, status) =>
{
Console.WriteLine($"LLMEngine status changed to: {status}");
};
// Check current status
switch (LLMEngine.Status)
{
case SystemStatus.NotInit:
Console.WriteLine("Engine not initialized");
break;
case SystemStatus.Ready:
Console.WriteLine("Engine ready for queries");
break;
case SystemStatus.Busy:
Console.WriteLine("Engine is processing a request");
break;
}The LLMEngine.Settings object provides extensive configuration options:
// Access current settings
var settings = LLMEngine.Settings;
// Basic backend settings
settings.BackendUrl = "http://localhost:5001";
settings.BackendAPI = BackendAPI.KoboldAPI;
settings.OpenAIKey = "your-api-key";
// Generation settings
settings.MaxReplyLength = 512;
settings.StopGenerationOnFirstParagraph = false;
// RAG and memory settings
settings.AllowWorldInfo = true;
settings.RAGMaxEntries = 3;
settings.RAGIndex = 3;
// Web search settings
settings.WebSearchAPI = BackendSearchAPI.DuckDuckGo;
settings.WebSearchDetailedResults = true;
// Save settings to file
settings.SaveToFile("path/to/settings.json");
// Load settings from file
settings = LLMSettings.LoadFromFile("path/to/settings.json");
LLMEngine.Settings = settings;The LetheAISharp uses specific author roles to distinguish between different types of messages. Understanding these roles is crucial for proper prompt construction:
public enum AuthorRole
{
System, // System messages within conversation
User, // User/human messages
Assistant, // AI assistant responses
Unknown, // Unknown/unspecified role
Tool // Used for tool calls in agent mode, not actual messages (internal)
}SingleMessage is the primary way messages are passed throughout the library — to SendMessageToBot, LogMessage, prompt builders, StartGeneration, and more.
Constructors:
// Full constructor — specify all metadata explicitly
new SingleMessage(AuthorRole role, DateTime date, string mess, string charID, string userID,
bool hidden = false, string imagePath = "")
// Convenience constructor — auto-fills DateTime.Now and current Bot/User identifiers
new SingleMessage(AuthorRole role, string mess, string img = "")Properties:
| Property | Type | Description |
|---|---|---|
Guid |
Guid |
Unique identifier for the message |
Role |
AuthorRole |
The author role (User, Assistant, etc.) |
Message |
string |
The message text |
Date |
DateTime |
Timestamp of the message |
CharID |
string |
Character ID of the bot persona |
UserID |
string |
ID of the user persona |
ImagePath |
string |
Path to an attached image (VLM) |
Hidden |
bool |
Whether the message is hidden from standard views |
Note |
string |
Optional annotation for the message |
User |
BasePersona |
Resolved user persona from LLMEngine.LoadedPersonas |
Bot |
BasePersona |
Resolved bot persona from LLMEngine.LoadedPersonas |
Sender |
BasePersona? |
The sending persona (User or Bot based on Role) |
Methods: ToTextCompletion(), ToChatCompletion()
Example usage:
// Simple message using convenience constructor
var msg = new SingleMessage(AuthorRole.User, "Hello!");
// Message with an attached image
var imgMsg = new SingleMessage(AuthorRole.User, "What is in this image?", "path/to/image.png");
// Full constructor for a hidden system note with explicit metadata
var hidden = new SingleMessage(AuthorRole.System, DateTime.Now, "Internal note", botId, userId, hidden: true);For basic text generation without conversation management, use the simple query methods. These methods require using the IPromptBuilder interface to construct backend-appropriate prompts.
// Get a prompt builder for the current backend
var builder = LLMEngine.GetPromptBuilder();
// Add your prompt content
builder.AddMessage(AuthorRole.User, "What is the capital of France?");
// Convert to query and execute
var query = builder.PromptToQuery(AuthorRole.Assistant);
var response = await LLMEngine.SimpleQuery(query);
Console.WriteLine($"Response: {response}");For real-time text generation with streaming:
// Subscribe to streaming events
LLMEngine.OnInferenceStreamed += (sender, token) =>
{
Console.Write(token); // Print each token as it arrives
};
LLMEngine.OnInferenceEnded += (sender, fullResponse) =>
{
Console.WriteLine($"\nComplete response: {fullResponse}");
};
// Build the prompt
var builder = LLMEngine.GetPromptBuilder();
builder.AddMessage(AuthorRole.User, "Write a short story about a robot.");
// Convert to query and start streaming
var query = builder.PromptToQuery(AuthorRole.Assistant);
await LLMEngine.SimpleQueryStreaming(query);The PromptBuilder allows you to create complex, multi-role conversations:
// Example with system prompt and user message
var builder = LLMEngine.GetPromptBuilder();
// Add messages with different roles
builder.AddMessage(AuthorRole.System, "You are a helpful assistant.");
builder.AddMessage(AuthorRole.User, "Explain quantum physics in simple terms.");
// Convert to query and execute
var query = builder.PromptToQuery(AuthorRole.Assistant);
var response = await LLMEngine.SimpleQuery(query);Both AddMessage and InsertMessage also accept a SingleMessage directly, enabling richer metadata such as image paths, custom char/user IDs, timestamps, and hidden flags:
// Add a SingleMessage (preserves all metadata)
builder.AddMessage(new SingleMessage(AuthorRole.User, "Explain quantum physics."));
// Insert a SingleMessage at a specific index
builder.InsertMessage(0, new SingleMessage(AuthorRole.System, "You are a helpful assistant."));The full communication mode provides complete conversation management with personas, chat history, and context awareness.
// Create a bot persona that can search information about the current chat topic while the user is AFK.
var bot = new BasePersona
{
Name = "Alice",
Bio = "A knowledgeable AI assistant with expertise in science and technology.",
IsUser = false,
Scenario = "You are Alice, a helpful AI assistant in a research lab.",
FirstMessage = new List<string> { "Hello! I'm Alice, how can I help you today?" },
AgentMode = true,
AgentTasks = [ "ActiveResearchTask" ],
SenseOfTime = true,
DatesInSessionSummaries = true
};
// Create a user persona
var user = new BasePersona
{
Name = "John",
Bio = "A curious researcher working on AI projects.",
IsUser = true
};
// Set the personas
LLMEngine.Bot = bot;
LLMEngine.User = user;// Send a user message and get bot response
await LLMEngine.SendMessageToBot(new SingleMessage(AuthorRole.User, "What is machine learning?"));
// The response will be streamed through events
LLMEngine.OnInferenceStreamed += (sender, token) =>
{
Console.Write(token);
};
LLMEngine.OnInferenceEnded += (sender, response) =>
{
Console.WriteLine($"\nBot response complete: {response}");
// The response is automatically logged to chat history
// Access it via LLMEngine.History
};// Access the chat history
var history = LLMEngine.History;
// Get the last message
var lastMessage = history.LastMessage();
Console.WriteLine($"Last message: {lastMessage?.Message}");
// Get all messages in current session
var messages = history.CurrentSession.Messages;
foreach (var msg in messages)
{
Console.WriteLine($"{msg.Role}: {msg.Message}");
}
// Clear full chat history (you probably don't want to do this often, the library is meant to use chat sessions instead)
history.Clear();
// Start a new session (archiving the previous one if it has content)
await history.StartNewChatSession();
// Save history to file, done automatically via LLMEngine.Bot.EndChat()
history.SaveToFile("path/to/chat.json");
// Load history from file, done automatically via LLMEngine.Bot.BeginChat()
var loadedHistory = Chatlog.LoadFromFile("path/to/chat.json");// Let the bot generate a message based on current context
await LLMEngine.AddBotMessage();
// Reroll the last bot response
await LLMEngine.RerollLastMessage();
// Have the bot impersonate the user
await LLMEngine.ImpersonateUser();var persona = new BasePersona
{
Name = "Dr. Sarah",
Bio = "A medical researcher specializing in genetics",
Scenario = "You are Dr. Sarah, working in a cutting-edge genetics lab.",
// Multiple possible first messages (one chosen randomly)
FirstMessage = new List<string>
{
"Welcome to the genetics lab! How can I assist you today?",
"Hello! I'm Dr. Sarah. What would you like to know about genetics?",
"Greetings! Ready to explore the world of genetic research?"
},
// Example dialog style
ExampleDialogs = new List<string>
{
"Dr. Sarah: *adjusts lab coat* Let me explain this concept clearly...",
"Dr. Sarah: That's a fascinating question about DNA structure!",
"Dr. Sarah: *points to molecular diagram* As you can see here..."
},
// Enable agent mode for autonomous behavior
AgentMode = true,
AgentTasks = new List<string> { "ActiveResearchTask" },
// Plugin integration
Plugins = new List<string> { "WebSearchPlugin", "MemoryPlugin" }
};
// Load the persona
LLMEngine.Bot = persona; // Initializes plugins and loads context, this *must* be done before a chat with a given persona// Start a new chat session, this will archive the previous chat and prepare it for RAG retrieval
await LLMEngine.History.StartNewChatSession();
// Add a welcome message
var welcomeMessage = LLMEngine.Bot.GetWelcomeLine(LLMEngine.User.Name);
if (!string.IsNullOrEmpty(welcomeMessage))
{
LLMEngine.History.LogMessage(AuthorRole.Assistant, welcomeMessage,
LLMEngine.User, LLMEngine.Bot);
}
// Alternatively, log a pre-constructed SingleMessage directly
// LLMEngine.History.LogMessage(new SingleMessage(AuthorRole.Assistant, welcomeMessage));
// Check session statistics
var session = LLMEngine.History.CurrentSession;
Console.WriteLine($"Session started: {session.StartTime}");
Console.WriteLine($"Message count: {session.Messages.Count}");The LLMEngine.Instruct property controls how messages are formatted for the underlying language model. This is crucial for proper model communication, especially with text completion backends like KoboldAPI.
Different models expect different formatting. For example:
- ChatML:
<|im_start|>user\nHello<|im_end|> - Alpaca:
### Instruction:\nHello\n### Response: - Vicuna:
USER: Hello\nASSISTANT:
// Access current instruction format
var instruct = LLMEngine.Instruct;
// Key properties for message formatting
Console.WriteLine($"User start: '{instruct.UserStart}'");
Console.WriteLine($"User end: '{instruct.UserEnd}'");
Console.WriteLine($"Bot start: '{instruct.BotStart}'");
Console.WriteLine($"Bot end: '{instruct.BotEnd}'");
Console.WriteLine($"System start: '{instruct.SystemStart}'");
Console.WriteLine($"System end: '{instruct.SystemEnd}'");// Configure for ChatML format
LLMEngine.Instruct.SystemStart = "<|im_start|>system\n";
LLMEngine.Instruct.SystemEnd = "<|im_end|>\n";
LLMEngine.Instruct.UserStart = "<|im_start|>user\n";
LLMEngine.Instruct.UserEnd = "<|im_end|>\n";
LLMEngine.Instruct.BotStart = "<|im_start|>assistant\n";
LLMEngine.Instruct.BotEnd = "<|im_end|>\n";NewLinesBetweenMessages: Add newlines between message blocksThinkingStart/End: For chain-of-thought modelsStopSequence: When to stop generation
Note: OpenAI-compatible backends handle formatting internally, while KoboldAPI backends use these settings to properly format prompts.
For detailed instruction format documentation, see InstructFormat.md.
The LLMEngine.Sampler controls how the model generates text, affecting creativity, randomness, and quality of responses.
// Access current sampler settings
var sampler = LLMEngine.Sampler;
// Common settings
sampler.Temperature = 0.7; // Creativity (0.0-2.0, higher = more creative)
sampler.Top_p = 0.9; // Nucleus sampling (0.0-1.0)
sampler.Top_k = 40; // Top-k sampling (0 = disabled)
sampler.Max_length = 512; // Maximum response length in tokens
sampler.Rep_pen = 1.1; // Repetition penalty (1.0 = no penalty)
// Advanced settings
sampler.Min_p = 0.05; // Minimum probability threshold
sampler.Typical = 1.0; // Typical sampling
sampler.Tfs = 1.0; // Tail-free sampling// Creative writing
sampler.Temperature = 1.2;
sampler.Top_p = 0.95;
// Factual/precise responses
sampler.Temperature = 0.3;
sampler.Top_p = 0.85;
// Balanced conversation
sampler.Temperature = 0.7;
sampler.Top_p = 0.9;The sampler settings are automatically applied to all generation methods (SimpleQuery, full communication mode, etc.).
The LLMEngine.SystemPrompt is only relevant for full communication mode and defines the structure and content of the system prompt that's sent to the model.
// Access system prompt settings
var sysPrompt = LLMEngine.SystemPrompt;
// Configure the main prompt template
sysPrompt.Prompt = @"You are {{char}}, interacting with {{user}}.
# Character Information
Name: {{char}}
Bio: {{charbio}}
# User Information
Name: {{user}}
Bio: {{userbio}}
# Instructions
Stay in character and respond naturally to {{user}}.";
// Configure section titles
sysPrompt.ScenarioTitle = "# Current Scenario";
sysPrompt.DialogsTitle = "# Character's Writing Style";
sysPrompt.WorldInfoTitle = "# Important Context";The system prompt automatically integrates:
- Character bio and personality from
LLMEngine.Bot - Scenario information from the loaded persona
- Example dialogs to guide response style
- RAG/WorldInfo retrieved content
- Previous session summaries when enabled
The system prompt fully supports the macro system (see Macros System) for dynamic content replacement.
The LetheAISharp includes a powerful macro system that allows dynamic text replacement throughout the library. Macros can be used in system prompts, personas, and even simple queries.
{{char}}: Current bot character's name{{charbio}}: Current bot character's biography{{mchar}}: In groups this points to the main character, elsewhere it's the same as{{char}}{{mcharbio}}: In groups this points to the main character's biography, elsewhere it's the same as{{charbio}}{{user}}: Current user's name{{userbio}}: Current user's biography{{examples}}: Character's example dialogs{{scenario}}: Current scenario description{{selfedit}}: Character's self-editable field{{group}}: Formatted list of all group members (names and bios). if not in a group, it's just for the current bot{{memory:<title>}}: Retrieves a specific entry by title from the bot's memory banks
{{time}}: Current time (e.g., "02:30 PM"){{date}}: Current date in human-readable format{{day}}: Current day of the week (e.g., "Monday")
LLMEngine.SystemPrompt.Prompt = @"You are {{char}}, a {{charbio}}.
Today is {{day}}, {{date}} at {{time}}.
You are talking with {{user}}.
{{scenario}}";var builder = LLMEngine.GetPromptBuilder();
builder.AddMessage(AuthorRole.System, "You are {{char}}, interacting with {{user}} on {{day}}.");
builder.AddMessage(AuthorRole.User, "Hello {{char}}!");
var query = builder.PromptToQuery(AuthorRole.Assistant);
var response = await LLMEngine.SimpleQuery(query);var persona = new BasePersona
{
Name = "Alice",
Bio = "I am {{char}}, a helpful assistant created to help {{user}}.",
Scenario = "{{char}} and {{user}} are working together on {{day}} morning.",
FirstMessage = new() { "Good morning {{user}}! It's {{time}} on {{day}}. How can I help?" }
};Macros are automatically processed in:
- Full communication mode: All system prompts, character bios, scenarios
- Simple queries: When using character context
- Chat history: Welcome messages and dialog examples
- RAG content: Retrieved information with character context
You can also manually process macros:
// Process macros for current bot and user
string processedText = LLMEngine.Bot.ReplaceMacros("Hello {{user}}, I'm {{char}}!");
// Process macros with specific user
string customText = LLMEngine.Bot.ReplaceMacros("{{user}} is talking to {{char}} at {{time}}", specificUser);This macro system ensures dynamic, contextual content that adapts to your current characters, time, and conversation state.
The LLMEngine provides several events for real-time updates:
The new channel-aware events provide richer information and cleanly separate different types of inference content:
// Channel-aware streaming — receives typed segments (Text, Thinking, ToolCall, etc.)
LLMEngine.OnInferenceSegment += (sender, segment) =>
{
switch (segment.Channel)
{
case InferenceChannel.Text:
Console.Write(segment.Text); // Normal visible response text
break;
case InferenceChannel.Thinking:
// Chain-of-thought / thinking block content (hidden from users)
break;
case InferenceChannel.ToolCall when segment.IsComplete:
Console.WriteLine("LLM requested a tool call");
break;
}
};
// Structured completion — receives the full result with separated channels
LLMEngine.OnInferenceCompleted += (sender, result) =>
{
Console.WriteLine($"\nResponse: {result.Response}");
if (result.ThinkingContent != null)
Console.WriteLine($"[Thinking: {result.ThinkingContent}]");
Console.WriteLine($"Finish reason: {result.FinishReason}");
// Log only the visible response to history
LLMEngine.History.LogMessage(AuthorRole.Assistant, result.Response,
LLMEngine.User, LLMEngine.Bot);
};The following events are still supported for backward compatibility but are deprecated. Migrate to the structured events above for new code.
// [Obsolete] Streaming text generation — fires for every token
LLMEngine.OnInferenceStreamed += (sender, token) =>
{
Console.Write(token); // Real-time text output
};
// [Obsolete] Generation completion — returns the full raw response string
LLMEngine.OnInferenceEnded += (sender, fullResponse) =>
{
Console.WriteLine($"\nGeneration complete: {fullResponse}");
// Log the response to history if needed
LLMEngine.History.LogMessage(AuthorRole.Assistant, fullResponse,
LLMEngine.User, LLMEngine.Bot);
};// Full prompt generation
LLMEngine.OnFullPromptReady += (sender, prompt) =>
{
Console.WriteLine($"Generated prompt: {prompt}");
};
// Quick inference (non-streaming) completion
LLMEngine.OnQuickInferenceEnded += (sender, response) =>
{
Console.WriteLine($"Quick inference result: {response}");
};
// Status changes
LLMEngine.OnStatusChanged += (sender, status) =>
{
Console.WriteLine($"Status: {status}");
};
// Bot persona changes
LLMEngine.OnBotChanged += (sender, newBot) =>
{
Console.WriteLine($"Bot changed to: {newBot.Name}");
};RAG automatically retrieves the summary of previous chat session with the current bot based on the conversation, and inserts it into the prompt at the specified index. This is extremely effective for maintaining context over long-term conversations. It's also used for other advanced functions like WorldInfo, or the agent mode's different tasks.
// Enable RAG in settings
LLMEngine.Settings.AllowWorldInfo = true;
LLMEngine.Settings.RAGMaxEntries = 5;
LLMEngine.Settings.RAGIndex = 3; // Insert at message index 3Web search is automatically triggered by some agent tasks when the bot needs additional information from the web.
// Configure web search
LLMEngine.Settings.WebSearchAPI = BackendSearchAPI.DuckDuckGo;
LLMEngine.Settings.WebSearchDetailedResults = true;// Count tokens in text
var tokenCount = LLMEngine.GetTokenCount("Hello, world!");
Console.WriteLine($"Token count: {tokenCount}");
// Check context limits
var maxTokens = LLMEngine.MaxContextLength;
var currentUsage = LLMEngine.History.GetCurrentTokenUsage();
var remaining = maxTokens - currentUsage;
Console.WriteLine($"Tokens remaining: {remaining}");One of the most powerful way to use a model is through structured output. This forces the LLM to respond in a specific format, which can then be parsed easily by code. This is especially useful for applications like Q&A bots, data extraction, or any scenario where you want the model to return data in a predictable structure.
if (LLMEngine.SupportsSchema)
{
// Define a class for structured output (note that the class has to be kinda simple, see official Kobold and OpenAI docs)
public class StructuredAnswer
{
public string answer { get; set; }
public int confidence { get; set; }
public List<string> sources { get; set; }
}
var structuredAnswer = new StructuredAnswer();
// Build a prompt basic prompt, this is obviously kind of a silly example:
var builder = LLMEngine.GetPromptBuilder();
builder.AddMessage(AuthorRole.System, "You are an useful bot, you respond in JSON.");
builder.AddMessage(AuthorRole.User, "Hello, what is the weather like today?");
// forces the bot to respond in the specified structured format
await builder.SetStructuredOutput(structuredAnswer);
// Convert to query
var query = builder.PromptToQuery();
// Await the response
var response = await LLMEngine.SimpleQuery(query, ct).ConfigureAwait(false);
try
{
// Parse the structured response
structuredAnswer = JsonConvert.DeserializeObject<StructuredAnswer>(response);
}
catch
{
// something went wrong. LLM can be finicky, and it might not respond in the correct format on extremely rare occasions
// so it should be handled properly.
Console.WriteLine("Failed to parse structured response.");
}
// Use the result
...
}The types supported for structured output are depend on the backend, but generally speaking, safe types are:
- bool
- int, float, double
- string
- List where T is a safe type
Beyond that, you'll need to experiment and consult the backend documentation.
using LetheAISharp.LLM;
using LetheAISharp.Files;
class SimpleBot
{
public static async Task Main()
{
// Setup and connect
LLMEngine.Setup("http://localhost:5001", BackendAPI.KoboldAPI);
await LLMEngine.Connect();
// Configure streaming
LLMEngine.OnInferenceStreamed += (_, token) => Console.Write(token);
LLMEngine.OnInferenceEnded += (_, response) => Console.WriteLine("\n");
// Interactive loop (this has no brain, it'll just recall the system prompt and user input)
while (true)
{
Console.Write("You: ");
var input = Console.ReadLine();
if (input == "quit") break;
Console.Write("Bot: ");
// Build prompt properly
var builder = LLMEngine.GetPromptBuilder();
builder.AddMessage(AuthorRole.User, input);
builder.AddMessage(AuthorRole.System, "You are a helpful assistant.");
var query = builder.PromptToQuery(AuthorRole.Assistant);
await LLMEngine.SimpleQueryStreaming(query);
}
}
}using LetheAISharp.LLM;
using LetheAISharp.Files;
class CharacterChat
{
public static async Task Main()
{
// Setup
LLMEngine.Setup("http://localhost:5001", BackendAPI.KoboldAPI);
await LLMEngine.Connect();
// Create personas
var bot = new BasePersona
{
Name = "Einstein",
Bio = "Albert Einstein, the famous physicist",
Scenario = "You are Albert Einstein. Speak with wisdom and curiosity about science.",
FirstMessage = new() { "Guten Tag! I am Albert Einstein. What scientific mysteries shall we explore today?" }
};
var user = new BasePersona
{
Name = "Student",
IsUser = true
};
// Setup events
LLMEngine.OnInferenceStreamed += (_, token) => Console.Write(token);
LLMEngine.OnInferenceEnded += (_, response) =>
{
Console.WriteLine("\n");
LLMEngine.History.LogMessage(AuthorRole.Assistant, response, user, bot);
}
LLMEngine.Bot = bot; // Initialize plugins and context, and load history if exists
LLMEngine.User = user;
// Start conversation
var welcome = bot.GetWelcomeLine(user.Name);
Console.WriteLine($"Einstein: {welcome}");
LLMEngine.History.LogMessage(AuthorRole.Assistant, welcome, user, bot);
// Chat loop
while (true)
{
Console.Write("Student: ");
var input = Console.ReadLine();
if (input == "quit") break;
Console.Write("Einstein: ");
await LLMEngine.SendMessageToBot(new SingleMessage(AuthorRole.User, input));
}
// Save history and exit program
LLMEngine.Bot.EndChat();
}
}- Check backend connectivity before making queries
- Monitor token usage to avoid context overflow in Simple mode (full discussion mode handles that automatically)
- Save chat history regularly to preserve conversations
- Use personas to create more engaging and consistent character behavior
- Leverage events for real-time UI updates
- Configure settings appropriately for your use case
- Connection Failed: Ensure the backend URL is correct and the service is running
- Empty Responses: Check if the model is loaded correctly in your backend
- Token Overflow: Monitor context length and implement history trimming
- Slow Responses: Consider using streaming for better UX (and that you're running a model that can fit on your computer)
// Check engine status
Console.WriteLine($"Status: {LLMEngine.Status}");
Console.WriteLine($"Backend: {LLMEngine.Backend}");
Console.WriteLine($"Model: {LLMEngine.CurrentModel}");
Console.WriteLine($"Max Context: {LLMEngine.MaxContextLength}");
// Check current token usage
var usage = LLMEngine.History.GetCurrentTokenUsage();
Console.WriteLine($"Current token usage: {usage}");
// Verify backend connection
var isConnected = await LLMEngine.CheckBackend();
Console.WriteLine($"Backend connected: {isConnected}");This documentation provides a comprehensive guide to using the LLMEngine. For more advanced features refer to:
- Agent Tasks: See AGENTS.md for the autonomous agent system
- RAG and Memory: See MEMORY.md for memory management and retrieval
- Personas: See PERSONAS.md for details about setting up (and overriding) the BasePersona class