Tiny Conversational AI - Usage Guide

A minimal transformer-based chatbot built from scratch using only NumPy. This guide covers installation, training, and usage.

📋 Prerequisites

Python 3.6+
NumPy library

🚀 Quick Start

Installation

# Install dependencies from requirements.txt
pip install -r requirements.txt

# Or install manually
pip install numpy

# Make scripts executable (optional)
chmod +x train.py launch.py

Training the Model

# Train with default output path (chatbot_model.pkl)
python train.py

# Or specify a custom output path
python train.py my_model.pkl

Launching the Chatbot

# Launch with default model (chatbot_model.pkl in current directory)
python launch.py

# Or specify a custom model path
python launch.py my_model.pkl

💬 Interactive Chat

Once you launch with python launch.py, you'll see:

==================================================
Chatbot Ready! (type 'quit' to exit)
==================================================

You:

Type your messages and press Enter. The bot will generate responses based on the trained model.

Available commands:

quit - Exit the chatbot
exit - Exit the chatbot
q - Exit the chatbot

🔧 Advanced Usage

Loading a Pre-trained Model

from tiny_chatbot import TinyChatbot

# Load saved model
model = TinyChatbot.load('chatbot_model.pkl')

# Generate response
response = model.generate("hello ", max_new_tokens=40, temperature=0.8)
print(response)

Training with Custom Data

from tiny_chatbot import TinyChatbot, train

# Create model
model = TinyChatbot(
    vocab_size=128,
    embed_dim=64,
    num_heads=4,
    ff_dim=128,
    num_layers=2,
    max_len=64
)

# Prepare your data (list of token sequences)
data = [
    [ord(c) for c in "hello Hi there!"],
    [ord(c) for c in "how are you I'm great!"],
    # Add more conversations...
]

# Train
train(model, data, epochs=100, batch_size=4)

# Save
model.save('my_model.pkl')

Adjusting Generation Parameters

# Temperature controls randomness (0.1 = conservative, 1.5 = creative)
response = model.generate(
    prompt="hello",
    max_new_tokens=50,    # Maximum tokens to generate
    temperature=0.7       # Sampling temperature
)

⚙️ Configuration

Edit hyperparameters at the top of tiny_chatbot.py:

VOCAB_SIZE = 128      # ASCII character set
EMBED_DIM = 64        # Embedding dimension
NUM_HEADS = 4         # Number of attention heads
FF_DIM = 128          # Feed-forward dimension
NUM_LAYERS = 2        # Number of transformer layers
MAX_LEN = 64          # Maximum sequence length
LEARNING_RATE = 0.001 # Learning rate (not used in current version)

Recommended configurations:

Use Case	EMBED_DIM	NUM_HEADS	NUM_LAYERS	Notes
Tiny (demo)	32	2	1	Very fast, limited capability
Small	64	4	2	Default, good for testing
Medium	128	8	4	Better quality, slower
Large	256	8	6	Best quality, much slower

📊 Model Architecture

The model implements a simplified GPT-style transformer:

Input Text
    ↓
Token Embedding + Positional Embedding
    ↓
Transformer Block 1
  ├─ Multi-Head Attention
  ├─ Layer Normalization
  ├─ Feed-Forward Network
  └─ Layer Normalization
    ↓
Transformer Block 2
  └─ (same structure)
    ↓
Output Linear Layer
    ↓
Generated Text

Components:

Multi-Head Attention: Allows model to focus on different parts of input
Feed-Forward Networks: Processes attention outputs
Layer Normalization: Stabilizes training
Positional Embeddings: Encodes token positions
Causal Masking: Ensures autoregressive generation

🎯 Training Data Format

The default training includes 10 conversation pairs:

conversations = [
    ("hello", "Hi! How can I help you today?"),
    ("hi", "Hello! What's on your mind?"),
    ("how are you", "I'm doing well, thanks for asking!"),
    # ...
]

To add your own:

Edit the prepare_data() function in tiny_chatbot.py
Add conversation tuples: (user_input, bot_response)
Re-run the script to train with new data

📈 Training Process

Training output shows progress:

Training for 50 epochs...
Epoch 10/50, Loss: -0.11
Epoch 20/50, Loss: -0.11
Epoch 30/50, Loss: -0.11
Epoch 40/50, Loss: -0.11
Epoch 50/50, Loss: -0.11
Training complete!

Note: The loss is a simplified approximation. Real implementations use proper cross-entropy loss with backpropagation.

🐛 Limitations

This is an educational implementation demonstrating transformer concepts:

No backpropagation: Weights aren't actually updated during training
Small vocabulary: Only supports ASCII characters (128 tokens)
Limited data: Only 10 training examples by default
Simple generation: May produce random outputs without proper training
No GPU support: Uses only NumPy (CPU-based)

🔍 Troubleshooting

Model generates random characters

This is expected with the minimal training data
Add more conversation pairs to prepare_data()
Increase training epochs
Adjust temperature (lower = more deterministic)

Out of memory errors

Reduce EMBED_DIM, NUM_LAYERS, or MAX_LEN
Process smaller batches
Limit max_new_tokens during generation

Import errors

# Make sure NumPy is installed
pip install numpy

📚 Further Learning

To build a production-ready chatbot:

Use PyTorch or TensorFlow for automatic differentiation
Implement proper backpropagation with optimizers (Adam, SGD)
Use larger datasets (thousands or millions of examples)
Add tokenization (BPE, WordPiece) for better vocabulary
Implement beam search for better generation
Add temperature scaling and top-k/top-p sampling
Use pre-trained models (GPT-2, BERT) and fine-tune

📄 License

Educational implementation - use freely for learning purposes!

🤝 Contributing

Feel free to extend this implementation:

Add proper backpropagation
Implement different attention mechanisms
Add more training data
Optimize performance
Create a web interface

Built with ❤️ using only NumPy and Python

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tiny Conversational AI - Usage Guide

📋 Prerequisites

🚀 Quick Start

Installation

Training the Model

Launching the Chatbot

💬 Interactive Chat

🔧 Advanced Usage

Loading a Pre-trained Model

Training with Custom Data

Adjusting Generation Parameters

⚙️ Configuration

📊 Model Architecture

Components:

🎯 Training Data Format

📈 Training Process

🐛 Limitations

🔍 Troubleshooting

Model generates random characters

Out of memory errors

Import errors

📚 Further Learning

📄 License

🤝 Contributing

FilesExpand file tree

USAGE.md

Latest commit

History

USAGE.md

File metadata and controls

Tiny Conversational AI - Usage Guide

📋 Prerequisites

🚀 Quick Start

Installation

Training the Model

Launching the Chatbot

💬 Interactive Chat

🔧 Advanced Usage

Loading a Pre-trained Model

Training with Custom Data

Adjusting Generation Parameters

⚙️ Configuration

📊 Model Architecture

Components:

🎯 Training Data Format

📈 Training Process

🐛 Limitations

🔍 Troubleshooting

Model generates random characters

Out of memory errors

Import errors

📚 Further Learning

📄 License

🤝 Contributing