MLX Code v6.3.0

Local AI coding assistant for macOS, powered by Apple MLX (Machine Learning eXtensions).

MLX Code runs language models directly on your Mac using Apple Silicon. No cloud inference, no subscriptions, no data leaving your machine. Integrates directly into Xcode via a Source Editor Extension.

What It Does

MLX Code is a chat-based coding assistant with tool calling and Xcode integration. You describe what you need, and the model reads files, searches code, runs commands, and builds your project — all locally.

14 built-in tools:

Tool	What it does
File Operations	Read, write, edit, list, delete files
Bash	Run shell commands
Grep	Search file contents with regex
Glob	Find files by pattern
Xcode	Build, test, clean, archive, full deploy pipeline
Git	Status, diff, commit, branch, log, push, pull
GitHub	Issues, PRs, branches, credential scanning (calls GitHub API)
Code Navigation	Jump to definitions, find symbols
Code Analysis	Metrics, dependencies, lint, symbols, full analysis
Error Diagnosis	Analyze and explain build errors
Test Generation	Create unit tests from source files
Diff Preview	Show before/after file changes
Context Analysis	Analyze project structure and dependencies
Help	List available commands and usage

Slash commands: /commit, /review, /test, /docs, /refactor, /explain, /optimize, /fix, /search, /plan, /help, /clear

How It Works

You type a message (e.g., "Find all TODO comments in the project")
The model generates a tool call: <tool>{"name": "grep", "args": {"pattern": "TODO", "path": "."}}</tool>
MLX Code executes the tool and feeds results back to the model
The model responds with findings or takes the next action

Read-only tools (grep, glob, file read, code navigation) auto-approve. Write/execute tools (bash, file write, xcode build) ask for confirmation.

What's New in v6.3.0 (March 2026)

Xcode Extension · Native Downloads · Polished UI

Xcode Source Editor Extension — MLX Code now lives inside Xcode. Select any code and invoke from the Editor > MLX Code menu:

Explain Selection — understand what code does
Refactor Selection — get an improved version
Generate Tests — write unit tests for selected code
Fix Issues — find and fix bugs
Ask MLX Code — open with code pre-loaded, ask anything

No Python required — model downloads use the native Hub Swift API. Python is fully eliminated.

Syntax highlighting — Code blocks now highlight Swift, Python, JavaScript/TypeScript, Bash, JSON, and Objective-C (keywords, types, strings, comments, numbers).

Collapsed tool calls — The raw <tool>JSON</tool> message is now a clean "🔧 Called: tool_name" chip. Expand it if you want to see the JSON.

Accurate context window bar — Token bar shows real conversation usage against the actual model context window (e.g. 32,768 for Mistral, not a hardcoded 8,192).

Resume generation — A Continue button appears on the last assistant message after stopping. Click to pick up where it left off.

Smarter tool calling — JSON auto-repair fixes common model mistakes. Malformed tool calls retry with a correction prompt rather than silently failing.

Lower default temperature — Changed from 0.7 → 0.2 to significantly reduce hallucinations in code analysis tasks.

Features

Xcode Integration

MLX Code integrates with Xcode at two levels:

Chat-based tools (built-in):

Build, test, clean, archive from chat
Full deploy pipeline: version bump, build, archive, DMG, install
Error diagnosis with context-aware analysis
Code analysis: metrics, dependencies, linting, symbol inspection

Xcode Source Editor Extension:

Select code in Xcode → Editor > MLX Code → choose a command
Five commands: Explain Selection, Refactor Selection, Generate Tests, Fix Issues, Ask MLX Code
Enable once in System Settings → Privacy & Security → Extensions → Xcode Source Editor

GitHub Integration

The GitHub tool connects to the GitHub API to:

View and create issues
List and create pull requests
Manage branches
Scan for exposed credentials before pushing

User Memories

Persistent preferences that shape assistant behavior
50+ built-in coding standards across 8 categories
Custom memories stored locally
Categories: personality, code quality, security, Xcode, git, testing, docs, deployment
User-specific settings injected at runtime — never hardcoded

Context Management

Token budgeting with automatic message compaction
Real-time context window usage bar (synced to actual model context size)
Project context auto-include when workspace is open
Two tool tiers: core (always available) and development (when project is open)

Models

MLX Code uses mlx-community models from Hugging Face, quantized for Apple Silicon.

Recommended:

Model	Size	Context	Best for
Qwen 2.5 7B (default)	~4 GB	32K	General coding, tool calling
Mistral 7B v0.3	~4 GB	32K	Versatile, good at instructions
DeepSeek Coder 6.7B	~4 GB	16K	Code-specific tasks
Qwen 2.5 14B	~8 GB	32K	Best quality (needs 16GB+ RAM)

Models download automatically on first use via the native Hub Swift API. You can also add custom models from any mlx-community repo.

Requirements

macOS 14.0 (Sonoma) or later
Apple Silicon (M1, M2, M3, M4)
8 GB RAM minimum (16 GB recommended for 7B models)
No Python required — inference and downloads are pure Swift
Xcode 15+ — only required for the Source Editor Extension feature

Installation

See INSTALLATION.md for the full setup guide, including model downloads, enabling the Xcode extension, and troubleshooting.

Quick Start

Download MLXCode-vX.Y.Z.dmg from Releases
Drag MLX Code to /Applications
Launch the app → Settings → Models → download a model
Load the model and start chatting

Architecture

MLX Code (SwiftUI macOS app)
  |
  |-- ChatViewModel         # Conversation management, tool execution loop
  |-- MLXService            # Native MLX Swift inference (mlx-swift-lm actor)
  |-- ContextManager        # Token budgeting, message compaction
  |-- ToolRegistry          # 14 registered tools (2 tiers)
  |-- SystemPrompts         # Compact prompt with tool descriptions + user memories
  |-- XcodeActionHandler    # Handles incoming requests from the Xcode extension
  |
  |-- Services/
  |   |-- GitHubService     # GitHub API: issues, PRs, branches, credential scan
  |   |-- ContextAnalysis   # Project structure and dependency analysis
  |   `-- UserMemories      # Built-in standards + custom memory persistence
  |
  |-- Views/
  |   |-- ChatView          # Main chat UI, input area, context bar
  |   |-- MessageRowView    # Per-message rendering with syntax highlighting
  |   |-- CodeBlockView     # Syntax-highlighted code blocks with copy button
  |   `-- CollapsibleToolResultView  # Collapsed tool call/result chips
  |
  |-- ViewModels/
  |   |-- ProjectViewModel  # Build operations and project management
  |   |-- GitHubViewModel   # GitHub panel state
  |   `-- CodeAnalysisViewModel  # Code metrics and analysis state
  |
  `-- MLX Code Extension/   # Xcode Source Editor Extension (5 commands)

Key design decisions:

Inference via mlx-swift-lm Swift framework — no Python required
Chat templates applied natively by the tokenizer; falls back to flat format for unsupported models
Tool prompt is ~500 tokens (not 4000) — leaves room for actual conversation
Context budget system allocates tokens: system prompt, messages, project context, output
Two tool tiers: core (always available) and development (when project is open)
User memories injected at runtime from AppSettings — no personal data in source code
Xcode extension communicates via shared App Group container + mlxcode:// URL scheme

Security

Shell Execution Safety

Command Validation: All bash commands pass through CommandValidator with regex word-boundary matching before execution, blocking dangerous patterns (rm -rf /, fork bombs, etc.)
No Shell Interpolation: Git and build tools use process.currentDirectoryURL instead of string interpolation, preventing directory traversal and injection attacks
Tool Approval Flow: Write and execute tools (bash, file write, xcode build) require user confirmation before running
Read-Only Auto-Approve: Only safe, read-only tools (grep, glob, file read) auto-approve without user interaction

Data Privacy

100% Local Inference: All model inference runs on-device via Apple MLX — no prompts or responses leave your machine
No Telemetry: No analytics, crash reporting, or usage tracking of any kind
No Cloud AI: No OpenAI, Anthropic, or other cloud AI services — the model runs on your GPU
GitHub API only: The only external network calls are to the GitHub API (via the GitHub tool), which you explicitly invoke
Local Memory Storage: User memories stored locally, never transmitted

Thread Safety

Actor isolation: MLXService is a Swift actor — all model state is automatically serialized
Streaming via AsyncStream: Token generation uses AsyncStream<Generation>, delivered to @MainActor via MainActor.run
Task Cancellation: All background loops use while !Task.isCancelled for clean shutdown

Roadmap

Deeper Xcode integration — write responses back into the editor buffer without switching apps
Structured output — grammar-constrained generation to guarantee well-formed tool calls from smaller models
Streaming download progress — real-time progress bar for model downloads

What It Doesn't Do

Being honest about limitations:

No web browsing — can't fetch arbitrary URLs or browse the internet
No image/video/audio generation — this is a code assistant, not a media tool
Small model constraints — 3-14B parameter models make mistakes, especially with complex multi-step reasoning
Tool calling is imperfect — local models sometimes format tool calls incorrectly (auto-retry helps but isn't perfect)
Extension requires app switch — the Xcode extension opens MLX Code in a separate window rather than responding inline

Version History

v6.3.0 build 7 (March 4, 2026) — Current

Xcode Source Editor Extension — 5 commands in Editor > MLX Code menu (Explain, Refactor, Generate Tests, Fix Issues, Ask). Communicates via shared App Group + mlxcode:// URL scheme
Native model downloads — replaced Python downloader with Hub.HubApi.snapshot(). Python fully eliminated
Syntax highlighting — Swift, Python, JS/TS, Bash, JSON, Objective-C in all code blocks
Collapsed tool calls — raw <tool> assistant messages show as a compact chip; expand to inspect
Accurate context bar — syncs to model's actual context window on load; tracks conversation totals
Resume generation — Continue button on last assistant message after stopping
Tool call reliability — JSON auto-repair, retry-on-failure loop, stricter system prompt rule
Default temperature 0.2 — reduced from 0.7 to cut hallucinations in code analysis
Jinja template fallback — models with unsupported chat templates fall back to flat prompt format
Fixed: agentic tool calling loop — resolved "inference already in progress" error that occurred when the model called a tool and the follow-up generation failed. Root cause was PythonService.terminate() being a no-op with native MLX, causing chatCompletion() to run until maxTokens before returning. Stream now exits immediately when </tool> is detected.
Security: SafeTensors-only model loading — .bin and .pt (PyTorch pickle) model files are rejected at both discovery and load time. Only .safetensors format is permitted.

v6.2.0 (March 4, 2026)

Replaced Python subprocess daemon with native mlx-swift-lm framework for inference
Model loading via LLMModelFactory + ModelContainer — no Python process
Chat generation via MLXLMCommon.UserInput + AsyncStream<Generation>
Removed 2,726 lines of dead code (EthicalAIGuardian, AIBackendStatusMenu, all AIBackendManager files)

v6.1.x (March 4, 2026)

Comprehensive security audit: 31 findings resolved
API key storage migrated to macOS Keychain
Dead code removal, debug artifact cleanup, force unwrap fixes
Consistent logging via SecureLogger throughout

v6.0.0 (February 20, 2026)

GitHub integration: issues, PRs, branches, credential scanning
Code analysis: metrics, dependencies, lint, symbols
Xcode full deploy pipeline: build, archive, DMG, install
User memories system — persistent coding standards and preferences
14 tools (up from 11)

v5.0.0 (February 2026)

Major simplification: deleted 41 files (~16,000 lines) of unused features
Rewrote system prompt to be compact and honest
Default model: Qwen 2.5 7B

v4.0.0 (February 2026)

Chat template support, structured message passing, tool tier system
Context budget system, smart token estimation, project context auto-include
Tool approval flow with auto-approve for read-only operations

v1.x (January–February 2026)

Initial release with MLX backend
Desktop widget extension
Basic chat interface

License

See LICENSE for details.

Disclaimer: This is a personal project created on my own time. It is not affiliated with, endorsed by, or representative of my employer.

Name		Name	Last commit message	Last commit date
Latest commit History 122 Commits
.github		.github
Documentation/screenshots		Documentation/screenshots
MLX Code Extension		MLX Code Extension
MLX Code Tests		MLX Code Tests
MLX Code Widget		MLX Code Widget
MLX Code.xcodeproj		MLX Code.xcodeproj
MLX Code		MLX Code
MLX CodeTests		MLX CodeTests
Python		Python
Screenshots		Screenshots
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
CONTRIBUTING.md		CONTRIBUTING.md
ExportOptions.plist		ExportOptions.plist
INSTALLATION.md		INSTALLATION.md
LICENSE		LICENSE
README.md		README.md
SECURITY.md		SECURITY.md
_config.yml		_config.yml
add_files.rb		add_files.rb
add_gauge_view.rb		add_gauge_view.rb
add_python_scripts.rb		add_python_scripts.rb
add_thinking_view.rb		add_thinking_view.rb
add_tool_files.rb		add_tool_files.rb
add_tools_extension.rb		add_tools_extension.rb
add_ux_panels.rb		add_ux_panels.rb
create_project.rb		create_project.rb
fix_tool_paths.rb		fix_tool_paths.rb
fix_ux_panels.rb		fix_ux_panels.rb
remove_duplicates.rb		remove_duplicates.rb
resign_and_install.sh		resign_and_install.sh
setup_mlx_models.sh		setup_mlx_models.sh
watch_logs.sh		watch_logs.sh

Folders and files

Latest commit

History

Repository files navigation

MLX Code v6.3.0

What It Does

How It Works

What's New in v6.3.0 (March 2026)

Xcode Extension · Native Downloads · Polished UI

Features

Xcode Integration

GitHub Integration

User Memories

Context Management

Models

Requirements

Installation

Quick Start

Architecture

Security

Shell Execution Safety

Data Privacy

Thread Safety

Roadmap

What It Doesn't Do

Version History

v6.3.0 build 7 (March 4, 2026) — Current

v6.2.0 (March 4, 2026)

v6.1.x (March 4, 2026)

v6.0.0 (February 20, 2026)

v5.0.0 (February 2026)

v4.0.0 (February 2026)

v1.x (January–February 2026)

License

About

Topics

Resources

License

Code of conduct

Contributing

Security policy

Uh oh!

Stars

Watchers

Forks

Releases 3

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages