|
| 1 | +# Code Execution MCP Server & Pre-Hook Implementation |
| 2 | + |
| 3 | +## Overview |
| 4 | + |
| 5 | +This implementation provides controlled code execution capabilities for StackMemory, inspired by the `execute_code_py` project. It includes: |
| 6 | + |
| 7 | +1. **MCP Code Execution Handler** - Safe Python/JavaScript/TypeScript execution |
| 8 | +2. **Pre-Tool-Use Hook** - Controls and restricts tool usage |
| 9 | +3. **Multiple Operation Modes** - Permissive, Restrictive, and Code-Only modes |
| 10 | + |
| 11 | +## Features |
| 12 | + |
| 13 | +### Code Execution Handler |
| 14 | +- **Supported Languages**: Python, JavaScript, TypeScript |
| 15 | +- **Sandboxed Environment**: Executes code in isolated temp directory |
| 16 | +- **Timeout Protection**: Configurable timeout (default 30s) |
| 17 | +- **Output Truncation**: Handles large outputs gracefully |
| 18 | +- **Security Validation**: Checks for dangerous patterns before execution |
| 19 | + |
| 20 | +### Pre-Tool-Use Hook |
| 21 | +- **Three Modes**: |
| 22 | + - `permissive` - All tools allowed, dangerous ones logged |
| 23 | + - `restrictive` - Blocks potentially dangerous tools (Bash, Write, etc.) |
| 24 | + - `code_only` - Only allows code execution (pure computational environment) |
| 25 | +- **Audit Logging**: Tracks all tool usage attempts |
| 26 | +- **Always-Allowed Tools**: Context saving/loading, TodoWrite/Read |
| 27 | + |
| 28 | +## Installation |
| 29 | + |
| 30 | +```bash |
| 31 | +# Install the hooks and handlers |
| 32 | +./scripts/install-code-execution-hooks.sh |
| 33 | + |
| 34 | +# Or manually: |
| 35 | +cp templates/claude-hooks/pre-tool-use ~/.claude/hooks/ |
| 36 | +chmod +x ~/.claude/hooks/pre-tool-use |
| 37 | +``` |
| 38 | + |
| 39 | +## Configuration |
| 40 | + |
| 41 | +### Setting the Mode |
| 42 | + |
| 43 | +```bash |
| 44 | +# Option 1: Environment variable |
| 45 | +export STACKMEMORY_TOOL_MODE=code_only # or permissive, restrictive |
| 46 | + |
| 47 | +# Option 2: Configuration file |
| 48 | +echo "STACKMEMORY_TOOL_MODE=code_only" > ~/.stackmemory/tool-mode.conf |
| 49 | +``` |
| 50 | + |
| 51 | +### Mode Descriptions |
| 52 | + |
| 53 | +#### Permissive Mode (Default) |
| 54 | +- All tools are allowed |
| 55 | +- Dangerous operations are logged |
| 56 | +- Best for general development |
| 57 | + |
| 58 | +#### Restrictive Mode |
| 59 | +- Blocks: Bash, Write, Edit, Delete, WebFetch |
| 60 | +- Allows: Read, Grep, LS, TodoWrite, TodoRead |
| 61 | +- Good for safer operations |
| 62 | + |
| 63 | +#### Code-Only Mode |
| 64 | +- **Only** code execution tools allowed |
| 65 | +- Creates pure computational environment |
| 66 | +- Similar to `execute_code_py` behavior |
| 67 | +- Ideal for: |
| 68 | + - Algorithm development |
| 69 | + - Data analysis |
| 70 | + - Mathematical computations |
| 71 | + - Problem solving without side effects |
| 72 | + |
| 73 | +## Usage Examples |
| 74 | + |
| 75 | +### Python Code Execution |
| 76 | + |
| 77 | +```python |
| 78 | +# Via MCP tool |
| 79 | +result = await mcp.call('code.execute', { |
| 80 | + language: 'python', |
| 81 | + code: ''' |
| 82 | +import numpy as np |
| 83 | +import matplotlib.pyplot as plt |
| 84 | +
|
| 85 | +# Generate and analyze data |
| 86 | +data = np.random.normal(0, 1, 1000) |
| 87 | +mean = np.mean(data) |
| 88 | +std = np.std(data) |
| 89 | +
|
| 90 | +print(f"Mean: {mean:.4f}") |
| 91 | +print(f"Std Dev: {std:.4f}") |
| 92 | +''' |
| 93 | +}) |
| 94 | +``` |
| 95 | + |
| 96 | +### JavaScript Execution |
| 97 | + |
| 98 | +```javascript |
| 99 | +// Via MCP tool |
| 100 | +result = await mcp.call('code.execute', { |
| 101 | + language: 'javascript', |
| 102 | + code: ` |
| 103 | +// Fibonacci calculation |
| 104 | +function fib(n) { |
| 105 | + if (n <= 1) return n; |
| 106 | + return fib(n-1) + fib(n-2); |
| 107 | +} |
| 108 | +
|
| 109 | +for (let i = 0; i < 10; i++) { |
| 110 | + console.log(\`fib(\${i}) = \${fib(i)}\`); |
| 111 | +} |
| 112 | +` |
| 113 | +}) |
| 114 | +``` |
| 115 | + |
| 116 | +## Testing |
| 117 | + |
| 118 | +```bash |
| 119 | +# Test code execution handler |
| 120 | +node scripts/test-code-execution.js |
| 121 | + |
| 122 | +# View tool usage logs |
| 123 | +tail -f ~/.stackmemory/tool-use.log |
| 124 | + |
| 125 | +# Check sandbox status |
| 126 | +node -e " |
| 127 | +import { CodeExecutionHandler } from './dist/integrations/mcp/handlers/code-execution-handlers.js'; |
| 128 | +const h = new CodeExecutionHandler(); |
| 129 | +console.log(await h.getSandboxStatus()); |
| 130 | +" |
| 131 | +``` |
| 132 | + |
| 133 | +## Security Features |
| 134 | + |
| 135 | +### Code Validation |
| 136 | +The handler validates code for dangerous patterns: |
| 137 | +- OS module imports |
| 138 | +- Subprocess execution |
| 139 | +- eval/exec usage |
| 140 | +- File system access |
| 141 | +- Network operations |
| 142 | + |
| 143 | +### Sandboxing |
| 144 | +- Temporary directory isolation |
| 145 | +- Process timeout limits |
| 146 | +- Output size limits |
| 147 | +- No persistent state between executions |
| 148 | + |
| 149 | +### Audit Trail |
| 150 | +All tool usage is logged to `~/.stackmemory/tool-use.log`: |
| 151 | +```json |
| 152 | +{"timestamp":"2024-01-19T08:00:00Z","tool":"Bash","allowed":false,"reason":"Blocked in code_only mode","mode":"code_only"} |
| 153 | +{"timestamp":"2024-01-19T08:00:01Z","tool":"mcp__stackmemory__code.execute","allowed":true,"reason":"Code execution tool in code_only mode","mode":"code_only"} |
| 154 | +``` |
| 155 | + |
| 156 | +## Comparison with execute_code_py |
| 157 | + |
| 158 | +| Feature | execute_code_py | StackMemory Implementation | |
| 159 | +|---------|-----------------|---------------------------| |
| 160 | +| Language Support | Python only | Python, JavaScript, TypeScript | |
| 161 | +| Tool Restriction | All tools blocked | Configurable modes | |
| 162 | +| Integration | Separate MCP server | Integrated with StackMemory | |
| 163 | +| Context Persistence | None | Full StackMemory integration | |
| 164 | +| Audit Logging | Basic | Comprehensive logging | |
| 165 | +| Security Validation | Runtime only | Pre-execution validation | |
| 166 | + |
| 167 | +## Architecture |
| 168 | + |
| 169 | +``` |
| 170 | +Claude Code |
| 171 | + ↓ |
| 172 | +Pre-Tool-Use Hook (filters based on mode) |
| 173 | + ↓ |
| 174 | +Allowed Tools Only |
| 175 | + ↓ |
| 176 | +MCP Server |
| 177 | + ↓ |
| 178 | +Code Execution Handler |
| 179 | + ↓ |
| 180 | +Sandboxed Process |
| 181 | + ↓ |
| 182 | +Results |
| 183 | +``` |
| 184 | + |
| 185 | +## Benefits |
| 186 | + |
| 187 | +1. **Safety**: Controlled execution environment |
| 188 | +2. **Flexibility**: Multiple modes for different use cases |
| 189 | +3. **Integration**: Works with existing StackMemory features |
| 190 | +4. **Auditability**: Complete tool usage tracking |
| 191 | +5. **Pure Computation**: Code-only mode for algorithm focus |
| 192 | + |
| 193 | +## Troubleshooting |
| 194 | + |
| 195 | +### Hook Not Working |
| 196 | +```bash |
| 197 | +# Check if hook is installed |
| 198 | +ls -la ~/.claude/hooks/pre-tool-use |
| 199 | + |
| 200 | +# Check mode setting |
| 201 | +echo $STACKMEMORY_TOOL_MODE |
| 202 | +cat ~/.stackmemory/tool-mode.conf |
| 203 | + |
| 204 | +# View logs |
| 205 | +tail -f ~/.stackmemory/tool-use.log |
| 206 | +``` |
| 207 | + |
| 208 | +### Code Execution Fails |
| 209 | +```bash |
| 210 | +# Check Python/Node installation |
| 211 | +python3 --version |
| 212 | +node --version |
| 213 | + |
| 214 | +# Test handler directly |
| 215 | +node scripts/test-code-execution.js |
| 216 | + |
| 217 | +# Check sandbox permissions |
| 218 | +ls -la /tmp/stackmemory-sandbox |
| 219 | +``` |
| 220 | + |
| 221 | +## Future Enhancements |
| 222 | + |
| 223 | +- [ ] Support for more languages (Rust, Go, etc.) |
| 224 | +- [ ] Container-based isolation |
| 225 | +- [ ] Resource limits (CPU, memory) |
| 226 | +- [ ] Persistent workspace option |
| 227 | +- [ ] Code snippet library |
| 228 | +- [ ] Integration with Jupyter notebooks |
| 229 | +- [ ] Real-time collaboration features |
| 230 | + |
| 231 | +## License |
| 232 | + |
| 233 | +MIT - Part of StackMemory project |
0 commit comments