This guide provides comprehensive documentation for integrating machine learning pipelines with the VISTA application.
- Overview
- Prerequisites
- Authentication
- ML Analysis Workflow
- API Endpoints
- Data Formats
- Example Integration
- Testing
- Best Practices
- Troubleshooting
The ML Analysis feature allows external machine learning pipelines to submit analysis results for visualization in the web interface. This enables teams to:
- View object detection bounding boxes
- Visualize segmentation heatmaps
- Access classification scores and feature detections
- Export analysis results as JSON or CSV
- Compare multiple ML models on the same images
Key Concepts:
- ML Analysis - A complete analysis run on an image by a specific model
- ML Annotation - Individual detection/feature within an analysis (bounding box, point, etc.)
- ML Artifact - Binary output files stored in S3 (visualizations, processed images)
- HMAC Authentication - Secure authentication for pipeline endpoints
The ML Analysis feature must be enabled in the application configuration:
# Required settings in .env
ML_ANALYSIS_ENABLED=true
ML_CALLBACK_HMAC_SECRET=<your-secure-secret>
ML_ALLOWED_MODELS=yolo_v8,resnet50,custom_model# Generate a secure secret (32 bytes recommended)
openssl rand -hex 32
# Example output:
# a7f3d9e2c4b8f1a6e9d4c2b7f5a8e3d1c9b6f4a2e7d5c3b1f8a6e4d2c7b5f3a9Store this secret securely:
- In environment variables
- In a secrets manager (AWS Secrets Manager, HashiCorp Vault, etc.)
- Never commit to version control
For Python implementations:
pip install requests boto3For Node.js implementations:
npm install axios aws-sdk cryptoThe ML Analysis API uses two authentication mechanisms:
The initial analysis creation uses standard user authentication:
Headers Required:
X-User-Email: User's email addressX-Proxy-Secret: Shared secret (production)- OR
X-API-Key: User's API key
All pipeline callback endpoints require HMAC-SHA256 authentication to prevent unauthorized submissions.
Python Example:
import hmac
import hashlib
import time
import json
def create_hmac_headers(payload, secret):
"""Create HMAC authentication headers."""
timestamp = str(int(time.time()))
# Create message to sign
message = json.dumps(payload, separators=(',', ':'), sort_keys=True)
# Generate HMAC signature
signature = hmac.new(
secret.encode('utf-8'),
message.encode('utf-8'),
hashlib.sha256
).hexdigest()
return {
'Content-Type': 'application/json',
'X-ML-Signature': signature,
'X-ML-Timestamp': timestamp
}
# Usage
headers = create_hmac_headers(payload, ML_CALLBACK_HMAC_SECRET)
response = requests.post(url, json=payload, headers=headers)Node.js Example:
const crypto = require('crypto');
function createHMACHeaders(payload, secret) {
const timestamp = Math.floor(Date.now() / 1000).toString();
// Create message to sign
const message = JSON.stringify(payload);
// Generate HMAC signature
const signature = crypto
.createHmac('sha256', secret)
.update(message)
.digest('hex');
return {
'Content-Type': 'application/json',
'X-ML-Signature': signature,
'X-ML-Timestamp': timestamp
};
}
// Usage
const headers = createHMACHeaders(payload, process.env.ML_CALLBACK_HMAC_SECRET);
const response = await axios.post(url, payload, { headers });Security Notes:
- Signatures are time-sensitive (timestamp validated to prevent replay attacks)
- Message must be JSON string with consistent formatting
- Use constant-time comparison on server side
- Rotate secrets periodically
The complete workflow consists of 6 steps:
1. Create Analysis (queued)
↓
2. Update Status (processing)
↓
3. Request Presigned URLs
↓
4. Upload Artifacts to S3
↓
5. Submit Annotations
↓
6. Finalize Analysis (completed)
Step 1: Create Analysis
Initiates a new ML analysis for an image.
POST /api/images/{image_id}/analysesStep 2: Update Status to Processing
Marks the analysis as actively processing.
PATCH /api/analyses/{analysis_id}/statusStep 3: Request Presigned URLs
Gets secure upload URLs for artifacts.
POST /api/analyses/{analysis_id}/artifacts/presignStep 4: Upload Artifacts
Uploads binary files (images, visualizations) to S3.
PUT <presigned_url>Step 5: Submit Annotations
Submits detected features (bounding boxes, points, etc.).
POST /api/analyses/{analysis_id}/annotations:bulkStep 6: Finalize Analysis
Marks analysis as completed.
POST /api/analyses/{analysis_id}/finalizeEndpoint: POST /api/images/{image_id}/analyses
Authentication: User authentication (API key or headers)
Request Body:
{
"model_name": "yolo_v8",
"model_version": "v8.0.0",
"parameters": {
"confidence_threshold": 0.5,
"iou_threshold": 0.45
},
"description": "Object detection using YOLOv8"
}Response (201 Created):
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"image_id": "660e8400-e29b-41d4-a716-446655440001",
"model_name": "yolo_v8",
"model_version": "v8.0.0",
"status": "queued",
"parameters": {...},
"created_at": "2024-01-15T10:30:00Z"
}Endpoint: PATCH /api/analyses/{analysis_id}/status
Authentication: HMAC required
Request Body:
{
"status": "processing"
}Valid Status Values:
queued- Initial stateprocessing- Analysis in progresscompleted- Successfully completedfailed- Analysis failedcancelled- Cancelled by user/system
Response (200 OK):
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "processing",
"updated_at": "2024-01-15T10:31:00Z"
}Endpoint: POST /api/analyses/{analysis_id}/artifacts/presign
Authentication: HMAC required
Request Body:
{
"artifacts": [
{
"artifact_name": "bounding_boxes.png",
"content_type": "image/png"
},
{
"artifact_name": "heatmap.png",
"content_type": "image/png"
}
]
}Response (200 OK):
{
"presigned_urls": [
{
"artifact_name": "bounding_boxes.png",
"upload_url": "https://s3.amazonaws.com/bucket/path?signature=...",
"expires_in": 3600
},
{
"artifact_name": "heatmap.png",
"upload_url": "https://s3.amazonaws.com/bucket/path?signature=...",
"expires_in": 3600
}
]
}Endpoint: Presigned URL from previous step
Method: PUT
Headers:
Content-Type: Must match the content type from presign request
Body: Binary file content
Example with Python:
import requests
# Upload file
with open('bounding_boxes.png', 'rb') as f:
response = requests.put(
presigned_url,
data=f,
headers={'Content-Type': 'image/png'}
)
print(f"Upload status: {response.status_code}")Endpoint: POST /api/analyses/{analysis_id}/annotations:bulk
Authentication: HMAC required
Request Body:
{
"annotations": [
{
"annotation_type": "bounding_box",
"class_name": "person",
"confidence": 0.95,
"data": {
"bbox": [100, 150, 50, 100],
"format": "xywh"
}
},
{
"annotation_type": "bounding_box",
"class_name": "car",
"confidence": 0.87,
"data": {
"bbox": [300, 200, 120, 80],
"format": "xywh"
}
}
]
}Annotation Types:
bounding_box- Object detection boxessegmentation- Polygon or mask segmentationkeypoint- Individual point detectionsclassification- Image-level classificationheatmap- Attention or saliency maps
Response (201 Created):
{
"created_count": 2,
"annotations": [
{
"id": "770e8400-e29b-41d4-a716-446655440000",
"annotation_type": "bounding_box",
"class_name": "person",
"confidence": 0.95
},
{
"id": "880e8400-e29b-41d4-a716-446655440000",
"annotation_type": "bounding_box",
"class_name": "car",
"confidence": 0.87
}
]
}Endpoint: POST /api/analyses/{analysis_id}/finalize
Authentication: HMAC required
Request Body:
{
"status": "completed",
"summary": {
"total_detections": 25,
"processing_time_seconds": 3.45,
"classes_detected": ["person", "car", "bicycle"]
}
}Response (200 OK):
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"status": "completed",
"completed_at": "2024-01-15T10:35:00Z"
}XYWH (X, Y, Width, Height):
{
"annotation_type": "bounding_box",
"class_name": "person",
"confidence": 0.95,
"data": {
"bbox": [100, 150, 50, 100],
"format": "xywh"
}
}Where:
bbox[0](100) - X coordinate of top-left cornerbbox[1](150) - Y coordinate of top-left cornerbbox[2](50) - Width of boxbbox[3](100) - Height of box
XYXY (X1, Y1, X2, Y2):
{
"data": {
"bbox": [100, 150, 150, 250],
"format": "xyxy"
}
}Polygon:
{
"annotation_type": "segmentation",
"class_name": "person",
"confidence": 0.92,
"data": {
"points": [
[100, 150],
[120, 140],
[150, 160],
[140, 180],
[100, 175]
],
"format": "polygon"
}
}RLE (Run-Length Encoding):
{
"annotation_type": "segmentation",
"class_name": "car",
"confidence": 0.89,
"data": {
"rle": {
"counts": [12, 5, 8, 3, 15, 2],
"size": [480, 640]
},
"format": "rle"
}
}{
"annotation_type": "keypoint",
"class_name": "person",
"confidence": 0.93,
"data": {
"keypoints": [
{"name": "nose", "x": 125, "y": 160, "confidence": 0.95},
{"name": "left_eye", "x": 120, "y": 155, "confidence": 0.92},
{"name": "right_eye", "x": 130, "y": 155, "confidence": 0.91}
]
}
}{
"annotation_type": "classification",
"class_name": "dog",
"confidence": 0.96,
"data": {
"scores": {
"dog": 0.96,
"cat": 0.03,
"bird": 0.01
}
}
}import requests
import hmac
import hashlib
import time
import json
from pathlib import Path
class MLPipelineClient:
def __init__(self, api_url, hmac_secret, api_key=None):
self.api_url = api_url.rstrip('/')
self.hmac_secret = hmac_secret
self.api_key = api_key
def _create_hmac_headers(self, payload):
"""Create HMAC authentication headers."""
timestamp = str(int(time.time()))
message = json.dumps(payload, separators=(',', ':'), sort_keys=True)
signature = hmac.new(
self.hmac_secret.encode('utf-8'),
message.encode('utf-8'),
hashlib.sha256
).hexdigest()
return {
'Content-Type': 'application/json',
'X-ML-Signature': signature,
'X-ML-Timestamp': timestamp
}
def _get_user_headers(self):
"""Get user authentication headers."""
if self.api_key:
return {'X-API-Key': self.api_key}
return {}
def create_analysis(self, image_id, model_name, model_version, parameters=None):
"""Step 1: Create a new ML analysis."""
url = f"{self.api_url}/api/images/{image_id}/analyses"
payload = {
'model_name': model_name,
'model_version': model_version,
'parameters': parameters or {}
}
response = requests.post(
url,
json=payload,
headers=self._get_user_headers()
)
response.raise_for_status()
return response.json()
def update_status(self, analysis_id, status):
"""Step 2: Update analysis status."""
url = f"{self.api_url}/api/analyses/{analysis_id}/status"
payload = {'status': status}
response = requests.patch(
url,
json=payload,
headers=self._create_hmac_headers(payload)
)
response.raise_for_status()
return response.json()
def request_presigned_urls(self, analysis_id, artifacts):
"""Step 3: Request presigned URLs for artifact uploads."""
url = f"{self.api_url}/api/analyses/{analysis_id}/artifacts/presign"
payload = {'artifacts': artifacts}
response = requests.post(
url,
json=payload,
headers=self._create_hmac_headers(payload)
)
response.raise_for_status()
return response.json()
def upload_artifact(self, presigned_url, file_path, content_type):
"""Step 4: Upload artifact to S3 using presigned URL."""
with open(file_path, 'rb') as f:
response = requests.put(
presigned_url,
data=f,
headers={'Content-Type': content_type}
)
response.raise_for_status()
def submit_annotations(self, analysis_id, annotations):
"""Step 5: Submit annotations in bulk."""
url = f"{self.api_url}/api/analyses/{analysis_id}/annotations:bulk"
payload = {'annotations': annotations}
response = requests.post(
url,
json=payload,
headers=self._create_hmac_headers(payload)
)
response.raise_for_status()
return response.json()
def finalize_analysis(self, analysis_id, status='completed', summary=None):
"""Step 6: Finalize the analysis."""
url = f"{self.api_url}/api/analyses/{analysis_id}/finalize"
payload = {
'status': status,
'summary': summary or {}
}
response = requests.post(
url,
json=payload,
headers=self._create_hmac_headers(payload)
)
response.raise_for_status()
return response.json()
def run_ml_analysis(image_id, image_path):
"""Complete ML analysis workflow example."""
client = MLPipelineClient(
api_url='http://localhost:8000',
hmac_secret='your-hmac-secret',
api_key='your-api-key'
)
# Step 1: Create analysis
print("Creating analysis...")
analysis = client.create_analysis(
image_id=image_id,
model_name='yolo_v8',
model_version='v8.0.0',
parameters={'confidence_threshold': 0.5}
)
analysis_id = analysis['id']
print(f"Analysis created: {analysis_id}")
# Step 2: Update to processing
print("Updating status to processing...")
client.update_status(analysis_id, 'processing')
# Run your ML model here
# results = run_yolo_model(image_path)
# Step 3 & 4: Upload artifacts
print("Uploading artifacts...")
artifacts = [
{'artifact_name': 'detections.png', 'content_type': 'image/png'}
]
presigned = client.request_presigned_urls(analysis_id, artifacts)
for url_info in presigned['presigned_urls']:
# Generate visualization
# create_detection_visualization(results, 'detections.png')
client.upload_artifact(
url_info['upload_url'],
'detections.png',
'image/png'
)
# Step 5: Submit annotations
print("Submitting annotations...")
annotations = [
{
'annotation_type': 'bounding_box',
'class_name': 'person',
'confidence': 0.95,
'data': {'bbox': [100, 150, 50, 100], 'format': 'xywh'}
}
]
client.submit_annotations(analysis_id, annotations)
# Step 6: Finalize
print("Finalizing analysis...")
client.finalize_analysis(
analysis_id,
status='completed',
summary={'total_detections': len(annotations)}
)
print("Analysis complete!")
# Run the pipeline
if __name__ == '__main__':
run_ml_analysis(
image_id='your-image-id',
image_path='/path/to/image.jpg'
)The repository includes a test script for quick validation:
# Set environment variables
export ML_CALLBACK_HMAC_SECRET='your-secret'
export API_KEY='your-api-key'
# Run test pipeline
python scripts/test_ml_pipeline.py --image-id <image_uuid>For testing with real object detection:
cd scripts
# Set HMAC secret
source ../.env
# Run YOLOv8 pipeline
./run_yolov8_pipeline.sh <project_id> --install-deps --limit 5See scripts/README_YOLOV8.md for detailed documentation.
Create analysis:
curl -X POST http://localhost:8000/api/images/{image_id}/analyses \
-H "X-API-Key: your-api-key" \
-H "Content-Type: application/json" \
-d '{
"model_name": "test_model",
"model_version": "1.0.0"
}'Update status (with HMAC):
# Generate signature in Python:
# python -c "import hmac, hashlib, json; \
# payload = '{\"status\":\"processing\"}'; \
# print(hmac.new(b'secret', payload.encode(), hashlib.sha256).hexdigest())"
curl -X PATCH http://localhost:8000/api/analyses/{analysis_id}/status \
-H "Content-Type: application/json" \
-H "X-ML-Signature: <generated_signature>" \
-H "X-ML-Timestamp: $(date +%s)" \
-d '{"status":"processing"}'Always implement robust error handling:
def safe_api_call(func, *args, **kwargs):
"""Wrapper for safe API calls with retry logic."""
max_retries = 3
retry_delay = 5
for attempt in range(max_retries):
try:
return func(*args, **kwargs)
except requests.exceptions.RequestException as e:
if attempt < max_retries - 1:
print(f"Attempt {attempt + 1} failed: {e}")
print(f"Retrying in {retry_delay} seconds...")
time.sleep(retry_delay)
else:
print(f"All {max_retries} attempts failed")
raiseLog all important steps for debugging:
import logging
logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)
logger.info(f"Creating analysis for image {image_id}")
logger.info(f"Analysis {analysis_id} created successfully")
logger.warning(f"Low confidence detection: {confidence}")
logger.error(f"Failed to upload artifact: {error}")Batch Processing:
# Process multiple images in parallel
from concurrent.futures import ThreadPoolExecutor
def process_image(image_id):
# Run analysis pipeline
pass
with ThreadPoolExecutor(max_workers=4) as executor:
executor.map(process_image, image_ids)Artifact Compression:
from PIL import Image
# Compress images before upload
img = Image.open('detection.png')
img.save('detection_compressed.png', optimize=True, quality=85)- Never log HMAC secrets or API keys
- Use environment variables for sensitive data
- Validate all inputs before processing
- Implement rate limiting
- Monitor for unusual activity
Track model versions for reproducibility:
analysis = client.create_analysis(
image_id=image_id,
model_name='yolo_v8',
model_version='v8.0.0',
parameters={
'confidence_threshold': 0.5,
'iou_threshold': 0.45,
'model_checkpoint': 'epoch_100',
'training_date': '2024-01-15'
}
)Error: 401 Unauthorized - Invalid HMAC signature
Solutions:
- Verify HMAC secret matches backend configuration
- Ensure JSON payload formatting is consistent
- Check timestamp is recent (not expired)
- Verify signature generation algorithm
Debug:
# Print payload being signed
print("Payload:", json.dumps(payload, separators=(',', ':'), sort_keys=True))
# Print signature
print("Signature:", signature)
print("Timestamp:", timestamp)Error: 403 Forbidden when uploading to S3
Solutions:
- Upload immediately after receiving URL
- Request new presigned URL if expired
- Check URL expiration time
Error: 400 Bad Request - Invalid annotation format
Solutions:
- Verify annotation_type is supported
- Check bbox format matches specification
- Ensure confidence is between 0 and 1
- Validate class_name is provided
Error: Upload to S3 fails
Solutions:
- Check Content-Type header matches presign request
- Verify file exists and is readable
- Check network connectivity
- Confirm S3 credentials are valid
Error: 400 Bad Request - Model not in allowed list
Solution: Add model to ML_ALLOWED_MODELS in backend configuration:
ML_ALLOWED_MODELS=yolo_v8,resnet50,your_model_name- API Documentation: http://localhost:8000/docs (Swagger UI)
- Test Scripts:
/scripts/directory - Example Pipelines:
scripts/yolov8_ml_pipeline.py - User Guide:
/docs/user-guide.md(viewing ML results in UI) - Developer Guide:
/docs/developer-guide.md
For issues or questions:
- Check this guide and API documentation
- Review test scripts for working examples
- Check application logs for detailed error messages
- Contact system administrator or team lead