Contributing to excel-to-sql

Thank you for your interest in contributing to excel-to-sql! This document provides guidelines and instructions for contributing to the project.

Code of Conduct
Getting Started
Development Setup
Coding Standards
Testing Guidelines
Commit Messages
Pull Request Process
Reporting Issues

Code of Conduct

Our Pledge

We are committed to providing a welcoming and inclusive environment for all contributors. Please be respectful and constructive in all interactions.

Standards

Use welcoming and inclusive language
Be respectful of differing viewpoints and experiences
Gracefully accept constructive criticism
Focus on what is best for the community
Show empathy towards other community members

Getting Started

Prerequisites

Python 3.10 or higher
Git
GitHub account
Basic understanding of Python, Git, and CLI tools

First Time Setup

# 1. Fork the repository on GitHub
# Click the "Fork" button in the top-right corner

# 2. Clone your fork locally
git clone https://github.com/YOUR_USERNAME/excel-to-sql.git
cd excel-to-sql

# 3. Add the original repository as upstream
git remote add upstream https://github.com/wareflowx/excel-to-sql.git

# 4. Install uv (Python package manager)
# On Windows:
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
# On Linux/macOS:
curl -LsSf https://astral.sh/uv/install.sh | sh

# 5. Install dependencies
uv sync

# 6. Install development dependencies
uv sync --dev

Development Setup

Creating a Feature Branch

# 1. Ensure your main branch is up-to-date
git checkout main
git fetch upstream
git rebase upstream/main

# 2. Create a new feature branch
git checkout -b feature/your-feature-name

Branch Naming Conventions

feature/feature-name - New features
fix/bug-name - Bug fixes
docs/documentation-name - Documentation updates
refactor/component-name - Code refactoring
test/test-name - Test additions or updates

Development Workflow

# 1. Make your changes
# Edit files, add features, fix bugs

# 2. Run tests locally
uv run pytest

# 3. Run with coverage
uv run pytest --cov=excel_to_sql --cov-report=html

# 4. Format code
uv run ruff format excel_to_sql/ tests/

# 5. Lint code
uv run ruff check excel_to_sql/ tests/

# 6. Commit your changes
git add .
git commit -m "feat: add your feature description"

# 7. Push to your fork
git push origin feature/your-feature-name

Coding Standards

Python Style Guide

We follow PEP 8 style guidelines with the following tools:

Formatting - Ruff

# Format code
uv run ruff format excel_to_sql/ tests/

# Check formatting
uv run ruff format --check excel_to_sql/ tests/

Linting - Ruff

# Lint code
uv run ruff check excel_to_sql/ tests/

# Auto-fix linting issues
uv run ruff check --fix excel_to_sql/ tests/

Code Organization

excel_to_sql/
├── cli.py                # CLI interface entry point
├── __init__.py           # Package exports
├── __version__.py        # Version information
├── sdk/                  # Python SDK implementation
├── entities/             # Domain entities (Project, Database, Table, etc.)
├── transformations/      # Data transformation logic
├── validators/           # Validation framework
├── profiling/            # Data quality profiling
├── auto_pilot/           # Auto-Pilot mode components
└── ui/                   # Interactive wizard UI

Import Style

Use isort for import organization (included in ruff):

# Standard library imports
import os
from pathlib import Path

# Third-party imports
import pandas as pd
from rich.console import Console

# Local imports
from excel_to_sql.entities import Project
from excel_to_sql.validators import ValidationRule

Docstrings

Use Google-style docstrings:

def process_file(file_path: Path, patterns: dict) -> dict:
    """Process a single Excel file and detect patterns.

    Args:
        file_path: Path to the Excel file to process.
        patterns: Dictionary of detected patterns.

    Returns:
        Dictionary containing processing results with keys:
            - 'file_path': str - Path to processed file
            - 'table_name': str - Detected table name
            - 'patterns': dict - Detected patterns

    Raises:
        FileNotFoundError: If the file does not exist.
        ValueError: If the file format is invalid.
    """

Type Hints

All functions should include type hints:

from typing import Dict, List, Optional

def detect_patterns(
    df: pd.DataFrame,
    table_name: str,
    confidence_threshold: float = 0.7
) -> Dict[str, any]:
    """Detect patterns in DataFrame."""
    pass

Error Handling

# Use specific exceptions
try:
    df = pd.read_excel(file_path)
except FileNotFoundError:
    raise FileNotFoundError(f"Excel file not found: {file_path}")
except Exception as e:
    raise ValueError(f"Failed to read Excel file: {e}")

# Log errors appropriately
import logging

logger = logging.getLogger(__name__)
logger.error(f"Error processing file {file_path}: {e}")

Testing Guidelines

Test Structure

tests/
├── test_cli.py              # CLI command tests
├── test_sdk.py              # SDK functionality tests
├── test_transformations/    # Transformation tests
├── test_validators/         # Validator tests
├── test_auto_pilot/         # Auto-Pilot component tests
│   ├── test_detector.py     # PatternDetector tests
│   ├── test_quality.py      # QualityScorer tests
│   ├── test_recommender.py  # RecommendationEngine tests
│   ├── test_auto_fix.py     # AutoFixer tests
│   └── test_auto_fix_integration.py  # Integration tests
├── test_ui/                 # UI component tests
└── fixtures/                # Test data and fixtures
    └── auto_pilot/          # Auto-Pilot test Excel files

Writing Tests

import pytest
import pandas as pd
from pathlib import Path

class TestPatternDetector:
    """Unit tests for PatternDetector class."""

    def test_initialization(self) -> None:
        """Test that PatternDetector initializes correctly."""
        from excel_to_sql.auto_pilot.detector import PatternDetector

        detector = PatternDetector()
        assert detector is not None

    def test_detect_primary_key(self) -> None:
        """Test primary key detection."""
        from excel_to_sql.auto_pilot.detector import PatternDetector

        detector = PatternDetector()
        df = pd.DataFrame({
            "id": [1, 2, 3],
            "name": ["A", "B", "C"]
        })

        patterns = detector.detect_patterns(df, "test")
        assert patterns["primary_key"] == "id"

Test Coverage Requirements

New features must have test coverage > 80%
Critical paths must have 100% coverage
Integration tests for complex workflows

# Run tests with coverage
uv run pytest --cov=excel_to_sql --cov-report=html

# Check coverage report
open htmlcov/index.html

Fixtures

Place test data in tests/fixtures/:

tests/fixtures/
├── auto_pilot/
│   ├── commandes.xlsx
│   ├── mouvements.xlsx
│   └── produits.xlsx
└── transformations/
    └── test_data.xlsx

Commit Messages

Follow conventional commit format:

<type>(<scope>): <subject>

<body>

<footer>

Types

feat - New feature
fix - Bug fix
docs - Documentation changes
style - Code style changes (formatting, etc.)
refactor - Code refactoring
test - Adding or updating tests
chore - Maintenance tasks
perf - Performance improvements

Examples

feat(auto_pilot): add pattern detection for foreign keys

Implement foreign key detection based on column name patterns
and value overlap analysis with existing tables.

Closes #14

Co-Authored-By: Claude Sonnet <noreply@anthropic.com>

fix(cli): handle Windows path separators correctly

Fix issue where Windows backslashes in paths caused errors.
Use pathlib.Path for cross-platform compatibility.

Fixes #42

docs: update README with Auto-Pilot documentation

Add comprehensive documentation for Auto-Pilot mode including:
- Pattern detection overview
- Quality scoring explanation
- Interactive wizard usage
- Code examples

Co-Authored-By: Claude Sonnet <noreply@anthropic.com>

Pull Request Process

Before Submitting

Tests Pass - All tests must pass locally
Code Formatted - Run uv run ruff format .
Code Linted - Run uv run ruff check .
Coverage Adequate - New code has >80% test coverage
Documentation Updated - Update relevant docs if needed

Creating a Pull Request

# 1. Push your feature branch
git push origin feature/your-feature-name

# 2. Create pull request on GitHub
# Visit: https://github.com/wareflowx/excel-to-sql/compare/main...YOUR_USERNAME:excel-to-sql:feature/your-feature-name

Pull Request Template

## Description
Brief description of the changes

## Type of Change
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] Documentation update

## Testing
- [ ] Tests added/updated
- [ ] All tests pass locally
- [ ] Coverage maintained above 80%

## Checklist
- [ ] My code follows the style guidelines
- [ ] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings

Review Process

Automated Checks - CI runs tests and linting
Code Review - Maintainers review your code
Feedback - Address review comments
Approval - PR approved and merged
Cleanup - Delete your feature branch after merge

Merging

Maintainers will squash and merge commits
Maintainers will update CHANGELOG.md
Maintainers will create a release if appropriate

Reporting Issues

Bug Reports

Report bugs using GitHub Issues with the following template:

### Description
Clear description of the bug

### Reproduction Steps
1. Step 1
2. Step 2
3. ...

### Expected Behavior
What should happen

### Actual Behavior
What actually happens

### Environment
- OS: [e.g. Windows 11, macOS 14, Ubuntu 22.04]
- Python Version: [e.g. 3.11.5]
- excel-to-sql Version: [e.g. 0.3.0]

### Additional Context
Stack traces, screenshots, etc.

Feature Requests

Request features using GitHub Issues:

### Problem Description
What problem does this solve?

### Proposed Solution
How should it work?

### Alternatives Considered
What other approaches did you consider?

### Additional Context
Examples, mockups, etc.

Development Resources

Documentation

README - Main documentation
CHANGELOG - Version history
API Reference - API documentation (planned)
Examples - Usage examples (planned)

Tools Used

uv - Python package manager
pytest - Testing framework
ruff - Linter and formatter
pandas - Data manipulation
Rich - Terminal output
Typer - CLI framework

Getting Help

Documentation - Start with the README and existing code
Issues - Search GitHub Issues for similar problems
Discussions - Use GitHub Discussions for questions
Contact - Open an issue for bugs or feature requests

License

By contributing, you agree that your contributions will be licensed under the MIT License.

Thank you for contributing to excel-to-sql! Your contributions are greatly appreciated.

FilesExpand file tree

CONTRIBUTING.md

Latest commit

History