Thank you for your interest in contributing to excel-to-sql! This document provides guidelines and instructions for contributing to the project.
- Code of Conduct
- Getting Started
- Development Setup
- Coding Standards
- Testing Guidelines
- Commit Messages
- Pull Request Process
- Reporting Issues
We are committed to providing a welcoming and inclusive environment for all contributors. Please be respectful and constructive in all interactions.
- Use welcoming and inclusive language
- Be respectful of differing viewpoints and experiences
- Gracefully accept constructive criticism
- Focus on what is best for the community
- Show empathy towards other community members
- Python 3.10 or higher
- Git
- GitHub account
- Basic understanding of Python, Git, and CLI tools
# 1. Fork the repository on GitHub
# Click the "Fork" button in the top-right corner
# 2. Clone your fork locally
git clone https://github.com/YOUR_USERNAME/excel-to-sql.git
cd excel-to-sql
# 3. Add the original repository as upstream
git remote add upstream https://github.com/wareflowx/excel-to-sql.git
# 4. Install uv (Python package manager)
# On Windows:
powershell -ExecutionPolicy ByPass -c "irm https://astral.sh/uv/install.ps1 | iex"
# On Linux/macOS:
curl -LsSf https://astral.sh/uv/install.sh | sh
# 5. Install dependencies
uv sync
# 6. Install development dependencies
uv sync --dev# 1. Ensure your main branch is up-to-date
git checkout main
git fetch upstream
git rebase upstream/main
# 2. Create a new feature branch
git checkout -b feature/your-feature-namefeature/feature-name- New featuresfix/bug-name- Bug fixesdocs/documentation-name- Documentation updatesrefactor/component-name- Code refactoringtest/test-name- Test additions or updates
# 1. Make your changes
# Edit files, add features, fix bugs
# 2. Run tests locally
uv run pytest
# 3. Run with coverage
uv run pytest --cov=excel_to_sql --cov-report=html
# 4. Format code
uv run ruff format excel_to_sql/ tests/
# 5. Lint code
uv run ruff check excel_to_sql/ tests/
# 6. Commit your changes
git add .
git commit -m "feat: add your feature description"
# 7. Push to your fork
git push origin feature/your-feature-nameWe follow PEP 8 style guidelines with the following tools:
# Format code
uv run ruff format excel_to_sql/ tests/
# Check formatting
uv run ruff format --check excel_to_sql/ tests/# Lint code
uv run ruff check excel_to_sql/ tests/
# Auto-fix linting issues
uv run ruff check --fix excel_to_sql/ tests/excel_to_sql/
├── cli.py # CLI interface entry point
├── __init__.py # Package exports
├── __version__.py # Version information
├── sdk/ # Python SDK implementation
├── entities/ # Domain entities (Project, Database, Table, etc.)
├── transformations/ # Data transformation logic
├── validators/ # Validation framework
├── profiling/ # Data quality profiling
├── auto_pilot/ # Auto-Pilot mode components
└── ui/ # Interactive wizard UI
Use isort for import organization (included in ruff):
# Standard library imports
import os
from pathlib import Path
# Third-party imports
import pandas as pd
from rich.console import Console
# Local imports
from excel_to_sql.entities import Project
from excel_to_sql.validators import ValidationRuleUse Google-style docstrings:
def process_file(file_path: Path, patterns: dict) -> dict:
"""Process a single Excel file and detect patterns.
Args:
file_path: Path to the Excel file to process.
patterns: Dictionary of detected patterns.
Returns:
Dictionary containing processing results with keys:
- 'file_path': str - Path to processed file
- 'table_name': str - Detected table name
- 'patterns': dict - Detected patterns
Raises:
FileNotFoundError: If the file does not exist.
ValueError: If the file format is invalid.
"""All functions should include type hints:
from typing import Dict, List, Optional
def detect_patterns(
df: pd.DataFrame,
table_name: str,
confidence_threshold: float = 0.7
) -> Dict[str, any]:
"""Detect patterns in DataFrame."""
pass# Use specific exceptions
try:
df = pd.read_excel(file_path)
except FileNotFoundError:
raise FileNotFoundError(f"Excel file not found: {file_path}")
except Exception as e:
raise ValueError(f"Failed to read Excel file: {e}")
# Log errors appropriately
import logging
logger = logging.getLogger(__name__)
logger.error(f"Error processing file {file_path}: {e}")tests/
├── test_cli.py # CLI command tests
├── test_sdk.py # SDK functionality tests
├── test_transformations/ # Transformation tests
├── test_validators/ # Validator tests
├── test_auto_pilot/ # Auto-Pilot component tests
│ ├── test_detector.py # PatternDetector tests
│ ├── test_quality.py # QualityScorer tests
│ ├── test_recommender.py # RecommendationEngine tests
│ ├── test_auto_fix.py # AutoFixer tests
│ └── test_auto_fix_integration.py # Integration tests
├── test_ui/ # UI component tests
└── fixtures/ # Test data and fixtures
└── auto_pilot/ # Auto-Pilot test Excel files
import pytest
import pandas as pd
from pathlib import Path
class TestPatternDetector:
"""Unit tests for PatternDetector class."""
def test_initialization(self) -> None:
"""Test that PatternDetector initializes correctly."""
from excel_to_sql.auto_pilot.detector import PatternDetector
detector = PatternDetector()
assert detector is not None
def test_detect_primary_key(self) -> None:
"""Test primary key detection."""
from excel_to_sql.auto_pilot.detector import PatternDetector
detector = PatternDetector()
df = pd.DataFrame({
"id": [1, 2, 3],
"name": ["A", "B", "C"]
})
patterns = detector.detect_patterns(df, "test")
assert patterns["primary_key"] == "id"- New features must have test coverage > 80%
- Critical paths must have 100% coverage
- Integration tests for complex workflows
# Run tests with coverage
uv run pytest --cov=excel_to_sql --cov-report=html
# Check coverage report
open htmlcov/index.htmlPlace test data in tests/fixtures/:
tests/fixtures/
├── auto_pilot/
│ ├── commandes.xlsx
│ ├── mouvements.xlsx
│ └── produits.xlsx
└── transformations/
└── test_data.xlsx
Follow conventional commit format:
<type>(<scope>): <subject>
<body>
<footer>
feat- New featurefix- Bug fixdocs- Documentation changesstyle- Code style changes (formatting, etc.)refactor- Code refactoringtest- Adding or updating testschore- Maintenance tasksperf- Performance improvements
feat(auto_pilot): add pattern detection for foreign keys
Implement foreign key detection based on column name patterns
and value overlap analysis with existing tables.
Closes #14
Co-Authored-By: Claude Sonnet <noreply@anthropic.com>
fix(cli): handle Windows path separators correctly
Fix issue where Windows backslashes in paths caused errors.
Use pathlib.Path for cross-platform compatibility.
Fixes #42
docs: update README with Auto-Pilot documentation
Add comprehensive documentation for Auto-Pilot mode including:
- Pattern detection overview
- Quality scoring explanation
- Interactive wizard usage
- Code examples
Co-Authored-By: Claude Sonnet <noreply@anthropic.com>
- Tests Pass - All tests must pass locally
- Code Formatted - Run
uv run ruff format . - Code Linted - Run
uv run ruff check . - Coverage Adequate - New code has >80% test coverage
- Documentation Updated - Update relevant docs if needed
# 1. Push your feature branch
git push origin feature/your-feature-name
# 2. Create pull request on GitHub
# Visit: https://github.com/wareflowx/excel-to-sql/compare/main...YOUR_USERNAME:excel-to-sql:feature/your-feature-name## Description
Brief description of the changes
## Type of Change
- [ ] Bug fix (non-breaking change which fixes an issue)
- [ ] New feature (non-breaking change which adds functionality)
- [ ] Breaking change (fix or feature that would cause existing functionality to not work as expected)
- [ ] Documentation update
## Testing
- [ ] Tests added/updated
- [ ] All tests pass locally
- [ ] Coverage maintained above 80%
## Checklist
- [ ] My code follows the style guidelines
- [ ] I have performed a self-review of my code
- [ ] I have commented my code, particularly in hard-to-understand areas
- [ ] I have made corresponding changes to the documentation
- [ ] My changes generate no new warnings- Automated Checks - CI runs tests and linting
- Code Review - Maintainers review your code
- Feedback - Address review comments
- Approval - PR approved and merged
- Cleanup - Delete your feature branch after merge
- Maintainers will squash and merge commits
- Maintainers will update CHANGELOG.md
- Maintainers will create a release if appropriate
Report bugs using GitHub Issues with the following template:
### Description
Clear description of the bug
### Reproduction Steps
1. Step 1
2. Step 2
3. ...
### Expected Behavior
What should happen
### Actual Behavior
What actually happens
### Environment
- OS: [e.g. Windows 11, macOS 14, Ubuntu 22.04]
- Python Version: [e.g. 3.11.5]
- excel-to-sql Version: [e.g. 0.3.0]
### Additional Context
Stack traces, screenshots, etc.Request features using GitHub Issues:
### Problem Description
What problem does this solve?
### Proposed Solution
How should it work?
### Alternatives Considered
What other approaches did you consider?
### Additional Context
Examples, mockups, etc.- README - Main documentation
- CHANGELOG - Version history
- API Reference - API documentation (planned)
- Examples - Usage examples (planned)
- uv - Python package manager
- pytest - Testing framework
- ruff - Linter and formatter
- pandas - Data manipulation
- Rich - Terminal output
- Typer - CLI framework
- Documentation - Start with the README and existing code
- Issues - Search GitHub Issues for similar problems
- Discussions - Use GitHub Discussions for questions
- Contact - Open an issue for bugs or feature requests
By contributing, you agree that your contributions will be licensed under the MIT License.
Thank you for contributing to excel-to-sql! Your contributions are greatly appreciated.