Skip to content

Latest commit

 

History

History
158 lines (124 loc) · 6.52 KB

File metadata and controls

158 lines (124 loc) · 6.52 KB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[1.0.0] - 2024-10-14

Major Reorganization

This release represents a complete reorganization of the repository into a professional CLI tool.

Note: This version documents the major reorganization work. The actual release date will be determined when this is published to PyPI/Bioconda.

Added

  • Unified CLI Interface: All tools now accessible through bioinfo-tools command
  • Subcommands:
    • extract-cds - Filter CDS features from GenBank files (replaces cdsselector.py)
    • extract-proteins - Extract amino acid sequences (replaces take genes into aminoacid.py)
    • extract-genes - Extract nucleotide sequences (replaces take genes into nucleotides.py)
    • blast - Run BLAST searches with auto-formatting (replaces blastall.py)
    • convert-ab1 - Convert AB1 sequencing files to FASTQ (replaces ab1_to_blast.py)
    • process-blast-results - Create hit matrices from BLAST results (replaces blastall_result.py)
    • rename-fasta - Rename FASTA files based on headers (replaces fastarename.py)
    • compare-proteins - Compare protein sequences and identify mutations (replaces get_mutations.py)
    • download-pdb - Download PDB structure files (replaces pdb_downloader.py)
    • generate-pgap-files - Generate PGAP input files (replaces pepnucfunction.py)
    • extract-rrna - Extract ribosomal RNA sequences (replaces retrieve bacterial ribossomal rna.py)
  • Package Structure:
    • bioinfo_tools/ - Main package with modular architecture
    • bioinfo_tools/commands/ - Subcommand implementations
    • bioinfo_tools/utils/ - Shared utility functions
  • Packaging Files:
    • setup.py - Traditional Python packaging
    • pyproject.toml - Modern Python packaging (PEP 517/518)
    • conda_recipe/meta.yaml - Conda/Bioconda recipe
    • pixi.toml - Pixi configuration
    • MANIFEST.in - Package manifest
  • Documentation:
    • Updated README.md with comprehensive usage guide
    • CONTRIBUTING.md - Developer guide for adding new commands
    • DEPLOYMENT.md - Deployment instructions for PyPI/Bioconda/Pixi
    • MIGRATION.md - Migration guide from old scripts
    • COMMAND_REFERENCE.md - Complete CLI command reference
    • CHANGELOG.md - This file
  • Testing:
    • Updated test suite to work with new CLI
    • All 11 tests passing
    • Test documentation in tests/README.md

Changed

  • Harmonized Arguments: All commands now use consistent argument patterns:
    • -i/--input-folder for input directories
    • -o/--output-folder for output directories
    • -g/--genes-list for gene list files
    • Short and long options available for all arguments
  • Better Help System: Each command has detailed help with examples
  • Improved Error Handling: Better error messages and logging
  • Logging: All commands use consistent INFO-level logging format

Improved

  • BLAST Command:
    • Named arguments instead of positional arguments
    • Optional output folder specification
    • Optional output format selection
    • Better database auto-formatting
  • Modular Architecture: Easy to add new subcommands (documented in CONTRIBUTING.md)
  • Code Quality:
    • Proper Python package structure
    • Docstrings for all functions
    • Type hints where appropriate
    • PEP 8 compliant

Deprecated

The following standalone scripts have been removed as they are now available as CLI subcommands:

  • cdsselector.py → Use bioinfo-tools extract-cds
  • take genes into aminoacid.py → Use bioinfo-tools extract-proteins
  • take genes into nucleotides.py → Use bioinfo-tools extract-genes
  • blastall.py → Use bioinfo-tools blast
  • ab1_to_blast.py → Use bioinfo-tools convert-ab1
  • blastall_result.py → Use bioinfo-tools process-blast-results
  • fastarename.py → Use bioinfo-tools rename-fasta
  • get_mutations.py → Use bioinfo-tools compare-proteins
  • pdb_downloader.py → Use bioinfo-tools download-pdb
  • pepnucfunction.py → Use bioinfo-tools generate-pgap-files
  • retrieve bacterial ribossomal rna.py → Use bioinfo-tools extract-rrna

These scripts are no longer included in the repository. See MIGRATION.md for transition guide.

Fixed

  • Consistent error handling across all commands
  • Better validation of input files and folders
  • Proper exit codes (0 for success, non-zero for errors)

Removed

  • All standalone Python scripts have been removed (converted to CLI subcommands)
  • Old scripts: cdsselector.py, take genes into aminoacid.py, take genes into nucleotides.py, blastall.py, ab1_to_blast.py, blastall_result.py, fastarename.py, get_mutations.py, pdb_downloader.py, pepnucfunction.py, retrieve bacterial ribossomal rna.py
  • All functionality is now available through the unified bioinfo-tools CLI

[0.x.x] - Previous Versions

Previous versions consisted of individual Python scripts without unified versioning.

Legacy Scripts

  • cdsselector.py - CDS filtering from GenBank files
  • take genes into aminoacid.py - Amino acid extraction
  • take genes into nucleotides.py - Nucleotide extraction
  • blastall.py - BLAST automation
  • ab1_to_blast.py - AB1 to FASTQ conversion
  • blastall_result.py - BLAST result processing
  • fastarename.py - FASTA file renaming utility
  • get_mutations.py - Mutation analysis
  • pdb_downloader.py - PDB file downloader
  • pepnucfunction.py - Peptide/nucleotide/function file generator
  • retrieve bacterial ribossomal rna.py - rRNA extraction

Migration Notes

For users upgrading from previous versions:

  1. Install the package: pip install -e .
  2. Update scripts: Replace old script calls with new CLI commands
  3. Refer to MIGRATION.md: Complete guide for transitioning

Example Migration

Before:

python cdsselector.py --input-folder input/ --genes-list genes.txt --output-folder output/

After:

bioinfo-tools extract-cds -i input/ -g genes.txt -o output/

Future Plans

  • Add remaining legacy scripts as subcommands
  • Publish to PyPI
  • Submit to Bioconda
  • Add more comprehensive examples
  • Consider GUI wrapper for common workflows
  • Add parallel processing options for large datasets

For more details on any changes, see the commit history.

For questions or issues, please open an issue or contact davijosuemarcon@gmail.com.