All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
This release represents a complete reorganization of the repository into a professional CLI tool.
Note: This version documents the major reorganization work. The actual release date will be determined when this is published to PyPI/Bioconda.
- Unified CLI Interface: All tools now accessible through
bioinfo-toolscommand - Subcommands:
extract-cds- Filter CDS features from GenBank files (replaces cdsselector.py)extract-proteins- Extract amino acid sequences (replaces take genes into aminoacid.py)extract-genes- Extract nucleotide sequences (replaces take genes into nucleotides.py)blast- Run BLAST searches with auto-formatting (replaces blastall.py)convert-ab1- Convert AB1 sequencing files to FASTQ (replaces ab1_to_blast.py)process-blast-results- Create hit matrices from BLAST results (replaces blastall_result.py)rename-fasta- Rename FASTA files based on headers (replaces fastarename.py)compare-proteins- Compare protein sequences and identify mutations (replaces get_mutations.py)download-pdb- Download PDB structure files (replaces pdb_downloader.py)generate-pgap-files- Generate PGAP input files (replaces pepnucfunction.py)extract-rrna- Extract ribosomal RNA sequences (replaces retrieve bacterial ribossomal rna.py)
- Package Structure:
bioinfo_tools/- Main package with modular architecturebioinfo_tools/commands/- Subcommand implementationsbioinfo_tools/utils/- Shared utility functions
- Packaging Files:
setup.py- Traditional Python packagingpyproject.toml- Modern Python packaging (PEP 517/518)conda_recipe/meta.yaml- Conda/Bioconda recipepixi.toml- Pixi configurationMANIFEST.in- Package manifest
- Documentation:
- Updated
README.mdwith comprehensive usage guide CONTRIBUTING.md- Developer guide for adding new commandsDEPLOYMENT.md- Deployment instructions for PyPI/Bioconda/PixiMIGRATION.md- Migration guide from old scriptsCOMMAND_REFERENCE.md- Complete CLI command referenceCHANGELOG.md- This file
- Updated
- Testing:
- Updated test suite to work with new CLI
- All 11 tests passing
- Test documentation in
tests/README.md
- Harmonized Arguments: All commands now use consistent argument patterns:
-i/--input-folderfor input directories-o/--output-folderfor output directories-g/--genes-listfor gene list files- Short and long options available for all arguments
- Better Help System: Each command has detailed help with examples
- Improved Error Handling: Better error messages and logging
- Logging: All commands use consistent INFO-level logging format
- BLAST Command:
- Named arguments instead of positional arguments
- Optional output folder specification
- Optional output format selection
- Better database auto-formatting
- Modular Architecture: Easy to add new subcommands (documented in CONTRIBUTING.md)
- Code Quality:
- Proper Python package structure
- Docstrings for all functions
- Type hints where appropriate
- PEP 8 compliant
The following standalone scripts have been removed as they are now available as CLI subcommands:
cdsselector.py→ Usebioinfo-tools extract-cdstake genes into aminoacid.py→ Usebioinfo-tools extract-proteinstake genes into nucleotides.py→ Usebioinfo-tools extract-genesblastall.py→ Usebioinfo-tools blastab1_to_blast.py→ Usebioinfo-tools convert-ab1blastall_result.py→ Usebioinfo-tools process-blast-resultsfastarename.py→ Usebioinfo-tools rename-fastaget_mutations.py→ Usebioinfo-tools compare-proteinspdb_downloader.py→ Usebioinfo-tools download-pdbpepnucfunction.py→ Usebioinfo-tools generate-pgap-filesretrieve bacterial ribossomal rna.py→ Usebioinfo-tools extract-rrna
These scripts are no longer included in the repository. See MIGRATION.md for transition guide.
- Consistent error handling across all commands
- Better validation of input files and folders
- Proper exit codes (0 for success, non-zero for errors)
- All standalone Python scripts have been removed (converted to CLI subcommands)
- Old scripts: cdsselector.py, take genes into aminoacid.py, take genes into nucleotides.py, blastall.py, ab1_to_blast.py, blastall_result.py, fastarename.py, get_mutations.py, pdb_downloader.py, pepnucfunction.py, retrieve bacterial ribossomal rna.py
- All functionality is now available through the unified
bioinfo-toolsCLI
Previous versions consisted of individual Python scripts without unified versioning.
cdsselector.py- CDS filtering from GenBank filestake genes into aminoacid.py- Amino acid extractiontake genes into nucleotides.py- Nucleotide extractionblastall.py- BLAST automationab1_to_blast.py- AB1 to FASTQ conversionblastall_result.py- BLAST result processingfastarename.py- FASTA file renaming utilityget_mutations.py- Mutation analysispdb_downloader.py- PDB file downloaderpepnucfunction.py- Peptide/nucleotide/function file generatorretrieve bacterial ribossomal rna.py- rRNA extraction
For users upgrading from previous versions:
- Install the package:
pip install -e . - Update scripts: Replace old script calls with new CLI commands
- Refer to MIGRATION.md: Complete guide for transitioning
Before:
python cdsselector.py --input-folder input/ --genes-list genes.txt --output-folder output/After:
bioinfo-tools extract-cds -i input/ -g genes.txt -o output/- Add remaining legacy scripts as subcommands
- Publish to PyPI
- Submit to Bioconda
- Add more comprehensive examples
- Consider GUI wrapper for common workflows
- Add parallel processing options for large datasets
For more details on any changes, see the commit history.
For questions or issues, please open an issue or contact davijosuemarcon@gmail.com.