Skip to content

id-bioinfo/ggFlu

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

ggFlu: Global Genotyping for H5 Highly Pathogenic Avian Influenza Virus

What This Tool Does

Given influenza A H5 query full genomes, the workflow maps queries to predefined major/minor genotypes by placing aligned queries onto segment reference trees using TIPars.

Website

Web version is availabel at www.ggflu.org and www.ggflu.com.

Method Summary

  • Global H5 genotyping framework with a two-level nomenclature:
    • Major genotype: gene constellation across 8 segments.
    • Minor genotype: finer reassortment-level sub-lineage within a major genotype.
  • HA is used as lineage anchor (WHO/WOAH/FAO clade context).
  • Internal and NA segment clusters are defined using distance/host/geography criteria.
  • Query assignment uses phylogenetic placement (TIPars) onto fixed segment reference trees.
  • Results are harmonized with external systems through mapping fields (e.g., GenoFLU and GenIn labels).
  • Pango (SARS-CoV-2) style naming convention is adopted (e.g., B.1) to simplify communication without losing evolutionary history.

Methods Summary preview

Input Format

Expected FASTA file with header style includes segment tags, e.g.:

>sample_001|HA
ATG...
>sample_001|NA
ATG...
>sample_001|PB2
ATG...

The splitter uses |HA, |NA, |PB2, |PB1, |PA, |NP, |MP, |NS tags in headers.

How To Interpret Results

Main file:

  • QueryGenotypeTable.txt

Key fields:

  • qid: query sequence ID.
  • HAclade: inferred HA clade/subtype context for assignment.
  • MajorGenotype_alias: major genotype alias label.
  • MajorGenotype_lineage: major genotype lineage label.
  • MinorGenotype: minor genotype assignment.
    • * means no confident mapped minor genotype in current reference/mapping set.
  • MinorGenotype_GenoFlu, MinorGenotype_Genin:
    • Cross-system mapping labels when available.
    • Not assigned means no mapping entry.
  • Segment cluster columns (PB2ClusterName, PB1ClusterName, PAClusterName, NPClusterName, NAClusterName, MPClusterName, NSClusterName):
    • Segment-level source cluster assignment.
    • Where available, includes mapped minor cluster IDs in parentheses.
  • Notes:
    • Quality warnings from alignment diagnostics, e.g., high gap/N percentage in specific segments.

Interpretation:

  1. The naming starts with A and extends to A.1, B.1,... for each H5 clade. Use H5 2.3.4.4b B.4, H5 2.3.4.4h A.1 to refer to the major genotypes. There are 2.3.4.4b B.5 and 2.3.2.1a B.5 for each clade.
  2. Existing major + existing minor match -> known genotype.
  3. Existing major + unmatched minor (*) -> candidate new minor genotype under known major backbone.
  4. Unmatched major pattern (*) -> candidate new major genotype

Version Changes

v1.3.0 (current)

  • Genotypes for 2.3.2.1a, 2.3.2.1e, 2.3.2.1g, 2.3.4.4b, 2.3.4.4h.

About

Global Genotyping for H5 Highly Pathogenic Avian Influenza Virus

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors