A CRISPR gRNA mapping tool that:
-
Gene & Transcript Retrieval via Ensembl
- Fetches gene metadata, transcript lists, and cDNA sequences using the Ensembl REST API.
-
Customizable PAM Pattern & Target Length
- Supports user-defined PAM sequences (e.g.,
NGG,NNGRRT) and target site lengths (e.g., 15–25 bp).
- Supports user-defined PAM sequences (e.g.,
-
Strand-Aware Target Detection
- Scans both sense and antisense strands for target sites matching criteria.
-
Raw Sequence Caching
- Stores fetched sequences as
.txtfiles in/datalibrary/rawsequencesto avoid repeated API calls.
- Stores fetched sequences as
-
Variant Overlap Reporting
- Displays the number of variants overlapping the selected gene from Ensembl variation data.
-
Auto-Named SQLite Target Databases
- Stores results in uniquely named
.dbfiles based on transcript ID, PAM, and target length.
- Stores results in uniquely named
-
Smart Database Reuse
- Checks if the same parameter combination has already been processed — skips redundant computation.
- Run
grna_map.py - Enter a gene, The PAM sequence and the target length desired.
- If the combination of parameters specified above already has a database, access it from datalibrary/sqlite_dbs
- If not, a new DB will be created as
db_{ENSEMBL_TRANSCRIPT_ID}_{PAM}_{TARGET_LEN}_.dbafter scanning for matches.
GeneSeq/
|--- datalibrary/
|------|--- rawsequences
|------|--- sqlite_dbs
|--- db_access/
|--- ensembl_fetch/
|--- raw_seq_access/
|--- seq_scan/
|--- grna_map.py
|--- README.md
rawsequences -> repository with gene sequences locally as {GENE SYMBOL}_{ENSEMBL_TRANSCRIPT_ID}_seq.txt files.
sqlite_dbs -> unique sqlite dbs for each variation of params - db_{ENSEMBL_TRANSCRIPT_ID}_{PAM}_{TARGET_LEN}_.db
db_access -> to check if a db exists, if not - functions to create and access it.
ensembl_fetch -> fetching various genetic data from ensembl
raw_seq_access -> functions to directly access to the local datalibrary/rawsequences bypassing the DB.
seq_scan -> scans the gene sequence for matches based on GENE, PAM and TARGET LENGTH.
grna_map -> main function