Skip to content
/ gene_map Public

Tool for converting between various gene ids

Notifications You must be signed in to change notification settings

kpj/gene_map

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

66 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

gene_map

PyPI Build Status codecov

Tool for converting between various gene ids.

Installation

$ pip install gene_map

Usage

$ gene_map --help
Usage: gene_map [OPTIONS]

  Map gene ids between various formats.

Options:
  -i, --input TEXT                If it exists, treated as file with
                                  whitespace-separated gene ids. Otherwise
                                  treated as a gene id itself.  [required]
  --from TEXT                     Source ID type.  [required]
  --to TEXT                       Target ID type.  [required]
  -o, --output FILENAME           CSV-file to save result to.
  --organism [ARATH_3702|CAEEL_6239|CHICK_9031|DANRE_7955|DICDI_44689|DROME_7227|ECOLI_83333|HUMAN_9606|MOUSE_10090|RAT_10116|SCHPO_284812|YEAST_559292]
                                  Organism to convert IDs in.
  --cache-dir DIRECTORY           Folder to store ID-databases in.
  -q, --quiet                     Suppress logging of mapping-statistics.
  --force-download                Force download of mapping-database.
  --help                          Show this message and exit.

Getting started

Commandline usage

Inputs can be either gene ids or files containing whitespace-separated gene ids:

$ cat mygenes.txt
P63244 P08246
P68871
$ gene_map \
    -i P35222 -i InvalidID -i mygenes.txt -i P04637 \
    --from ACC --to Gene_Name \
    -o gene_mapping.csv
Mapped 5/6 genes.
$ cat gene_mapping.csv
ID_from,ID_to
P04637,TP53
P08246,ELANE
P35222,CTNNB1
P63244,RACK1
P68871,HBB

It is also possible to simply try to convert all given inputs without knowing their ID type, by using --from auto:

$ gene_map \
    -i P35222 \
    -i TP53 \
    -i '9606.ENSP00000306407' \
    --from auto \
    --to GeneID
Mapped 3/3 genes.
ID_from,ID_to
9606.ENSP00000306407,79007
P35222,1499
TP53,7157

Attention: if an ID is valid for multiple types, unintended side-effects may occur. Furthermore, all IDs are treated as strings.

API usage

>>> from gene_map import GeneMapper

>>> stringdb_ids = ['9606.ENSP00000306407', '9606.ENSP00000337461']
>>> gm = GeneMapper()  # defaults to HUMAN_9606
>>> gm.query(stringdb_ids, source_id_type='STRING', target_id_type='GeneID')
#                ID_from  ID_to
#0  9606.ENSP00000306407  79007
#1  9606.ENSP00000337461  90529

About

Tool for converting between various gene ids

Topics

Resources

Stars

Watchers

Forks

Packages

No packages published

Languages