GitHub - rnasys/TranslationAI

TranslationAI: An advanced deep learning framework for precise identification of translation initiation and termination sites within mature mRNA sequences

This package predicts translation initiation and termination sites with a given mature mRNA sequence. As an application, it can predict effect of genetic variants or RNA editing on translation product.

License

TranslationAI source code is provided under the GPLv3 license. The trained models used by TranslationAI (located in this package at translationai/models) are provided under the CC BY NC 4.0 license for academic and non-commercial use.

Installation

TranslationAI can be installed from the github repository:

git clone https://github.com/rnasys/TranslationAI.git
cd TranslationAI
python setup.py install

TranslationAI requires tensorflow>=1.2.0, which is best installed separately via pip or conda (see the TensorFlow website for other installation options):

pip install tensorflow
# or
conda install tensorflow

Usage

TranslationAI can be run from the command line (e.g.):

translationai -I translationai/examples/query_seq.fa -t 0.5,0.5

Required parameters:

-I: Input .fa with query sequence(s).
-t: Threshold k values for TIS/TTS prediction. If k>1, output the top-k TISs and TTSs in each sequence; if k<1, output the score>=k TISs and TTSs. e.g. 0.5,0.5

Examples

A sample input file and the corresponding output file can be found at examples/query_seq.fa and examples/query_seq_predTIS_0.1.txt & examples/query_seq_predTTS_0.1.txt & examples/query_seq_predORFs_0.1_0.1.txt respectively.

The header line format of input fasta file is: >chrN:int-int(+/-)(annotation)(int, int,). Example: >chr3:28283123-28361264(+)(CMC1)(241, 568,)

In the file for predicted TISs (e.g. examples/query_seq_predTIS_0.1.txt), each line represents a sequence and contains the following information: the sequence identifier, the predicted TIS position, the corresponding score. The TIS position and score are separated by a comma character. If multiple TISs were predicted for a sequence, they are ranked by the predicted scores and separated by a tab character. The format of each line is as follows: sequence_identifier TIS_position_1,TIS_score_1 TIS_position_2,TIS_score_2 ...

The file for predicted TTSs (e.g. examples/query_seq_predTTS_0.1.txt) is in the same format as the file for predicted TISs.

In the file for predicted ORFs (e.g. examples/query_seq_predORFs_0.1_0.1.txt), each line represents a sequence and contains the following information: the sequence identifier, the predicted TIS position, the predicted TTS position, the corresponding TIS score, the corresponding TTS score. The positions and scores are separated by a comma character. If multiple ORFs were predicted for a sequence, they are separated by a tab character. The format of each line is as follows: sequence_identifier TIS_position_1,TTS_position_1,TIS_score_1,TTS_score_1 TIS_position_2,TTS_position_2,TIS_score_2,TTS_score_2 ...

Contact

Xiaojuan Fan: xiaojuanfan05@gmail.com

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
translationai		translationai
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

TranslationAI: An advanced deep learning framework for precise identification of translation initiation and termination sites within mature mRNA sequences

License

Installation

Usage

Examples

Contact

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

TranslationAI: An advanced deep learning framework for precise identification of translation initiation and termination sites within mature mRNA sequences

License

Installation

Usage

Examples

Contact

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages