About the Splicing Predictor

An interpretable deep learning model for predicting RNA alternative splicing outcomes

What is PSI?

PSI (Percent Spliced In) is a measure of how often an exon is included in the mature mRNA transcript during alternative splicing. It ranges from 0 to 1:

PSI = 1.0

Exon always included

PSI = 0.5

50/50 inclusion

PSI = 0.0

Exon always skipped

Alternative splicing is a key regulatory mechanism that allows a single gene to produce multiple protein variants. Understanding and predicting splicing outcomes is crucial for studying gene regulation and disease mechanisms.

How It Works

1

Input Your Sequence

Enter a 70-nucleotide exon sequence (A, C, G, T only)

2

Add Flanking Sequences

The model adds 10 nucleotides on each side from the original experimental context

3

Predict RNA Structure

ViennaRNA predicts the secondary structure and identifies wobble base pairs

4

Neural Network Prediction

The deep learning model predicts PSI based on sequence, structure, and wobble features

Who Should Use This Tool?

Researchers

Studying alternative splicing mechanisms and regulation

Synthetic Biologists

Designing synthetic exons with specific splicing behavior

Clinicians

Investigating potential splicing effects of genetic variants

Educators & Students

Learning about splicing regulation and computational biology

Limitations

  • Fixed sequence length: Only accepts exactly 70-nucleotide exon sequences
  • Training data: Model was trained on HeLa cell data from the ES7 library
  • Cell type specificity: Predictions may not generalize to all cell types or tissues
  • Cis-regulatory only: Does not consider trans-acting factors or cellular context

Model Performance

Metric Value Description
Test R² ~0.85 Variance explained on held-out test set
Correlation ~0.92 Pearson correlation with experimental PSI
Test RMSE ~0.12 Root mean squared error on test set
Training Data ~150,000 Synthetic exon sequences from ES7_HeLa