-
These genes were used to generate the splice data set and to perform the comparison with genscan.
The files contain gene strings in one line, followed
by two lines of
gene_start intron_end+1 intron_end+1
intron_start+1 intron_start+1 gene_end+2
|
i.e. gene_start is on atg, intron_start on
gt, intron end on agx and gene end on
tagxx.
so the data looks like this:
tccgaatatcaatgtga...
571 738 1287 2018
683 939 1449 2144
tccgaatatcaatgtg...
571 695 868
648 818 1031
...
|
Download:
-
The data looks like this
-1 TTCTGAAGAAGACGATGACGAAGACGAAGGAGAAGCCGTTGCAGAACTTGTCACAAAGTG
-1 CCAACCTAATCGTTATACATATGTATTTACAGTCGCAAATGACAATTGAACAAATAAATG
....
+1 AATGTTTCAATTATAAAAATTGTTAATTACAGGGGGACACCTGTATCAGTGTGACATTTC
....
|
whereas the number -1 means no splice site while +1 means splice site. Then after a space the sequence follows.
Download:
-
(selected for largest validation ROC)
All files result files names *.{tst|dat} contain a
line about the actual validation or test error
followed by the actual classifier output.
validation error = 0.014181
-12.143139
-10.286769
...
|
Readily trained SVMs are saved in the following
format:
b=-3.577909
alphas=[
2 -1.000000
13 +0.373805
57 +1.000000
68 -0.332549
85 -1.000000
...
]
|
Here b is the bias term and alphas contain pairs of index and value, where
index is the index to a nonzero support vector and value the product of the
lagrange multiplier and label of that support
vector.
Results:
-
Positional Weight Matrixes
|
pseudo_p | pseudo_n | order | RSE | Err |
acceptor |
1 |
1 |
2 |
98.88 |
1.54 |
donor |
10 |
1e-4 |
2 |
98.23 |
1.85 |
Download result files:
-
Weighted Degree Kernel
|
C | degree | RSE | Err |
acceptor |
1 |
4 |
99.06 |
1.42 |
donor |
1 |
3 |
98.47 |
1.78 |
Download result files:
-
Locality Improved Kernel
|
C | degree | width | RSE | Err |
acceptor |
0.75 |
4 |
15 |
99.08 |
1.44 |
donor |
1 |
3 |
10 |
98.48 |
1.80 |
Download result files:
-
TOP-Linear Kernel
|
C | degree | RSE | Err |
acceptor |
0.5 |
3 |
98.88 |
1.52 |
donor |
0.5 |
2 |
98.35 |
1.82 |
Download result files:
SVM-Pairwise with 500 reference examples (trained on 20k), only first 10k test
|
C | gapcost | RSE | Err |
acceptor |
5 |
0.5 |
98.01 |
1.93 |
donor |
50 |
0.5 |
97.60 |
2.03 |
Download result files:
-
Polynomial Kernel
|
C | degree | RSE | Err |
acceptor |
2 |
6 |
98.94 |
1.80 |
donor |
2 |
5 |
98.31 |
2.08 |
Download result files:
-
-
-
Positional Weight Matrixes
sigmoid_a |
0.45 |
sigmoid_b |
-0.9 |
alpha |
-3.75 |
used model parameters (may differ from above)
| order | pseudo_p | pseudo_n |
acceptor |
3 |
1 |
1e-6 |
donor |
3 |
10 |
100 |
-
Weighted Degree Kernel
sigmoid_a |
0.75 |
sigmoid_b |
-0.9375 |
alpha |
1.7 |
used model parameters (may differ from above)
| C | degree |
acceptor |
2 |
3 |
donor |
1 |
3 |
-
Locality Improved Kernel
sigmoid_a |
0.75 |
sigmoid_b |
-0.75 |
alpha |
1.0 |
used model parameters (may differ from above!)
| degree | width | C |
acceptor |
4 |
15 |
2 |
donor |
3 |
10 |
5 |
-
Download Implementation
wd_kernel.cpp
Please not that the Shogun toolbox contains an easy-to-use version of that kernel.
|