50
50
faster than the 'Complete mode'. For some sequence types it completely fails,
51
51
e.g. if there are repetitive areas containing many 'AAAAA'
53
Relative and absolute scores will be approx. 1/4 (compared with complete mode)
55
57
absolute: returns the absolute number of hits
56
relative: returns the number of hits relative to maximum possible hits
58
The score depends on several other parameters like number of allowed mismatches,
59
search mode, oligo length and complement settings.
61
The maximum absolute score is
63
(length of relatives sequence) minus (oligo length)
65
Practically this score is rarely reached, because several possible oligos are
68
- all oligos starting with 2 identical nucleotides
69
- all oligos containing IUPAC codes (or N's)
71
If mismatches are used, each oligo may hit at several positions. Thus the
72
maximum relative score may exceed 100% (and the maximum absolute score may
73
exceed its theoretical maximum).
75
If you use 'Quick mode' the mean relative score will be approx. 25% (assuming
76
that 25% of the possible oligos start with an 'A').
78
That means.. only if you use
82
- your sequences contain no IUPACs and no repetitions,
84
you will get a score of 100% (or sequenceLength-oligoLength) for
85
the selected species itself and its duplicates.
58
relative: returns the number of hits relative to some maximum (see score-scaling)
62
Absolute hits are the number of oligos which occur in the source sequence
63
and in the targeted sequences (i.e. in the relatives of the source sequence).
65
If an oligo occurs multiple times in source or target sequence, it only
66
creates the minimum number of hits (e.g. if it occurs twice in source and
67
three times in a target, only two hits will be counted for that target).
69
The theoretical maximum for absolute hits is
71
maxhits = minimumBasecount(source, target) - oligolen + 1
73
In practice that value is rarely or never reached because several oligos
74
are skipped, namely all oligos containing IUPAC codes, N's or dots.
75
The PT-server as well will not report matches hitting ambiguous positions
78
The number of absolute hits is as well affected by other parameters:
80
- using quick search will only produces around 25% of the hits as using
81
complete search (assuming that 25% of all oligo starts with an 'A')
82
- searching for complement or reverse will duplicate the number of possible
83
hits. Searching for all 4 reverse/complement-combinations will produce
84
4 times as many hits as a plain forward search.
88
The relative score is absolute hits scaled versus a maximum POC (possible oligo count).
89
You can specify which maximum POC to use with the selection button next to
90
the score selection button:
92
to source POC maximum possible oligos in source
93
to target POC maximum possible oligos in target
94
to minimum POC minimum possible oligos in source or target
95
to maximum POC maximum possible oligos in source or target
97
'to source POC' will report ~100% score for partial source versus
98
all full sequences containing the part.
100
'to target POC' will report ~100% score for all partial target sequences
101
which are contained in the source sequence.
103
'to minimum POC' will report ~100% score if source is part of target or vice versa
104
(this was the default method in previous ARB versions).
106
'to maximum POC' will report ~100% score if source and target contain each other, i.e.
107
if they have an identical oligo distribution. If either source or target is missing
108
some bases, the score will lower.
111
When using 'quick search mode' the max. relative score will be 25% (if 25% of
112
the oligos start with 'A').
114
When searching for forward and reverse-complement, the theoretical max. relative
115
score will be 200%. In practice it won't find much hits on the reverse-complement
116
strand. So you'll get similar scores as without reverse-complement, but especially
117
if you lower the oligo size, you'll probably reach scores above 100%.
120
The EDIT4 aligner currently always uses 'to minimum POC'.
90
126
reverse: Match only reverse oligos
91
127
complement: Match only complement oligos
92
128
reverse-complement: Match only reverse-complement oligos
93
and all combinations of these.
95
The combinations may affect the score as well!
130
The remaining options are combinations of the above.
132
The combinations will affect the score, especially for shorter oligos.
133
Please read the section about 'Relative score' above to avoid confusion.
97
135
Note: Not available for EDIT4 aligner.
139
Restrict the alignment range in which oligos may match.
140
Hits outside that range will not be considered.
142
NOTES Special effort is taken to eliminate multi-matches, which were ignored in past versions.
143
That resulted in relative scores far beyond 100%, especially for small oligo-lengths.
145
Now e.g. an oligo occurring 3 times in the source sequence will give atmost 3 absolute
146
hitpoints to any target sequence - even if it occurs there far more often.
150
WARNINGS Use mismatches with care!
152
BUGS Relative score is not scaled to the maximum possible hits in the target range.