2059
* Find positive solution to sum_{i=low}^{high} exp(i lambda) = 1.
2061
* @param probs probabilities of a score occurring
2062
* @param d the gcd of the possible scores. This equals 1 if the scores
2064
* @param low the lowest possible score
2065
* @param high the highest possible score
2066
* @param lambda0 an initial value for lambda
2067
* @param tolx the tolerance to which lambda must be computed
2068
* @param itmax the maximum number of times the function may be
2070
* @param maxNewton the maximum permissible number of Newton
2071
* iteration. After that the computation will proceed by bisection.
2072
* @param itn a pointer to an integer that will receive the actually
2073
* number of iterations performed.
2075
* Let phi(lambda) = sum_{i=low}^{high} exp(i lambda) - 1. Then phi(lambda)
2078
* phi(lamdba) = exp(u lambda) p( exp(-lambda) )
2080
* where p(x) is a polynomial that has exactly two zeros, one at x = 1
2081
* and one at y = exp(-lamdba). It is simpler, more numerically
2082
* efficient and stable to apply Newton's method to p(x) than to
2085
* We define a safeguarded Newton iteration as follows. Let the
2086
* initial interval of uncertainty be [0,1]. If p'(x) >= 0, we bisect
2087
* the interval. Otherwise we try a Newton step. If the Newton iterate
2088
* lies in the current interval of uncertainty and it reduces the
2089
* value of | p(x) | by at least 10%, we accept the new
2090
* point. Otherwise, we bisect the current interval of uncertainty.
2091
* It is clear that this method converges to a zero of p(x). Since
2092
* p'(x) > 0 in an interval containing x = 1, the method cannot
2093
* converge to x = 1 and therefore converges to the only other zero,
2059
* Find positive solution to
2061
* sum_{i=low}^{high} exp(i lambda) * probs[i] = 1.
2063
* Note that this solution does not exist unless the average score is
2064
* negative and the largest score that occurs with nonzero probability
2067
* @param probs probabilities of a score occurring
2068
* @param d the gcd of the possible scores. This equals 1 if
2069
* the scores are not a lattice
2070
* @param low the lowest possible score that occurs with
2071
* nonzero probability
2072
* @param high the highest possible score that occurs with
2073
* nonzero probability.
2074
* @param lambda0 an initial guess for lambda
2075
* @param tolx the tolerance to which lambda must be computed
2076
* @param itmax the maximum number of times the function may be
2078
* @param maxNewton the maximum permissible number of Newton
2079
* iterations; after that the computation will proceed
2081
* @param *itn the number of iterations needed to compute Lambda,
2082
* or itmax if Lambda could not be computed.
2084
* Let phi(lambda) = sum_{i=low}^{high} exp(i lambda) - 1. Then
2085
* phi(lambda) may be written
2087
* phi(lamdba) = exp(u lambda) f( exp(-lambda) )
2089
* where f(x) is a polynomial that has exactly two zeros, one at x = 1
2090
* and one at x = exp(-lamdba). It is simpler to solve this problem
2091
* in x = exp(-lambda) than it is to solve it in lambda, because we
2092
* know that for x, a solution lies in [0,1], and because Newton's
2093
* method is generally more stable and efficient for polynomials than
2094
* it is for exponentials.
2096
* For the most part, this function is a standard safeguarded Newton
2097
* iteration: define an interval of uncertainty [a,b] with f(a) > 0
2098
* and f(b) < 0 (except for the initial value b = 1, where f(b) = 0);
2099
* evaluate the function and use the sign of that value to shrink the
2100
* interval of uncertainty; compute a Newton step; and if the Newton
2101
* step suggests a point outside the interval of uncertainty or fails
2102
* to decrease the function sufficiently, then bisect. There are
2103
* three further details needed to understand the algorithm:
2105
* 1) If y the unique solution in [0,1], then f is positive to the left of
2106
* y, and negative to the right. Therefore, we may determine whether
2107
* the Newton step -f(x)/f'(x) is moving toward, or away from, y by
2108
* examining the sign of f'(x). If f'(x) >= 0, we bisect instead
2109
* of taking the Newton step.
2110
* 2) There is a neighborhood around x = 1 for which f'(x) >= 0, so
2111
* (1) prevents convergence to x = 1 (and for a similar reason
2112
* prevents convergence to x = 0, if the function is incorrectly
2113
* called with probs[high] == 0).
2114
* 3) Conditions like fabs(p) < lambda_tolerance * x * (1-x) are used in
2115
* convergence criteria because these values translate to a bound
2116
* on the relative error in lambda. This is proved in the
2117
* "Blast Scoring Parameters" document that accompanies the BLAST
2120
* The iteration on f(x) is robust and doesn't overflow; defining a
2121
* robust safeguarded Newton iteration on phi(lambda) that cannot
2122
* converge to lambda = 0 and that is protected against overflow is
2123
* more difficult. So (despite the length of this comment) the Newton
2124
* iteration on f(x) is the simpler solution.
2098
NlmKarlinLambdaNR( double* probs, Int4 d, Int4 low, Int4 high, double lambda0, double tolx,
2099
Int4 itmax, Int4 maxNewton, Int4 * itn )
2127
NlmKarlinLambdaNR(double* probs, Int4 d, Int4 low, Int4 high, double lambda0,
2128
double tolx, Int4 itmax, Int4 maxNewton, Int4 * itn )
2102
2131
double x0, x, a = 0, b = 1;
2369
2402
Int4 query_length;
2370
2403
Uint1 *buffer; /* holds sequence */
2371
2404
Blast_KarlinBlk* kbp;
2373
/* For each query, check if forward strand is present */
2374
if ((query_length = query_info->contexts[context].query_length) < 0)
2377
context_offset = query_info->contexts[context].query_offset;
2405
Int2 loop_status; /* status flag for functions in this loop. */
2407
if ( !contexts[context].is_valid )
2410
query_length = contexts[context].query_length;
2411
context_offset = contexts[context].query_offset;
2378
2412
buffer = &query[context_offset];
2380
2414
Blast_ResFreqString(sbp, rfp, (char*)buffer, query_length);
2381
2415
sbp->sfp[context] = Blast_ScoreFreqNew(sbp->loscore, sbp->hiscore);
2382
2416
BlastScoreFreqCalc(sbp, sbp->sfp[context], rfp, stdrfp);
2383
2417
sbp->kbp_std[context] = kbp = Blast_KarlinBlkNew();
2384
status = Blast_KarlinBlkUngappedCalc(kbp, sbp->sfp[context]);
2418
loop_status = Blast_KarlinBlkUngappedCalc(kbp, sbp->sfp[context]);
2420
contexts[context].is_valid = FALSE;
2421
sbp->sfp[context] = Blast_ScoreFreqFree(sbp->sfp[context]);
2422
sbp->kbp_std[context] = Blast_KarlinBlkFree(sbp->kbp_std[context]);
2423
if (!Blast_QueryIsTranslated(program) ) {
2424
Blast_MessageWrite(blast_message, eBlastSevWarning, context,
2425
"Could not calculate ungapped Karlin-Altschul parameters due "
2426
"to an invalid query sequence or its translation. Please verify the "
2427
"query sequence(s) and/or filtering options");
2388
2431
/* For searches with translated queries, check whether ideal values
2389
2432
should be substituted instead of calculated values, so a more