1
// -*- mode: C++; tab-width: 2; -*-
4
// --------------------------------------------------------------------------
5
// OpenMS Mass Spectrometry Framework
6
// --------------------------------------------------------------------------
7
// Copyright (C) 2003-2011 -- Oliver Kohlbacher, Knut Reinert
9
// This library is free software; you can redistribute it and/or
10
// modify it under the terms of the GNU Lesser General Public
11
// License as published by the Free Software Foundation; either
12
// version 2.1 of the License, or (at your option) any later version.
14
// This library is distributed in the hope that it will be useful,
15
// but WITHOUT ANY WARRANTY; without even the implied warranty of
16
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
17
// Lesser General Public License for more details.
19
// You should have received a copy of the GNU Lesser General Public
20
// License along with this library; if not, write to the Free Software
21
// Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA
23
// --------------------------------------------------------------------------
24
// $Maintainer: Clemens Groepl $
26
// --------------------------------------------------------------------------
28
#ifndef OPENMS_ANALYSIS_MAPMATCHING_STABLEPAIRFINDER_H
29
#define OPENMS_ANALYSIS_MAPMATCHING_STABLEPAIRFINDER_H
31
#include <OpenMS/ANALYSIS/MAPMATCHING/BaseGroupFinder.h>
36
@brief This class implements a pair finding algorithm for consensus features.
38
It offers a method to determine pairs across two consensus maps. The corresponding consensus features must be aligned, but may have small position deviations.
40
The distance measure is implemented in class @ref FeatureDistance - see there for details.
42
<B> Additional criteria for pairing </B>
44
Depending on parameter @p use_identifications, peptide identifications annotated to the features may have to be compatible (i.e. no annotation or the same annotation) for a pairing to occur.
46
Stability criterion: The distance to the nearest neighbor must be smaller than the distance to the second-nearest neighbor by a certain factor, see parameter @p second_nearest_gap. There is a non-trivial relation between this parameter and the maximum allowed difference (in RT or m/z) of the distance measure: If @p second_nearest_gap is greater than one, lowering @p max_difference may in fact lead to more - rather than fewer - pairings, because it increases the distance difference between the nearest and the second-nearest neighbor, so that the constraint imposed by @p second_nearest_gap may be fulfilled more often.
48
<B> Quality calculation </B>
50
The quality of a pairing is computed from the distance between the paired elements (nearest neighbors) and the distances to the second-nearest neighbors of both elements, according to the formula:
53
q_{i,j} = \big( 1 - d_{i,j} \big) \cdot
54
\big( 1 - \frac{g \cdot d_{i,j}}{d_{2,i}} \big) \cdot
55
\big( 1 - \frac{g \cdot d_{i,j}}{d_{2,j}} \big) \cdot
58
@f$ q_{i,j} @f$ is the quality of the pairing of elements @em i and @em j, @f$ d_{i,j} @f$ is the distance between the two, @f$ d_{2,i} @f$ and @f$d_{2,j} @f$ are the distances to the second-nearest neighbors of @em i and @em j, respectively, and @em g is the factor defined by parameter @p second_nearest_gap.
60
Note that by the definition of the distance measure, @f$ 0 \leq d_{i,j} \leq 1 @f$ if @em i and @em j are to form a pair. The criteria for pairing further require that @f$ g \cdot d_{i,j} \leq d_{2,i} @f$ and @f$ g \cdot d_{i,j} \leq d_{2,j} @f$. This ensures that the resulting quality is always between one (best) and zero (worst).
62
For the final quality @em q of the consensus feature produced by merging two paired elements (@em i and @em j), the existing quality values of the two elements are taken into account. The final quality is a weighted average of the existing qualities (@f$ q_i @f$ and @f$ q_j @f$) and the quality of the pairing (@f$ q_{i,j} @f$, see above):
65
q = \frac{q_{i,j} + (s_i - 1) \cdot q_i + (s_j - 1) \cdot q_j}{s_i + s_j - 1}
68
The weighting factors @f$ s_i @f$ and @f$ s_j @f$ are the sizes (i.e. numbers of subelements) of the two consensus features @em i and @em j. That way, it is possible to link several feature maps to a growing consensus map in a stepwise fashion (as done by @ref FeatureGroupingAlgorithmUnlabeled), and in the end obtain quality values that incorporate the qualities of all pairings that occurred during the generation of a consensus feature. Note that "missing" elements (if a consensus feature does not contain sub-features from all input maps) are not punished in this definition of quality.
70
@htmlinclude OpenMS_StablePairFinder.parameters
72
@ingroup FeatureGrouping
74
class OPENMS_DLLAPI StablePairFinder : public BaseGroupFinder
79
typedef BaseGroupFinder Base;
90
/// Returns an instance of this class
91
static BaseGroupFinder*
94
return new StablePairFinder();
97
/// Returns the name of this module
105
@brief Run the algorithm
107
@note Exactly two @em input maps must be provided.
109
@exception Exception::IllegalArgument is thrown if the input data is not valid.
111
void run(const std::vector<ConsensusMap>& input_maps,
112
ConsensusMap &result_map );
116
///@name Internal helper classes and enums
130
@brief Checks if the peptide IDs of two features are compatible.
132
A feature without identification is always compatible. Otherwise, two features are compatible if the best peptide hits of their identifications have the same sequences.
134
bool compatibleIDs_(const ConsensusFeature& feat1,
135
const ConsensusFeature& feat2) const;
137
/// The distance to the second nearest neighbors must be by this factor larger than the distance to the matched element itself.
138
DoubleReal second_nearest_gap_;
140
/// Only match if peptide IDs are compatible?
144
} // namespace OpenMS
146
#endif // OPENMS_ANALYSIS_MAPMATCHING_STABLEPAIRFINDER_H
150
gnuplot history - how the plot was created - please do not delete this receipt
152
f(x,intercept,exponent)=1/(1+(abs(x)*intercept)**exponent)
153
set terminal postscript enhanced color
154
set output "choosingstablepairfinderparams.ps"
156
plot [-3:3] [0:1] f(x,1,1), f(x,2,1), f(x,1,2), f(x,2,2)