2
* This file is licensed to You under the "Simplified BSD License".
3
* You may not use this software except in compliance with the License.
4
* You may obtain a copy of the License at
6
* http://www.opensource.org/licenses/bsd-license.php
8
* See the COPYRIGHT file distributed with this work for information
9
* regarding copyright ownership.
11
package ch.usi.inf.sape.hac.agglomeration;
15
* The "average", "group average", "unweighted average", or
16
* "Unweighted Pair Group Method using Arithmetic averages (UPGMA)",
17
* is a graph-based approach.
19
* The distance between two clusters is calculated as the average
20
* of the distances between all pairs of objects in opposite clusters.
21
* This method tends to produce small clusters of outliers,
22
* but does not deform the cluster space.
23
* [The data analysis handbook. By Ildiko E. Frank, Roberto Todeschini]
25
* The general form of the Lance-Williams matrix-update formula:
26
* d[(i,j),k] = ai*d[i,k] + aj*d[j,k] + b*d[i,j] + g*|d[i,k]-d[j,k]|
28
* For the "group average" method:
35
* d[(i,j),k] = ci/(ci+cj)*d[i,k] + cj/(ci+cj)*d[j,k]
36
* = ( ci*d[i,k] + cj*d[j,k] ) / (ci+cj)
38
* @author Matthias.Hauswirth@usi.ch
40
public final class AverageLinkage implements AgglomerationMethod {
42
public double computeDissimilarity(final double dik, final double djk, final double dij, final int ci, final int cj, final int ck) {
43
return (ci*dik+cj*djk)/(ci+cj);
46
public String toString() {