4
Licensed to the Apache Software Foundation (ASF) under one or more
5
contributor license agreements. See the NOTICE file distributed with
6
this work for additional information regarding copyright ownership.
7
The ASF licenses this file to You under the Apache License, Version 2.0
8
(the "License"); you may not use this file except in compliance with
9
the License. You may obtain a copy of the License at
11
http://www.apache.org/licenses/LICENSE-2.0
13
Unless required by applicable law or agreed to in writing, software
14
distributed under the License is distributed on an "AS IS" BASIS,
15
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
16
See the License for the specific language governing permissions and
17
limitations under the License.
20
<?xml-stylesheet type="text/xsl" href="./xdoc.xsl"?>
21
<!-- $Revision: 620293 $ $Date: 2008-02-10 09:44:50 -0700 (Sun, 10 Feb 2008) $ -->
22
<document url="stat.html">
24
<title>The Commons Math User Guide - Statistics</title>
27
<section name="1 Statistics">
28
<subsection name="1.1 Overview" href="overview">
30
The statistics package provides frameworks and implementations for
31
basic Descriptive statistics, frequency distributions, bivariate regression,
32
and t-, chi-square and ANOVA test statistics.
35
<a href="#1.2 Descriptive statistics">Descriptive statistics</a><br></br>
36
<a href="#1.3 Frequency distributions">Frequency distributions</a><br></br>
37
<a href="#1.4 Simple regression">Simple Regression</a><br></br>
38
<a href="#1.5 Statistical tests">Statistical Tests</a><br></br>
41
<subsection name="1.2 Descriptive statistics" href="univariate">
43
The stat package includes a framework and default implementations for
44
the following Descriptive statistics:
46
<li>arithmetic and geometric means</li>
47
<li>variance and standard deviation</li>
48
<li>sum, product, log sum, sum of squared values</li>
49
<li>minimum, maximum, median, and percentiles</li>
50
<li>skewness and kurtosis</li>
51
<li>first, second, third and fourth moments</li>
55
With the exception of percentiles and the median, all of these
56
statistics can be computed without maintaining the full list of input
57
data values in memory. The stat package provides interfaces and
58
implementations that do not require value storage as well as
59
implementations that operate on arrays of stored values.
62
The top level interface is
63
<a href="../apidocs/org/apache/commons/math/stat/descriptive/UnivariateStatistic.html">
64
org.apache.commons.math.stat.descriptive.UnivariateStatistic.</a>
65
This interface, implemented by all statistics, consists of
66
<code>evaluate()</code> methods that take double[] arrays as arguments
67
and return the value of the statistic. This interface is extended by
68
<a href="../apidocs/org/apache/commons/math/stat/descriptive/StorelessUnivariateStatistic.html">
69
StorelessUnivariateStatistic</a>, which adds <code>increment(),</code>
70
<code>getResult()</code> and associated methods to support
71
"storageless" implementations that maintain counters, sums or other
72
state information as values are added using the <code>increment()</code>
76
Abstract implementations of the top level interfaces are provided in
77
<a href="../apidocs/org/apache/commons/math/stat/descriptive/AbstractUnivariateStatistic.html">
78
AbstractUnivariateStatistic</a> and
79
<a href="../apidocs/org/apache/commons/math/stat/descriptive/AbstractStorelessUnivariateStatistic.html">
80
AbstractStorelessUnivariateStatistic</a> respectively.
83
Each statistic is implemented as a separate class, in one of the
84
subpackages (moment, rank, summary) and each extends one of the abstract
85
classes above (depending on whether or not value storage is required to
86
compute the statistic). There are several ways to instantiate and use statistics.
87
Statistics can be instantiated and used directly, but it is generally more convenient
88
(and efficient) to access them using the provided aggregates,
89
<a href="../apidocs/org/apache/commons/math/stat/descriptive/DescriptiveStatistics.html">
90
DescriptiveStatistics</a> and
91
<a href="../apidocs/org/apache/commons/math/stat/descriptive/SummaryStatistics.html">
92
SummaryStatistics.</a>
95
<code>DescriptiveStatistics</code> maintains the input data in memory
96
and has the capability of producing "rolling" statistics computed from a
97
"window" consisting of the most recently added values.
100
<code>SummaryStatistics</code> does not store the input data values
101
in memory, so the statistics included in this aggregate are limited to those
102
that can be computed in one pass through the data without access to
103
the full array of values.
106
<code>MultivariateSummaryStatistics</code> is similar to <code>SummaryStatistics</code>
107
but handles n-tuple values instead of scalar values. It can also compute the
108
full covariance matrix for the input data.
112
<tr><th>Aggregate</th><th>Statistics Included</th><th>Values stored?</th>
113
<th>"Rolling" capability?</th></tr><tr><td>
114
<a href="../apidocs/org/apache/commons/math/stat/descriptive/DescriptiveStatistics.html">
115
DescriptiveStatistics</a></td><td>min, max, mean, geometric mean, n,
116
sum, sum of squares, standard deviation, variance, percentiles, skewness,
117
kurtosis, median</td><td>Yes</td><td>Yes</td></tr><tr><td>
118
<a href="../apidocs/org/apache/commons/math/stat/descriptive/SummaryStatistics.html">
119
SummaryStatistics</a></td><td>min, max, mean, geometric mean, n,
120
sum, sum of squares, standard deviation, variance</td><td>No</td><td>No</td></tr>
124
There is also a utility class,
125
<a href="../apidocs/org/apache/commons/math/stat/StatUtils.html">
126
StatUtils</a>, that provides static methods for computing statistics
127
directly from double[] arrays.
130
Here are some examples showing how to compute Descriptive statistics.
132
<dt>Compute summary statistics for a list of double values</dt>
134
<dd>Using the <code>DescriptiveStatistics</code> aggregate
135
(values are stored in memory):
137
// Get a DescriptiveStatistics instance using factory method
138
DescriptiveStatistics stats = DescriptiveStatistics.newInstance();
140
// Add the data from the array
141
for( int i = 0; i < inputArray.length; i++) {
142
stats.addValue(inputArray[i]);
145
// Compute some statistics
146
double mean = stats.getMean();
147
double std = stats.getStandardDeviation();
148
double median = stats.getMedian();
151
<dd>Using the <code>SummaryStatistics</code> aggregate (values are
152
<strong>not</strong> stored in memory):
154
// Get a SummaryStatistics instance using factory method
155
SummaryStatistics stats = SummaryStatistics.newInstance();
157
// Read data from an input stream,
158
// adding values and updating sums, counters, etc.
159
while (line != null) {
160
line = in.readLine();
161
stats.addValue(Double.parseDouble(line.trim()));
165
// Compute the statistics
166
double mean = stats.getMean();
167
double std = stats.getStandardDeviation();
168
//double median = stats.getMedian(); <-- NOT AVAILABLE
171
<dd>Using the <code>StatUtils</code> utility class:
173
// Compute statistics directly from the array
174
// assume values is a double[] array
175
double mean = StatUtils.mean(values);
176
double std = StatUtils.variance(values);
177
double median = StatUtils.percentile(50);
179
// Compute the mean of the first three values in the array
180
mean = StatuUtils.mean(values, 0, 3);
183
<dt>Maintain a "rolling mean" of the most recent 100 values from
186
<dd>Use a <code>DescriptiveStatistics</code> instance with
187
window size set to 100
189
// Create a DescriptiveStats instance and set the window size to 100
190
DescriptiveStatistics stats = DescriptiveStatistics.newInstance();
191
stats.setWindowSize(100);
193
// Read data from an input stream,
194
// displaying the mean of the most recent 100 observations
195
// after every 100 observations
197
while (line != null) {
198
line = in.readLine();
199
stats.addValue(Double.parseDouble(line.trim()));
202
System.out.println(stats.getMean());
208
<dt>Compute statistics in a thread-safe manner</dt>
210
<dd>Use a <code>SynchronizedDescriptiveStatistics</code> instance
212
// Create a SynchronizedDescriptiveStatistics instance and
213
// use as any other DescriptiveStatistics instance
214
DescriptiveStatistics stats = DescriptiveStatistics.newInstance(SynchronizedDescriptiveStatistics.class);
220
<subsection name="1.3 Frequency distributions" href="frequency">
222
<a href="../apidocs/org/apache/commons/math/stat/Frequency.html">
223
org.apache.commons.math.stat.descriptive.Frequency</a>
224
provides a simple interface for maintaining counts and percentages of discrete
228
Strings, integers, longs and chars are all supported as value types,
229
as well as instances of any class that implements <code>Comparable.</code>
230
The ordering of values used in computing cumulative frequencies is by
231
default the <i>natural ordering,</i> but this can be overriden by supplying a
232
<code>Comparator</code> to the constructor. Adding values that are not
233
comparable to those that have already been added results in an
234
<code>IllegalArgumentException.</code>
237
Here are some examples.
239
<dt>Compute a frequency distribution based on integer values</dt>
241
<dd>Mixing integers, longs, Integers and Longs:
243
Frequency f = new Frequency();
245
f.addValue(new Integer(1));
246
f.addValue(new Long(1));
248
f.addValue(new Integer(-1));
249
System.out.prinltn(f.getCount(1)); // displays 3
250
System.out.println(f.getCumPct(0)); // displays 0.2
251
System.out.println(f.getPct(new Integer(1))); // displays 0.6
252
System.out.println(f.getCumPct(-2)); // displays 0
253
System.out.println(f.getCumPct(10)); // displays 1
256
<dt>Count string frequencies</dt>
258
<dd>Using case-sensitive comparison, alpha sort order (natural comparator):
260
Frequency f = new Frequency();
265
System.out.println(f.getCount("one")); // displays 1
266
System.out.println(f.getCumPct("Z")); // displays 0.5
267
System.out.println(f.getCumPct("Ot")); // displays 0.25
270
<dd>Using case-insensitive comparator:
272
Frequency f = new Frequency(String.CASE_INSENSITIVE_ORDER);
277
System.out.println(f.getCount("one")); // displays 3
278
System.out.println(f.getCumPct("z")); // displays 1
284
<subsection name="1.4 Simple regression" href="regression">
286
<a href="../apidocs/org/apache/commons/math/stat/regression/SimpleRegression.html">
287
org.apache.commons.math.stat.regression.SimpleRegression</a>
288
provides ordinary least squares regression with one independent variable,
289
estimating the linear model:
292
<code> y = intercept + slope * x </code>
295
Standard errors for <code>intercept</code> and <code>slope</code> are
296
available as well as ANOVA, r-square and Pearson's r statistics.
299
Observations (x,y pairs) can be added to the model one at a time or they
300
can be provided in a 2-dimensional array. The observations are not stored
301
in memory, so there is no limit to the number of observations that can be
305
<strong>Usage Notes</strong>: <ul>
306
<li> When there are fewer than two observations in the model, or when
307
there is no variation in the x values (i.e. all x values are the same)
308
all statistics return <code>NaN</code>. At least two observations with
309
different x coordinates are requred to estimate a bivariate regression
311
<li> getters for the statistics always compute values based on the current
312
set of observations -- i.e., you can get statistics, then add more data
313
and get updated statistics without using a new instance. There is no
314
"compute" method that updates all statistics. Each of the getters performs
315
the necessary computations to return the requested statistic.</li>
319
<strong>Implementation Notes</strong>: <ul>
320
<li> As observations are added to the model, the sum of x values, y values,
321
cross products (x times y), and squared deviations of x and y from their
322
respective means are updated using updating formulas defined in
323
"Algorithms for Computing the Sample Variance: Analysis and
324
Recommendations", Chan, T.F., Golub, G.H., and LeVeque, R.J.
325
1983, American Statistician, vol. 37, pp. 242-247, referenced in
326
Weisberg, S. "Applied Linear Regression". 2nd Ed. 1985. All regression
327
statistics are computed from these sums.</li>
328
<li> Inference statistics (confidence intervals, parameter significance levels)
329
are based on on the assumption that the observations included in the model are
330
drawn from a <a href="http://mathworld.wolfram.com/BivariateNormalDistribution.html">
331
Bivariate Normal Distribution</a></li>
335
Here are some examples.
337
<dt>Estimate a model based on observations added one at a time</dt>
339
<dd>Instantiate a regression instance and add data points
341
regression = new SimpleRegression();
342
regression.addData(1d, 2d);
343
// At this point, with only one observation,
344
// all regression statistics will return NaN
346
regression.addData(3d, 3d);
347
// With only two observations,
348
// slope and intercept can be computed
349
// but inference statistics will return NaN
351
regression.addData(3d, 3d);
352
// Now all statistics are defined.
355
<dd>Compute some statistics based on observations added so far
357
System.out.println(regression.getIntercept());
358
// displays intercept of regression line
360
System.out.println(regression.getSlope());
361
// displays slope of regression line
363
System.out.println(regression.getSlopeStdErr());
364
// displays slope standard error
367
<dd>Use the regression model to predict the y value for a new x value
369
System.out.println(regression.predict(1.5d)
370
// displays predicted y value for x = 1.5
372
More data points can be added and subsequent getXxx calls will incorporate
373
additional data in statistics.
375
<dt>Estimate a model from a double[][] array of data points</dt>
377
<dd>Instantiate a regression object and load dataset
379
double[][] data = { { 1, 3 }, {2, 5 }, {3, 7 }, {4, 14 }, {5, 11 }};
380
SimpleRegression regression = new SimpleRegression();
381
regression.addData(data);
384
<dd>Estimate regression model based on data
386
System.out.println(regression.getIntercept());
387
// displays intercept of regression line
389
System.out.println(regression.getSlope());
390
// displays slope of regression line
392
System.out.println(regression.getSlopeStdErr());
393
// displays slope standard error
395
More data points -- even another double[][] array -- can be added and subsequent
396
getXxx calls will incorporate additional data in statistics.
401
<subsection name="1.5 Statistical tests" href="tests">
403
The interfaces and implementations in the
404
<a href="../apidocs/org/apache/commons/math/stat/inference/">
405
org.apache.commons.math.stat.inference</a> package provide
406
<a href="http://www.itl.nist.gov/div898/handbook/prc/section2/prc22.htm">
408
<a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda35f.htm">
410
<a href="http://www.itl.nist.gov/div898/handbook/prc/section4/prc43.htm">
411
One-Way ANOVA</a> test statistics as well as
412
<a href="http://www.cas.lancs.ac.uk/glossary_v1.1/hyptest.html#pvalue">
413
p-values</a> associated with <code>t-</code>,
414
<code>Chi-Square</code> and <code>One-Way ANOVA</code> tests. The
416
<a href="../apidocs/org/apache/commons/math/stat/inference/TTest.html">
418
<a href="../apidocs/org/apache/commons/math/stat/inference/ChiSquareTest.html">
419
ChiSquareTest</a>, and
420
<a href="../apidocs/org/apache/commons/math/stat/inference/OneWayAnova.html">
421
OneWayAnova</a> with provided implementations
422
<a href="../apidocs/org/apache/commons/math/stat/inference/TTestImpl.html">
424
<a href="../apidocs/org/apache/commons/math/stat/inference/ChiSquareTestImpl.html">
425
ChiSquareTestImpl</a> and
426
<a href="../apidocs/org/apache/commons/math/stat/inference/OneWayAnovaImpl.html">
427
OneWayAnovaImpl</a>, respectively.
429
<a href="../apidocs/org/apache/commons/math/stat/inference/TestUtils.html">
430
TestUtils</a> class provides static methods to get test instances or
431
to compute test statistics directly. The examples below all use the
432
static methods in <code>TestUtils</code> to execute tests. To get
433
test object instances, either use e.g.,
434
<code>TestUtils.getTTest()</code> or use the implementation constructors
436
<code>new TTestImpl()</code>.
439
<strong>Implementation Notes</strong>
441
<li>Both one- and two-sample t-tests are supported. Two sample tests
442
can be either paired or unpaired and the unpaired two-sample tests can
443
be conducted under the assumption of equal subpopulation variances or
444
without this assumption. When equal variances is assumed, a pooled
445
variance estimate is used to compute the t-statistic and the degrees
446
of freedom used in the t-test equals the sum of the sample sizes minus 2.
447
When equal variances is not assumed, the t-statistic uses both sample
449
<a href="http://www.itl.nist.gov/div898/handbook/prc/section3/gifs/nu3.gif">
450
Welch-Satterwaite approximation</a> is used to compute the degrees
451
of freedom. Methods to return t-statistics and p-values are provided in each
452
case, as well as boolean-valued methods to perform fixed significance
453
level tests. The names of methods or methods that assume equal
454
subpopulation variances always start with "homoscedastic." Test or
455
test-statistic methods that just start with "t" do not assume equal
456
variances. See the examples below and the API documentation for
458
<li>The validity of the p-values returned by the t-test depends on the
459
assumptions of the parametric t-test procedure, as discussed
460
<a href="http://www.basic.nwu.edu/statguidefiles/ttest_unpaired_ass_viol.html">
462
<li>p-values returned by t-, chi-square and Anova tests are exact, based
463
on numerical approximations to the t-, chi-square and F distributions in the
464
<code>distributions</code> package. </li>
465
<li>p-values returned by t-tests are for two-sided tests and the boolean-valued
466
methods supporting fixed significance level tests assume that the hypotheses
467
are two-sided. One sided tests can be performed by dividing returned p-values
468
(resp. critical values) by 2.</li>
469
<li>Degrees of freedom for chi-square tests are integral values, based on the
470
number of observed or expected counts (number of observed counts - 1)
471
for the goodness-of-fit tests and (number of columns -1) * (number of rows - 1)
472
for independence tests.</li>
476
<strong>Examples:</strong>
478
<dt><strong>One-sample <code>t</code> tests</strong></dt>
480
<dd>To compare the mean of a double[] array to a fixed value:
482
double[] observed = {1d, 2d, 3d};
484
System.out.println(TestUtils.t(mu, observed);
486
The code above will display the t-statisitic associated with a one-sample
487
t-test comparing the mean of the <code>observed</code> values against
490
<dd>To compare the mean of a dataset described by a
491
<a href="../apidocs/org/apache/commons/math/stat/descriptive/StatisticalSummary.html">
492
org.apache.commons.math.stat.descriptive.StatisticalSummary</a> to a fixed value:
494
double[] observed ={1d, 2d, 3d};
496
SummaryStatistics sampleStats = null;
497
sampleStats = SummaryStatistics.newInstance();
498
for (int i = 0; i < observed.length; i++) {
499
sampleStats.addValue(observed[i]);
501
System.out.println(TestUtils.t(mu, observed);
504
<dd>To compute the p-value associated with the null hypothesis that the mean
505
of a set of values equals a point estimate, against the two-sided alternative that
506
the mean is different from the target value:
508
double[] observed = {1d, 2d, 3d};
510
System.out.println(TestUtils.tTest(mu, observed);
512
The snippet above will display the p-value associated with the null
513
hypothesis that the mean of the population from which the
514
<code>observed</code> values are drawn equals <code>mu.</code>
516
<dd>To perform the test using a fixed significance level, use:
518
TestUtils.tTest(mu, observed, alpha);
520
where <code>0 < alpha < 0.5</code> is the significance level of
521
the test. The boolean value returned will be <code>true</code> iff the
522
null hypothesis can be rejected with confidence <code>1 - alpha</code>.
523
To test, for example at the 95% level of confidence, use
524
<code>alpha = 0.05</code>
527
<dt><strong>Two-Sample t-tests</strong></dt>
529
<dd><strong>Example 1:</strong> Paired test evaluating
530
the null hypothesis that the mean difference between corresponding
531
(paired) elements of the <code>double[]</code> arrays
532
<code>sample1</code> and <code>sample2</code> is zero.
534
To compute the t-statistic:
536
TestUtils.pairedT(sample1, sample2);
540
To compute the p-value:
542
TestUtils.pairedTTest(sample1, sample2);
546
To perform a fixed significance level test with alpha = .05:
548
TestUtils.pairedTTest(sample1, sample2, .05);
551
The last example will return <code>true</code> iff the p-value
552
returned by <code>TestUtils.pairedTTest(sample1, sample2)</code>
553
is less than <code>.05</code>
555
<dd><strong>Example 2: </strong> unpaired, two-sided, two-sample t-test using
556
<code>StatisticalSummary</code> instances, without assuming that
557
subpopulation variances are equal.
559
First create the <code>StatisticalSummary</code> instances. Both
560
<code>DescriptiveStatistics</code> and <code>SummaryStatistics</code>
561
implement this interface. Assume that <code>summary1</code> and
562
<code>summary2</code> are <code>SummaryStatistics</code> instances,
563
each of which has had at least 2 values added to the (virtual) dataset that
564
it describes. The sample sizes do not have to be the same -- all that is required
565
is that both samples have at least 2 elements.
567
<p><strong>Note:</strong> The <code>SummaryStatistics</code> class does
568
not store the dataset that it describes in memory, but it does compute all
569
statistics necessary to perform t-tests, so this method can be used to
570
conduct t-tests with very large samples. One-sample tests can also be
572
(See <a href="#1.2 Descriptive statistics">Descriptive statistics</a> for details
573
on the <code>SummaryStatistics</code> class.)
576
To compute the t-statistic:
578
TestUtils.t(summary1, summary2);
582
To compute the p-value:
584
TestUtils.tTest(sample1, sample2);
588
To perform a fixed significance level test with alpha = .05:
590
TestUtils.tTest(sample1, sample2, .05);
594
In each case above, the test does not assume that the subpopulation
595
variances are equal. To perform the tests under this assumption,
596
replace "t" at the beginning of the method name with "homoscedasticT"
600
<dt><strong>Chi-square tests</strong></dt>
602
<dd>To compute a chi-square statistic measuring the agreement between a
603
<code>long[]</code> array of observed counts and a <code>double[]</code>
604
array of expected counts, use:
606
long[] observed = {10, 9, 11};
607
double[] expected = {10.1, 9.8, 10.3};
608
System.out.println(TestUtils.chiSquare(expected, observed));
610
the value displayed will be
611
<code>sum((expected[i] - observed[i])^2 / expected[i])</code>
613
<dd> To get the p-value associated with the null hypothesis that
614
<code>observed</code> conforms to <code>expected</code> use:
616
TestUtils.chiSquareTest(expected, observed);
619
<dd> To test the null hypothesis that <code>observed</code> conforms to
620
<code>expected</code> with <code>alpha</code> siginficance level
621
(equiv. <code>100 * (1-alpha)%</code> confidence) where <code>
622
0 < alpha < 1 </code> use:
624
TestUtils.chiSquareTest(expected, observed, alpha);
626
The boolean value returned will be <code>true</code> iff the null hypothesis
627
can be rejected with confidence <code>1 - alpha</code>.
629
<dd>To compute a chi-square statistic statistic associated with a
630
<a href="http://www.itl.nist.gov/div898/handbook/prc/section4/prc45.htm">
631
chi-square test of independence</a> based on a two-dimensional (long[][])
632
<code>counts</code> array viewed as a two-way table, use:
634
TestUtils.chiSquareTest(counts);
636
The rows of the 2-way table are
637
<code>count[0], ... , count[count.length - 1]. </code><br></br>
638
The chi-square statistic returned is
639
<code>sum((counts[i][j] - expected[i][j])^2/expected[i][j])</code>
640
where the sum is taken over all table entries and
641
<code>expected[i][j]</code> is the product of the row and column sums at
642
row <code>i</code>, column <code>j</code> divided by the total count.
644
<dd>To compute the p-value associated with the null hypothesis that
645
the classifications represented by the counts in the columns of the input 2-way
646
table are independent of the rows, use:
648
TestUtils.chiSquareTest(counts);
651
<dd>To perform a chi-square test of independence with <code>alpha</code>
652
siginficance level (equiv. <code>100 * (1-alpha)%</code> confidence)
653
where <code>0 < alpha < 1 </code> use:
655
TestUtils.chiSquareTest(counts, alpha);
657
The boolean value returned will be <code>true</code> iff the null
658
hypothesis can be rejected with confidence <code>1 - alpha</code>.
661
<dt><strong>One-Way Anova tests</strong></dt>
663
<dd>To conduct a One-Way Analysis of Variance (ANOVA) to evaluate the
664
null hypothesis that the means of a collection of univariate datasets
665
are the same, start by loading the datasets into a collection, e.g.
668
{93.0, 103.0, 95.0, 101.0, 91.0, 105.0, 96.0, 94.0, 101.0 };
670
{99.0, 92.0, 102.0, 100.0, 102.0, 89.0 };
672
{110.0, 115.0, 111.0, 117.0, 128.0, 117.0 };
673
List classes = new ArrayList();
678
Then you can compute ANOVA F- or p-values associated with the
679
null hypothesis that the class means are all the same
680
using a <code>OneWayAnova</code> instance or <code>TestUtils</code>
683
double fStatistic = TestUtils.oneWayAnovaFValue(classes); // F-value
684
double pValue = TestUtils.oneWayAnovaPValue(classes); // P-value
686
To test perform a One-Way Anova test with signficance level set at 0.01
687
(so the test will, assuming assumptions are met, reject the null
688
hypothesis incorrectly only about one in 100 times), use
690
TestUtils.oneWayAnovaTest(classes, 0.01); // returns a boolean
691
// true means reject null hypothesis