org.geotoolkit.math
Class Statistics

Object
  extended by Statistics
All Implemented Interfaces:
Serializable, Cloneable
Direct Known Subclasses:
Statistics.Delta

public class Statistics
extends Object
implements Cloneable, Serializable

Holds some statistics about a series of sample values. Given a series of sample values s0, s1, s2, s3..., this class computes minimum, maximum, mean, root mean square and standard deviation. Statistics are computed on the fly using the Kahan summation algorithm for reducing the numerical errors; the sample values are never stored in memory.

An instance of Statistics is initially empty (i.e. all statistical values are set to NaN). The statistics are updated every time an add(double) method is invoked with a non-NaN value. A typical usage of this class is:

double[] data = new double[1000];
// (Compute some data values here...)

Statistics stats = new Statistics();
for (int i=0; i<data.length; i++) {
    stats.add(data[i]);
}
System.out.println(stats);

Since:
1.0
Version:
3.20
Author:
Martin Desruisseaux (MPO, IRD, Geomatys)
See Also:
Serialized Form
Module:
utility/geotk-utility (download)    View source code for this class

Nested Class Summary
static class Statistics.Delta
          Holds some statistics about a series of sample values and the difference between them.
 
Constructor Summary
Statistics()
          Constructs an initially empty set of statistics.
 
Method Summary
 void add(double sample)
          Updates statistics for the specified sample.
 void add(long sample)
          Updates statistics for the specified sample.
 void add(Statistics stats)
          Updates statistics with all samples from the specified stats.
 Statistics clone()
          Returns a clone of this statistics.
 void configure(NumberFormat format)
          Configures the given formatter for writing a set of data described by this statistics.
 int count()
          Returns the number of samples, excluding NaN values.
 int countNaN()
          Returns the number of NaN samples.
 boolean equals(Object object)
          Compares this statistics with the specified object for equality.
 NumberFormat getNumberFormat(Locale locale)
          Suggests a formatter for writing a set of data described by this statistics.
 int hashCode()
          Returns a hash code value for this statistics.
 double maximum()
          Returns the maximum sample value, or NaN if none.
 double mean()
          Returns the mean value, or NaN if none.
 double minimum()
          Returns the minimum sample value, or NaN if none.
static void printTable(CharSequence[] header, Statistics[] statistics, Locale locale)
          Prints to the standard output stream the given array of statistics as a table.
 double range()
          Returns the range of sample values.
 void reset()
          Resets the statistics to their initial NaN values.
 double rms()
          Returns the root mean square, or NaN if none.
 double standardDeviation(boolean allPopulation)
          Returns the standard deviation.
 double sum()
          Returns the sum, or 0 if none.
 String toString()
          Returns a string representation of this statistics.
 String toString(Locale locale, boolean tabulations)
          Returns a localized string representation of this statistics.
static void writeTable(Writer out, CharSequence[] header, Statistics[] statistics, Locale locale)
          Formats the given array of statistics as a table.
 
Methods inherited from class Object
finalize, getClass, notify, notifyAll, wait, wait, wait
 

Constructor Detail

Statistics

public Statistics()
Constructs an initially empty set of statistics. All statistical values are initialized to Double.NaN.

Method Detail

reset

public void reset()
Resets the statistics to their initial NaN values. This method reset this object state as if it was just created.


add

public void add(double sample)
Updates statistics for the specified sample. This add method is usually invoked inside a for loop.

Parameters:
sample - The sample value. NaN values are ignored.
See Also:
add(long), add(Statistics)

add

public void add(long sample)
Updates statistics for the specified sample. This add method is usually invoked inside a for loop.

Parameters:
sample - The sample value.
See Also:
add(double), add(Statistics)

add

public void add(Statistics stats)
Updates statistics with all samples from the specified stats. Invoking this method is equivalent (except for rounding errors) to invoking add for all samples that were added to stats.

Parameters:
stats - The statistics to be added to this, or null if none.

countNaN

public int countNaN()
Returns the number of NaN samples. NaN samples are ignored in all other statistical computation. This method count them for information purpose only.

Returns:
The number of NaN values.

count

public int count()
Returns the number of samples, excluding NaN values.

Returns:
The number of sample values, excluding NaN.

minimum

public double minimum()
Returns the minimum sample value, or NaN if none.

Returns:
The minimum sample value.
See Also:
maximum()

maximum

public double maximum()
Returns the maximum sample value, or NaN if none.

Returns:
The maximum sample value.
See Also:
minimum()

range

public double range()
Returns the range of sample values. This is equivalent to maximum - minimum, except for rounding error. If no samples were added, then returns NaN.

Returns:
The range of values.
See Also:
minimum(), maximum()

sum

public double sum()
Returns the sum, or 0 if none.

Returns:
The sum.
Since:
3.00

mean

public double mean()
Returns the mean value, or NaN if none.

Returns:
The mean value.

rms

public double rms()
Returns the root mean square, or NaN if none.

Returns:
The root mean square.

standardDeviation

public double standardDeviation(boolean allPopulation)
Returns the standard deviation. If the sample values given to the add(...) methods have a uniform distribution, then the returned value should be close to sqrt(range2 / 12). If they have a gaussian distribution (which is the most common case), then the returned value is related to the error function.

As a remainder, the table below gives the probability for a sample value to be inside the mean ± n×deviation range, assuming that the distribution is gaussian (first column) or assuming that the distribution is uniform (second column).

ngaussianuniform
 0.5  69.1%  28.9% 
 1.0  84.2%  57.7% 
 1.5  93.3%  86.6% 
 2.0  97.7%   100% 
 3.0  99.9%   100% 

Parameters:
allPopulation - true if sample values given to add methods are the totality of the population under study, or false if they are only a sampling.
Returns:
The standard deviation.

getNumberFormat

public NumberFormat getNumberFormat(Locale locale)
Suggests a formatter for writing a set of data described by this statistics. This method configures the formatter using heuristic rules based on the range of values and their standard deviation. It can be used for reasonable default formatting when the user didn't specify an explicit one.

Parameters:
locale - The locale for the formatter, or null for the default.
Returns:
A proposed formatter for data described by this statistics.
Since:
3.00

configure

public void configure(NumberFormat format)
Configures the given formatter for writing a set of data described by this statistics. This method applies the same heuristic rules than getNumberFormat(Locale).

Parameters:
format - The format to configure.
Since:
3.20

toString

public final String toString()
Returns a string representation of this statistics. This method invokes toString(Locale, boolean) using the default locale and spaces separator.

Overrides:
toString in class Object

toString

public String toString(Locale locale,
                       boolean tabulations)
Returns a localized string representation of this statistics. This string will span multiple lines, one for each statistical value. For example:
Compte:      8726
Minimum:    6.853
Maximum:    8.259
Moyenne:    7.421
RMS:        7.846
Écart-type: 6.489

Parameters:
locale - The locale to use for formatting the string representation, or null for the default one.
tabulations - If true, then labels (e.g. "Minimum") and values (e.g. "6.853") are separated by tabulations. Otherwise, they are separated by spaces.
Returns:
A string representation of this statistics object.

printTable

public static void printTable(CharSequence[] header,
                              Statistics[] statistics,
                              Locale locale)
Prints to the standard output stream the given array of statistics as a table. This is mostly a convenience method for debugging.

Parameters:
header - The column headers in the table, or null if none.
statistics - The statistics to format.
locale - The locale, or null for the default locale.
Since:
3.00

writeTable

public static void writeTable(Writer out,
                              CharSequence[] header,
                              Statistics[] statistics,
                              Locale locale)
                       throws IOException
Formats the given array of statistics as a table.

Parameters:
out - Where to format the statistics table.
header - The column headers in the table, or null if none.
statistics - The statistics to format.
locale - The locale, or null for the default locale.
Throws:
IOException - if an error occurred while writing to out.
Since:
3.00

clone

public Statistics clone()
Returns a clone of this statistics.

Overrides:
clone in class Object
Returns:
A clone of this statistics.
See Also:
Object.clone()

hashCode

public int hashCode()
Returns a hash code value for this statistics.

Overrides:
hashCode in class Object

equals

public boolean equals(Object object)
Compares this statistics with the specified object for equality.

Overrides:
equals in class Object
Parameters:
object - The object to compare with.
Returns:
true if both objects are equal.


Copyright © 2009-2012 Geotoolkit.org. All Rights Reserved.