java.lang.Object
org.apache.commons.math.stat.descriptive.AbstractUnivariateStatistic
org.apache.commons.math.stat.descriptive.rank.Percentile
All Implemented Interfaces:
Serializable, UnivariateStatistic
Direct Known Subclasses:
Median

public class Percentile extends AbstractUnivariateStatistic implements Serializable
Provides percentile computation.

There are several commonly used methods for estimating percentiles (a.k.a. quantiles) based on sample data. For large samples, the different methods agree closely, but when sample sizes are small, different methods will give significantly different results. The algorithm implemented here works as follows:

  1. Let n be the length of the (sorted) array and 0 < p <= 100 be the desired percentile.
  2. If n = 1 return the unique array element (regardless of the value of p); otherwise
  3. Compute the estimated percentile position pos = p * (n + 1) / 100 and the difference, d between pos and floor(pos) (i.e. the fractional part of pos). If pos >= n return the largest element in the array; otherwise
  4. Let lower be the element in position floor(pos) in the array and let upper be the next element in the array. Return lower + d * (upper - lower)

To compute percentiles, the data must be at least partially ordered. Input arrays are copied and recursively partitioned using an ordering definition. The ordering used by Arrays.sort(double[]) is the one determined by Double.compareTo(Double). This ordering makes Double.NaN larger than any other value (including Double.POSITIVE_INFINITY). Therefore, for example, the median (50th percentile) of {0, 1, 2, 3, 4, Double.NaN} evaluates to 2.5.

Since percentile estimation usually involves interpolation between array elements, arrays containing NaN or infinite values will often result in NaN or infinite values returned.

Since 2.2, Percentile implementation uses only selection instead of complete sorting and caches selection algorithm state between calls to the various evaluate methods when several percentiles are to be computed on the same data. This greatly improves efficiency, both for single percentile and multiple percentiles computations. However, it also induces a need to be sure the data at one call to evaluate is the same as the data with the cached algorithm state from the previous calls. Percentile does this by checking the array reference itself and a checksum of its content by default. If the user already knows he calls evaluate on an immutable array, he can save the checking time by calling the evaluate methods that do not

Note that this implementation is not synchronized. If multiple threads access an instance of this class concurrently, and at least one of the threads invokes the increment() or clear() method, it must be synchronized externally.

Version:
$Revision: 1006299 $ $Date: 2010-10-10 16:47:17 +0200 (dim. 10 oct. 2010) $
See Also:
  • Constructor Summary

    Constructors
    Constructor
    Description
    Constructs a Percentile with a default quantile value of 50.0.
    Percentile(double p)
    Constructs a Percentile with the specific quantile value.
    Copy constructor, creates a new Percentile identical to the original
  • Method Summary

    Modifier and Type
    Method
    Description
    Returns a copy of the statistic with the same internal state.
    static void
    copy(Percentile source, Percentile dest)
    Copies source to dest.
    double
    evaluate(double p)
    Returns the result of evaluating the statistic over the stored data.
    double
    evaluate(double[] values, double p)
    Returns an estimate of the pth percentile of the values in the values array.
    double
    evaluate(double[] values, int start, int length)
    Returns an estimate of the quantileth percentile of the designated values in the values array.
    double
    evaluate(double[] values, int begin, int length, double p)
    Returns an estimate of the pth percentile of the values in the values array, starting with the element in (0-based) position begin in the array and including length values.
    double
    Returns the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).
    void
    setData(double[] values)
    Set the data array.
    void
    setData(double[] values, int begin, int length)
    Set the data array.
    void
    setQuantile(double p)
    Sets the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).

    Methods inherited from class org.apache.commons.math.stat.descriptive.AbstractUnivariateStatistic

    evaluate, evaluate, getData, getDataRef, test, test

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

    • Percentile

      public Percentile()
      Constructs a Percentile with a default quantile value of 50.0.
    • Percentile

      public Percentile(double p)
      Constructs a Percentile with the specific quantile value.
      Parameters:
      p - the quantile
      Throws:
      IllegalArgumentException - if p is not greater than 0 and less than or equal to 100
    • Percentile

      public Percentile(Percentile original)
      Copy constructor, creates a new Percentile identical to the original
      Parameters:
      original - the Percentile instance to copy
  • Method Details

    • setData

      public void setData(double[] values)
      Set the data array.

      The stored value is a copy of the parameter array, not the array itself

      Overrides:
      setData in class AbstractUnivariateStatistic
      Parameters:
      values - data array to store (may be null to remove stored data)
      See Also:
    • setData

      public void setData(double[] values, int begin, int length)
      Set the data array.
      Overrides:
      setData in class AbstractUnivariateStatistic
      Parameters:
      values - data array to store
      begin - the index of the first element to include
      length - the number of elements to include
      See Also:
    • evaluate

      public double evaluate(double p)
      Returns the result of evaluating the statistic over the stored data.

      The stored array is the one which was set by previous calls to

      Parameters:
      p - the percentile value to compute
      Returns:
      the value of the statistic applied to the stored data
    • evaluate

      public double evaluate(double[] values, double p)
      Returns an estimate of the pth percentile of the values in the values array.

      Calls to this method do not modify the internal quantile state of this statistic.

      • Returns Double.NaN if values has length 0
      • Returns (for any value of p) values[0] if values has length 1
      • Throws IllegalArgumentException if values is null or p is not a valid quantile value (p must be greater than 0 and less than or equal to 100)

      See Percentile for a description of the percentile estimation algorithm used.

      Parameters:
      values - input array of values
      p - the percentile value to compute
      Returns:
      the percentile value or Double.NaN if the array is empty
      Throws:
      IllegalArgumentException - if values is null or p is invalid
    • evaluate

      public double evaluate(double[] values, int start, int length)
      Returns an estimate of the quantileth percentile of the designated values in the values array. The quantile estimated is determined by the quantile property.

      • Returns Double.NaN if length = 0
      • Returns (for any value of quantile) values[begin] if length = 1
      • Throws IllegalArgumentException if values is null, or start or length is invalid

      See Percentile for a description of the percentile estimation algorithm used.

      Specified by:
      evaluate in interface UnivariateStatistic
      Specified by:
      evaluate in class AbstractUnivariateStatistic
      Parameters:
      values - the input array
      start - index of the first array element to include
      length - the number of elements to include
      Returns:
      the percentile value
      Throws:
      IllegalArgumentException - if the parameters are not valid
    • evaluate

      public double evaluate(double[] values, int begin, int length, double p)
      Returns an estimate of the pth percentile of the values in the values array, starting with the element in (0-based) position begin in the array and including length values.

      Calls to this method do not modify the internal quantile state of this statistic.

      • Returns Double.NaN if length = 0
      • Returns (for any value of p) values[begin] if length = 1
      • Throws IllegalArgumentException if values is null , begin or length is invalid, or p is not a valid quantile value (p must be greater than 0 and less than or equal to 100)

      See Percentile for a description of the percentile estimation algorithm used.

      Parameters:
      values - array of input values
      p - the percentile to compute
      begin - the first (0-based) element to include in the computation
      length - the number of array elements to include
      Returns:
      the percentile value
      Throws:
      IllegalArgumentException - if the parameters are not valid or the input array is null
    • getQuantile

      public double getQuantile()
      Returns the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).
      Returns:
      quantile
    • setQuantile

      public void setQuantile(double p)
      Sets the value of the quantile field (determines what percentile is computed when evaluate() is called with no quantile argument).
      Parameters:
      p - a value between 0 < p <= 100
      Throws:
      IllegalArgumentException - if p is not greater than 0 and less than or equal to 100
    • copy

      public Percentile copy()
      Returns a copy of the statistic with the same internal state.
      Specified by:
      copy in interface UnivariateStatistic
      Specified by:
      copy in class AbstractUnivariateStatistic
      Returns:
      a copy of the statistic
    • copy

      public static void copy(Percentile source, Percentile dest)
      Copies source to dest.

      Neither source nor dest can be null.

      Parameters:
      source - Percentile to copy
      dest - Percentile to copy to
      Throws:
      NullPointerException - if either source or dest is null