Interface EmpiricalDistribution

All Known Implementing Classes:
EmpiricalDistributionImpl

public interface EmpiricalDistribution
Represents an empirical probability distribution -- a probability distribution derived from observed data without making any assumptions about the functional form of the population distribution that the data come from.

Implementations of this interface maintain data structures, called distribution digests, that describe empirical distributions and support the following operations:

  • loading the distribution from a file of observed data values
  • dividing the input data into "bin ranges" and reporting bin frequency counts (data for histogram)
  • reporting univariate statistics describing the full set of data values as well as the observations within each bin
  • generating random values from the distribution
Applications can use EmpiricalDistribution implementations to build grouped frequency histograms representing the input data or to generate random values "like" those in the input file -- i.e., the values generated will follow the distribution of the values in the file.

Version:
$Revision: 817128 $ $Date: 2009-09-21 03:30:53 +0200 (lun. 21 sept. 2009) $
  • Method Summary

    Modifier and Type
    Method
    Description
    int
    Returns the number of bins.
    Returns a list of SummaryStatistics containing statistics describing the values in each of the bins.
    double
    Generates a random value from this distribution.
    Returns a StatisticalSummary describing this distribution.
    double[]
    Returns the array of upper bounds for the bins.
    boolean
    Property indicating whether or not the distribution has been loaded.
    void
    load(double[] dataArray)
    Computes the empirical distribution from the provided array of numbers.
    void
    load(File file)
    Computes the empirical distribution from the input file.
    void
    load(URL url)
    Computes the empirical distribution using data read from a URL.
  • Method Details

    • load

      void load(double[] dataArray)
      Computes the empirical distribution from the provided array of numbers.
      Parameters:
      dataArray - the data array
    • load

      void load(File file) throws IOException
      Computes the empirical distribution from the input file.
      Parameters:
      file - the input file
      Throws:
      IOException - if an IO error occurs
    • load

      void load(URL url) throws IOException
      Computes the empirical distribution using data read from a URL.
      Parameters:
      url - url of the input file
      Throws:
      IOException - if an IO error occurs
    • getNextValue

      double getNextValue() throws IllegalStateException
      Generates a random value from this distribution. Preconditions:
      • the distribution must be loaded before invoking this method
      Returns:
      the random value.
      Throws:
      IllegalStateException - if the distribution has not been loaded
    • getSampleStats

      StatisticalSummary getSampleStats() throws IllegalStateException
      Returns a StatisticalSummary describing this distribution. Preconditions:
      • the distribution must be loaded before invoking this method
      Returns:
      the sample statistics
      Throws:
      IllegalStateException - if the distribution has not been loaded
    • isLoaded

      boolean isLoaded()
      Property indicating whether or not the distribution has been loaded.
      Returns:
      true if the distribution has been loaded
    • getBinCount

      int getBinCount()
      Returns the number of bins.
      Returns:
      the number of bins
    • getBinStats

      List<SummaryStatistics> getBinStats()
      Returns a list of SummaryStatistics containing statistics describing the values in each of the bins. The List is indexed on the bin number.
      Returns:
      List of bin statistics
    • getUpperBounds

      double[] getUpperBounds()
      Returns the array of upper bounds for the bins. Bins are:
      [min,upperBounds[0]],(upperBounds[0],upperBounds[1]],..., (upperBounds[binCount-2], upperBounds[binCount-1] = max].
      Returns:
      array of bin upper bounds