Package org.apache.commons.math.random
Class EmpiricalDistributionImpl
java.lang.Object
org.apache.commons.math.random.EmpiricalDistributionImpl
- All Implemented Interfaces:
Serializable
,EmpiricalDistribution
public class EmpiricalDistributionImpl
extends Object
implements Serializable, EmpiricalDistribution
Implements
EmpiricalDistribution
interface. This implementation
uses what amounts to the
Variable Kernel Method with Gaussian smoothing:Digesting the input file
- Pass the file once to compute min and max.
- Divide the range from min-max into
binCount
"bins." - Pass the data file again, computing bin counts and univariate statistics (mean, std dev.) for each of the bins
- Divide the interval (0,1) into subintervals associated with the bins, with the length of a bin's subinterval proportional to its count.
- Generate a uniformly distributed value in (0,1)
- Select the subinterval to which the value belongs.
- Generate a random Gaussian value with mean = mean of the associated bin and std dev = std dev of associated bin.
USAGE NOTES:
- The
binCount
is set by default to 1000. A good rule of thumb is to set the bin count to approximately the length of the input file divided by 10. - The input file must be a plain text file containing one valid numeric entry per line.
- Version:
- $Revision: 1003886 $ $Date: 2010-10-02 23:04:44 +0200 (sam. 02 oct. 2010) $
- See Also:
-
Constructor Summary
ConstructorsConstructorDescriptionCreates a new EmpiricalDistribution with the default bin count.EmpiricalDistributionImpl
(int binCount) Creates a new EmpiricalDistribution with the specified bin count. -
Method Summary
Modifier and TypeMethodDescriptionint
Returns the number of bins.Returns a List ofSummaryStatistics
instances containing statistics describing the values in each of the bins.double[]
Returns a fresh copy of the array of upper bounds of the subintervals of [0,1] used in generating data from the empirical distribution.double
Generates a random value from this distribution.Returns aStatisticalSummary
describing this distribution.double[]
Returns a fresh copy of the array of upper bounds for the bins.boolean
isLoaded()
Property indicating whether or not the distribution has been loaded.void
load
(double[] in) Computes the empirical distribution from the provided array of numbers.void
Computes the empirical distribution from the input file.void
Computes the empirical distribution using data read from a URL.
-
Constructor Details
-
EmpiricalDistributionImpl
public EmpiricalDistributionImpl()Creates a new EmpiricalDistribution with the default bin count. -
EmpiricalDistributionImpl
public EmpiricalDistributionImpl(int binCount) Creates a new EmpiricalDistribution with the specified bin count.- Parameters:
binCount
- number of bins
-
-
Method Details
-
load
public void load(double[] in) Computes the empirical distribution from the provided array of numbers.- Specified by:
load
in interfaceEmpiricalDistribution
- Parameters:
in
- the input data array
-
load
Computes the empirical distribution using data read from a URL.- Specified by:
load
in interfaceEmpiricalDistribution
- Parameters:
url
- url of the input file- Throws:
IOException
- if an IO error occurs
-
load
Computes the empirical distribution from the input file.- Specified by:
load
in interfaceEmpiricalDistribution
- Parameters:
file
- the input file- Throws:
IOException
- if an IO error occurs
-
getNextValue
Generates a random value from this distribution.- Specified by:
getNextValue
in interfaceEmpiricalDistribution
- Returns:
- the random value.
- Throws:
IllegalStateException
- if the distribution has not been loaded
-
getSampleStats
Returns aStatisticalSummary
describing this distribution. Preconditions:- the distribution must be loaded before invoking this method
- Specified by:
getSampleStats
in interfaceEmpiricalDistribution
- Returns:
- the sample statistics
- Throws:
IllegalStateException
- if the distribution has not been loaded
-
getBinCount
public int getBinCount()Returns the number of bins.- Specified by:
getBinCount
in interfaceEmpiricalDistribution
- Returns:
- the number of bins.
-
getBinStats
Returns a List ofSummaryStatistics
instances containing statistics describing the values in each of the bins. The list is indexed on the bin number.- Specified by:
getBinStats
in interfaceEmpiricalDistribution
- Returns:
- List of bin statistics.
-
getUpperBounds
public double[] getUpperBounds()Returns a fresh copy of the array of upper bounds for the bins. Bins are:
[min,upperBounds[0]],(upperBounds[0],upperBounds[1]],..., (upperBounds[binCount-2], upperBounds[binCount-1] = max].Note: In versions 1.0-2.0 of commons-math, this method incorrectly returned the array of probability generator upper bounds now returned by
getGeneratorUpperBounds()
.- Specified by:
getUpperBounds
in interfaceEmpiricalDistribution
- Returns:
- array of bin upper bounds
- Since:
- 2.1
-
getGeneratorUpperBounds
public double[] getGeneratorUpperBounds()Returns a fresh copy of the array of upper bounds of the subintervals of [0,1] used in generating data from the empirical distribution. Subintervals correspond to bins with lengths proportional to bin counts.
In versions 1.0-2.0 of commons-math, this array was (incorrectly) returned by
getUpperBounds()
.- Returns:
- array of upper bounds of subintervals used in data generation
- Since:
- 2.1
-
isLoaded
public boolean isLoaded()Property indicating whether or not the distribution has been loaded.- Specified by:
isLoaded
in interfaceEmpiricalDistribution
- Returns:
- true if the distribution has been loaded
-