Chris Beaumont's IDL Library

single page | use frames     summary     class     fields     routine details     file attributes

edf_stats.pro

topedf_stats

`result = edf_stats(data, model [, ks=ks] [, ad=ad] [, ky=ky] [, mad=mad], _extra=_extra)`

This function calculates a variety of statistics to characterize the discrepancy between 1D data and a model for the distribution from which the data were drawn. Each statistic is based on the empirical distribution function (i.e. the cdf of the data). Such statistics can be used to evaluate whether a model distribution is consistent with the data.

In what follows, Fn(x) is the empirical distribution function (edf) and F(x) is the model cdf. Currently, four statistics are implemented:

1) The Kolmogorov-Smirnov Statistic: max(|F(x) - Fn(x)|) 2) An Anderson-Darling style statistic: Mean( (Fn(x) - F(x))^2 / (F(x) * (1 - F(x))) ) 3) The Kuyper Statistic: max(F(x) - Fn(x)) + max(Fn(x) - F(x)) 4) The mean absolute deviation: Mean( |Fn(x) - F(x)| )

Note that statistic 2 is designed to be more sensitive to discrepancies at low and high values of x than is the KS stat

Note also that the Kuyper statistic is meant to be used for values of x wrapped onto a circle. See Numerical Recipes.

CATEGORY Statistics

Return value

If only one of ks, ad, ky, or mad are set, then the return value is that particular statistic. Otherwise, the KS statistic is returned.

MODIFICATION HISTORY June 2009: Written by Chris Beaumont July 2009: Added mad statistic

Parameters

data in required

A vector of data values

model in required

The string name of a function which calculates the cdf of the model distribution. The function must have the calling sequence result = model(x, _extra = extra), and must be written to handle x as a scalar or vector. Extra keywords to edf_stats will be passed to this function.

Keywords

ks in optional

If non zero or set to a named variable, will calculate and return the ks statistic to that variable.

Same as above, for the Anderson-Darling statistic.

ky in optional

Same as above, for the Kuyper statistic