IGLib  1.7.2
The IGLib base library EXTENDED - with other lilbraries and applications.
 All Classes Namespaces Files Functions Variables Typedefs Enumerations Enumerator Properties Events Macros
Meta.Numerics.Statistics.BivariateSample Class Reference

Represents a set of data points, where each data point is described by a pair of real numbers. More...

+ Inheritance diagram for Meta.Numerics.Statistics.BivariateSample:
+ Collaboration diagram for Meta.Numerics.Statistics.BivariateSample:

Public Member Functions

 BivariateSample ()
 Initializes a new bivariate sample. More...
 
 BivariateSample (string xName, string yName)
 Initializes a new bivariate sample with the given variable names. More...
 
void Add (double x, double y)
 Adds a data point to the sample. More...
 
void Add (XY point)
 Adds a data point to the sample. More...
 
void Add (IEnumerable< XY > points)
 Adds multiple data points to the sample. More...
 
void Add (IList< double > x, IList< double > y)
 Adds points from two lists to the sample. More...
 
bool Remove (double x, double y)
 Removes a data point from the sample. More...
 
bool Remove (XY point)
 Removes a data point from the sample. More...
 
void Clear ()
 Removes all data points from the sample. More...
 
bool Contains (double x, double y)
 Determines whether the sample contains a given data point. More...
 
bool Contains (XY xy)
 Determines whether the sample contains a given data point. More...
 
BivariateSample Copy ()
 Copies the bivariate sample. More...
 
IEnumerator< XYGetEnumerator ()
 Gets an enumerator of sample values. More...
 
void TransposeXY ()
 Swaps the X and Y variables in the bivariate sample. More...
 
TestResult PearsonRTest ()
 Performs a Pearson correlation test for association. More...
 
TestResult SpearmanRhoTest ()
 Performs a Spearman rank-order test of association between the two variables. More...
 
TestResult KendallTauTest ()
 Performs a Kendall concordance test for association. More...
 
TestResult PairedStudentTTest ()
 Performs a paired Student t-test. More...
 
FitResult LinearRegression ()
 Computes the best-fit linear regression from the data. More...
 
FitResult PolynomialRegression (int m)
 Computes the polynomial of given degree which best fits the data. More...
 
FitResult LinearLogisticRegression ()
 Computes the best-fit linear logistic regression from the data. More...
 
void Load (IDataReader reader, int xIndex, int yIndex)
 Loads values from a data reader. More...
 

Properties

bool IsReadOnly [get]
 Gets a value indicating whether the bivariate sample is read-only. More...
 
Sample X [get]
 Gets a read-only univariate sample consisting of the x-values of the data points. More...
 
Sample Y [get]
 Gets a read-only univariate sample consisting of the y-values of the data points. More...
 
int Count [get]
 Gets the number of data points. More...
 
double Covariance [get]
 Gets the covariance of the two variables. More...
 
double CorrelationCoefficient [get]
 Gets the correlation coefficient between the two variables. More...
 
UncertainValue PopulationCovariance [get]
 Estimates of the population covariance of two variables. More...
 

Private Member Functions

int IndexOf (double x, double y)
 
IEnumerator IEnumerable. GetEnumerator ()
 
void ICollection< XY >. CopyTo (XY[] array, int start)
 
double MomentAboutMean (int nx, int ny)
 

Private Attributes

SampleStorage xData
 
SampleStorage yData
 
bool isReadOnly
 

Detailed Description

Represents a set of data points, where each data point is described by a pair of real numbers.

A bivariate sample consists of pairs of real numbers, where each pair is an independent measurement. For example, if you measure the height and weight of a sample of people, the data could be stored as a bivariate sample. The class can compute various descriptive statistics for the sample, perform appropriate statistical tests on the sample data, and fit the sample data to various models.

Constructor & Destructor Documentation

Meta.Numerics.Statistics.BivariateSample.BivariateSample ( )
inline

Initializes a new bivariate sample.

Meta.Numerics.Statistics.BivariateSample.BivariateSample ( string  xName,
string  yName 
)
inline

Initializes a new bivariate sample with the given variable names.

Parameters
xNameThe name of the x-variable.
yNameThe name of the y-variable.

Member Function Documentation

void Meta.Numerics.Statistics.BivariateSample.Add ( XY  point)
inline

Adds a data point to the sample.

Parameters
pointThe data point.

References Meta.Numerics.XY.X, and Meta.Numerics.XY.Y.

void Meta.Numerics.Statistics.BivariateSample.Add ( IEnumerable< XY points)
inline

Adds multiple data points to the sample.

Parameters
pointsThe data points.
void Meta.Numerics.Statistics.BivariateSample.Add ( IList< double >  x,
IList< double >  y 
)
inline

Adds points from two lists to the sample.

Parameters
xThe x values of the data points.
yThe y values of the data points.
Exceptions
DimensionMistachExceptionThe lengths of the two lists are not equal.
int Meta.Numerics.Statistics.BivariateSample.IndexOf ( double  x,
double  y 
)
inlineprivate
bool Meta.Numerics.Statistics.BivariateSample.Remove ( double  x,
double  y 
)
inline

Removes a data point from the sample.

Parameters
xThe x-value of the data point.
yThe y-value of the data point.
Returns
True if the given data point was found and removed, otherwise false.

Referenced by Test.BivariateSampleTest.BivariateSampleManipulations().

bool Meta.Numerics.Statistics.BivariateSample.Remove ( XY  point)
inline

Removes a data point from the sample.

Parameters
pointThe point to remove.
Returns
True if the given data point was found and removed, otherwise false.

References Meta.Numerics.XY.X, and Meta.Numerics.XY.Y.

void Meta.Numerics.Statistics.BivariateSample.Clear ( )
inline

Removes all data points from the sample.

Referenced by Test.BivariateSampleTest.BivariateSampleManipulations().

bool Meta.Numerics.Statistics.BivariateSample.Contains ( double  x,
double  y 
)
inline

Determines whether the sample contains a given data point.

Parameters
xThe x-value of the data point.
yThe y-value of the data point.
Returns
True if the sample contains the given data point, otherwise false.

Referenced by Test.BivariateSampleTest.BivariateSampleManipulations().

bool Meta.Numerics.Statistics.BivariateSample.Contains ( XY  xy)
inline

Determines whether the sample contains a given data point.

Parameters
xyThe data point.
Returns
True if the sample contains the given data point, otherwise false.

References Meta.Numerics.XY.X, and Meta.Numerics.XY.Y.

BivariateSample Meta.Numerics.Statistics.BivariateSample.Copy ( )
inline

Copies the bivariate sample.

Returns
An independent copy of the bivariate sample.

Referenced by Test.BivariateSampleTest.BivariateSampleCopy().

IEnumerator<XY> Meta.Numerics.Statistics.BivariateSample.GetEnumerator ( )
inline

Gets an enumerator of sample values.

Returns
An enumerator of sample values.
IEnumerator IEnumerable. Meta.Numerics.Statistics.BivariateSample.GetEnumerator ( )
inlineprivate
void ICollection<XY>. Meta.Numerics.Statistics.BivariateSample.CopyTo ( XY[]  array,
int  start 
)
inlineprivate
void Meta.Numerics.Statistics.BivariateSample.TransposeXY ( )
inline

Swaps the X and Y variables in the bivariate sample.

double Meta.Numerics.Statistics.BivariateSample.MomentAboutMean ( int  nx,
int  ny 
)
inlineprivate
TestResult Meta.Numerics.Statistics.BivariateSample.PearsonRTest ( )
inline

Performs a Pearson correlation test for association.

Returns
The result of the test.

This test measures the strength of the linear correlation between two variables. The test statistic r is simply the covariance of the two variables, scaled by their respective standard deviations so as to obtain a number between -1 (perfect linear anti-correlation) and +1 (perfect linear correlation).

The Pearson test cannot reliably detect or rule out non-linear associations. For example, variables with a perfect quadratic association may have only a weak linear correlation. If you wish to test for associations that may not be linear, consider using the Spearman or Kendall tests instead.

The Pearson correlation test requires O(N) operations.

The Pearson test requires at least three bivariate values.

Exceptions
InsufficientDataExceptionCount is less than three.
See also
SpearmanRhoTest, KendallTauTest

Referenced by Test.MultivariateSampleTest.BivariateNullAssociation(), and Test.BivariateSampleTest.PearsonRDistribution().

TestResult Meta.Numerics.Statistics.BivariateSample.SpearmanRhoTest ( )
inline

Performs a Spearman rank-order test of association between the two variables.

Returns
The result of the test.

The Spearman rank-order test of association is a non-parametric test for association between two variables. The test statistic rho is the correlation coefficient of the rank of each entry in the sample. It is thus invariant over monotonic reparameterizations of the data, and will, for example, detect a quadratic or exponential association just as well as a linear association.

The Spearman rank-order test requires O(N log N) operations.

Exceptions
InsufficientDataExceptionThere are fewer than three data points.
See also
PearsonRTest, KendallTauTest

References Meta.Numerics.Interval.FromEndpoints().

Referenced by Test.MultivariateSampleTest.BivariateNullAssociation(), and Test.NullDistributionTests.SpearmanNullDistributionTest().

TestResult Meta.Numerics.Statistics.BivariateSample.KendallTauTest ( )
inline

Performs a Kendall concordance test for association.

Returns
The result of the test.

Kendall's &#x3C4; is a non-parameteric and robust test of association between two variables. It simply measures the number of cases where an increase in one variable is associated with an increase in the other (corcordant pairs), compared with the number of cases where an increase in one variable is associated with a decrease in the other (discordant pairs).

Because &#x3C4; depends only on the sign of a change and not its magnitude, it is not skewed by outliers exhibiting very large changes, nor by cases where the degree of change in one variable associated with a given change in the other changes over the range of the varibles. Of course, it may still miss an association whoose sign changes over the range of the variables. For example, if data points lie along a semi-circle in the plane, an increase in the first variable is associated with an increase in the second variable along the rising arc and and decrease in the second variable along the falling arc. No test that looks for single-signed correlation will catch this association.

Because it examine all pairs of data points, the Kendall test requires O(N2) operations. It is thus impractical for very large data sets. While not quite as robust as the Kendall test, the Spearman test is a good fall-back in such cases.

Exceptions
InsufficientDataExceptionCount is less than two.
See also
PearsonRTest, SpearmanRhoTest

References Meta.Numerics.Interval.FromEndpoints().

Referenced by Test.MultivariateSampleTest.BivariateNullAssociation(), and Test.NullDistributionTests.KendallNullDistributionTest().

TestResult Meta.Numerics.Statistics.BivariateSample.PairedStudentTTest ( )
inline

Performs a paired Student t-test.

Returns
The result of the test.

Like a two-sample, unpaired t-test (Sample.StudentTTest(Sample,Sample)), a paired t-test compares two samples to detect a difference in means. Unlike the unpaired version, the paired version assumes that each

Exceptions
InsufficientDataExceptionThere are fewer than two data points.

Referenced by Test.MultivariateSampleTest.PairedStudentTTest().

FitResult Meta.Numerics.Statistics.BivariateSample.LinearRegression ( )
inline

Computes the best-fit linear regression from the data.

Returns
The result of the fit.

Linear regression assumes that the data have been generated by a function y = a + b x + e, where e is normally distributed noise, and determines the values of a and b that best fit the data. It also determines an error matrix on the parameters a and b, and does an F-test to

The fit result is two-dimensional. The first parameter is the intercept a, the second is the slope b. The goodness-of-fit test is a F-test comparing the variance accounted for by the model to the remaining, unexplained variance.

Exceptions
InsufficientDataExceptionThere are fewer than three data points.

Referenced by Test.BivariateSampleTest.BivariateLinearPolynomialRegressionAgreement(), Test.BivariateSampleTest.BivariateLinearRegression(), Test.BivariateSampleTest.BivariateLinearRegressionGoodnessOfFitDistribution(), and Test.MultivariateSampleTest.MultivariateLinearRegressionAgreement().

FitResult Meta.Numerics.Statistics.BivariateSample.PolynomialRegression ( int  m)
inline

Computes the polynomial of given degree which best fits the data.

Parameters
mThe degree, which must be non-negative.
Returns
The fit result.
Exceptions
ArgumentOutOfRangeExceptionm is negative.
InsufficientDataExceptionThere are fewer data points than coefficients to be fit.

References Meta.Numerics.Matrices.ColumnVector.Transpose().

Referenced by Test.BivariateSampleTest.BivariateLinearPolynomialRegressionAgreement(), Test.BivariateSampleTest.BivariatePolynomialRegression(), and Test.BugTests.Bug6162().

FitResult Meta.Numerics.Statistics.BivariateSample.LinearLogisticRegression ( )
inline

Computes the best-fit linear logistic regression from the data.

Returns
The fit result.

Linear logistic regression is a way to fit binary outcome data to a linear model.

The method assumes that binary outcomes are encoded as 0 and 1. If any y-values other than 0 and 1 are encountered, it throws an InvalidOperationException.

The fit result is two-dimensional. The first parameter is a, the second b.

Exceptions
InsufficientDataExceptionThere are fewer than three data points.
InvalidOperationExceptionThere is a y-value other than 0 or 1.

References Meta.Numerics.Analysis.MultiFunctionMath.FindMinimum(), Meta.Numerics.Analysis.MultiExtremum.HessianMatrix, Meta.Numerics.Matrices.SymmetricMatrix.Inverse(), and Meta.Numerics.Analysis.MultiExtremum.Location.

Referenced by Test.BivariateSampleTest.LinearLogisticRegression().

void Meta.Numerics.Statistics.BivariateSample.Load ( IDataReader  reader,
int  xIndex,
int  yIndex 
)
inline

Loads values from a data reader.

Parameters
readerThe data reader.
xIndexThe column number of the x-variable.
yIndexThe column number of the y-variable.

Member Data Documentation

SampleStorage Meta.Numerics.Statistics.BivariateSample.xData
private
SampleStorage Meta.Numerics.Statistics.BivariateSample.yData
private
bool Meta.Numerics.Statistics.BivariateSample.isReadOnly
private

Property Documentation

bool Meta.Numerics.Statistics.BivariateSample.IsReadOnly
get

Gets a value indicating whether the bivariate sample is read-only.

Sample Meta.Numerics.Statistics.BivariateSample.X
get

Gets a read-only univariate sample consisting of the x-values of the data points.

Use this method to obtain sinformation specific to the x-vales, such as their Sample.Median or Sample.Variance.

Note that this is a fast, O(1) operation, which does not create an independent copy of the data. The advantage of this is that you can access x-data as a Sample as often as you like without worying about performance. The disadvantage of this is that the returned sample cannot be altered. If you need to alter x-data independent of the bivariate sample, use the Sample.Copy method to obtain an independent copy.

Referenced by Test.SampleTest.BetaFitUncertainty(), Test.BivariateSampleTest.BivariateLinearRegression(), Test.ContingencyTableTest.ContingencyTableProbabilitiesAndUncertainties(), Test.SampleTest.GammaFitUncertainty(), Test.BivariateSampleTest.LinearLogisticRegression(), Test.SampleTest.NormalFitUncertainties(), Test.SampleTest.WaldFitUncertainties(), and Test.SampleTest.WeibullFitUncertainties().

Sample Meta.Numerics.Statistics.BivariateSample.Y
get

Gets a read-only univariate sample consisting of the y-values of the data points.

Use this method to obtain sinformation specific to the y-vales, such as their Sample.Median or Sample.Variance.

Note that this is a fast, O(1) operation, which does not create an independent copy of the data. The advantage of this is that you can access y-data as a Sample as often as you like without worying about performance. The disadvantage of this is that the returned sample cannot be altered. If you need to alter y-data independent of the bivariate sample, use the Sample.Copy method to obtain an independent copy.

Referenced by Test.SampleTest.BetaFitUncertainty(), Test.BivariateSampleTest.BivariateLinearRegression(), Test.ContingencyTableTest.ContingencyTableProbabilitiesAndUncertainties(), Test.SampleTest.GammaFitUncertainty(), Test.BivariateSampleTest.LinearLogisticRegression(), Test.SampleTest.NormalFitUncertainties(), Test.SampleTest.WaldFitUncertainties(), and Test.SampleTest.WeibullFitUncertainties().

double Meta.Numerics.Statistics.BivariateSample.Covariance
get

Gets the covariance of the two variables.

Referenced by Test.MultivariateSampleTest.MultivariateMoments().

double Meta.Numerics.Statistics.BivariateSample.CorrelationCoefficient
get

Gets the correlation coefficient between the two variables.

UncertainValue Meta.Numerics.Statistics.BivariateSample.PopulationCovariance
get

Estimates of the population covariance of two variables.

Returns
An estimate, with associated uncertainty, of the population covariance.

Referenced by Test.BivariateSampleTest.BivariateLinearRegression(), Test.BivariateSampleTest.BivariatePolynomialRegression(), Test.BivariateSampleTest.LinearLogisticRegression(), Test.SampleTest.NormalFitUncertainties(), and Test.SampleTest.WaldFitUncertainties().


The documentation for this class was generated from the following file: