IGLib
1.7.2
The IGLib base library EXTENDED - with other lilbraries and applications.
|
Represents a set of data points, where each data point is described by a pair of real numbers. More...
Public Member Functions | |
BivariateSample () | |
Initializes a new bivariate sample. More... | |
BivariateSample (string xName, string yName) | |
Initializes a new bivariate sample with the given variable names. More... | |
void | Add (double x, double y) |
Adds a data point to the sample. More... | |
void | Add (XY point) |
Adds a data point to the sample. More... | |
void | Add (IEnumerable< XY > points) |
Adds multiple data points to the sample. More... | |
void | Add (IList< double > x, IList< double > y) |
Adds points from two lists to the sample. More... | |
bool | Remove (double x, double y) |
Removes a data point from the sample. More... | |
bool | Remove (XY point) |
Removes a data point from the sample. More... | |
void | Clear () |
Removes all data points from the sample. More... | |
bool | Contains (double x, double y) |
Determines whether the sample contains a given data point. More... | |
bool | Contains (XY xy) |
Determines whether the sample contains a given data point. More... | |
BivariateSample | Copy () |
Copies the bivariate sample. More... | |
IEnumerator< XY > | GetEnumerator () |
Gets an enumerator of sample values. More... | |
void | TransposeXY () |
Swaps the X and Y variables in the bivariate sample. More... | |
TestResult | PearsonRTest () |
Performs a Pearson correlation test for association. More... | |
TestResult | SpearmanRhoTest () |
Performs a Spearman rank-order test of association between the two variables. More... | |
TestResult | KendallTauTest () |
Performs a Kendall concordance test for association. More... | |
TestResult | PairedStudentTTest () |
Performs a paired Student t-test. More... | |
FitResult | LinearRegression () |
Computes the best-fit linear regression from the data. More... | |
FitResult | PolynomialRegression (int m) |
Computes the polynomial of given degree which best fits the data. More... | |
FitResult | LinearLogisticRegression () |
Computes the best-fit linear logistic regression from the data. More... | |
void | Load (IDataReader reader, int xIndex, int yIndex) |
Loads values from a data reader. More... | |
Properties | |
bool | IsReadOnly [get] |
Gets a value indicating whether the bivariate sample is read-only. More... | |
Sample | X [get] |
Gets a read-only univariate sample consisting of the x-values of the data points. More... | |
Sample | Y [get] |
Gets a read-only univariate sample consisting of the y-values of the data points. More... | |
int | Count [get] |
Gets the number of data points. More... | |
double | Covariance [get] |
Gets the covariance of the two variables. More... | |
double | CorrelationCoefficient [get] |
Gets the correlation coefficient between the two variables. More... | |
UncertainValue | PopulationCovariance [get] |
Estimates of the population covariance of two variables. More... | |
Private Member Functions | |
int | IndexOf (double x, double y) |
IEnumerator IEnumerable. | GetEnumerator () |
void ICollection< XY >. | CopyTo (XY[] array, int start) |
double | MomentAboutMean (int nx, int ny) |
Private Attributes | |
SampleStorage | xData |
SampleStorage | yData |
bool | isReadOnly |
Represents a set of data points, where each data point is described by a pair of real numbers.
A bivariate sample consists of pairs of real numbers, where each pair is an independent measurement. For example, if you measure the height and weight of a sample of people, the data could be stored as a bivariate sample. The class can compute various descriptive statistics for the sample, perform appropriate statistical tests on the sample data, and fit the sample data to various models.
|
inline |
Initializes a new bivariate sample.
|
inline |
Initializes a new bivariate sample with the given variable names.
xName | The name of the x-variable. |
yName | The name of the y-variable. |
|
inline |
Adds a data point to the sample.
x | The x-value of the data point. |
y | The y-value of the data point. |
Referenced by Test.SampleTest.BetaFitUncertainty(), Test.BivariateSampleTest.BivariateLinearPolynomialRegressionAgreement(), Test.BivariateSampleTest.BivariateLinearRegression(), Test.BivariateSampleTest.BivariateLinearRegressionGoodnessOfFitDistribution(), Test.MultivariateSampleTest.BivariateNullAssociation(), Test.BivariateSampleTest.BivariatePolynomialRegression(), Test.BivariateSampleTest.BivariateSampleCopy(), Test.BivariateSampleTest.BivariateSampleManipulations(), Test.BugTests.Bug6162(), Test.ContingencyTableTest.ContingencyTableProbabilitiesAndUncertainties(), Test.SampleTest.GammaFitUncertainty(), Test.NullDistributionTests.KendallNullDistributionTest(), Test.BivariateSampleTest.LinearLogisticRegression(), Test.SampleTest.NormalFitUncertainties(), Test.MultivariateSampleTest.PairedStudentTTest(), Test.BivariateSampleTest.PearsonRDistribution(), Test.NullDistributionTests.SpearmanNullDistributionTest(), Test.SampleTest.WaldFitUncertainties(), and Test.SampleTest.WeibullFitUncertainties().
|
inline |
Adds a data point to the sample.
point | The data point. |
References Meta.Numerics.XY.X, and Meta.Numerics.XY.Y.
|
inline |
Adds multiple data points to the sample.
points | The data points. |
|
inline |
Adds points from two lists to the sample.
x | The x values of the data points. |
y | The y values of the data points. |
DimensionMistachException | The lengths of the two lists are not equal. |
|
inlineprivate |
|
inline |
Removes a data point from the sample.
x | The x-value of the data point. |
y | The y-value of the data point. |
Referenced by Test.BivariateSampleTest.BivariateSampleManipulations().
|
inline |
Removes a data point from the sample.
point | The point to remove. |
References Meta.Numerics.XY.X, and Meta.Numerics.XY.Y.
|
inline |
Removes all data points from the sample.
Referenced by Test.BivariateSampleTest.BivariateSampleManipulations().
|
inline |
Determines whether the sample contains a given data point.
x | The x-value of the data point. |
y | The y-value of the data point. |
Referenced by Test.BivariateSampleTest.BivariateSampleManipulations().
|
inline |
Determines whether the sample contains a given data point.
xy | The data point. |
References Meta.Numerics.XY.X, and Meta.Numerics.XY.Y.
|
inline |
Copies the bivariate sample.
Referenced by Test.BivariateSampleTest.BivariateSampleCopy().
|
inline |
Gets an enumerator of sample values.
|
inlineprivate |
|
inlineprivate |
|
inline |
Swaps the X and Y variables in the bivariate sample.
|
inlineprivate |
References Meta.Numerics.MoreMath.Pow().
|
inline |
Performs a Pearson correlation test for association.
This test measures the strength of the linear correlation between two variables. The test statistic r is simply the covariance of the two variables, scaled by their respective standard deviations so as to obtain a number between -1 (perfect linear anti-correlation) and +1 (perfect linear correlation).
The Pearson test cannot reliably detect or rule out non-linear associations. For example, variables with a perfect quadratic association may have only a weak linear correlation. If you wish to test for associations that may not be linear, consider using the Spearman or Kendall tests instead.
The Pearson correlation test requires O(N) operations.
The Pearson test requires at least three bivariate values.
InsufficientDataException | Count is less than three. |
Referenced by Test.MultivariateSampleTest.BivariateNullAssociation(), and Test.BivariateSampleTest.PearsonRDistribution().
|
inline |
Performs a Spearman rank-order test of association between the two variables.
The Spearman rank-order test of association is a non-parametric test for association between two variables. The test statistic rho is the correlation coefficient of the rank of each entry in the sample. It is thus invariant over monotonic reparameterizations of the data, and will, for example, detect a quadratic or exponential association just as well as a linear association.
The Spearman rank-order test requires O(N log N) operations.
InsufficientDataException | There are fewer than three data points. |
References Meta.Numerics.Interval.FromEndpoints().
Referenced by Test.MultivariateSampleTest.BivariateNullAssociation(), and Test.NullDistributionTests.SpearmanNullDistributionTest().
|
inline |
Performs a Kendall concordance test for association.
Kendall's τ is a non-parameteric and robust test of association between two variables. It simply measures the number of cases where an increase in one variable is associated with an increase in the other (corcordant pairs), compared with the number of cases where an increase in one variable is associated with a decrease in the other (discordant pairs).
Because τ depends only on the sign of a change and not its magnitude, it is not skewed by outliers exhibiting very large changes, nor by cases where the degree of change in one variable associated with a given change in the other changes over the range of the varibles. Of course, it may still miss an association whoose sign changes over the range of the variables. For example, if data points lie along a semi-circle in the plane, an increase in the first variable is associated with an increase in the second variable along the rising arc and and decrease in the second variable along the falling arc. No test that looks for single-signed correlation will catch this association.
Because it examine all pairs of data points, the Kendall test requires O(N2) operations. It is thus impractical for very large data sets. While not quite as robust as the Kendall test, the Spearman test is a good fall-back in such cases.
InsufficientDataException | Count is less than two. |
References Meta.Numerics.Interval.FromEndpoints().
Referenced by Test.MultivariateSampleTest.BivariateNullAssociation(), and Test.NullDistributionTests.KendallNullDistributionTest().
|
inline |
Performs a paired Student t-test.
Like a two-sample, unpaired t-test (Sample.StudentTTest(Sample,Sample)), a paired t-test compares two samples to detect a difference in means. Unlike the unpaired version, the paired version assumes that each
InsufficientDataException | There are fewer than two data points. |
Referenced by Test.MultivariateSampleTest.PairedStudentTTest().
|
inline |
Computes the best-fit linear regression from the data.
Linear regression assumes that the data have been generated by a function y = a + b x + e, where e is normally distributed noise, and determines the values of a and b that best fit the data. It also determines an error matrix on the parameters a and b, and does an F-test to
The fit result is two-dimensional. The first parameter is the intercept a, the second is the slope b. The goodness-of-fit test is a F-test comparing the variance accounted for by the model to the remaining, unexplained variance.
InsufficientDataException | There are fewer than three data points. |
Referenced by Test.BivariateSampleTest.BivariateLinearPolynomialRegressionAgreement(), Test.BivariateSampleTest.BivariateLinearRegression(), Test.BivariateSampleTest.BivariateLinearRegressionGoodnessOfFitDistribution(), and Test.MultivariateSampleTest.MultivariateLinearRegressionAgreement().
|
inline |
Computes the polynomial of given degree which best fits the data.
m | The degree, which must be non-negative. |
ArgumentOutOfRangeException | m is negative. |
InsufficientDataException | There are fewer data points than coefficients to be fit. |
References Meta.Numerics.Matrices.ColumnVector.Transpose().
Referenced by Test.BivariateSampleTest.BivariateLinearPolynomialRegressionAgreement(), Test.BivariateSampleTest.BivariatePolynomialRegression(), and Test.BugTests.Bug6162().
|
inline |
Computes the best-fit linear logistic regression from the data.
Linear logistic regression is a way to fit binary outcome data to a linear model.
The method assumes that binary outcomes are encoded as 0 and 1. If any y-values other than 0 and 1 are encountered, it throws an InvalidOperationException.
The fit result is two-dimensional. The first parameter is a, the second b.
InsufficientDataException | There are fewer than three data points. |
InvalidOperationException | There is a y-value other than 0 or 1. |
References Meta.Numerics.Analysis.MultiFunctionMath.FindMinimum(), Meta.Numerics.Analysis.MultiExtremum.HessianMatrix, Meta.Numerics.Matrices.SymmetricMatrix.Inverse(), and Meta.Numerics.Analysis.MultiExtremum.Location.
Referenced by Test.BivariateSampleTest.LinearLogisticRegression().
|
inline |
Loads values from a data reader.
reader | The data reader. |
xIndex | The column number of the x-variable. |
yIndex | The column number of the y-variable. |
|
private |
|
private |
|
private |
|
get |
Gets a value indicating whether the bivariate sample is read-only.
|
get |
Gets a read-only univariate sample consisting of the x-values of the data points.
Use this method to obtain sinformation specific to the x-vales, such as their Sample.Median or Sample.Variance.
Note that this is a fast, O(1) operation, which does not create an independent copy of the data. The advantage of this is that you can access x-data as a Sample as often as you like without worying about performance. The disadvantage of this is that the returned sample cannot be altered. If you need to alter x-data independent of the bivariate sample, use the Sample.Copy method to obtain an independent copy.
Referenced by Test.SampleTest.BetaFitUncertainty(), Test.BivariateSampleTest.BivariateLinearRegression(), Test.ContingencyTableTest.ContingencyTableProbabilitiesAndUncertainties(), Test.SampleTest.GammaFitUncertainty(), Test.BivariateSampleTest.LinearLogisticRegression(), Test.SampleTest.NormalFitUncertainties(), Test.SampleTest.WaldFitUncertainties(), and Test.SampleTest.WeibullFitUncertainties().
|
get |
Gets a read-only univariate sample consisting of the y-values of the data points.
Use this method to obtain sinformation specific to the y-vales, such as their Sample.Median or Sample.Variance.
Note that this is a fast, O(1) operation, which does not create an independent copy of the data. The advantage of this is that you can access y-data as a Sample as often as you like without worying about performance. The disadvantage of this is that the returned sample cannot be altered. If you need to alter y-data independent of the bivariate sample, use the Sample.Copy method to obtain an independent copy.
Referenced by Test.SampleTest.BetaFitUncertainty(), Test.BivariateSampleTest.BivariateLinearRegression(), Test.ContingencyTableTest.ContingencyTableProbabilitiesAndUncertainties(), Test.SampleTest.GammaFitUncertainty(), Test.BivariateSampleTest.LinearLogisticRegression(), Test.SampleTest.NormalFitUncertainties(), Test.SampleTest.WaldFitUncertainties(), and Test.SampleTest.WeibullFitUncertainties().
|
get |
Gets the number of data points.
Referenced by Test.BivariateSampleTest.BivariateLinearRegression(), Test.BivariateSampleTest.BivariateSampleCopy(), Test.BivariateSampleTest.BivariateSampleManipulations(), Test.BivariateSampleTest.LinearLogisticRegression(), Test.SampleTest.NormalFitUncertainties(), Test.MultivariateSampleTest.PairedStudentTTest(), and Test.SampleTest.WaldFitUncertainties().
|
get |
Gets the covariance of the two variables.
Referenced by Test.MultivariateSampleTest.MultivariateMoments().
|
get |
Gets the correlation coefficient between the two variables.
|
get |
Estimates of the population covariance of two variables.
Referenced by Test.BivariateSampleTest.BivariateLinearRegression(), Test.BivariateSampleTest.BivariatePolynomialRegression(), Test.BivariateSampleTest.LinearLogisticRegression(), Test.SampleTest.NormalFitUncertainties(), and Test.SampleTest.WaldFitUncertainties().