Library "DataCorrelation" Implementation of functions related to data correlation calculations. Formulas have been transformed in such a way that we avoid running loops and instead make use of time series to gradually build the data we need to perform calculation. This allows the calculations to run on unbound series, and/or higher number of samples
Chatterjee Correlation and Spearman Correlation functions make use of BinaryInsertionSort library to speed up sorting. The library in turn implements mechanism to insert values into sorted order so that load on sorting is reduced by higher extent allowing the functions to work on higher sample size.
🎲 Function Documentation
chatterjeeCorrelation(x, y, sampleSize, plotSize) Calculates chatterjee correlation between two series. Formula is - ξnₓᵧ = 1 - (3 * ∑ |rᵢ₊₁ - rᵢ|)/ (n²-1) Parameters: x: First series for which correlation need to be calculated y: Second series for which correlation need to be calculated sampleSize: number of samples to be considered for calculattion of correlation. Default is 20000 plotSize: How many historical values need to be plotted on chart. Returns: float correlation - Chatterjee correlation value if falls within plotSize, else returns na
spearmanCorrelation(x, y, sampleSize, plotSize) Calculates spearman correlation between two series. Formula is - ρ = 1 - (6∑dᵢ²/n(n²-1)) Parameters: x: First series for which correlation need to be calculated y: Second series for which correlation need to be calculated sampleSize: number of samples to be considered for calculattion of correlation. Default is 20000 plotSize: How many historical values need to be plotted on chart. Returns: float correlation - Spearman correlation value if falls within plotSize, else returns na
covariance(x, y, include, biased) Calculates covariance between two series of unbound length. Formula is Covₓᵧ = ∑ ((xᵢ-x̄)(yᵢ-ȳ)) / (n-1) for sample and Covₓᵧ = ∑ ((xᵢ-x̄)(yᵢ-ȳ)) / n for population Parameters: x: First series for which covariance need to be calculated y: Second series for which covariance need to be calculated include: boolean flag used for selectively including sample biased: boolean flag representing population covariance instead of sample covariance Returns: float covariance - covariance of selective samples of two series x, y
stddev(x, include, biased) Calculates Standard Deviation of a series. Formula is σ = √( ∑(xᵢ-x̄)² / n ) for sample and σ = √( ∑(xᵢ-x̄)² / (n-1) ) for population Parameters: x: Series for which Standard Deviation need to be calculated include: boolean flag used for selectively including sample biased: boolean flag representing population covariance instead of sample covariance Returns: float stddev - standard deviation of selective samples of series x
correlation(x, y, include) Calculates pearson correlation between two series of unbound length. Formula is r = Covₓᵧ / σₓσᵧ Parameters: x: First series for which correlation need to be calculated y: Second series for which correlation need to be calculated include: boolean flag used for selectively including sample Returns: float correlation - correlation between selective samples of two series x, y