Home » Amazing World » Kolmogorov-Smirnov test: what is it and how is it used?

Kolmogorov-Smirnov test: what is it and how is it used?

In this article we will analyze one of the most used tests in inferential statistics: the Kolmogorov-Smirnov test. What is it for and how can we use it? Stay and find out!

Parametric and non-parametric tests are widely used in the field of inferential statistics.. Among the non-parametric tests, we find the Kolmogorov-Smirnov test, which returns an indicator to help us decide whether the data from a given sample fit a probability distribution, with the consequences that this fact has for the analysis of data.

This proof arises from the contributions made by Nikolaevich Kolmogorov and Vladimir Ivanovich Smirnov. Kolmogorov’s contribution corresponds to the problem related to a single sample, while Smirnov’s is responsible for answering the problem regarding two samples, trying to test the hypothesis of equality between the populations of origin of one with respect to that of the other. .

What is the Kolmogorov-Smirnov test?

The Kolmogorov-Smirnov test is a non-parametric goodness-of-fit test used to obtain an indicator that gives the researcher an idea of ​​whether two distributions are different or whether an underlying probability distribution differs from a hypothetical distribution (Dodge, 2008).

Mostly, It is used when in a research we have two samples from two populations that are different. Some of the characteristics of this type of non-parametric tests are the following (Gómez-Gómez et al., 2003):

They are independent of random observations except for paired data. They have few assumptions regarding the distribution of the population. The dependent variable is measured on a categorical scale. The primary point is the ordering by ranks or by frequencies. The hypotheses are made about ranges, median or frequencies of the data. The required sample size is smaller (20 or <).

Read Also:  Repressed emotions are recorded in our body

What is it for?

This test helps us:

Verify whether or not the scores we have obtained from our sample follow a normal distribution. Measure the degree of agreement between the distribution of a set of data and a specific theoretical distribution. Evaluate which distribution best fits the data. Contrast whether our observations come from a specific distribution.Discriminate differences in the location and shapes of distributions.Test whether two distributions are sufficiently different from each other when we want to build prediction scenarios.

Through Kolmogorov-Smirnov, we can compare the cumulative distribution of theoretical frequencies with the cumulative distribution of observed frequencies. To do this, the idea is to find the point of maximum divergence and determine the probability that a difference of that magnitude occurs at random.

How is it calculated?

To calculate it, we start from the largest difference (in absolute value) between the cumulative distribution of a sample (observed) and the theoretical distribution. The goodness of fit of the sample allows us to reasonably assume that the observations may correspond to the specific distribution (Gómez-Gómez et al., 2003).

If what we intend is to compare the empirical distribution function of the observed data, with the cumulative distribution function associated with the null hypothesis, the steps are as follows (Kawwa, 2020):

Arrange the observations in ascending order.Calculate the empirical distribution function of the observations.For each observation xi calculate F exp (xi) = P (Z ≤ xi).Calculate the absolute differences.Record the maximum difference.Find the critical value .Reject or accept null hypotheses.

If we want to test whether two samples are drawn from the same distribution, then we must follow the following steps (Kawwa, 2020):

Read Also:  Take your time to embrace your wounds and heal

Sort each sample. Concatenate them into an ordered array. Calculate the observed cumulative distribution functions of the two samples. Calculate their maximum absolute difference. Compare the results.

When applying this test, we must always assume that the parameters of the test distribution have been specified in advance. This procedure estimates the parameters from the sample. On the other hand, we also have to assume that the mean and standard deviation of the sample are the parameters of a normal distribution.

Limitations of the test

One of the limitations of the Kolmogorov-Smirnov test is that For it to work, location, scale and shape parameters must be specified. If these parameters are estimated from the data, the test is invalidated. Therefore, if we do not know what these parameters are, it is better to apply a less formal test.

Another limitation is that, generally cannot be used for discrete distributions, especially if you are using software, as most software packages do not have the necessary extensions for the Kolmogorov-Smirnov test and manual calculations are complicated.

You might be interested…

All cited sources were reviewed in depth by our team to ensure their quality, reliability, validity and validity. The bibliography in this article was considered reliable and of academic or scientific accuracy.

Dodge, Y. (2008). Kolmogorov–Smirnov Test. The concise encyclopedia of statistics (pp. 283-287). https://doi.org/10.1007/978-0-387-32833-1_214Gómez-Gómez, M., Danglot-Banck, C., & Vega-Franco, L. (2003). Synopsis of non-parametric statistical tests. When to use them. Mexican journal of pediatrics, 70(2), 91-99. https://www.medigraphic.com/pdfs/pediat/sp-2003/sp032i.pdfKawwa, N. (2020, February 14). When to Use the Kolmogorov-Smirnov Test. Towards data science. https://towardsdatascience.com/when-to-use-the-kolmogorov-smirnov-test-dd0b2c8a8f61

Read Also:  Living as a couple without clipping your wings: on what level is freedom?

Are You Ready to Discover Your Twin Flame?

Answer just a few simple questions and Psychic Jane will draw a picture of your twin flame in breathtaking detail:

Leave a Reply

Your email address will not be published. Los campos marcados con un asterisco son obligatorios *

*

This site uses Akismet to reduce spam. Learn how your comment data is processed.