In statistics, a bimodal distribution is a continuous probability distribution with two different modes. These appear as distinct peaks (local maxima) in the probability density function, as shown in Figure 1.
More generally, a multimodal distribution is a continuous probability distribution with two or more modes, as illustrated in Figure 3.
When the two modes are unequal the larger mode is known as the major mode and the other as the minor mode. The least frequent value between the modes is known as the antimode. The difference between the major and minor modes is known as the amplitude. In time series the major mode is called the acrophase and the antimode the batiphase.
Gatling introduced a classification system (AJUS) for distributions[2]
This classification has since been modified slightly:
Under this classification bimodal distributions are classified as type S or U.
Examples of variables with bimodal distributions include the time between eruptions of certain geysers, the color of galaxies, the size of worker weaver ants, the age of incidence of Hodgkin's lymphoma, use of diapers (Kiri Horsey-Shepherd), the speed of inactivation of the drug isoniazid in US adults, the absolute magnitude of novae, and the circadian activity patterns of those crepuscular animals that are active both in morning and evening twilight. In fishery science multimodal length distributions reflect the different year classes and can thus be used for age distribution- and growth estimates of the fish population[3] Sediments are usually distributed in a bimodal fashion.
Important bimodal distributions include the arcsine distribution and the beta distribution. Another bimodal distribution is the U-quadratic distribution.
The ratio of two normal distributions is also bimodally distributed. Let
where a and b are constant and x and y are distributed as normal variables with a mean of 0 and a standard deviation of 1. R has a known density that can be expressed as a confluent hypergeometric function.[4]
The distribution of the reciprocal of a t distributed random variable is bimodal when the degrees of freedom are more than one. Similarly the reciprocal of a normally distributed variable is also bimodally distributed.
A bimodal distribution most commonly arises as a mixture of two different unimodal distributions (i.e. distributions having only one mode). In other words, the bimodally distributed random variable X is defined as with probability or with probability where Y and Z are unimodal random variables and is a mixture coefficient. For example, the bimodal distribution of sizes of weaver ant workers shown in Figure 2 arises due to existence of two distinct classes of workers, namely major workers and minor workers.[1] In this case, Y would be the size of a random major worker, Z the size of a random minor worker, and α the proportion of worker weaver ants that are major workers.
Mixtures with two distinct components need not be bimodal, and two component mixtures of unimodal component densities can have more than two modes. Therefore, there is no immediate connection between the number of components in a mixture and the number of modes of the resulting density.
A mixture of two normal distributions has five parameters to estimate: the two means, the two variances and the mixing parameter. A mixture of two normal distributions with equal standard deviations is bimodal only if their means differ by at least twice the common standard deviation.[5] Estimates of the parameters is simplified if the variances can be assumed to be equal (the homoscedastic case).
It is obvious that if the means of the two normal distributions are equal that the combined distribution is unimodal. Conditions for unimodality of the combined distribution were derived by Eisenberger.[6] Necessary and sufficient conditions for a mixture of normal distributions to be bimodal have been identified by Ray and Lindsay.[7]
Mixtures of other distributions require additional parameters to be estimated.
A mixture of two unimodal distributions with differing means is not necessarily bimodal. The combined distribution of heights of men and women is sometimes used as an example of a bimodal distribution, but in fact the difference in mean heights of men and women is too small relative to their standard deviations to produce bimodality.[5]
Bimodal distributions have the peculiar property that - unlike the unimodal distributions - the mean may be a more robust sample estimator than the median.[8] This is clearly the case when the distribution is U shaped like the arcsine distribution. It may not be true when the distribution has one or more long tails.
Let
where gi is a probability distribution and p is the mixing parameter.
The moments of f(x) are[9]
where
and Si and Ki are the skewness and kurtosis of the ith distribution.
Bimodal distributions are a commonly used example of how summary statistics such as the mean, median, and standard deviation can be deceptive when used on an arbitrary distribution. For example, in the distribution in Figure 1, the mean and median would be about zero, even though zero is not a typical value. The standard deviation is also larger than deviation of each normal distribution.
Although several have been suggested, there is no presently generally agreed summary statistic (or set of statistics) to quantify the parameters of a general bimodal distribution. For a mixture of two normal distributions the mean and standard deviation along with the mixing parameter (a weighing system for the combination) are usually used - a total of five parameters.
A statistic that may be useful is Ashman's D:[10]
where μ1, μ2 are the means and σ1 σ2 are the standard deviations.
For a mixture of two normal distributions D > 2 is required for a clean separation of the distributions.
The bimodality index assumes that the distribution is a sum of two normal distributions with equal variances but differing means.[11] It is defined as follow:
where μ1, μ2 are the means and σ is the common standard deviation.
where p is the mixing parameter.
Sarle's bimodality coefficient b is[12]
where γ is the skewness and κ is the kurtosis. The kurtosis is here defined to be the standardised fourth moment around the mean. The value of b lies between 0 and 1.[13] The logic behind this coefficient is that a bimodal distribution will have very low kurtosis, an asymmetric character, or both - all of which increase this coefficient.
The formula for a finite sample is[14]
where n is the number of items in the sample, g is the sample skewness and k is the sample excess kurtosis.
The value of b for the uniform distribution is 5/9. This is also its value for the exponential distribution. Values greater than 5/9 may indicate a bimodal or multimodal distribution. The maximum value (1.0) is reached only by a Bernoulli distribution with only two distinct values or the sum of two different Dirac delta functions.
The distribution of this statistic is unknown. It is related to a statistic proposed earlier by Pearson - the difference between the kurtosis and the square of the skewness (vide infra).
A study of a mixture density of two normal distributions data found that separation into the two normal distributions was difficult unless the means were separated by 4-6 standard deviations.[15]
A necessary but not sufficient condition for a symmetrical distribution to be bimodal is that the kurtosis be less than three.[16][17] Here the kurtosis is defined to be the standardised fourth moment around the mean. The reference given prefers to use the excess kurtosis - the kurtosis less 3.
Pearson in 1894 was the first to devise a procedure to test whether a distribution could be resolved into two normal distributions.[18] This method required the solution of a ninth order polynomial. In a subsequent paper Pearson reported that for any distribution skewness2 + 1 < kurtosis.[13] Later Pearson showed that[19]
where b2 is the kurtosis and b1 is the square of the skewness. Equality holds only for the two point Bernoulli distribution or the sum of two different Dirac delta functions. These are the most extreme cases of bimodality possible. The kurtosis in both these cases is 1. Since they are both symmetrical their skewness is 0 and the difference is 1.
Baker proposed a transformation to convert a bimodal to a unimodal distribution.[20]
Several tests of unimodality versus bimodality have been proposed: Haldane suggested one based on second central differences.[21] Larkin later introduced a test based on the F test;[22] Benett created one based on the G test.[23] Tokeshi has proposed fourth test.[24][25] A test based on a likelihood ratio has been proposed by Holzmann and Vollmer.[26]
Statistical tests for the antimode are known.[27]
To test if a distribution is other than unimodal, several additional tests have been devised: the bandwidth test,[28] the dip test,[29] the excess mass test,[30] the MAP test,[31] the mode existence test,[32] the runt test,[33][34] the span test,[35] and the saddle test.
The dip test is available for use in R.[1] The values for the dip statistic values range between 0 to 1. Values less than 0.05 indicating significant bimodality and values greater than 0.05 but less than 0.10 suggesting bimodality with marginal significance.
This distribution is bimodal for certain values of is parameters. A test for these values has been described.[36]
In the study of sediments particle size is frequently bimodal. Empirically it has been found useful to plot the frequency against the log( size ) of the particles. This usually gives a clear separation of the particles into a bimodal distribution.
An alternative method is to plot the log of the particle size against the cumulative frequency. This graph will usually consist two reasonably straight lines with a connecting line corresponding to the antimode.
Silverman introduced a bootstrap method for the number of modes.[28] The test uses a fixed bandwidth which reduces the power of the test and its interpretability. Undersmoothed densities may have an excessive number of modes whose count during bootstrapping is unstable.
Assuming that the distribution is known to be bimodal or has been shown to be bimodal by one or more of the test above, it is frequently desirable to fit a curve to the data. This may be difficult.
Bayesian methods may be useful in difficult cases.
A package for R is available for testing for bimodality.[2] This package assumes that the data are distributed as a sum of two normal distributions. If this assumption is not correct the results may not be reliable. It also includes functions for fitting a sum of two normal distributions to the data.
Assuming that the distribution is a mixture of two normal distributions then the expectation-maximization algorithm may be used to determine the parameters. Several programmes are available for this including Cluster.[37]
The mixtools package also available for R can test for and estimate the parameters of a number of different distributions.[38]
Another package for a mixture of two right tailed gamma distributions is available.[39]
Two other packages for R are available to fit mixture models: flexmix[40] and mcclust.[41]
The statistical programme SAS can also fit a variety of mixed distributions with the command PROCFREQ.
((cite journal))
: |pages=
has extra text (help)