TOPICS
Search

Fisher's Exact Test


Fisher's exact test is a statistical test used to determine if there are nonrandom associations between two categorical variables.

Let there exist two such variables X and Y, with m and n observed states, respectively. Now form an m×n matrix in which the entries a_(ij) represent the number of observations in which x=i and y=j. Calculate the row and column sums R_i and C_j, respectively, and the total sum

 N=sum_(i)R_i=sum_(j)C_j
(1)

of the matrix. Then calculate the conditional probability of getting the actual matrix given the particular row and column sums, given by

 P_(cutoff)=((R_1!R_2!...R_m!)(C_1!C_2!...C_n!))/(N!product_(i,j)a_(ij)!),
(2)

which is a multivariate generalization of the hypergeometric probability function. Now find all possible matrices of nonnegative integers consistent with the row and column sums R_i and C_j. For each one, calculate the associated conditional probability using (2), where the sum of these probabilities must be 1.

To compute the P-value of the test, the tables must then be ordered by some criterion that measures dependence, and those tables that represent equal or greater deviation from independence than the observed table are the ones whose probabilities are added together. There are a variety of criteria that can be used to measure dependence. In the 2×2 case, which is the one Fisher looked at when he developed the exact test, either the Pearson chi-square or the difference in proportions (which are equivalent) is typically used. Other measures of association, such as the likelihood-ratio-test, G-squared, or any of the other measures typically used for association in contingency tables, can also be used.

The test is most commonly applied to 2×2 matrices, and is computationally unwieldy for large m or n. For tables larger than 2×2, the difference in proportion can no longer be used, but the other measures mentioned above remain applicable (and in practice, the Pearson statistic is most often used to order the tables). In the case of the 2×2 matrix, the P-value of the test can be simply computed by the sum of all P-values which are <=P_(cutoff).

For an example application of the 2×2 test, let X be a journal, say either Mathematics Magazine or Science, and let Y be the number of articles on the topics of mathematics and biology appearing in a given issue of one of these journals. If Mathematics Magazine has five articles on math and one on biology, and Science has none on math and four on biology, then the relevant matrix would be

  Math. Mag. Science ; math  5 0 R_1=5; biology  1 4 R_2=5;  C_1=6 C_2=4 N=10.
(3)

Computing P_(cutoff) gives

 P_(cutoff)=(5!^26!4!)/(10!(5!0!1!4!))=0.0238,
(4)

and the other possible matrices and their Ps are

[4 1; 2 3]  P=0.2381
(5)
[3 2; 3 2]  P=0.4762
(6)
[2 3; 4 1]  P=0.2381
(7)
[1 4; 5 0]  P=0.0238,
(8)

which indeed sum to 1, as required. The sum of P-values less than or equal to P_(cutoff)=0.0238 is then 0.0476 which, because it is less than 0.05, is significant. Therefore, in this case, there would be a statistically significant association between the journal and type of article appearing.


Explore with Wolfram|Alpha

Cite this as:

Weisstein, Eric W. "Fisher's Exact Test." From MathWorld--A Wolfram Web Resource. https://mathworld.wolfram.com/FishersExactTest.html

Subject classifications