10.10.08

WARNING: 25% of the cells have expected counts less than 5. Chi-Square may not be a valid test.
Prepared by Rajeev V


The chi-square test is a one of the tools in statistics to compare observed data with data we would expect to obtain according to a specific hypothesis. Procedure freq with option chisq help us to determine evidence for the association between two categorical variables in SAS.

While performing chi square test using proc freq in SAS one may sometimes encounter such a warning, WARNING: 25% of the cells have expected counts les than 5. Chi-Square may not be a valid test.

This type of warning arises in output window when the expected count in any of the cells is less than 5 when performing Pearson chi-square test; for example the expected counts for row2column1 in the below table is calculated as (16X235)/766 equal to 4.9086. If we see this type of warning just avoid p value corresponding to Pearson chi-square test and take into consideration p value of Fisher’s exact test. These are explained below using an example:-

Data _cat_;
Input grp $7. Cat1 $4;
Cards;
Case ABN
Control NRM
………………….
………………….
;
run;

ods output Chisq=chitb_1(where =(statistic in ("Chi-Square")))
FishersExact=fishexctb_1(where=(label1="Two-sided Pr<=P"));
proc freq data=_cat_;
table grp*cat1/chisq expected nocol norow nopercent ;
by _name_;
run;
ods output close;
The output will be,

The p-value obtained on performing the chi-square test above may not measure the association because (2, 1) th expected count is less than 5. SAS automatically performs Fisher's exact test for 2×2 tables but for tables larger than 2x2, exact tests are requested by using the exact option on the table statement or using exact fisher statement after table statement. SAS generate every possible table that is compatible with the given marginal totals, and calculates the exact probability p of each table, using Fisher’s formula (1934). By summing the probabilities of the extreme tables we obtain a P value that is used in the usual classical way as a test of the null hypothesis.From our example we will consider the two sided p value (1.00) from Fisher’s Exact Test.

No comments: