Tukey's B method
Tukey's B method, also known as the Tukey-Kramer B procedure, or Tukey's Wholly Significant Difference (WSD) is a post-hoc multiple comparison statistical test used to identify which specific group means differ significantly from each other after a statistically significant result has been obtained from an analysis of variance (ANOVA).[1] It is considered a compromise between two other popular multiple comparison procedures: Tukey's range test and the Newman-Keuls method.[2]
The primary purpose of post-hoc tests like Tukey's B is to control the family-wise error rate (FWER) when performing multiple comparisons. Without such control, the probability of making at least one Type I error increases with the number of comparisons made.[3]
History and context
The development of multiple comparison procedures stems from the work of Ronald Fisher, John Tukey and others in the mid-20th century. Tukey's HSD test is a conservative method that guarantees the FWER does not exceed the chosen significance level (e.g., ). Conversely, the Newman-Keuls (NK) method, while providing higher statistical power, is known to be anti-conservative; that is, not strictly controlling the FWER as the number of groups increases.[2]
Tukey's B method was introduced to provide an intermediate level of conservatism. It seeks to balance the strict error control of HSD with the greater sensitivity to differences offered by Newman-Keuls.[1]
Methodology
Tukey's B method operates by comparing all possible pairs of means. For each pair, it calculates a critical value based on the studentized range distribution.
While Tukey's HSD uses a single critical value derived from the total number of groups (), and Newman-Keuls uses critical values that vary depending on the number of steps between the ordered means (), Tukey's B calculates the critical value () as the simple arithmetic mean of the critical values obtained from those two procedures:[1]
The absolute difference between two means, , is then compared against a critical difference value:
where:
- is the mean squared error from the ANOVA, and
- and are the sample sizes of the groups being compared.
If , the difference is declared statistically significant.
Characteristics and comparison with other methods
Tukey's B method is a standard post-hoc option in statistical packages such as SPSS,[1] and provides a middle ground for researchers:
- Error rate control: it offers better control over the family-wise-error rate than the Newman-Keuls method, but is less conservative than Tukey's HSD.[2]
- Statistical power: it generally has greater statistical power than Tukey's HSD, making it more likely to detect true differences.[1]
Statistical criticism
In contemporary statistical practice, the procedure has largely fallen out of favor due to several factors:
- Theoretical grounding: unlike the Tukey HSD, which is rooted in the distribution of the studentized range, Tukey's B lacks a rigorous mathematical justification for its averaging approach.
- Error rate control: because it is a hybrid, it does not guarantee the same level of family-wise error rate protection as more modern, stepwise procedures.
- Availability of alternatives: the development of more powerful and theoretically sound procedures, such as the Ryan-Einot-Gabriel-Welsch (REGW) or the Fisher-Hayter test, has rendered Tukey's B largely obsolete in most modern statistical software packages.[4][5]
See also
References
- ^ a b c d e Larson-Hall, J. (2009). A guide to doing statistics in second language research using SPSS. Routledge. doi:10.4324/9780203875964. ISBN 978-1-135-59474-9.
- ^ a b c McHugh, M. L. (2011). "Multiple comparison analysis testing in ANOVA". Biochemia Medica. 21 (3): 203–209. doi:10.11613/bm.2011.029. PMID 22420233.
- ^ Mishra, Prabhaker; Singh, Uttam; Pandey, Chandra M; Mishra, Priyadarshni; Pandey, Gaurav (2019). "Application of student's t-test, analysis of variance, and covariance". Annals of Cardiac Anaesthesia. 22 (4): 407–411. doi:10.4103/aca.aca_94_19. PMC 6813708. PMID 31621677.
- ^ Field, Andy (2013). Discovering Statistics Using IBM SPSS Statistics (4th ed.). SAGE Publications. p. 459. ISBN 978-1446249185.
Tukey's B is an ad hoc compromise... Generally, if you want a stepwise test, the REGWQ is the best choice.
- ^ De Muth, James E. (2014). Basic Statistics and Pharmaceutical Statistical Applications. Pharmacy Education Series (3rd ed.). CRC Press. pp. 250–251. ISBN 9781466596740.
Tukey's-b is a compromise between the HSD and SNK tests... it is generally considered less robust than modern stepwise procedures like REGWQ.