A note on estimating the proportion of True Null Hypotheses

M. W. Davy, I. Hunt, M. A. Black

ABSTRACT

Motivation: In the context of microarray analysis where m hypotheses are simultaneously tested we consider issues surrounding estimating the proportion of null hypotheses [math]\pi_0[/math] from a vector of p-values. We review two recently published methods using software available in the Bioconductor open source to software project to evaluate performance in a simulation study. examine these two methods in a simulation study derived from a publically available microarray experiment. We illustrate that algorithms can be optimised furthur based on evidence in the long run, and show that one estimator has significantly less variance, although both are likely to suffer bias due to unobserved covariates.

Results
Availabillity
Contact

INTRODUCTION

In the analysis of microarrays, False Discovery Rate (FDR) control has been a method of determining cutoff in an statistically ordered list of p-values by many pracitioners. Finer and Roters (2001) showed that FDR control applied to a vector of p-values using Benjamini and Hochberg procedures actually maintains non adaptive control of the FDR where the expected proportion of false discoveries in the long run is controlled at [math] \pi_0 \times \alpha^*[/math] dependent on the unobserved proportion of truely null hypotheses. Estimation of [math]\pi_0[/math], the proportion of null hypotheses in multiple hypothesis testing is useful for maintaining adaptive control of the FDR across the range of [math]\pi_0=(\alpha^*,1)[/math], however this quantity is not straight forward to estimate due to the potential of latent variance and bias introduced due to the large observational study [ref Efron].

APPROACH

METHODS

DISCUSSION

CONCLUSION

ACKNOWLEDGEMENT

REFERENCES

A note on estimating the proportion of True Null Hypotheses

Contents