Online Public Access Catalogue (OPAC)
Library,Documentation and Information Science Division

“A research journal serves that narrow

borderland which separates the known from the unknown”

-P.C.Mahalanobis


Image from Google Jackets

Some contributions to multiple hypotheses testing under dependence/ Monitirtha Dey

By: Material type: TextTextPublication details: Kolkata: Indian Statistical Institute, 2024Description: xix, 105 pagesSubject(s): DDC classification:
  • 23 SA.1  M744
Online resources:
Contents:
Non-asymptotic Behaviors of FWER in Correlated Normal Distributions -- Non-asymptotic Behaviors of Generalized FWERs in Correlated Normal Distributions -- Asymptotic Behaviors of FWER and Generalized FWERs in Correlated Normal Distributions -- Asymptotic Behaviors of Stepwise Multiple Testing Procedures -- Asymptotically Optimal Sequential Multiple Testing Procedures for Cor- related Normal
Production credits:
  • Guided by Prof. Subir Kumar Bhandari
Dissertation note: Thesis (Ph.D.)- Indian statistical Institute, 2024 Summary: The field of simultaneous statistical inference has attracted several statisticians for decades for its interesting theory and paramount applications. A potpourri of different methodologies exists to control various error rates, e.g., the false discovery rate (FDR) or the family-wise error rate (FWER). Most of these classical procedures were proposed under independence or some form of weak dependence among the concerned variables. However, large-scale multiple testing problems in various scientific disciplines often study correlated variables simultaneously. For example, in microRNA expression data, several genes may cluster into groups through their transcription processes and possess high correlations. The data observed from different locations and periods in public health studies are generally spatially or serially correlated. fMRI studies and multistage clinical trials also involve variables with complex and unknown dependencies. Consequently, the study of the effect of correlation on dependent test statistics in simultaneous inference problems has attracted considerable attention recently. However, the existing literature lacks the study of the performances of FWER or generalized FWER controlling procedures under dependent setups. For these reasons, this thesis concentrates mainly on FWER and generalized FWER controlling procedures. We consider the correlated Gaussian sequence model as our underlying framework. FWER has been a prominent error criterion in simultaneous inference for decades. The Bonferroni method is the earliest and one of the most popular methods for con- trolling FWER. However, we find little literature that illustrates the magnitude of the conservativeness of Bonferroni’s procedure in the correlated framework with small or moderate dimensions. We address this research gap in a unified manner by establishing upper bounds on Bonferroni FWER in equicorrelated and non-negatively correlated non- asymptotic Gaussian sequence model setups. We also derive similar upper bounds for the generalized FWERs and propose an improved k-FWER controlling procedure. Towards this, we establish an inequality related to the probability that at least k among n events occur, which extends and sharpens the classical ones. The computation of this probability arises in various contexts, e.g., in reliability problems of communication networks. Our probabilistic results might be insightful in those areas, too. We also study the limiting behavior of Bonferroni FWER as the number of hypotheses approaches infinity. We prove that in the equicorrelated Gaussian setup with positive equicorrelation, Bonferroni FWER tends to zero asymptotically. These results eluci- date that Bonferroni’s procedure becomes extremely conservative for large-scale multiple- testing problems under correlated frameworks. We extend this result for generalized FWERs and to non-negatively correlated Normal frameworks where the limiting infimum of the correlations is strictly positive. Our proposed approximation of FWER also provides an estimate of the c.d.f of the failure time of the parallel systems. We then move to the general class of stepwise multiple testing procedures (MTPs). The role of correlation on the limiting behavior of the FWER for stepwise procedures is less studied. Also, the existing literature lacks theoretical justifications for why FWER methods fail in large-scale problems. We address this problem by theoretically investi- gating the limiting FWER values of general step-down procedures under the correlated Gaussian setup. These results provide new insights into the behavior of step-down decision procedures. By establishing the limiting performances of commonly used step-up methods, e.g., the Benjamini-Hochberg (BH) and the Hochberg method, we have elucidated that the class of step-up procedures does not possess a similar universal asymptotic zero result as obtained in the case of step-down procedures. It is also noteworthy that most of our results are very general since they accommodate any combination of true and false null hypotheses. We have also obtained the limiting powers of the stepwise procedures. Our results elucidate that, at least under the correlated Gaussian sequence model with many hypotheses, Holm’s MTP and Hochberg’s MTP do not have significantly different performances since they both asymptotically have zero FWER and zero power. It is also astonishing to note that, among all the procedures studied in this thesis, the BH method is the only one which can hold the FWER at a strictly positive level asymptotically under the equicorrelated Gaussian setup. Finally, we consider the simultaneous inference problem in a sequential framework in Chapter 6. The mainstream sequential simultaneous inference literature has traditionally focused on the independent setup. However, there is little work studying the multiple inference problem in a sequential framework where the observations corresponding to the various streams are dependent. We consider the classical means-testing problem in an equicorrelated Gaussian and sequential framework. We focus on sequential test procedures that control the type I and type II familywise error probabilities at pre-specified levels. We establish that our proposed MTPs have the optimal average sample numbers under every possible signal configuration asymptotically, as the two types of familywise error probabilities approach zero at arbitrary rates. Towards this, we elucidate that the ratio of the expected sample size of our proposed rule and that of the classical SPRT goes to one asymptotically, thus illustrating their connection. Generalizing this, we show that our proposed procedures, with suitably modified cutoffs, are asymptotically optimal for controlling any multiple testing error criteria lying between multiples of FWER in a certain sense. This class of criteria includes FDR/FNR and pFDR/pFNR among others. The results in this thesis illuminate that dependence might be a blessing or a curse, subject to the type of dependence or the underlying paradigm. Several popular and widely used procedures fail to hold the FWER at a positive level asymptotically under positively correlated Gaussian frameworks. On the contrary, the expected sample size of the asymptotically optimal sequential multiple testing rule is a decreasing function in the common correlation under the equicorrelated framework. Thus, correlation plays a dual role in the classical fixed-sample size and the sequential paradigms.
Tags from this library: No tags from this library for this title. Log in to add tags.

Thesis (Ph.D.)- Indian statistical Institute, 2024

Includes bibliography

Non-asymptotic Behaviors of FWER in Correlated Normal Distributions -- Non-asymptotic Behaviors of Generalized FWERs in Correlated Normal Distributions -- Asymptotic Behaviors of FWER and Generalized FWERs in Correlated Normal Distributions -- Asymptotic Behaviors of Stepwise Multiple Testing Procedures -- Asymptotically Optimal Sequential Multiple Testing Procedures for Cor-
related Normal

Guided by Prof. Subir Kumar Bhandari

The field of simultaneous statistical inference has attracted several statisticians for
decades for its interesting theory and paramount applications. A potpourri of different
methodologies exists to control various error rates, e.g., the false discovery rate (FDR)
or the family-wise error rate (FWER). Most of these classical procedures were proposed
under independence or some form of weak dependence among the concerned variables.
However, large-scale multiple testing problems in various scientific disciplines often study
correlated variables simultaneously. For example, in microRNA expression data, several
genes may cluster into groups through their transcription processes and possess high
correlations. The data observed from different locations and periods in public health
studies are generally spatially or serially correlated. fMRI studies and multistage clinical
trials also involve variables with complex and unknown dependencies. Consequently, the
study of the effect of correlation on dependent test statistics in simultaneous inference
problems has attracted considerable attention recently.
However, the existing literature lacks the study of the performances of FWER or
generalized FWER controlling procedures under dependent setups. For these reasons,
this thesis concentrates mainly on FWER and generalized FWER controlling procedures.
We consider the correlated Gaussian sequence model as our underlying framework.
FWER has been a prominent error criterion in simultaneous inference for decades.

The Bonferroni method is the earliest and one of the most popular methods for con-
trolling FWER. However, we find little literature that illustrates the magnitude of the

conservativeness of Bonferroni’s procedure in the correlated framework with small or
moderate dimensions. We address this research gap in a unified manner by establishing

upper bounds on Bonferroni FWER in equicorrelated and non-negatively correlated non-
asymptotic Gaussian sequence model setups.

We also derive similar upper bounds for the generalized FWERs and propose an
improved k-FWER controlling procedure. Towards this, we establish an inequality related
to the probability that at least k among n events occur, which extends and sharpens
the classical ones. The computation of this probability arises in various contexts, e.g., in reliability problems of communication networks. Our probabilistic results might be
insightful in those areas, too.
We also study the limiting behavior of Bonferroni FWER as the number of hypotheses
approaches infinity. We prove that in the equicorrelated Gaussian setup with positive

equicorrelation, Bonferroni FWER tends to zero asymptotically. These results eluci-
date that Bonferroni’s procedure becomes extremely conservative for large-scale multiple-
testing problems under correlated frameworks. We extend this result for generalized

FWERs and to non-negatively correlated Normal frameworks where the limiting infimum
of the correlations is strictly positive. Our proposed approximation of FWER also provides
an estimate of the c.d.f of the failure time of the parallel systems.
We then move to the general class of stepwise multiple testing procedures (MTPs).
The role of correlation on the limiting behavior of the FWER for stepwise procedures is
less studied. Also, the existing literature lacks theoretical justifications for why FWER

methods fail in large-scale problems. We address this problem by theoretically investi-
gating the limiting FWER values of general step-down procedures under the correlated

Gaussian setup. These results provide new insights into the behavior of step-down decision
procedures. By establishing the limiting performances of commonly used step-up methods,
e.g., the Benjamini-Hochberg (BH) and the Hochberg method, we have elucidated that
the class of step-up procedures does not possess a similar universal asymptotic zero result
as obtained in the case of step-down procedures. It is also noteworthy that most of our
results are very general since they accommodate any combination of true and false null
hypotheses. We have also obtained the limiting powers of the stepwise procedures.
Our results elucidate that, at least under the correlated Gaussian sequence model with
many hypotheses, Holm’s MTP and Hochberg’s MTP do not have significantly different
performances since they both asymptotically have zero FWER and zero power. It is also
astonishing to note that, among all the procedures studied in this thesis, the BH method
is the only one which can hold the FWER at a strictly positive level asymptotically under
the equicorrelated Gaussian setup.
Finally, we consider the simultaneous inference problem in a sequential framework in
Chapter 6. The mainstream sequential simultaneous inference literature has traditionally
focused on the independent setup. However, there is little work studying the multiple
inference problem in a sequential framework where the observations corresponding to
the various streams are dependent. We consider the classical means-testing problem in an
equicorrelated Gaussian and sequential framework. We focus on sequential test procedures
that control the type I and type II familywise error probabilities at pre-specified levels.
We establish that our proposed MTPs have the optimal average sample numbers under every possible signal configuration asymptotically, as the two types of familywise error
probabilities approach zero at arbitrary rates. Towards this, we elucidate that the ratio
of the expected sample size of our proposed rule and that of the classical SPRT goes to
one asymptotically, thus illustrating their connection. Generalizing this, we show that
our proposed procedures, with suitably modified cutoffs, are asymptotically optimal for
controlling any multiple testing error criteria lying between multiples of FWER in a
certain sense. This class of criteria includes FDR/FNR and pFDR/pFNR among others.
The results in this thesis illuminate that dependence might be a blessing or a curse,
subject to the type of dependence or the underlying paradigm. Several popular and
widely used procedures fail to hold the FWER at a positive level asymptotically under
positively correlated Gaussian frameworks. On the contrary, the expected sample size of
the asymptotically optimal sequential multiple testing rule is a decreasing function in the
common correlation under the equicorrelated framework. Thus, correlation plays a dual
role in the classical fixed-sample size and the sequential paradigms.

There are no comments on this title.

to post a comment.
Library, Documentation and Information Science Division, Indian Statistical Institute, 203 B T Road, Kolkata 700108, INDIA
Phone no. 91-33-2575 2100, Fax no. 91-33-2578 1412, ksatpathy@isical.ac.in