Multi-View Discriminant Canonical Correlation Analysis

Mandal, Ankita

DSpace Home
→
Dissertation and Thesis
→
Theses
→
View Item

Multi-View Discriminant Canonical Correlation Analysis

Mandal, Ankita

URI: http://hdl.handle.net/10263/7469

Date: 2022-12

Abstract:

Multi-view learning is an emerging machine learning paradigm that focuses on discovering patterns in data represented by multiple distinct views. One of the important issues associated with real-life high-dimensional multi-view data is how to integrate relevant and complementary information from multiple views, while generating discriminative subspaces for analysis. Although the integration of multi-view data is expected to provide an intrinsically more powerful model than its single-view counterpart, it poses its own set of challenges. The most important problems associated with multi-view data analysis are presence of noisy, irrelevant and heterogeneous views, high-dimension low-sample size nature of individual views, and updating the databases with new views. In this regard, the thesis addresses the problem of multi-view data integration, for both static and dynamic data sets, in the presence of high-dimensional noisy and redundant views. The main contribution of the present work is to design some novel algorithms, based on the theory of canonical correlation analysis (CCA), to extract informative subspaces for multi-view classification, and theoretically analyze the important properties of these transformed spaces and new algorithms. The “curse of dimensionality” problem due to “high-dimension low-sample size” characteristics of real-life data is addressed, by judiciously integrating the CCA and ridge regression optimization technique. The relation between CCA and its regularized counterpart is established, which enables extraction of relevant and significant features sequentially from bimodal data sets for classification and addresses the scalability issue of real-life high-dimensional data. To integrate multi-view data using multiset CCA (MCCA), a new block matrix representation is introduced. It facilitates generation of discriminative subspaces having maximum pairwise correlation, and makes the MCCA model scalable to highdimensional multi-view data. Integration of MCCA with multiset ridge regression model addresses the “curse of dimensionality” problem of individual views. In order to integrate dynamic multi-view data, a novel adaptive MCCA model is proposed, which incrementally updates canonical variables when new views are available for the analysis. The adaptive model ensures selection of relevant and complementary views during data integration, while discarding irrelevant and redundant ones. To make the adaptive framework scalable to high-dimensional data, a new model is introduced under common latent representation. Finally, a graph based approach is judiciously integrated with this adaptive model to utilize the underlying geometry of the data in different views.