Abstract:
Over the past few years, multi-view data analysis has emerged as an inevitable method for
identifying sample categories. In multi-view data classification problem, it is expected that the joint
subspace is learned from the given input views in such a way that the similarity in the latent space
implies the similarity in the corresponding concepts. Since each view has different statistical
properties, the joint subspace should be able to reflect the intrinsic properties of each of the input
views. Another important aspect is the coherent knowledge of the multiple views. It is required that
the learning objective of the multi-view model efficiently captures the non-linear correlated
structures across different views. Cross-view dependency is also an essential attribute of multi-view
learning in which the primary focus to discover the dependency shared between the pairs of input
views. If one or more input views correspond to images, then then joint subspace should be learned
in such a way that the topological properties of the image views are properly preserved along with
the inherent chracteristics of the rest of the views.
In this regard, the thesis addresses the classification problem of multi-view data, where the primary
objective is to identify and analyze the inherent structures or patterns of the data, relevant to classify
the given observations into different categories. In order to evaluate the relevance of a view in
differentiating observations from a particular class from the observations belonging to the rest of the
classes, a novel framework is developed by judiciously integrating the theory of rough sets with the
Bayes decision theory. While rough set theory deals with the uncertainty due to incompleteness in
class definition, the probabilistic model addresses the uncertainty due to overlapping classes by
measuring the belongingness of an observation to a specific class.
In multi-view learning, it is essential that a joint subspace is learned from the given input views
which can efficiently encapsulate the underlying non-linear data distribution of the given
observations. In this regard, the thesis develops deep predictive models based on the framework of
deep Boltzmann machine for discriminability, correlation, and dependency analysis. In
discriminability analysis, the class nodes are incorporated into the deep architecture where the
supervised information is clamped. Through proper learning of the weights associated with the class
nodes, the discriminative ability of the latent subspace is enhanced. In correlation analysis, the
learning objective of the deep architecture is judiciously integrated with canonical correlation
analysis such that given the input views, the joint subspace is learned from maximally correlated
subspaces. In dependency analysis, the relationship between each pair of views is assumed to be
unique. Hence, a view-pair specific approach is developed based on the concept of Hilbert-Schmidt
independence criterion to efficiently encapsulate the cross-view dependency in terms of consensus
and/or complementary knowledge from the input pairs of views. Based on the Bayes error analysis,
an upper bound on the error probability of the proposed deep model is estimated in terms of the
model architecture. It facilitates determining the optimal architecture of the proposed model for
each database considered.
Combining information from multiple views is particularly challenging when the input views
involve both image and non-image information. In case of multi-view data analysis, it is essential
that descriptive and comprehensive information is efficiently extracted from all the views of the
given input data. If one or more input views correspond to image information, then it should be
ensured that the innate topological properties of each of the input image views are appropriately
reflected in the joint subspace. In this regard, a geometrically motivated deep predictive model is
developed, which can process multiple image and non-image views simultaneously. In order to
recognize and represent the geometric structures of the image manifolds, embedded in the high-
dimensional ambient space, the theory of Laplacian eigenmap is judiciously integrated with the
learning objective of the deep predictive model. An approximate common eigenbasis of the
Laplacians is computed to consolidate the intrinsic geometric structures of the manifolds,
corresponding to each of the input image views.