Joint/Simultaneous Multiset Decompositions—Independent vector analysis (IVA) and multiset canonical correlation analysis (MCCA)

For a general review of simultaneous decomposition of multiple datasets, refer to
T. Adali, M. Anderson, and G.-S. Fu, "Diversity in independent component and vector analyses: Identifiability, algorithms, and applications in medical imaging," IEEE Signal Processing Magazine, vol. 31, no.3, pp. 18-33, May 2014.

Few points on algorithm choice:

  • If second-order statistics are sufficient, i.e., multi-set correlation analysis will yield the desired response, the algorithm of choice is IVA-G and using Newton updates, the default option
  • If the underlying distributions of the multivariate sources are Laplacian, or are known to have super-Gaussian marginals, the best algorithm to use is IVA-L-SOS
  • Finally, IVA-A-GGD is the most flexible option when source distributions are not known a prior and IVA-GGD is the more computationally desirable option
  • For complex-valued data, all algorithms will accept and work with complex-valued data, however note that only two algorithms take full second-order statistical information: IVA-GGD, which includes a flag for complex or real, and IVA-CMGGD, which is written for the complex processing and is a fully adaptive algorithm like IVA-A-GGD. Also note that all other algorithms include real-valued initialization for W, which can be changed to complex if desired.

Algorithms:

  1. Independent vector analysis (IVA) using multivariate Gaussian distribution (IVA-G) [1,2,3]
  2. IVA using second-order uncorrelated multivariate Laplace distribution (IVA-L) [4,5,6], the decoupled version (IVA-L-Decp) [2]
  3. IVA using second-order correlated multivariate Laplace distribution (IVA-L-SOS) [7]
  4. Constrained IVA using second-order correlated multivariate Laplace distribution (Constrained-IVA-L-SOS) [17], Adaptive constrained IVA using second-order correlated multivariate Laplace distribution (Adaptive-constrained-IVA-L-SOS) [18]
  5. Joint diagonalization using second-order statistics for IVA(JDIAG-SOS) [8,9]
  6. Joint diagonalization using fourth-order cumulant for IVA(JDIAG-CUM4) [8,9]
  7. IVA using MGGD or multivariate power exponential distribution(IVA-GGD) [10], and the version with adaptively tuned shape parameter and scalar matrix (IVA-A-GGD) [11,12]
  8. IVA using multivariate generalized Gaussian distribution for complex-valued data (IVA-CMGGD) [13,14]
  9. Multiset canonical correlation analysis (MCCA) [15,16]

IVA-G [1,2,3]

This is an algorithm that uses second-order statistics (a multivariate Gaussian model) and a decoupling trick to estimate demixing matrices by minimizing mutual information, and does not constrain the demixing matrices to be orthogonal as in MCCA. If the data is complex-valued then the pseudo-covariance matrix is included in the mutual information measure. It ignores sample-to-sample dependence and higher-order statistics. It generalizes MCCA and has more flexible identification conditions compared with MCCA.

IVA-L [4,5,6]

This is an algorithm that uses higher-order statistics to estimate demixing matrices by minimizing mutual information assuming a second-order uncorrelated multivariate Laplace prior. It ignores sample-to-sample dependence and second-order statistics. IVA-L is implemented using the a matrix relative gradient and IVA-L-Decp using a vector gradient via a decoupling method. Demixing matrix is not contrained to be orthogonal.

IVA-L-SOS [7]

Unlike regular IVA-L algorithm introduced in [8] that assumes no second-order correlation, IVA-L-SOS takes both second and higher-order statistics into account and assumes the sources are multivariate Laplacian distributed [11]. Hence, it can effectively align sources across the datasets when IVA-L typically fails. This algorithm provides a simplified formulation of the score function in order to significantly reduce the effect of large number of datasets, as compared with IVA-GGD with the shape parameter set to 0.5, i.e., that of Laplacian in the GGD formation.

Constrained IVA-L-SOS [7]

Constrained IVA-L-SOS incorporates prior information about the data into the IVA cost function by allowing a fixed, user-defined minimum degree of similarity between the estimate and reference signal [17]. Adaptive constrained IVA-L-SOS incorporates prior information about the data into the IVA cost function. Unlike constrained IVA-L-SOS, it uses an adaptive tuning mechanism to control the effect of reference information on the estimates, thus reducing the effects of incorrect priors [18].

JDIAG-SOS[8,9]

This is an IVA algorithm that uses symmetric orthogonal joint diagonalization of covariance matrices based on multiple datasets. It is the only IVA algorithm that explicitly can exploit sample-to-sample dependence to improve source separation. It ignores higher-order statistics.

JDIAG-CUM4 [8,9]

This is an IVA algorithm that uses symmetric orthogonal joint diagonalization of fourth-order cumulants. Sample-to-sample dependence and second-order statistical information are not taken into account.

IVA-GGD [10]

Multivariate generalized Gaussian distribution (MGGD) provides an effective density model with a simple model. All order statistical information is taken into account to estimate non-orthogonal---i.e., non constrained---demixing matrices by minimizing mutual information. This algorithm was initially introduced as IVA with a multivariate power exponential prior (IVA-MPE). Algorithm uses a set of shape parameters (user defined) and selects the best estimate for efficiency rather than updating those.

IVA-A-GGD [11,12]

IVA-A-GGD is also based on the MGGD model but estimates both the shape parameter and the scatter matrix, taking both SOS and HOS into account. The first algorithm is based on Fisher scoring (FS) approach [12] (IVA-A-GGD-MLFS) and the second on fixed point (FP) updates [13] IVA-A-GGD-RAFP.

IVA-CMGGD [13,14]

This is an algorithm that uses all complex-valued statistics of all orders and a decoupling trick to estimate non-orthogonal demixing matrices by minimizing the mutual information among estimated sources using a CMGGD adapted to each source using the given data [14,15]. Sample-to-sample dependence is ignored. In addition to optimization using steepest descent, BFGS and limited memory BFGS are made available to accelerate convergence for larger datasets.

MCCA [15,16]

These algorithms implement the generalization of canonical corerlation analysis (CCA) to multiple datasets, multiset CCA (MCCA) [15]. The implementation estimates an orthogonal matrix using a deflationary approach. Sample-to-sample dependence and higher-order statistical information are ignored.


References:

[1] M. Anderson, X.-L. Li, & T. Adali, "Nonorthogonal Independent Vector Analysis Using Multivariate Gaussian Model," International Conference on Latent Variable Analysis and Signal Separation, Springer Berlin / Heidelberg, 2010, 6365, 354-361
[2] M. Anderson, T. Adali, & X.-L. Li, "Joint Blind Source Separation with Multivariate Gaussian Model: Algorithms and Performance Analysis," IEEE Trans. Signal Process., 2012, 60, 1672-1683
[3] M. Anderson, X.-L. Li, & T. Adali, "Complex-valued Independent Vector Analysis: Application to Multivariate Gaussian Model," Signal Process., 2012, 1821-1831
[4] T. Kim, I. Lee, & T.-W. Lee, "Independent Vector Analysis: Definition and Algorithms," Proc. of 40th Asilomar Conference on Signals, Systems, and Computers, 2006, 1393-1396
[5] T. Kim, T. Eltoft, & T.-W. Lee, "Independent Vector Analysis: an extension of ICA to multivariate components," Lecture Notes in Computer Science: Independent Component Analysis and Blind Signal Separation, Independent Component Analysis and Blind Signal Separation, Springer Berlin / Heidelberg, 2006, 3889, 165-172
[6] T. Kim, H. T. Attias, S.-Y. Lee, & T.-W. Lee, "Blind Source Separation Exploiting Higher-Order Frequency Dependencies," IEEE Trans. Audio Speech Lang. Process., 2007, 15, 70-79
[7] S. Bhinge, R. Mowakeaa, V.D. Calhoun, T. Adalı, "Extraction of time-varying spatio-temporal networks using parameter-tuned constrained IVA." IEEE Transactions on Medical Imaging, 2019, vol. 38, no. 7, 1715-1725
[8] X.-L. Li, T. Adali, & M. Anderson, "Joint Blind Source Separation by Generalized Joint Diagonalization of Cumulant Matrices," Signal Process., 2011, 91, 2314-2322
[9] X.-L. Li, M. Anderson, & T. Adali, "Second and Higher-Order Correlation Analysis of Multiset Multidimensional Variables by Joint Diagonalization," Lecture Notes in Computer Science: Independent Component Analysis and Blind Signal Separation, Latent Variable Analysis and Signal Separation, Springer Berlin / Heidelberg, 2010, 6365, 197-204
[10] M. Anderson, G.-S. Fu, R. Phlypo, and T. Adali, "Independent Vector Analysis: Identification Conditions and Performance Bounds," IEEE Trans. Signal Processing, vol. 62, no. 17, pp. 4399--4410, Sep. 2014.
[11] Z. Boukouvalas, G.-S. Fu, and T. Adali, "An efficient multivariate generalized Gaussian distribution estimator: Application to IVA," in Proc. Conf. on Info. Sciences and Systems (CISS), Baltimore, MD, March 2015.
[12] Z. Boukouvalas, S. Said, L. Bombrun, Y. Berthoumieu, and T. Adali, "A new Riemannian averaged fixed-point algorithm for MGGD parameter estimation," IEEE Signal Proc. Letts., vol. 22, no. 12, pp. 2314-2318, Dec. 2015.
[13] R. Mowakeaa, Z. Boukouvalas, Q. Long, & T. Adali, “IVA using complex multivariate GGD: Application to fMRI analysis,” Multidimensional Systems and Signal Processing, Springer , 2019, 1-20
[14] R. Mowakeaa, Z. Boukouvalas, T. Adali, & C. Cavalcante,“On the characterization, generation, and efficient estimation of the complex multivariate GGD,” IEEE Sensor Array and Multichannel Signal Processing Workshop, 2016, 1-5
[15] J. R. Kettenring, "Canonical analysis of several sets of variables," Biometrika, 1971, 58, 433-451
[16] Y.-O. Li, T. Adali, W. Wang, V. D. Calhoun, "Joint Blind Source Separation by Multiset Canonical Correlation Analysis," IEEE Trans. Signal Process., 2009, 57, 3918-3929
[17] S. Bhinge, Q. Long, Y. Levin-Schwartz, Z. Boukouvalas, V. D. Calhoun, & T. Adali, "Non-orthogonal constrained independent vector analysis: Application to data fusion," In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2666-2670. IEEE, 2017
[18] S. Bhinge, R. Mowakeaa, V. D. Calhoun, & T. Adali, "Extraction of time-varying spatiotemporal networks using parameter-tuned constrained IVA," IEEE Transactions on Medical Imaging 38, no. 7 2019, 1715-1725.