Joint/Simultaneous Multiset Decompositions—Independent vector analysis (IVA) and multiset canonical correlation analysis (MCCA)

For a general review of simultaneous decomposition of multiple datasets, refer to
T. Adali, M. Anderson, and G.-S. Fu, "Diversity in independent component and vector analyses: Identifiability, algorithms, and applications in medical imaging," IEEE Signal Processing Magazine, vol. 31, no.3, pp. 18-33, May 2014.

Few points on algorithm choice:

  • If second-order statistics are sufficient, i.e., multi-set correlation analysis will yield the desired response, the algorithm of choice is IVA-G and using Newton updates, the default option
  • If the underlying distributions of the multivariate sources are Laplacian, or are known to have super-Gaussian marginals, the best algorithm to use is IVA-L-SOS
  • Finally, IVA-A-GGD is the most flexible option when source distributions are not known a prior and IVA-GGD is the more computationally desirable option
  • For complex-valued data, all algorithms will accept and work with complex-valued data, however note that only two algorithms take full second-order statistical information: IVA-GGD, which includes a flag for complex or real, and IVA-CMGGD, which is written for the complex processing and is a fully adaptive algorithm like IVA-A-GGD. Also note that all other algorithms include real-valued initialization for W, which can be changed to complex if desired.
  • If references are available (either for all sources or a subset of sources), constrained IVA-G algorithms can efficiently capture both second-order statistics (SOS) via multivariate Gaussian distribution and higher-order statistics (HOS) via the similarity between the sources and the references. They also offer significantly faster runtime than IVA using a multivariate Laplace density model, thanks to iteration complexity independent of the sample size.

Algorithms:

  1. Independent vector analysis (IVA) using multivariate Gaussian distribution (IVA-G) [1,2,3]
  2. IVA using second-order uncorrelated multivariate Laplace distribution (IVA-L) [4,5,6], the decoupled version (IVA-L-Decp) [2]
  3. IVA using second-order correlated multivariate Laplace distribution (IVA-L-SOS) [7]
  4. Constrained IVA using second-order correlated multivariate Laplace distribution (Constrained-IVA-L-SOS) [17], Adaptive constrained IVA using second-order correlated multivariate Laplace distribution (Adaptive-constrained-IVA-L-SOS) [18]
  5. Joint diagonalization using second-order statistics for IVA(JDIAG-SOS) [8,9]
  6. Joint diagonalization using fourth-order cumulant for IVA(JDIAG-CUM4) [8,9]
  7. IVA using MGGD or multivariate power exponential distribution(IVA-GGD) [10], and the version with adaptively tuned shape parameter and scalar matrix (IVA-A-GGD) [11,12]
  8. IVA using multivariate generalized Gaussian distribution for complex-valued data (IVA-CMGGD) [13,14]
  9. Multiset canonical correlation analysis (MCCA) [15,16]
  10. Adaptive-reverse constrained IVA using multivariate Gaussian distribution (ar-cIVA-G) [19]
  11. Threshold-free constrained IVA using multivariate Gaussian distribution (tf-cIVA-G) [19]
  12. IVA with the multivariate generalized % Gaussian distribution (MGGD) (IVA-MGGD) [20]

IVA-G [1,2,3]

This is an algorithm that uses second-order statistics (a multivariate Gaussian model) and a decoupling trick to estimate demixing matrices by minimizing mutual information, and does not constrain the demixing matrices to be orthogonal as in MCCA. If the data is complex-valued then the pseudo-covariance matrix is included in the mutual information measure. It ignores sample-to-sample dependence and higher-order statistics. It generalizes MCCA and has more flexible identification conditions compared with MCCA.

IVA-L [4,5,6]

This is an algorithm that uses higher-order statistics to estimate demixing matrices by minimizing mutual information assuming a second-order uncorrelated multivariate Laplace prior. It ignores sample-to-sample dependence and second-order statistics. IVA-L is implemented using the a matrix relative gradient and IVA-L-Decp using a vector gradient via a decoupling method. Demixing matrix is not contrained to be orthogonal.

IVA-L-SOS [7]

Unlike regular IVA-L algorithm introduced in [8] that assumes no second-order correlation, IVA-L-SOS takes both second and higher-order statistics into account and assumes the sources are multivariate Laplacian distributed [11]. Hence, it can effectively align sources across the datasets when IVA-L typically fails. This algorithm provides a simplified formulation of the score function in order to significantly reduce the effect of large number of datasets, as compared with IVA-GGD with the shape parameter set to 0.5, i.e., that of Laplacian in the GGD formation.

Constrained IVA-L-SOS [7]

Constrained IVA-L-SOS incorporates prior information about the data into the IVA cost function by allowing a fixed, user-defined minimum degree of similarity between the estimate and reference signal [17]. Adaptive constrained IVA-L-SOS incorporates prior information about the data into the IVA cost function. Unlike constrained IVA-L-SOS, it uses an adaptive tuning mechanism to control the effect of reference information on the estimates, thus reducing the effects of incorrect priors [18].

JDIAG-SOS[8,9]

This is an IVA algorithm that uses symmetric orthogonal joint diagonalization of covariance matrices based on multiple datasets. It is the only IVA algorithm that explicitly can exploit sample-to-sample dependence to improve source separation. It ignores higher-order statistics.

JDIAG-CUM4 [8,9]

This is an IVA algorithm that uses symmetric orthogonal joint diagonalization of fourth-order cumulants. Sample-to-sample dependence and second-order statistical information are not taken into account.

IVA-GGD [10]

Multivariate generalized Gaussian distribution (MGGD) provides an effective density model with a simple model. All order statistical information is taken into account to estimate non-orthogonal---i.e., non constrained---demixing matrices by minimizing mutual information. This algorithm was initially introduced as IVA with a multivariate power exponential prior (IVA-MPE). Algorithm uses a set of shape parameters (user defined) and selects the best estimate for efficiency rather than updating those.

IVA-A-GGD [11,12]

IVA-A-GGD is also based on the MGGD model but estimates both the shape parameter and the scatter matrix, taking both SOS and HOS into account. The first algorithm is based on Fisher scoring (FS) approach [12] (IVA-A-GGD-MLFS) and the second on fixed point (FP) updates [13] IVA-A-GGD-RAFP.

IVA-CMGGD [13,14]

This is an algorithm that uses all complex-valued statistics of all orders and a decoupling trick to estimate non-orthogonal demixing matrices by minimizing the mutual information among estimated sources using a CMGGD adapted to each source using the given data [14,15]. Sample-to-sample dependence is ignored. In addition to optimization using steepest descent, BFGS and limited memory BFGS are made available to accelerate convergence for larger datasets.

MCCA [15,16]

These algorithms implement the generalization of canonical corerlation analysis (CCA) to multiple datasets, multiset CCA (MCCA) [15]. The implementation estimates an orthogonal matrix using a deflationary approach. Sample-to-sample dependence and higher-order statistical information are ignored.

ar-cIVA-G [19]

The ar-cIVA-G incorporates prior information about the sources into the IVA cost function. By alternating between a conservative scheme and an assertive scheme, ar-cIVA-G optimally controls the effect of each reference on the corresponding estimated source. There is no need for users to specify the degree of similarity between the estimate and the reference signal.

tf-cIVA-G [19]

The tf-cIVA-G utilizes the references as regularization for the IVA cost function. Both the similarity between the reference and the corresponding source and the (dis)similarity between that reference and the other sources are exploited. tf-cIVA-G eliminates the need for constraint-threshold selection.

IVA-MGGD [20]

Implementation of IVA with the multivariate generalized Gaussian distribution (MGGD). Compared with IVA-GGD [10] and IVA-A-GGD [11, 12], the most important part is that the shape parameter is not estimated but can be specified as user-input via varargin. By default, IVA-MGGD uses beta=0.5, assuming the sources follow multivariate Laplacian distribution; beta=1, the distribution is Gaussian; and for beta<1, super-Gaussian and for beta>1, sub-Gaussian. Other updates include:
  • Cost function and gradients are computed using the dispersion matrix instead of the covariance matrix. This ensures that the implementation is consistent with the algorithm's derivation. Only in the Gaussian case (beta=1), the covariance matrix equals the dispersion matrix. In other cases, it is important to recognize that the covariance matrix is a scaled version of the dispersion matrix
  • IVA-MGGD implements the general case for any beta>0. The Kotz constant (that is the scale factor between the covariance matrix and the dispersion matrix) is computed in a logarithmic manner, avoiding numerical issues that can happen when K goes to infinity or beta goes to 0
  • Optimization and simplification:
    1. rewrite the code for computing cost function: comp_l_sos_cost function is too lengthy and hard to read. An important fix is the use of logdet function which avoids numerical issues when computing the logarithm of determinant of covariance matrices
    2. rewrite the code for whiten function: the new code, named "pca_whitening", optimizes the computation of whitening transformation by using matrix-vector multiplication/division instead of matrix-matrix multiplication/division
    3. rewrite the code for bss_isi function: the new code, named "joint_isi", simplifies the computation of Amari joint ISI by only considering the most popular case: 3-D matrices. Using vectorization on the columns/rows of global mixing-demixing matrix is significantly faster than the previous code that uses for loop. In addition, the new implementation enables partial joint ISI by allowing users to specify the number of components used to compute the partial metric
    4. simplify the code for decouple_trick function: the new code only includes the method for the QR recursive algorithm


References:

[1] M. Anderson, X.-L. Li, & T. Adali, "Nonorthogonal Independent Vector Analysis Using Multivariate Gaussian Model," International Conference on Latent Variable Analysis and Signal Separation, Springer Berlin / Heidelberg, 2010, 6365, 354-361
[2] M. Anderson, T. Adali, & X.-L. Li, "Joint Blind Source Separation with Multivariate Gaussian Model: Algorithms and Performance Analysis," IEEE Trans. Signal Process., 2012, 60, 1672-1683
[3] M. Anderson, X.-L. Li, & T. Adali, "Complex-valued Independent Vector Analysis: Application to Multivariate Gaussian Model," Signal Process., 2012, 1821-1831
[4] T. Kim, I. Lee, & T.-W. Lee, "Independent Vector Analysis: Definition and Algorithms," Proc. of 40th Asilomar Conference on Signals, Systems, and Computers, 2006, 1393-1396
[5] T. Kim, T. Eltoft, & T.-W. Lee, "Independent Vector Analysis: an extension of ICA to multivariate components," Lecture Notes in Computer Science: Independent Component Analysis and Blind Signal Separation, Independent Component Analysis and Blind Signal Separation, Springer Berlin / Heidelberg, 2006, 3889, 165-172
[6] T. Kim, H. T. Attias, S.-Y. Lee, & T.-W. Lee, "Blind Source Separation Exploiting Higher-Order Frequency Dependencies," IEEE Trans. Audio Speech Lang. Process., 2007, 15, 70-79
[7] S. Bhinge, R. Mowakeaa, V.D. Calhoun, T. Adalı, "Extraction of time-varying spatio-temporal networks using parameter-tuned constrained IVA." IEEE Transactions on Medical Imaging, 2019, vol. 38, no. 7, 1715-1725
[8] X.-L. Li, T. Adali, & M. Anderson, "Joint Blind Source Separation by Generalized Joint Diagonalization of Cumulant Matrices," Signal Process., 2011, 91, 2314-2322
[9] X.-L. Li, M. Anderson, & T. Adali, "Second and Higher-Order Correlation Analysis of Multiset Multidimensional Variables by Joint Diagonalization," Lecture Notes in Computer Science: Independent Component Analysis and Blind Signal Separation, Latent Variable Analysis and Signal Separation, Springer Berlin / Heidelberg, 2010, 6365, 197-204
[10] M. Anderson, G.-S. Fu, R. Phlypo, and T. Adali, "Independent Vector Analysis: Identification Conditions and Performance Bounds," IEEE Trans. Signal Processing, vol. 62, no. 17, pp. 4399--4410, Sep. 2014.
[11] Z. Boukouvalas, G.-S. Fu, and T. Adali, "An efficient multivariate generalized Gaussian distribution estimator: Application to IVA," in Proc. Conf. on Info. Sciences and Systems (CISS), Baltimore, MD, March 2015.
[12] Z. Boukouvalas, S. Said, L. Bombrun, Y. Berthoumieu, and T. Adali, "A new Riemannian averaged fixed-point algorithm for MGGD parameter estimation," IEEE Signal Proc. Letts., vol. 22, no. 12, pp. 2314-2318, Dec. 2015.
[13] R. Mowakeaa, Z. Boukouvalas, Q. Long, & T. Adali, “IVA using complex multivariate GGD: Application to fMRI analysis,” Multidimensional Systems and Signal Processing, Springer , 2019, 1-20
[14] R. Mowakeaa, Z. Boukouvalas, T. Adali, & C. Cavalcante,“On the characterization, generation, and efficient estimation of the complex multivariate GGD,” IEEE Sensor Array and Multichannel Signal Processing Workshop, 2016, 1-5
[15] J. R. Kettenring, "Canonical analysis of several sets of variables," Biometrika, 1971, 58, 433-451
[16] Y.-O. Li, T. Adali, W. Wang, V. D. Calhoun, "Joint Blind Source Separation by Multiset Canonical Correlation Analysis," IEEE Trans. Signal Process., 2009, 57, 3918-3929
[17] S. Bhinge, Q. Long, Y. Levin-Schwartz, Z. Boukouvalas, V. D. Calhoun, & T. Adali, "Non-orthogonal constrained independent vector analysis: Application to data fusion," In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 2666-2670. IEEE, 2017
[18] S. Bhinge, R. Mowakeaa, V. D. Calhoun, & T. Adali, "Extraction of time-varying spatiotemporal networks using parameter-tuned constrained IVA," IEEE Transactions on Medical Imaging 38, no. 7 2019, 1715-1725.
[19] V. Trung, F. Laport, H. Yang, V. D. Calhoun, & T. Adali, "Constrained independent vector analysis with reference for multi-subject fMRI analysis," IEEE Transactions on Biomedical Engineering, vol. 1, no. 1, pp. 1--12, July, 2024.
[19] Erdem, "XXXX," XXXX, vol. 1, no. 1, pp. 1--12, July, 2024.