MLSP

Blind Source Separation and Data Fusion

As the data now come in a multitude of forms originating from different applications and environments, it has become more apparent that we can no longer make many of the simplifying assumptions, such as stationarity, linearity, Gaussianity, and circularity when developing methods for the analysis of such data. The emphasis in our research lab is on the development of data-driven methods such as independent component analysis (ICA), blind source separation, and canonical dependence analysis (CDA) that minimize the modeling assumptions on the data and achieve useful decompositions of the multi-modal data that can be either used as informative features or can be directly used for inference from the data.

Key references:

T. AdalÄ±, F. Kantar, M. A. B. S. Akhonda, S. Strother, V. D. Calhoun, and E. Acar,"Reproducibility in matrix and tensor decompositions: Focus on model match, interpretability, and uniqueness," IEEE Signal Processing Magazine, vol. 39, no. 4, pp. 8-24, July 2022.
We identify critical issues for guaranteeing the reproducibility of solutions in unsupervised matrix and tensor factorizations, with an applied focus, considering the practical case where there is no ground truth. While simulation results can easily demonstrate the advantages of a given model and support a given theoretical development, when the model parameters are not knownâ€”the case in most practical problemsâ€”their estimation and performance evaluation is a difficult problem. In this article, we review theâ€”currently rather limitedâ€”literature on the topic, discuss the proposed solutions, make suggestions based on those, and identify topics that require attention and further research.
D. Lahat, T. Adali and C. Jutten, "Multimodal data fusion: An overview of methods, challenges, and prospects," Proc. IEEE, vol. 103, no. 9, pp. 1449-1477, Sep. 2015.
This paper provides an overview of the main challenges in multimodal data fusion across various disciplines and addresses two key issues: "why we need data fusionâ" and "how we perform it."
T. Adali, Y. Levin-Schwartz, and V. D. Calhoun, "Multimodal data fusion using source separation: Two effective models based on ICA and IVA and their properties," Proc. IEEE, vol. 103, no. 9, pp. 1478-1493, Sep. 2015.
This paper introduces two powerful data-driven models and provides guidance on the selection of a given model and its implementation while emphasizing the general applicability of the two models.
T. Adali Y. Levin-Schwartz, and V. D. Calhoun, "Multimodal data fusion using source separation: Application to medical imaging," Proc. IEEE, vol. 103, no. 9, pp. 1494-1506, Sep. 2015.
This paper demonstrates the application of the two models introduced in the previous paper to fusion of medical imaging data from three modalities: functional magnetic resonance imaging (MRI), structural MRI, and electroencephalography data and discusses the tradeoffs in various modeling and parameter choices.
T. Adali, M. Anderson, and G.-S. Fu, "Diversity in independent component and vector analyses: Identifiability, algorithms, and applications in medical imaging," IEEE Signal Processing Magazine, vol. 31, no. 3, pp. 18-33, May 2014.
In this overview article, we first present ICA, and then its generalization to multiple data sets, IVA, both using mutual information rate, present conditions for the identifiability of the given linear mixing model and derive the performance bounds. We address how various methods fall under this umbrella and give examples of performance for a few sample algorithms compared with the performance bound. We then discuss the importance of approaching the performance bound depending on the goal, and use medical image analysis as the motivating example.

Active projects:

Collaborative Research:CISE-ANR:CIF:Small:Learning from Large Datasets - Application to Multi-Subject fMRI Analysis

Funded by NSF CCF, Division of Computing and Communication Foundations (Grant Number: NSF-CCF 2316420)

Dynamic imaging-genomic models for characterizing and predicting psychosis and mood disorders

Funded by NIH-NIMH (Grant Number: R01 MH 118695)

Recent related projects:

CIF: Small: Source Separation with an Adaptive Structure for Multi-Modal Data Fusion

Funded by NSF CCF, Division of Computing and Communication Foundations (Grant Number: NSF-CCF 1618551)

CRCNS: Informed Data-Driven Fusion of Behavior, Brain Function, and Genes

Funded by NIH-NIBIB (Grant Number: R01 EB 005846)

CIF: Small: Collaborative Research: Entropy Rate for Source Separation and Model Selection: Applications in fMRI and EEG Analysis

Funded by NSF-CCF: Award no: 1117056 and Award no: 1116944

Collaborative Research: III: Small: Collaborative Research: Canonical Dependence Analysis for Multi-modal Data Fusion and Source Separation

Funded by NSF-IIS: Award no: 1017718 and Award no: 1016619

Collaborative Research: SEI: Independent Component Analysis of Complex-Valued Brain Imaging Data

Funded by NSF-IIS (Award no: 0612076)

Other selected references:

M. Anderson, G.-S. Fu, R. Phlypo, and T. Adali, "Independent Vector Analysis: Identification Conditions and Performance Bounds," IEEE Trans. Signal Processing, to appear.
M. Anderson, X.-L. Li, and T. Adali, "Joint blind source separation with multivariate Gaussian model: Algorithms and performance analysis," IEEE Trans. Signal Processing, vol. 60, no. 4, pp. 2049--2055, April 2012.

Project team:

Current:

Dr. Tulay Adali

Resources:

MLSP-Lab Resources: Codes for the algorithms we have published.
Fusion ICA Toolbox (FIT): FIT is a MATLAB toolbox which implements the joint ICA and parallel ICA methods. It is used to examine the shared information between the features (SPM contrast image, EEG signal or SNP data).

MLSP-Lab

Machine Learning for Signal Processing Laboratory

Blind Source Separation and Data Fusion