An Effective Tensor Regression with Latent Sparse Regularization

Chen, Ko-shin; Xu, Tingyang; Liang, Guannan; Tong, Qianqian; Song, Minghu; Bi, Jinbo

doi:10.6339/22-JDS1048

Journal of Data Science

An Effective Tensor Regression with Latent Sparse Regularization

Volume 20, Issue 2 (2022), pp. 228–252

Ko-shin Chen Tingyang Xu Guannan Liang All authors (6)

https://doi.org/10.6339/22-JDS1048

Pub. online: 9 May 2022 Type: Statistical Data Science

Open Access

Received
31 December 2021

Accepted
10 April 2022

Published
9 May 2022

Abstract

As data acquisition technologies advance, longitudinal analysis is facing challenges of exploring complex feature patterns from high-dimensional data and modeling potential temporally lagged effects of features on a response. We propose a tensor-based model to analyze multidimensional data. It simultaneously discovers patterns in features and reveals whether features observed at past time points have impact on current outcomes. The model coefficient, a k-mode tensor, is decomposed into a summation of k tensors of the same dimension. We introduce a so-called latent F-1 norm that can be applied to the coefficient tensor to performed structured selection of features. Specifically, features will be selected along each mode of the tensor. The proposed model takes into account within-subject correlations by employing a tensor-based quadratic inference function. An asymptotic analysis shows that our model can identify true support when the sample size approaches to infinity. To solve the corresponding optimization problem, we develop a linearized block coordinate descent algorithm and prove its convergence for a fixed sample size. Computational results on synthetic datasets and real-life fMRI and EEG datasets demonstrate the superior performance of the proposed approach over existing techniques.

Supplementary material

Supplementary Material

The code and data can be found: https://doi.org/10.6084/m9.figshare.19166474.v1. For data generation, we provide DataGenerator.py to generate synthetic data including training and test sets; For model fitting, we provide tensorQIF_model_Tensorflow_v2.py to run models and ReportGenerator.py to report on performance. For experiments comparisons, we have Granger_model.py, GEE_model.m, Kruskal_model.m.

References

Acar E, Yener B (2009). Unsupervised multiway data analysis: A literature survey. IEEE Transactions on Knowledge and Data Engineering, 21(1): 6–20.

Arnold A, Liu Y, Abe N (2007). Temporal causal modeling with graphical granger methods. In: Proceedings of the 13th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD-07, 66–75. ACM, New York, NY, USA.

Bai Y, Fung WK, Zhu ZY (2009). Penalized quadratic inference functions for single-index models with longitudinal data. Journal of Multivariate Analysis, 100(1): 152–161.

Beck T, Teboulle M (2009). A fast iterative shrinkage thresholding algorithm for linear inverse problems. SIAM Journal on Imaging Sciences, 2(1): 83–202.

Bi J, Sun J, Wu Y, Tennen H, Armeli S (2013). A machine learning approach to college drinking prediction and risk factor identification. ACM Transactions on Intelligent Systems and Technology (TIST), 4(4): 1–24.

Chen CMA, Stanford AD, Mao X, Abi-Dargham A, Shungu DC, Lisanby SH, et al. (2014). Gaba level, gamma oscillation, and working memory performance in schizophrenia. NeuroImage. Clinical, 4: 531–539.

Cong F, Lin QH, Kuang LD, Gong XF, Astikainen P, Ristaniemi T (2015). Tensor decomposition of EEG signals: A brief review. Journal of Neuroscience Methods, 248: 59–69.

Crowder M (1995). On the use of a working correlation matrix in using generalised linear models for repeated measures. Biometrika, 82(2): 407–410.

De Lathauwer L, Vandewalle J (2004). Dimensionality reduction in higher-order signal processing and rank-(${r_{1}},{r_{2}},\dots ,{r_{n}}$) reduction in multilinear algebra. Linear Algebra and its Applications, 391: 31–55.

Diggle P, Heagerty P, Liang KY, Zeger S (2002). Analysis of Longitudinal Data. Oxford University Press.

Donoho DL (2000). High-dimensional data analysis: The curses and blessings of dimensionality. AMS math challenges lecture, 1(2000): 32.

Fawcett T (2006). An introduction to ROC analysis. Pattern Recognition Letters, 27(8): 861–874.

Fu WJ (2003). Penalized estimating equations. Biometrics, 59(1): 126–132.

Granger C (1980). Testing for causality: A personal viewpoint. Journal of Economic Dynamics and Control, 2(1): 329–352.

Hansen LP (1982). Large sample properties of generalized method of moments estimators. Econometrica, 50(4): 1029–1054.

Herrmann CS, Senkowski D, Rottger S (2004). Phase-locking and amplitude modulations of EEG alpha: Two measures reflect different cognitive processes in a working memory task. Experimental Psychology, 51(4): 311.

Hitchcock FL (1927). The expression of a tensor or a polyadic as a sum of products. Journal of Mathematics and Physics, 6(1–4): 164–189.

Hoff PD (2015). Multilinear tensor regression for longitudinal relational data. The Annals of Applied Statistics, 9(3): 1169.

Johannesen JK, Bi J, Jiang R, Kenney JG, Chen CMA (2016). Machine learning identification of EEG features predicting working memory performance in schizophrenia and healthy adults. Neuropsychiatric Electrophysiology, 2(1): 3.

Liang KY, Zeger SL (1986a). Longitudinal data analysis using generalised estimating equations. Biometrika, 73(1): 13–22.

Liang KY, Zeger SL (1986b). Longitudinal data-analysis using generalized linear-models. Biometrika, 73(1): 13–22.

Liu S, Maljovec D, Wang B, Bremer PT, Pascucci V (2016). Visualizing high-dimensional data: Advances in the past decade. IEEE transactions on visualization and computer graphics, 23(3): 1249–1268.

Loader C, Pilla RS (2007). Iteratively reweighted generalized least squares for estimation and testing with correlated data: An inference function framework. Journal of Computational and Graphical Statistics, 16(4): 925–945.

Lozano AC, Abe N, Liu Y, Rosset S (2009). Grouped graphical granger modeling methods for temporal causal modeling. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD-09, 577–586. ACM, New York, NY, USA.

Newey WK, McFadden D (1994). Chapter 36 Large sample estimation and hypothesis testing. Handbook of Econometrics, 4: 2111–2245.

Pereira F, Mitchell T, Botvinick M (2009). Machine learning classifiers and fMRI: A tutorial overview. Neuroimage, 45(1): S199–S209.

Qu A, Li R (2006). Quadratic inference functions for varying-coefficient models with longitudinal data. Biometrics, 62(2): 379–391.

Qu A, Lindsay BG (2003). Building adaptive estimating equations when inverse of covariance estimation is difficult. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 65(1): 127–142.

Qu A, Lindsay BG, Li B (2000). Improving generalised estimating equations using quadratic inference functions. Biometrika, 87(4): 823–836.

Sela RJ, Simonoff JS (2012). RE-EM trees: A data mining approach for longitudinal and clustered data. Machine Learning, 86(2): 169–207.

Shen L, Thompson PM, Potkin SG, Bertram L, Farrer LA, Foroud TM, et al. (2014). Genetic analysis of quantitative phenotypes in AD and MCI: Imaging, cognition and biomarkers. Brain Imaging and Behavior, 8(2): 183–207.

Stappenbeck CA, Fromme K (2010). A longitudinal investigation of heavy drinking and physical dating violence in men and women. Addictive Behaviors, 35(5): 479–485.

Tomioka R, Hayashi K, Kashima H (2010). Estimation of low-rank tensors via convex optimization. arXiv preprint: https://arxiv.org/abs/1010.0789.

Tomioka R, Suzuki T (2013). Convex tensor decomposition via structured schatten norm regularization. In: Advances in Neural Information Processing Systems 26 (C Burges, L Bottou, M Welling, Z Ghahramani, K Weinberger, eds.), 1331–1339.

Tucker LR (1966). Some mathematical notes on three-mode factor analysis. Psychometrika, 31(3): 279–311.

Vasilescu MAO, Terzopoulos D (2002). Multilinear analysis of image ensembles: Tensorfaces. In: European Conference on Computer Vision, 447–460. Springer.

Wimalawarne K, Tomioka R, Sugiyama M (2016). Theoretical and experimental analyses of tensor-based regression and classification. Neural Computation, 28(4): 686–715.

Xu T, Sun J, Bi J (2015). Longitudinal lasso: Jointly learning features and temporal contingency for outcome prediction. In: Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, KDD-15, 1345–1354. ACM, New York, NY, USA.

Xu Y, Yin W (2017). A globally convergent algorithm for nonconvex optimization based on block coordinate update. Journal of Scientific Computing, 72(2): 700–734.

Zhou H, Li L, Zhu H (2013). Tensor regression with applications in neuroimaging data analysis. Journal of the American Statistical Association, 108(502): 540–552.

2022 The Author(s). Published by the School of Statistics and the Center for Applied Statistics, Renmin University of China.

Open access article under the CC BY license.

Keywords

longitudinal data quadratic inference function tensors

Funding

This work was partially supported by the National Science Foundation with a grant IIS-1718738 at the U.S., and by the National Institutes of Health with grants R01DA051922, K02DA043063, and R01MH119678 to J Bi.

Metrics

since February 2021

790

Article info
views

363

PDF
downloads

RSS

Authors

Abstract

Supplementary material

References

Export citation

Copy and paste formatted citation

Download citation in file