Access the full text.
Sign up today, get DeepDyve free for 14 days.
T. Kolda, Brett Bader (2006)
MATLAB Tensor Toolbox
Shaden Smith, Niranjay Ravindran, N. Sidiropoulos, G. Karypis (2015)
SPLATT: Efficient and Parallel Sparse Tensor-Matrix Multiplication2015 IEEE International Parallel and Distributed Processing Symposium
R. Bro (1997)
PARAFAC. Tutorial and applicationsChemometrics and Intelligent Laboratory Systems, 38
Rasmus Bro (1997)
PARAFACTutorial and applications. Chemometrics and Intelligent Laboratory Systems, 38
F. Maxwell Harper, Joseph A. Konstan (2016)
The MovieLens datasetsACM Transactions on Interactive Intelligent Systems, 5
Wangdong Yang, Kenli Li, Keqin Li (2018)
A parallel computing method using blocked format with optimal partitioning for SpMV on GPUJ. Comput. Syst. Sci., 92
(2017)
ITensor
L. Tucker (1966)
Some mathematical notes on three-mode factor analysisPsychometrika, 31
(2017)
ParTI! : A Parallel Tensor Infrastructure
R. Bro (1998)
MULTI-WAY ANALYSIS IN THE FOOD INDUSTRY Models, Algorithms & Applications
Roman Poya, Antonio Gil, R. Ortigosa (2017)
A high performance data parallel tensor contraction framework: Application to coupled electro-mechanicsComput. Phys. Commun., 216
Bangtian Liu, Chengyao Wen, A. Sarwate, M. Dehnavi (2017)
A Unified Optimization Approach for Sparse Tensor Operations on GPUs2017 IEEE International Conference on Cluster Computing (CLUSTER)
E. Papalexakis, Christos Faloutsos, N. Sidiropoulos (2012)
ParCube: Sparse Parallelizable Tensor Decompositions
J. Choi, S. Vishwanathan (2014)
DFacTo: Distributed Factorization of Tensors
Edgar Solomonik, D. Matthews, J. Hammond, J. Demmel (2013)
Cyclops Tensor Framework: Reducing Communication and Eliminating Load Imbalance in Massively Parallel Contractions2013 IEEE 27th International Symposium on Parallel and Distributed Processing
Wangdong Yang, Kenli Li, Z. Mo, Kuan-Ching Li (2015)
Performance Optimization Using Partitioned SpMV on GPUs and Multicore CPUsIEEE Transactions on Computers, 64
J. Carroll, J. Chang (1970)
Analysis of individual differences in multidimensional scaling via an n-way generalization of “Eckart-Young” decompositionPsychometrika, 35
C. Appellof, E. Davidson (1983)
Three-dimensional rank annihilation for multi-component determinationsAnalytica Chimica Acta, 146
Nicolas Vasilache, O. Zinenko, Theodoros Theodoridis, Priya Goyal, Zach DeVito, William Moses, Sven Verdoolaege, Andrew Adams, Albert Cohen (2018)
Tensor Comprehensions: Framework-Agnostic High-Performance Machine Learning AbstractionsArXiv, abs/1802.04730
F. Hitchcock (1927)
The Expression of a Tensor or a Polyadic as a Sum of ProductsJournal of Mathematics and Physics, 6
Jiajia Li, Yuchen Ma, C. Yan, R. Vuduc (2016)
Optimizing Sparse Tensor Times Matrix on Multi-core and Many-Core Architectures2016 6th Workshop on Irregular Applications: Architecture and Algorithms (IA3)
U. Kang, E. Papalexakis, Abhay Harpale, C. Faloutsos (2012)
GigaTensor: scaling tensor analysis up by 100 times - algorithms and discoveries
L. Qi, Wenyu Sun, Yiju Wang (2007)
Numerical multilinear algebra and its applicationsFrontiers of Mathematics in China, 2
P. Tew (2016)
An investigation of sparse tensor formats for tensor libraries
Rasmus Bro (1998)
Multi-way analysis in the food industryModels, 6
Brett W. Bader, Tamara G. Kolda (2015)
MATLAB Tensor Toolbox Version 2Retrieved from http://www.sandia.gov/tgkolda/TensorToolbox/.
F. Harper, J. Konstan, J. A. (2016)
The MovieLens Datasets: History and ContextACM Trans. Interact. Intell. Syst., 5
Frank L. Hitchcock (1927)
The expression of a tensor or a polyadic as a sum of productsStudies in Applied Mathematics, 6
(2012)
, Christos Faloutsos , and Nicholas D . Sidiropoulos
Wlodek Rabinowicz (2003)
Discussion – Ryberg's Doubts about Higher and Lower Pleasures – Put to Rest?Ethical Theory and Moral Practice, 6
Yanzhao Wu, Wenqi Cao, S. Sahin, Ling Liu (2018)
Experimental Characterizations and Analysis of Deep Learning Frameworks2018 IEEE International Conference on Big Data (Big Data)
(1973)
The Art of Computer Programming: Seminumerical Algorithms
N. Sidiropoulos, L. Lathauwer, Xiao Fu, Kejun Huang, E. Papalexakis, C. Faloutsos (2016)
Tensor Decomposition for Signal Processing and Machine LearningIEEE Transactions on Signal Processing, 65
Guillaume Bouchard, Jason Naradowsky, S. Riedel, Tim Rocktäschel, Andreas Vlachos (2015)
Matrix and Tensor Factorization Methods for Natural Language Processing
Sorber L. Van Barel M. Vervliet N., Debals O., De Lathauwer L. (2016)
MATLAB Tensorlab Version 3Retrieved from https://www.tensorlab.net/.
M. Nakatsuji, Qingpeng Zhang, Xiaohui Lu, B. Makni, J. Hendler (2017)
Semantic Social Network Analysis by Cross-Domain Tensor FactorizationIEEE Transactions on Computational Social Systems, 4
N. Sidiropoulos, E. Papalexakis, C. Faloutsos (2014)
Parallel Randomly Compressed Cubes : A scalable distributed architecture for big tensor decompositionIEEE Signal Processing Magazine, 31
C. King, W. Chou, L. Ni (1990)
Pipelined Data Parallel Algorithms-I: Concept and ModelingIEEE Trans. Parallel Distributed Syst., 1
David Booth (2005)
Multi-Way Analysis: Applications in the Chemical SciencesTechnometrics, 47
Yichen Wang, Robert Chen, Joydeep Ghosh, J. Denny, A. Kho, You Chen, B. Malin, Jimeng Sun (2015)
Rubik: Knowledge Guided Tensor Factorization and Completion for Health Data AnalyticsProceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining
Nico Vervliet, Otto Debals, Laurent Sorber, M. Barel, L. Lathauwer (2014)
Structured data fusion using Tensorlab : a demonstration
Iván Cantador, Peter Brusilovsky, T. Kuflik (2011)
Second workshop on information heterogeneity and fusion in recommender systems (HetRec2011)
(2013)
Stimulation methodologies and apparatus for control of brain states
R. Cattell (1944)
“Parallel proportional profiles” and other principles for determining the choice of factors by rotationPsychometrika, 9
(2014)
NVIDIA CUDA C Programming Guide
Kenli Li, Wangdong Yang, Kuan-Ching Li (2015)
Performance Analysis and Optimization for SpMV on GPU Using Probabilistic ModelingIEEE Transactions on Parallel and Distributed Systems, 26
T. Kolda, Brett Bader (2009)
Tensor Decompositions and ApplicationsSIAM Rev., 51
L. Lathauwer, B. Moor (1996)
From Matrix to Tensor : Multilinear Algebra and Signal Processing
S. Hirata (2003)
Tensor Contraction Engine: Abstraction and Automated Parallel Implementation of Configuration-Interaction, Coupled-Cluster, and Many-Body Perturbation TheoriesJournal of Physical Chemistry A, 107
M. Wells (1973)
Review: Donald E. Knuth, The Art of Computer Programming, Volume 1. Fundamental Algorithms and Volume 2. Seminumerical AlgorithmsBulletin of the American Mathematical Society, 79
N. Sidiropoulos, E. Papalexakis, C. Faloutsos (2014)
A parallel algorithm for big tensor decomposition using randomly compressed cubes (PARACOMP)2014 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)
(1970)
Foundations of the PARAFAC Procedure: Models and Conditions for An “Explanatory” Multimode Factor Analysis (Working Papers in Phonetics No
Steffen Rendle, L. Marinho, A. Nanopoulos, L. Schmidt-Thieme (2009)
Learning optimal ranking with tensor factorization for tag recommendation
Tensors have drawn a growing attention in many applications, such as physics, engineering science, social networks, recommended systems. Tensor decomposition is the key to explore the inherent intrinsic data relationship of tensor. There are many sparse tensor and vector multiplications (SpTV) in tensor decomposition. We analyze a variety of storage formats of sparse tensors and develop a piecewise compression strategy to improve the storage efficiency of large sparse tensors. This compression strategy can avoid storing a large number of empty slices and empty fibers in sparse tensors, and thus the storage space is significantly reduced. A parallel algorithm for the SpTV based on the high-order compressed format based on slices is designed to greatly improve its computing performance on graphics processing unit. Each tensor is cut into multiple slices to form a series of sparse matrix and vector multiplications, which form the pipelined parallelism. The transmission time of the slices can be hidden through pipelined parallel to further optimize the performance of the SpTV.
ACM Transactions on Knowledge Discovery from Data (TKDD) – Association for Computing Machinery
Published: Nov 11, 2019
Keywords: Pipeline
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.