Access the full text.
Sign up today, get DeepDyve free for 14 days.
I. Stoica, R. Morris, David Karger, M. Kaashoek, H. Balakrishnan (2001)
Chord: A scalable peer-to-peer lookup service for internet applications
Jiawei Han (2007)
IntroductionACM Trans. Knowl. Discov. Data, 1
Qiang Yang, Xindong Wu (2006)
10 Challenging Problems in Data Mining ResearchInt. J. Inf. Technol. Decis. Mak., 5
Jieping Ye, Qi Li, Hui Xiong, Haesun Park, Ravi Janardan, Vipin Kumar (2004)
IDR/QR: an incremental dimension reduction algorithm via QR decompositionIEEE Transactions on Knowledge and Data Engineering, 17
C. Faloutsos, King-Ip Lin (1995)
FastMap: a fast algorithm for indexing, data-mining and visualization of traditional and multimedia datasets
H. Gabriela, F. Martín (1999)
Cluster-preserving Embedding of Proteins
C. Doulkeridis, K. Nørvåg, M. Vazirgiannis (2007)
DESENT: decentralized and distributed semantic overlay generation in P2P networksIEEE Journal on Selected Areas in Communications, 25
Yongming Qu, G. Ostrouchov, N. Samatova, A. Geist (2002)
Principal Component Analysis for Dimension Reduction in Massive Distributed Data Sets ∗
J. Vleugels, R. Veltkamp (1999)
Efficient image retrieval through vantage objects
T. Payne, Peter Edwards (1999)
Dimensionality Reduction through Correspondence Analysis
M. Carreira-Perpiñán (2009)
A Review of Dimension Reduction Techniques
K. Beyer, J. Goldstein, R. Ramakrishnan, U. Shaft (1999)
When Is ''Nearest Neighbor'' Meaningful?
V. Athitsos, J. Alon, S. Sclaroff, G. Kollios (2008)
BoostMap: An Embedding Method for Efficient Nearest Neighbor RetrievalIEEE Transactions on Pattern Analysis and Machine Intelligence, 30
F. Abu-Khzam, N. Samatova, G. Ostrouchov, M. Langston, A. Geist (2002)
Distributed Dimension Reduction Algorithms for Widely Dispersed Data
V. Silva, J. Tenenbaum (2002)
Global Versus Local Methods in Nonlinear Dimensionality Reduction
Philippines) 44 of 44 February 22
P. Magdalinos, C. Doulkeridis, M. Vazirgiannis (2006)
K-Landmarks: Distributed Dimensionality Reduction for Clustering Quality Maintenance
Eamonn Keogh (2010)
Nearest Neighbor
Richard Lee, J. Slagle, H. Blum (1977)
A Triangulation Method for the Sequential Mapping of Points from N-Space to Two-SpaceIEEE Transactions on Computers, C-26
T. Landauer, S. Dumais (2008)
Latent semantic analysisScholarpedia, 3
P. Magdalinos, C. Doulkeridis, M. Vazirgiannis (2009)
FEDRA: A Fast and Efficient Dimensionality Reduction Algorithm
J. Bourgain (1985)
On lipschitz embedding of finite metric spaces in Hilbert spaceIsrael Journal of Mathematics, 52
D. Swets, J. Weng (1996)
Using Discriminant Eigenfeatures for Image RetrievalIEEE Trans. Pattern Anal. Mach. Intell., 18
P. Drineas, R. Kannan, Michael Mahoney (2006)
Fast Monte Carlo Algorithms for Matrices III: Computing a Compressed Approximate Matrix DecompositionSIAM J. Comput., 36
(2001)
Matrix algorithms Vol I, II
Xiang Lian, Lei Chen (2009)
General Cost Models for Evaluating Dimensionality Reduction in High-Dimensional SpacesIEEE Transactions on Knowledge and Data Engineering, 21
(2011)
Article 11, Publication date: February 2011
Michael Mahoney, P. Drineas (2009)
CUR matrix decompositions for improved data analysisProceedings of the National Academy of Sciences, 106
Y. Freund, R. Schapire (1997)
A decision-theoretic generalization of on-line learning and an application to boosting
Alok Sharma, K. Paliwal (2008)
Rotational Linear Discriminant Analysis Technique for Dimensionality ReductionIEEE Transactions on Knowledge and Data Engineering, 20
H. Kargupta, Weiyun Huang, K. Sivakumar, Byung-Hoon Park, Shuren Wang (2000)
Collective Principal Component Analysis from Distributed, Heterogeneous Data
N. Ailon, B. Chazelle (2006)
Approximate nearest neighbors and the fast Johnson-Lindenstrauss transform
J. Wang, Xiong Wang, D. Shasha, Kaizhong Zhang (2005)
MetricMap: an embedding technique for processing distance-based queries in metric spacesIEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics), 35
J. Zinnes, W. Torgerson (1958)
Theory and Methods of Scaling.Journal of the American Statistical Association, 56
Zhifeng Li, Dahua Lin, Xiaoou Tang (2009)
Nonparametric Discriminant Analysis for Face RecognitionIEEE Transactions on Pattern Analysis and Machine Intelligence, 31
H. Qi, Tsewei Wang, J. Birdwell (2003)
Global Principal Component Analysis for Dimensionality Reduction in Distributed Data Mining
Souptik Datta, C. Giannella, H. Kargupta (2006)
K-Means Clustering Over a Large, Dynamic Network
S. Dasgupta, Anupam Gupta (2003)
An elementary proof of a theorem of Johnson and LindenstraussRandom Structures & Algorithms, 22
(2001)
2001.Matrix algorithms Vol I, II. Society for Industrial and Applied Mathematics, Philadel
ACM Journal Name
D. Achlioptas (2001)
Database-friendly random projections
(2000)
P.Magdalinos, C.Doulkeridis and M.Vazirgiannis
V. Silva, J. Tenenbaum (2004)
Sparse multidimensional scaling using land-mark points
N. Ailon, B. Chazelle (2010)
Faster dimension reductionCommunications of the ACM, 53
Chaomei Chen (2004)
Mining the Web: Discovering knowledge from hypertext dataJ. Assoc. Inf. Sci. Technol., 55
Soumen Chakrabarti (2002)
Mining the web - discovering knowledge from hypertext data
C. Hennig, Longin Latecki (2003)
The choice of vantage objects for image retrievalPattern Recognit., 36
(2009)
Received December
S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker (2001)
A scalable content-addressable network
Gísli Hjaltason, H. Samet (2003)
Properties of Embedding Methods for Similarity Searching in Metric SpacesIEEE Trans. Pattern Anal. Mach. Intell., 25
TKD00016 ACM (Typeset by SPi, Manila, Philippines) 1 of 44 February 22, 2011 Enhancing Clustering Quality through Landmark-Based Dimensionality Reduction PANAGIS MAGDALINOS, Athens University of Economics and Business CHRISTOS DOULKERIDIS, Norwegian University of Science and Technology MICHALIS VAZIRGIANNIS, Athens University of Economics and Business Scaling up data mining algorithms for data of both high dimensionality and cardinality has been lately recognized as one of the most challenging problems in data mining research. The reason is that typical data mining tasks, such as clustering, cannot produce high quality results when applied on high-dimensional and/or large (in terms of cardinality) datasets. Data preprocessing and in particular dimensionality reduction constitute promising tools to deal with this problem. However, most of the existing dimensionality reduction algorithms share also the same disadvantages with data mining algorithms, when applied on large datasets of high dimensionality. In this article, we propose a fast and ef cient dimensionality reduction algorithm (FEDRA), which is particularly scalable and therefore suitable for challenging datasets. FEDRA follows the landmark-based paradigm for embedding data objects in a low-dimensional projection space. By means of a theoretical analysis, we prove that FEDRA is ef cient, while we demonstrate the achieved quality of results
ACM Transactions on Knowledge Discovery from Data (TKDD) – Association for Computing Machinery
Published: Feb 1, 2011
Read and print from thousands of top scholarly journals.
Already have an account? Log in
Bookmark this article. You can see your Bookmarks on your DeepDyve Library.
To save an article, log in first, or sign up for a DeepDyve account if you don’t already have one.
Copy and paste the desired citation format or use the link below to download a file formatted for EndNote
Access the full text.
Sign up today, get DeepDyve free for 14 days.
All DeepDyve websites use cookies to improve your online experience. They were placed on your computer when you launched this website. You can change your cookie settings through your browser.