Dear all
After getting more answers to the following question, I repost an improved
summary.
"I am looking for distributed or parallel algorithms for Machine Learning.
I am specially interested on surveys (apart from that by Peter Stone), and
on parallel / distributed versions of stardard algorithms (from decision
trees, bayesian approaches or rulte learners to neural networks, SVMs, etc)."
Thank you to all who replied to my messages, in particular: Hillol
Kargupta, Alex Freitas, Rui Camacho, Balazs Kegl, Claudia Antunes, Jiri
Ocenasek, Jose Carlos Cortizo, Nitesh Chawla and Hillol Kargupta. They have
suggested me the following references:
BOOKS ----------------------------------------------------- *** ADDITIONS
Yike Guo and Robert Grossman, editors, High Performance Data Mining:
Scaling Algorithms, Applications and Systems, Kluwer Academic Publishers, 1999.
A.A. Freitas and S.H. Lavington. Mining Very Large Databases with Parallel
Processing. Kluwer, 1998.
http://www.cs.kent.ac.uk/people/staff/aaf/book-springer-ukc.html
Mohammed J. Zaki, Ching-Tien Ho (Eds). Large-Scale Parallel Data Mining.
Springer-Verlag GmbH, Lecture Notes in Computer Science, 1759 / 2000.
Advances in Distributed and Parallel Knowledge Discovery. Edited by Hillol
Kargupta and Philip Chan
MIT/AAAI Press
http://www.aaai.org/Press/Books/Kargupta1/kargupta1.html
Data Mining: Next Generation Challenges and Future Directions. Edited by H.
Kargupta, A. Joshi, K. Sivakumar, and Y. Yesha
MIT/AAAI Press
http://www.cs.umbc.edu/~hillol/Kargupta/ngdmbook.html
( *This book has several chapters on DDM* )
PAPERS ---------------------------------------------------- *** ADDITIONS
A.A. Freitas. A Survey of Parallel Data Mining. Proc. 2nd Int. Conf. on the
Practical Applications of Knowledge Discovery and Data Mining, 287-300.
London: The Practical Application Company, Mar. 1998.
http://www.cs.kent.ac.uk/people/staff/aaf/my-publications-ukc.html
Nuno Fonseca, Fernando Silva, Rui Camacho. Strategies to parallelize ILP
systems. In Proceedings of ILC 2005
http://ilp2005.in.tum.de/accepted-papers.html
* Thank you for sending me the draft
S. Gambs, B. Kégl, and E. Aïmeur. "Privacy-preserving boosting" Data Mining
and Knowledge Discovery, 2005 (submitted).
http://www.iro.umontreal.ca/~kegl/research/publications/
* This is for adaboost, and the authors suggest looking at the references:
[Lazarevic and Obradovic, 2002] and [Fan et al., 1999] for adaboost,
[Lindell and Pinkas, 2002] for trees.
Mohammed J. Zaki, "Parallel and Distributed Association Mining: A Survey",
IEEE Concurrency, special issue on Parallel Mechanisms for Data Mining,
Vol. 7, No. 4, pp14-25, December, 1999
* Good intro in terms of pattern mining / association rules
Ocenasek, J., Schwarz, J., Pelikan, M.: Design of Multithreaded Estimation
of Distribution Algorithms. In: Cantú-Paz et al. (Eds.): Genetic and
Evolutionary Computation Conference - GECCO 2003. Springer Verlag: Berlin,
2003, pp. 1247-1258.
* More papers, thesis and code by Jiri Ocenasek (http://jiri.ocenasek.com/).
"Learning Ensembles from Bites: A Scalable and Accurate Approach," Nitesh
V. Chawla, Lawrence O. Hall, Kevin W. Bowyer, W. Philip Kegelmeyer, Journal
of Machine Learning Research (JMLR), 5(Apr):421--451, 2004.
“Distributed Learning with Bagging like Performance,” Chawla, N.V., Moore,
T.E., Hall, L.O., Bowyer, K.W., Kegelmeyer, W.P., Springer C. Pattern
Recognition Letters, 24 (1-3) (2003), 455 -- 471.
BIBLIOGRAPHIES -----------------------------------------
Online distributed data mining bibliography:
http://www.cs.umbc.edu/~hillol/DDMBIB/
Again, thank you all. I will imr`pove the summary if I get more replies.
Best regards
Jose Maria Gomez Hidalgo
Departamento de Sistemas Informáticos
Universidad Europea de Madrid
28670 - Villaviciosa de Odon - MADRID
(+34) 912115670
jmgomez@...
http://www.esp.uem.es/~jmgomez/
http://www.esp.uem.es
La legislación española ampara el secreto de las comunicaciones. Este
correo electrónico es estrictamente confidencial y va dirigido
exclusivamente a su destinatario/a. Si no es Ud., le rogamos que no difunda
ni copie la transmisión y nos lo notifique cuanto antes.
Spanish law guarantees privacy in electronic communications. This
electronic transmission is strictly confidential and intended solely for
the addressee. If you are not the intended addressee, you are kindly
requested not to disclose nor to copy this transmission and to notify us as
soon as possible.