Dear all
Two days ago I posted the following question:
"I am looking for distributed or parallel algorithms for Machine Learning.
I am specially interested on surveys (apart from that by Peter Stone), and
on parallel / distributed versions of stardard algorithms (from decision
trees, bayesian approaches or rulte learners to neural networks, SVMs, etc)."
Thank you to all who replied to my message, in particular: Hillol Kargupta,
Alex Freitas, Rui Camacho, Balazs Kegl, Claudia Antunes, Jiri Ocenasek and
Jose Carlos Cortizo. They have suggested me the following references:
BOOKS -----------------------------------------------------
Yike Guo and Robert Grossman, editors, High Performance Data Mining:
Scaling Algorithms, Applications and Systems, Kluwer Academic Publishers, 1999.
A.A. Freitas and S.H. Lavington. Mining Very Large Databases with Parallel
Processing. Kluwer, 1998.
http://www.cs.kent.ac.uk/people/staff/aaf/book-springer-ukc.html
Mohammed J. Zaki, Ching-Tien Ho (Eds). Large-Scale Parallel Data Mining.
Springer-Verlag GmbH, Lecture Notes in Computer Science, 1759 / 2000.
PAPERS ----------------------------------------------------
A.A. Freitas. A Survey of Parallel Data Mining. Proc. 2nd Int. Conf. on the
Practical Applications of Knowledge Discovery and Data Mining, 287-300.
London: The Practical Application Company, Mar. 1998.
http://www.cs.kent.ac.uk/people/staff/aaf/my-publications-ukc.html
Nuno Fonseca, Fernando Silva, Rui Camacho. Strategies to parallelize ILP
systems. In Proceedings of ILC 2005
http://ilp2005.in.tum.de/accepted-papers.html
* Thank you for sending me the draft
S. Gambs, B. Kégl, and E. Aïmeur. "Privacy-preserving boosting" Data Mining
and Knowledge Discovery, 2005 (submitted).
http://www.iro.umontreal.ca/~kegl/research/publications/
* This is for adaboost, and the authors suggest looking at the references:
[Lazarevic and Obradovic, 2002] and [Fan et al., 1999] for adaboost,
[Lindell and Pinkas, 2002] for trees.
Mohammed J. Zaki, "Parallel and Distributed Association Mining: A Survey",
IEEE Concurrency, special issue on Parallel Mechanisms for Data Mining,
Vol. 7, No. 4, pp14-25, December, 1999
* Good intro in terms of pattern mining / association rules
Ocenasek, J., Schwarz, J., Pelikan, M.: Design of Multithreaded Estimation
of Distribution Algorithms. In: Cantú-Paz et al. (Eds.): Genetic and
Evolutionary Computation Conference - GECCO 2003. Springer Verlag: Berlin,
2003, pp. 1247-1258.
* More papers, thesis and code by Jiri Ocenasek (http://jiri.ocenasek.com/).
BIBLIOGRAPHIES -----------------------------------------
Online distributed data mining bibliography:
http://www.cs.umbc.edu/~hillol/DDMBIB/
Again, thank you all. I will imr`pove the summary if I get more replies.
Best regards
Jose Maria Gomez Hidalgo
Departamento de Sistemas Informáticos
Universidad Europea de Madrid
28670 - Villaviciosa de Odon - MADRID
(+34) 912115670
jmgomez@...
http://www.esp.uem.es/~jmgomez/
http://www.esp.uem.es
La legislación española ampara el secreto de las comunicaciones. Este
correo electrónico es estrictamente confidencial y va dirigido
exclusivamente a su destinatario/a. Si no es Ud., le rogamos que no difunda
ni copie la transmisión y nos lo notifique cuanto antes.
Spanish law guarantees privacy in electronic communications. This
electronic transmission is strictly confidential and intended solely for
the addressee. If you are not the intended addressee, you are kindly
requested not to disclose nor to copy this transmission and to notify us as
soon as possible.