Abstract
Ensemble learning is a well-established method for improving the generalization performance of learning machines. The idea is to combine a number of learning systems that have been trained in the same task. However, since all the members of the ensemble are operating at the same time, large amounts of memory and long execution times are needed, limiting its practical application. This paper presents a new method (called local averaging) in the context of nearest neighbor (NN) classifiers that generates a classifier from the ensemble with the same complexity as the individual members. Once a collection of prototypes is generated from different learning sessions using a Kohonen's LVQ algorithm, a single set of prototypes is computed by applying a cluster algorithm (such as K-means) to this collection. Local averaging can be viewed either as a technique to reduce the variance of the prototypes or as the result of averaging a series of particular bootstrap replicates. Experimental results using several classification problems confirm the utility of the method and show that local averaging can compute a single classifier that achieves a similar (or even better) accuracy than ensembles generated with voting.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
L. Breiman, “Bias-variance, regularization, instability and stabilization,” in (Bishop, 1998), 1998a.
P. Prechelt, “Early stopping But when? In Neural Networks: Tricks of the Trade, pp. 55–69, Lecture Notes in Computer Science 1524, Springer Verlag: Heidelberg, 1998.
L. Breiman, “Bagging predictors,” Technical Report No. 421, Department of Statistics, University of California, Berkeley, 1994.
L. Breiman, “The heuristics of instability in model selection,” Annals of Statistics, vol. 24, pp. 2350–2381, 1997.
M.P. Perrone, “Averaging techniques for neural networks,” in The Handbook of Brain Theory and Neural Networks, edited by M.A. Arbib, Boston, MIT Press: Boston, 1993.
R.A. Jacobs, et al., “Adaptive mixtures of local experts,” Neural Computation, 3, 79–87, 1991.
R.E. Schapire, “Abrief introduction to boosting,” in Proceedings of the 16th International Joint Conference on Artificial Intelligence, 1999.
A. Sharkey, (Ed.), Combining Artificial Neural Nets. Springer-Verlag: Berlin, 1999.
B. Efron and R.J. Tibshirani, An Introduction to the Bootstrap. Chapman & Hall, 1994.
D. Wolpert, “Stacked generalization,” Neural Networks, vol. 5, 1992.
L. Breiman, “Half-&-half bagging and hard boundary points,” Technical Report No. 534, Department of Statistics, University of California, Berkeley, 1998b.
R. Avnimelech and N. Intrator, “Boosted mixture of experts: An ensemble learning scheme.” Neural Computation, vol. 11, 1999.
U. Natftaly, N. Intrator, and D. Horn, Optimal Ensemble Averaging of Neural Networks, Technical Report, Israel Institute of Technology, 1997.
L.K. Hansen and P. Salamon, “Neural network ensembles,” IEEE Transactions on PAMI, vol. 12, 1990.
D. Partridge and W.B. Yates, “Engineering multiversion neuralnet systems,” Neural Computation, 8, 1996.
T. Dietterich, “Machine learning research: Four current directions,” AI Magazine, vol. 18, pp. 97–136, 1997.
T. Kohonen, Self-Organizing Maps, 2nd edition. Springer-Verlag: Berlin, 1996.
E. Alpaydin, “Voting over multiple condensed nearest neighbors,” Artificial Intelligence Review, vol. 1-5, pp. 115–132, 1997.
D.B. Skalak, “Prototype Selection for Composite Nearest Neighbor Classifiers. Department of Computer Science, University of Massachusetts, Amherst, 1997.
L. Bottou and Y. Bengio, Convergence properties of k-means,” in Advances in Neural Processing systems, vol. 7, MIT Press: Boston, 1995.
S. Bermejo and J. Cabestany, “Finite-sample convergence of the LVQ1 algorithm and the BLVQ1 algorithm,” Neural Processing Letters, 13, pp. 135–157, 2001.
B. Kosko, Neural Networks and Fuzzy Systems, Prentice-Hall International: NJ, 1992.
T. Dietterich, “Overfitting and undercomputing in machine learning,” Computing Surveys, vol. 27, pp. 326–327, 1995.
D. Ripley, “Neural networks and methods for classification,” Journal of the Royal Statistical Society, Series B, vol. 56, pp. 409–456, 1994.
S. Geman, E. Bienenstock, and R. Doursat “Neural Networks and the bias/variance dilemma,” Neural Computation, vol. 4, 1992.
A. Gersho and R.M. Gray, Vector Quantization and Signal Compression.Kluwer Academic Publishers, 1992.
T. Kohonen, J. Hynninen, J. Kangas, J. Laaksonen, and K. Torkkola, “LVQ_PAK. The learning vector quantization program package.” Version 3.1, Laboratory of Computer and Information Science, Helsinki University of Technology, Helsinki, 1995.
P.M. Murphy and D.W. Aha, UCI Repository of machine learning databases. http://www.ics.uci.edu/~mlearn/MLRepository. html. Department of Information and Computer Science, University of California, 1994.
A.P. Dempster, N.M. Laird, and D.B. Rubin, “Maximum likelihood from incomplete data via the EM algorithm,” Journal of the Royal Statistical Society, series B, vol. 39, pp. 1–38, 1977.
A. Benveniste, M. Méetivier, and P. Priouret, Adaptive Algorithms and Stochastic Approximations. Springer-Verlag: Berlin, 1990.
B. LaVigna, “Nonparametric classification using learning vector quantization,” Ph.D. Dissertation, University of Maryland, 1989.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Bermejo, S., Cabestany, J. Local Averaging of Ensembles of LVQ-Based Nearest Neighbor Classifiers. Applied Intelligence 20, 47–58 (2004). https://doi.org/10.1023/B:APIN.0000011141.25306.26
Issue Date:
DOI: https://doi.org/10.1023/B:APIN.0000011141.25306.26