creators_name: Wang, Jun creators_name: Zucker, Jean-Daniel editors_name: Langley, Pat type: confpaper datestamp: 2002-03-10 lastmod: 2011-03-11 08:54:54 metadata_visibility: show title: Solving Multiple-Instance Problem: A Lazy Learning Approach ispublished: pub subjects: comp-sci-art-intel subjects: comp-sci-mach-learn full_text_status: public keywords: multiple-instance problem, multiple-instance learning, lazing learning, nearest neighbor abstract: As opposed to traditional supervised learning, multiple-instance learning concerns the problem of classifying a bag of instances, given bags that are labeled by a teacher as being overall positive or negative. Current research mainly concentrates on adapting traditional concept learning to solve this problem. In this paper we investigate the use of lazy learning and Hausdorff distance to approach the multiple-instance problem. We present two variants of the K-nearest neighbor algorithm, called Bayesian-KNN and Citation-KNN, solving the multiple-instance problem. Experiments on the Drug discovery benchmark data show that both algorithms are competitive with the best ones conceived in the concept learning framework. Further work includes exploring of a combination of lazy and eager multiple-instance problem classifiers. date: 2000 date_type: published publisher: Morgan Kaufmann pagerange: 1119-1125 refereed: TRUE referencetext: Aha, D. W. (Ed.). (1997). Lazy Learning. Dordrecht, The Netherlands: Kluwer Academic Publishers. Auer, P. (1997). On learning from multi-instance examples: Empirical evaluation of a theoretical approach. Proceedings of the Fourteenth International Conference on Machine Learning (pp. 21-29). San Francisco: Morgan Kaufmann. Bergadano, F., Giordana, A., & Saitta, L. (1991). Machine learning: An integrated framework and its application. Chichester, UK: Ellis Horwood. Blockeel, H., & De Raedt, L. (1998). Top-down induction of first order logical decision trees. Artificial Intelligence, 101, 285-297. Blum, A., & Kalai, A. (1998). A note on learning from multiple-instance examples. Machine Learning, 30, 23-29. Bottou, P., & Vapnik, V. (1992). Local learning algorithms. Neural Computation, 4, 888-900. Dasarathy, B.V. (1991). Nearest neighbor norms: NN pattern classification techniques. Los Alamitos, CA: IEEE Computer Society Press. De Raedt, L. (1998). Attribute-value learning versus inductive logic programming: The missing links. Proceedings of the Eighth International Conference on Inductive Logic Programming (pp. 1-8). Springer-Verlag. Dietterich, T.G., Jain, A., Lathrop, R. H., & Lozano-Pérez, T. (1994). A comparison of dynamic reposing and tangent distance for drug activity prediction. Advances in Neural Information Processing Systems, 6, 216-223. San Mateo: Morgan Kaufmann. Dietterich, T.G., Lathrop, R. H., & Lozano-Pérez, T. (1997). Solving the multiple-instance problem with axis-parallel rectangles. Artificial Intelligence, 89, 31-71. Edgar, G.A. (1995). Measure, topology, and fractal geometry (3rd print). Springer-Verlag. Garfield, E. (1979). Citation indexing: Its theory and application in science, technology, and humanities. New York: John Wiley & Sons. Long, P.M., & Tan, L. (1996). PAC-learning axis aligned rectangles with respect to product distributions from multiple-instance examples. Proceedings of the Ninth Annual Conference on Computational Learning Theory (pp. 228-234). New York: ACM Press. Maron. O., & Lozano-Pérez, T. (1998). A framework for multiple-instance learning. Advances in Neural Information Processing Systems, 10. MIT Press. Maron, O., & Ratan, A. L. (1998). Multiple-instance learning for natural scene classification. Proceedings of the Fifteenth International Conference on Machine Learning. San Francisco: Morgan Kaufmann. Maron, O. (1998). Learning from ambiguity. Doctoral dissertation, Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology. Perny, P. (1998). Multicriteria filtering methods based on concordance and non-discordance principles. Annals of Operations Research, 80, 137-165. Bussum, The Netherlands: Baltzer Science Publishers. Ruffo, G. (2000). Learning single and multiple instance decision tree for computer security applications. Doctoral dissertation, Department of Computer Science, University of Turin, Torino, Italy. Sebag, M., & Rouveirol, C. (1997). Tractable induction and classification in first order logic via stochastic matching. Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (pp. 888-893). Nagoya, Japan: Morgan Kaufmann. Vapnik, V. (1995). The nature of statistical learning theory. New York: Springer-Verlag. Zucker, J.-D., & Ganascia, J.-G. (1996). Changes of representation for efficient learning in structural domains. Proceedings of the Thirteenth International Conference on Machine Learning (pp. 543-551). Bary, Italy: Morgan Kaufmann. Zucker, J.-D., & Ganascia, J.-G. (1998). Learning structurally indeterminate clauses. Proceedings of the Eighth International Conference on Inductive Logic Programming (pp. 235-244). Springer-Verlag. citation: Wang, Jun and Zucker, Jean-Daniel (2000) Solving Multiple-Instance Problem: A Lazy Learning Approach. [Conference Paper] document_url: http://cogprints.org/2124/3/wang_ICML2000.pdf