creators_name: Turney, Peter type: journalp datestamp: 2001-09-17 lastmod: 2011-03-11 08:54:48 metadata_visibility: show title: Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm ispublished: pub subjects: comp-sci-art-intel subjects: comp-sci-mach-learn subjects: comp-sci-stat-model full_text_status: public keywords: cost, classification, ICET, classification errors, decision trees, EG2, average cost, genetic algorithm, IDX. abstract: This paper introduces ICET, a new algorithm for cost-sensitive classification. ICET uses a genetic algorithm to evolve a population of biases for a decision tree induction algorithm. The fitness function of the genetic algorithm is the average cost of classification when using the decision tree, including both the costs of tests (features, measurements) and the costs of classification errors. ICET is compared here with three other algorithms for cost-sensitive classification - EG2, CS-ID3, and IDX - and also with C4.5, which classifies without regard to cost. The five algorithms are evaluated empirically on five real-world medical datasets. Three sets of experiments are performed. The first set examines the baseline performance of the five algorithms on the five datasets and establishes that ICET performs significantly better than its competitors. The second set tests the robustness of ICET under a variety of conditions and shows that ICET maintains its advantage. The third set looks at ICET's search in bias space and discovers a way to improve the search. date: 1995 date_type: published publication: Journal of Artificial Intelligence Research volume: 2 publisher: Morgan Kaufmann pagerange: 369-409 refereed: TRUE referencetext: Ackley, D., & Littman, M. (1991). Interactions between learning and evolution. In Proceedings of the Second Conference on Artificial Life, C. Langton, C. Taylor, D. Farmer, and S. Rasmussen, editors. California: Addison-Wesley. Aha, D.W., Kibler, D., & Albert, M.K. (1991). Instance-based learning algorithms, Machine Learning, 6, 37-66. Aha, D.W., & Bankert, R.L. (1994). Feature selection for case-based classification of cloud types: An empirical comparison. Case-Based Reasoning: Papers from the 1994 Workshop, edited by D.W. Aha, Technical Report WS-94-07, pp. 106-112. Menlo Park, CA: AAAI Press. Anderson, R.W. (in press). Learning and evolution: A quantitative genetics approach. Journal of Theoretical Biology. Baldwin, J.M. (1896). A new factor in evolution. American Naturalist, 30, 441-451. Breiman, L., Friedman, J., Olshen, R., & Stone, C. (1984). Classification and regression trees. California: Wadsworth. De Jong, K.A., Spears, W.M., & Gordon, D.F. (1993). Using genetic algorithms for concept learning. Machine Learning, 13, 161-188. Fogarty, T.C. (1992). Technical note: First nearest neighbor classification on Frey and Slate's letter recognition problem. Machine Learning, 9, 387-388. Frey, P.W., & Slate, D.J., (1991). Letter recognition using Holland-style adaptive classifiers. Machine Learning, 6, 161-182. Friedman, J.H., & Stuetzle, W. (1981). Projection pursuit regression. Journal of the American Statistics Association, 76, 817-823. Gordon, D.F., & Perlis, D. (1989). Explicitly biased generalization. Computational Intelligence, 5, 67-81. Grefenstette, J.J. (1986). Optimization of control parameters for genetic algorithms. IEEE Transactions on Systems, Man, and Cybernetics, 16, 122-128. Grefenstette, J.J., Ramsey, C.L., & Schultz, A.C. (1990). Learning sequential decision rules using simulation models and competition. Machine Learning, 5, 355-381. Hermans, J., Habbema, J.D.F., & Van der Burght, A.T. (1974). Cases of doubt in allocation problems, k populations. Bulletin of the International Statistics Institute, 45, 523-529. Hinton, G.E., & Nowlan, S.J. (1987). How learning can guide evolution. Complex Systems, 1, 495-502. Karakoulas, G. (in preparation). A Q-learning approach to cost-effective classification. Technical Report, Knowledge Systems Laboratory, National Research Council Canada. Also submitted to the Twelfth International Conference on Machine Learning, ML-95. Knoll, U., Nakhaeizadeh, G., & Tausend, B. (1994). Cost-sensitive pruning of decision trees. Proceedings of the Eight European Conference on Machine Learning, ECML-94, pp. 383-386. Berlin, Germany: Springer-Verlag. Koza, J.R. (1992). Genetic Programming: On the programming of computers by means of natural selection. Cambridge, MA: MIT Press. Lirov, Y., & Yue, O.-C. (1991). Automated network troubleshooting knowledge acquisition. Journal of Applied Intelligence, 1, 121-132. Maynard Smith, J. (1987). When learning guides evolution. Nature, 329, 761-762. Morgan, C.L. (1896). On modification and variation. Science, 4, 733-740. Murphy, P.M., & Aha, D.W. (1994). UCI Repository of Machine Learning Databases. University of California at Irvine, Department of Information and Computer Science. Norton, S.W. (1989). Generating better decision trees. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, IJCAI-89, pp. 800-805. Detroit, Michigan. Nunez, M. (1988). Economic induction: A case study. Proceedings of the Third European Working Session on Learning, EWSL-88, pp. 139-145. California: Morgan Kaufmann. Nunez, M. (1991). The use of background knowledge in decision tree induction. Machine Learning, 6, 231-250. Ontario Ministry of Health (1992). Schedule of benefits: Physician services under the health insurance act, October 1, 1992. Ontario: Ministry of Health. Pazzani, M., Merz, C., Murphy, P., Ali, K., Hume, T., & Brunk, C. (1994). Reducing misclassification costs: Knowledge-intensive approaches to learning from noisy data. Proceedings of the Eleventh International Conference on Machine Learning, ML-94, pp. 217-225. New Brunswick, New Jersey. Pearl, J. (1984). Heuristics: Intelligent search strategies for computer problem solving. Massachusetts: Addison-Wesley. Pearl, J. (1988). Probabilistic reasoning in intelligent systems: Networks of plausible inference. California: Morgan Kaufmann. Pipitone, F., De Jong, K.A., & Spears, W.M. (1991). An artificial intelligence approach to analog systems diagnosis. In Testing and Diagnosis of Analog Circuits and Systems, Ruey-wen Liu, editor. New York: Van Nostrand-Reinhold. Provost, F.J. (1994). Goal-directed inductive learning: Trading off accuracy for reduced error cost. AAAI Spring Symposium on Goal-Driven Learning. Provost, F.J., & Buchanan, B.G. (in press). Inductive policy: The pragmatics of bias selection. Machine Learning. Quinlan, J.R. (1992). C4.5: Programs for machine learning. California: Morgan Kaufmann. Ragavan, H., & Rendell, L. (1993). Lookahead feature construction for learning hard concepts. Proceedings of the Tenth International Conference on Machine Learning, ML-93, pp. 252-259. California: Morgan Kaufmann. Rymon, R. (1993). An SE-tree based characterization of the induction problem. Proceedings of the Tenth International Conference on Machine Learning, ML-93, pp. 268-275. California: Morgan Kaufmann. Schaffer, C. (1993). Selecting a classification method by cross-validation. Machine Learning, 13, 135-143. Schaffer, J.D., Whitley, D., & Eshelman, L.J. (1992). Combinations of genetic algorithms and neural networks: A survey of the state of the art. In Combinations of Genetic Algorithms and Neural Networks, D. Whitley and J.D. Schaffer, editors. California: IEEE Computer Society Press. Seshu, R. (1989). Solving the parity problem. Proceedings of the Fourth European Working Session on Learning, EWSL-89, pp. 263-271. California: Morgan Kaufmann. Spears, W.M. (1992). Crossover or mutation? Foundations of Genetic Algorithms 2, FOGA-92, edited by D. Whitley. California: Morgan Kaufmann. Sutton, R.S. (1992). Introduction: The challenge of reinforcement learning. Machine Learning, 8, 225-227. Tan, M., & Schlimmer, J. (1989). Cost-sensitive concept learning of sensor use in approach and recognition. Proceedings of the Sixth International Workshop on Machine Learning, ML-89, pp. 392-395. Ithaca, New York. Tan, M., & Schlimmer, J. (1990). CSL: A cost-sensitive learning system for sensing and grasping objects. IEEE International Conference on Robotics and Automation. Cincinnati, Ohio. Tan, M. (1993). Cost-sensitive learning of classification knowledge and its applications in robotics. Machine Learning, 13, 7-33. Tcheng, D., Lambert, B., Lu, S., Rendell, L. (1989). Building robust learning systems by combining induction and optimization. Proceedings of the Eleventh International Joint Conference on Artificial Intelligence, IJCAI-89, pp. 806-812. Detroit, Michigan. Turney, P.D. (in press). Technical note: Bias and the quantification of stability. Machine Learning. Verdenius, F. (1991). A method for inductive cost optimization. Proceedings of the Fifth European Working Session on Learning, EWSL-91, pp. 179-191. New York: Springer-Verlag. Waddington, C.H. (1942). Canalization of development and the inheritance of acquired characters. Nature, 150, 563-565. Whitley, D., Dominic, S., Das, R., & Anderson, C.W. (1993). Genetic reinforcement learning for neurocontrol problems. Machine Learning, 13, 259-284. Whitley, D., & Gruau, F. (1993). Adding learning to the cellular development of neural networks: Evolution and the Baldwin effect. Evolutionary Computation, 1, 213-233. Whitley, D., Gordon, S., & Mathias, K. (1994). Lamarckian evolution, the Baldwin effect and function optimization. Parallel Problem Solving from Nature -- PPSN III. Y. Davidor, H.P. Schwefel, and R. Manner, editors, pp. 6-15. Berlin: Springer-Verlag. Wilson, S.W. (1987). Classifier systems and the animat problem. Machine Learning, 2, 199-228. citation: Turney, Peter (1995) Cost-Sensitive Classification: Empirical Evaluation of a Hybrid Genetic Decision Tree Induction Algorithm. [Journal (Paginated)] document_url: http://cogprints.org/1805/1/turney95costsensitive.ps document_url: http://cogprints.org/1805/5/turney95a.pdf