creators_name: Iqbal, Ridwan Al creators_id: stopofeger@yahoo.com type: preprint datestamp: 2010-06-06 14:35:46 lastmod: 2011-03-11 08:57:37 metadata_visibility: show title: Empirical learning aided by weak domain knowledge in the form of feature importance ispublished: unpub subjects: comp-sci-mach-learn full_text_status: public keywords: neural network, domain knowledge, prior knowledge, feature importance abstract: Standard hybrid learners that use domain knowledge require stronger knowledge that is hard and expensive to acquire. However, weaker domain knowledge can benefit from prior knowledge while being cost effective. Weak knowledge in the form of feature relative importance (FRI) is presented and explained. Feature relative importance is a real valued approximation of a feature’s importance provided by experts. Advantage of using this knowledge is demonstrated by IANN, a modified multilayer neural network algorithm. IANN is a very simple modification of standard neural network algorithm but attains significant performance gains. Experimental results in the field of molecular biology show higher performance over other empirical learning algorithms including standard backpropagation and support vector machines. IANN performance is even comparable to a theory refinement system KBANN that uses stronger domain knowledge. This shows Feature relative importance can improve performance of existing empirical learning algorithms significantly with minimal effort. date: 2010-06-04 date_type: submitted refereed: FALSE referencetext: 1 Winston, P. H. Learning structural descriptions from examples. MIT Technical Report, 1970. 2 Pazzani, M., Mani, S., and Shankle, W. R. Comprehensible knowledge discovery in databases. In CogSci-97 ( 1997). 3 Simard, P. S., Victoni, B., LeCun, Y., and Denker, J. Tangent prop-A formalism for specifying selected invariances in an adaptive network. In Advances in Neural Information Processing Systems (San Mateo, CA 1992), Morgan Kaufmann. 4 Pazzani, M., Brunk, C., and & Silverstein, G. A knowledge-intensive approach to learning relational concepts. In Proceedings of the Eighth International Workshop on Machine Learning (San Francisco 1991), 432-436. 5 Mahoney, J. Jeffrey and Mooney, Raymond J. Combining Symbolic and Neural Learning to Revise Probabilistic Theories. In Proceedings of the 1992 Machine Learning Workshop on Integrated Learning in Real Domains ( 1992). 6 Towell, G. G. and Shavlik, J. W. Knowledge-based artificial neural networks. Artif. Intel., 70 (1994), 50-62. 7 Fung, G., Mangasarian, O., and Shavlik, J. Knowledge-Based Support Vector Machine Classifiers. In Proceedings of Sixteenth Conference on Neural Information Processing Systems (NIPS) (Vancouver, Canada 2002). 8 Scott, A., Clayton, J., & Gibson, E. A practical guide to knowledge acquisition. Addison-Wesley, 1991. 9 Marcus, S. (Ed.). Special issue on knowledge acquisition. Mach. Learn., 4 (1989). 10 Bekkerman, R., El-Yaniv, R., Tishby, N., and Winter., Y. Distributional word clusters vs. words for text categorization. JMLR, 3 (2003), 1183–1208. 11 Ruck, Dennis W., Rogers, Steven K., and Kabrisky, Matthew. Feature Selection Using a Multilayer Perceptron. Journal of Neural Network Computing, 2 (1990), 40-48. 12 Guyon, Isabelle and Elisseeff, Andr´e. An Introduction to Variable and Feature Selection. Journal of Machine Learning Research, 3 (2003), 1157-1182. 13 Friedman, J. Greedy function approximation: a gradient boosting machine. Annals of Statistics, 29 (2001), 1189-1232. 14 Zien, Alexander, Kramer, Nicole, Sonnenburg, Soren, and Ratsch, Gunnar. The Feature Importance Ranking Measure. In ECML 09 ( 2009). 15 Mitchell, Tom M. Artificial neural networks. In Machine learning. McGraw-Hill, 1997. 16 Mitchell, Tom M. Artificial neural networks. In Machine learning. McGraw-Hill Science/Engineering/Math, 1997. 17 Quinlan, J. R. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA, 1993. 18 Aha, D., Kibler, D., and & Albert, M. Instance-based learning algorithms. Machine learning, 6 (1991), 37-66. 19 Vapnik, V. N. Statistical Learning Theory. Wiley, New York, 1998. 20 Towell, G., Shavlik, J., and Noordewier, M. Refinement of Approximate Domain Theories by Knowledge-Based Neural Networks. In Proceedings of the Eighth National Conference on Artificial Intelligence (Boston, MA 1990), 861-866. 21 Noordewier, M., Towell, G., and Shavlik, J. Training Knowledge-Based Neural Networks to Recognize Genes in DNA Sequences. In Advances in Neural Information Processing Systems (Denver, CO 1991), Morgan Kaufmann, 530-536. citation: Iqbal, Ridwan Al (2010) Empirical learning aided by weak domain knowledge in the form of feature importance. [Preprint] (Unpublished) document_url: http://cogprints.org/6855/1/fri.pdf