Cogprints

Towards Incremental Parsing of Natural Language using Recursive Neural Networks

Costa, Fabrizio and Frasconi, Paolo and Lombardo, Vincenzo and Soda, Giovanni (2002) Towards Incremental Parsing of Natural Language using Recursive Neural Networks. [Journal (Paginated)] (In Press)

Full text available as:

[img] PDF
335Kb

Abstract

In this paper we develop novel algorithmic ideas for building a natural language parser grounded upon the hypothesis of incrementality. Although widely accepted and experimentally supported under a cognitive perspective as a model of the human parser, the incrementality assumption has never been exploited for building automatic parsers of unconstrained real texts. The essentials of the hypothesis are that words are processed in a left-to-right fashion, and the syntactic structure is kept totally connected at each step. Our proposal relies on a machine learning technique for predicting the correctness of partial syntactic structures that are built during the parsing process. A recursive neural network architecture is employed for computing predictions after a training phase on examples drawn from a corpus of parsed sentences, the Penn Treebank. Our results indicate the viability of the approach andlay out the premises for a novel generation of algorithms for natural language processing which more closely model human parsing. These algorithms may prove very useful in the development of eÆcient parsers.

Item Type:Journal (Paginated)
Keywords:Natural Language Processing, Incremental parsing, Machine Learning, Recursive Neural Networks
Subjects:Computer Science > Language
Computer Science > Machine Learning
Computer Science > Neural Nets
ID Code:2089
Deposited By: Paolo, Frasconi
Deposited On:18 Feb 2002
Last Modified:11 Mar 2011 08:54

References in Article

Select the SEEK icon to attempt to find the referenced article. If it does not appear to be in cogprints you will be forwarded to the paracite service. Poorly formated references will probably not work.

Auer, P. (1997). On learning from multi-instance examples: Empirical evaluation of a

theoretical approach. In Fisher, D. H. (Ed.), Proc. 14th Int. Conf. Machine Learning, pp.

21{29. Morgan Kaufmann.

Bader, M., & Lasser, I. (1994). German verb-nal clauses and sentence processing. In

Clifton, C., Frazier, L., & Reyner, K. (Eds.), Perspectives on Sentence Processing, pp. {.

Lawrence Erlbaum Associates.

Bengio, Y., Simard, P., & Frasconi, P. (1994). Learning long-term dependencies with

gradient descent is diÆcult. IEEE Transactions on Neural Networks, 5 (2), 157{166.

Bianucci, A., Micheli, A., Sperduti, A., & Starita, A. (2000). Application of cascade-

correlation networks for structures to chemistry. Applied Intelligence, 12, 115{145.

Collins, M. (1996). A new statistical parser based on bigram lexical dependencies. In Proc.

of 34th ACL, pp. 184{191.

Dietterich, G., Lathrop, R. H., & Lozano-Perez, T. (1997). Solving the multiple-instance

problem with axis-parallel rectangles. Articial Intelligence, 89 (1{2), 31{71.

Eberhard, K. M., Spivey-Knowlton, M. J., Sedivy, J., & Tanenhaus, M. K. (1995). Eye

movements as a window into real-time spoken language comprehension in natural contexts.

Journal of Psycholinguistic Research, 24, 409{436.

Fodor, J. D., & Ferreira, F. (Eds.). (1998). Reanalysis in sentence processing. Kluwer

Academic Publishers.

Francesconi, E., Frasconi, P., Gori, M., Marinai, S., Sheng, J., Soda, G., & Sperduti, A.

(1997). Logo recognition by recursive neural networks. In Kasturi, R., & Tombre, K. (Eds.),

Graphics Recognition { Algorithms and Systems. Springer Verlag.

Frasconi, P., Gori, M., & Sperduti, A. (1998). A general framework for adaptive processing

of data structures. IEEE Trans. on Neural Networks, 9 (5), 768{786.

Frazier, L. (1987). Syntactic processing: Evidence from dutch. Natural Language and

Linguistic Theory, 5, 519{559.

Frazier, L., & Fodor, J. D. (1978). The sausage machine: A newtwo-stage parsing model.

Cognition, 6, 291{325.

Goller, C. (1997). A Connectionist Approach for Learning Search-Control Heuristics for

Automated Deduction Systems. Ph.D. thesis, Tech. Univ. Munich, Computer Science.

Goller, C., & K�uchler, A. (1996). Learning task-dependent distributed structure-

representations by backpropagation through structure. In IEEE International Conference

on Neural Networks, pp. 347{352.

Hermjakob, U., & Mooney, R. J. (1997). Learning parse and translation decisions from

examples with rich context. In Proceedings of ACL97, pp. 482{489.

Hinton, G. E. (1990). Mapping part-whole hierarchies into connectionist networks. Articial

Intelligence, 46, 47{75.

Hobbs, J., & Bear, J. (1990). Two principles of parse preference. In Proceedings of COL-

ING90, pp. 162{167.

Kamide, Y., & Mitchell, D. C. (1999). Incremental pre-head attachment in japanese parsing.

Language and Cognitive Processes, 14 (5{6), 631{662.

Kimball, J. P. (1973). Seven principles of surface structure parsing in natural language.

Cognition, 2, 15{47.

Kolen, J., & Kremer, S. (Eds.). (2000). A Field Guide to Dynamical Recurrent Networks.

IEEE Press.

Lombardo, V., Lesmo, L., Ferraris, L., & Seidenari, C. (1998). Incremental processing and

lexicalized grammars. In Proceedings of the XXI Annual Meeting of the Cognitive Science

Society, pp. 621{626.

Lombardo, V., & Sturt, P. (1999). Incrementality and lexicalism: A treebank study. In

Stevenson, S., & Merlo, P. (Eds.), Lexical Representations in Sentence Processing. John

Benjamins.

Marcus, M., Santorini, B., & Marcinkiewicz, M. A. (1993). Building a large annotated

corpus of english: The penn treebank. Computational Linguistics, 19, 313{330.

Marslen-Wilson, W. (1973). Linguistic structure and speech shadowing at very short la-

tencies. Nature, 244, 522{533.

Milward., D. (1995). Incremental interpretation of categorial grammar. In Proceedings of

EACL95.

Nagao, M. (1994). Varieties of heuristics in sentence processing. In Current Issues in

Natural Language Processing: In Honour of Don Walker. Giardini with Kluwer.

Plate, T. A. (1995). Holographic reduced representations. IEEE Transactions on Neural

Networks, 6 (3), 623{641.

Pollack, J. B. (1990). Recursive distributed representations. Articial Intelligence, 46 (1-2),

77{106.

Roark, B., & Johnson, M. (1999). EÆcient probabilistic top-down and left-corner parsing.

In Proceedings of the 37th Annual Meeting of the Association for Computational Linguistics,

pp. 421{428.

Rumelhart, D. E., Durbin, R., Golden, R., & Chauvin, Y. (1995). Backpropagation:

The basic theory. In Backpropagation: Theory, Architectures and Applications, pp. 1{34.

Lawrence Erlbaum Associates, Hillsdale, NJ.

Sperduti, A., & Starita, A. (1997). Supervised neural networks for the classication of

structures. IEEE Transactions on Neural Networks, 8 (3).

Srinivas, B., & Joshi, A. (1999). Supertagging: An approach to almost parsing. Computa-

tional Linguistics, 25 (2), 237{265.

Stabler, E. P. (1994). The nite connectivity of linguistic structure. In Clifton, C., Fra-

zier, L., & Reyner, K. (Eds.), Perspectives on Sentence Processing, pp. 303{336. Lawrence

Erlbaum Associates.

Steedman, M. J. (1989). Grammar, interpretation and processing from the lexicon. In

Marslen-Wilson, W. M. (Ed.), Lexical Representation and Process, pp. 463{504. MIT Press.

Sturt, P., Costa, F., Lombardo, V., & Frasconi, P. (2001). Learning rst-pass attachment

preferences with dynamic grammars and recursive neural networks. In preparation.

Sturt, P., & Crocker, M. (1996). Monotonic syntactic processing: a cross-linguistic study

of attachment and reanalysis. Language and Cognitive Processes, 11 (5), 449{494.

Sturt, P., Lombardo, V., Costa, F., & Frasconi, P. (2001). A wide-coverage model of rst-

pass structural preferences in human parsing. 14th Annual CUNY Conference on Human

Sentence Processing, Philadelpha, PA.

Thacher, J. (1973). Tree automata: An informal survey. In Aho, A. (Ed.), Currents in the

Theory of Computing, pp. 143{172. Prentice-Hall Inc., Englewood Clis.

Weischedel, R., Meter, M., Schwartz, R., Ramshaw, L., & Palmucci, J. (1993). Coping with

ambiguity and unknown words through probabilistic models. Computational Linguistics,

19 (2), 359{382.

Yamashita, K. (1994). Processing of Japanese and Korean. Ph.D. thesis, Ohio State

University, Columbus, Ohio.

Metadata

Repository Staff Only: item control page