creators_name: Laakso, Aarre type: preprint datestamp: 2005-04-12 lastmod: 2011-03-11 08:55:55 metadata_visibility: show title: On Parsing CHILDES ispublished: unpub subjects: ling-comput full_text_status: public keywords: language acquisition, syntactic parsing, CHILDES database note: Submitted to Midwest Computational Linguistics Colloquium (MCLC) 4/10/2005. abstract: Research on child language acquisition would benefit from the availability of a large body of syntactically parsed utterances between parents and children. We consider the problem of generating such a ``treebank'' from the CHILDES corpus, which currently contains primarily orthographically transcribed speech tagged for lexical category. date: 2005 date_type: published refereed: FALSE referencetext: Mark C. Baker. 2005. Mapping the terrain of language learning. Language Learning and Development, 1(1):93–129. Daniel M. Bikel. 2002. Design of a multi-lingual, parallel-processing statistical parsing engine. In Proceedings of the Human Language Technology Conference, San Diego. Edward Briscoe and J. Carroll. 2002. Robust accurate statistical annotation of general text. In Proceedings of the Third International Conference on Language Resources and Evaluation (LREC 2002), pages 1499–1504, Las Palmas, Canary Islands. Eugene Charniak and Mark Johnson. 2001. Edit detection and parsing for transcribed speech. In Second Meeting of the North American Chapter of the Association for Computational Linguistics, pages 118–126. Michael John Collins. 1999. Head-driven statistical models for natural language parsing. Ph.d. dissertation, University of Pennsylvania. Stephen Crain and Paul Pietroski. 2002. Why language acquisition is a snap. Linguistic Review, 19(1–2):163–183. J. J. Godfrey, E. C. Holliman, and J. McDaniel. 1992. Switchboard: telephone speech corpus for research and development. In 1992 IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP-92), volume 1, pages 517–520, San Francisco. Dennis Grinberg, John Lafferty, and Daniel Sleator. 1995. A robust parsing algorithm for link grammars. Technical Report CMU-CS-95–125, School of Computer Science, Carnegie-Mellon University. Peter Lane and James Henderson. 2001. Incremental syntactic parsing of natural language corpora with simple synchrony networks. IEEE Transactions on Knowledge and Data Engineering, 13(2):219–231. Brian MacWhinney. 2000. The CHILDES Project: Tools for Analyzing Talk, volume 2: The Database. Lawrence Erlbaum Associates, Mahwah, NJ, 3rd edition. C. Parisse and M.-T. Le Normand. 2000. Automatic disambiguation of morphosyntax in spoken language corpora. Behavior Research, Methods, Instruments & Computers, 32:468–481. Kenji Sagae, Brian MacWhinney, and Alon Lavie. 2004. Automatic parsing of parental verbal input. Behavior Research Methods, Instruments and Computers, 36(1):113–126. XTAG Research Group. 2001. A lexicalized tree adjoining grammar for english. Technical report, IRCS, University of Pennsylvania. citation: Laakso, Aarre (2005) On Parsing CHILDES. [Preprint] (Unpublished) document_url: http://cogprints.org/4204/1/parsing-childes.pdf