Child Language Data Exchange System / CHILDES
The CHILDES project is internationally known for its first language acquisition data and analysis tools. CHILDES tools have been used in more than 1300 published studies ranging from computational linguistics, language disorders, narrative structures, literacy development, phonological analyses and adult sociolinguistics (for a useful introduction to CHILDES see MacWhinney 1999).
More recently CHILDES tools have begun to be used by second language researchers, as for example in our sister project for French L2).
CHILDES consists of three integrated components:
- The database (Talkbank), consisting primarily of child speech recordings and transcriptions, but also including some language disorder data and bilingual data. It is a condition of using CHILDES tools that any data becomes part of the Talkbank database.
- CHAT (Codes for the Human Analysis of Transcripts) are the transcription procedures, a system for notation and coding which has been developed to be compatible with the tagged programmes. This 'tagging system' is now XML compatible. The manual containing all the transcription conventions is regularly updated.
- CLAN (Computerized Language Analysis) is a set of computer programs designed to carry out data analyses. It includes morphosyntactic parsers in 12 languages, as well as search tools enabling the output of any of its programs to be interrogated directly; for example, searches can be carried out straight onto the morphosyntactic output of any batch of files. The CLAN programs are designed to recognise the tagging conventions of CHAT.
Here we provide only a very brief introduction to CHILDES. Any researcher wishing to understand how the tools work will need to consult the manuals, which are also available in hardback (MacWhinney, 2000).
The SPLLOC project adheres generally to the researcher and user groundrules which have been developed by the CHILDES project. These groundrules are available at http://talkbank.org/share/.