TY - GEN
ID - cogprints214
UR - http://cogprints.org/214/
A1 - Bayraktar, Murat
A1 - Say, Bilge
A1 - Akman, Varol
Y1 - 1998///
N2 - Punctuation has usually been ignored by researchers in computational linguistics over the years. Recently, it has been realized that a true understanding of written language will be impossible if punctuation marks are not taken into account. This paper contains the details of a computer-aided exercise to investigate English punctuation practice for the special case of comma (the most significant punctuation mark) in a parsed corpus. The study classifies the various ``structural'' uses of the comma according to the syntax-patterns in which a comma occurs. The corpus (Penn Treebank) consists of syntactically annotated sentences with no part-of-speech tag information about individual words.
KW - punctuation
KW - structural punctuation marks
KW - comma
KW - the Penn Treebank
KW - the Wall Street Journal
KW - corpus linguistics.
TI - An Analysis of English Punctuation: The Special Case of Comma
SP - 33
AV - public
EP - 57
ER -