---
abstract: |-
This paper describes the development of algorithms for
extracting the title and the names of the authors from
documents available on the World Wide Web. In this
paper we describe several algorithms for doing so in a
manner designed not to rely on specific stylistic dictates of
any document formatting standard. Rather, they are
designed to rely on a combination of overt and subtle cues
that form a generalized, common standard for placing this
information in a document and its easy extraction by
readers.
altloc:
- http://cs.roosevelt.edu/eric/ebmaics2004b.pdf
chapter: ~
commentary: ~
commref: ~
confdates: 'April 17, 18'
conference: Fifteenth Midwest Artificial Intelligence and Cognitive Science Conference
confloc: ~
contact_email: ~
creators_id: []
creators_name:
- family: Berkowitz
given: Eric
honourific: ''
lineage: ''
- family: Elkhadiri
given: Mohamed Reda
honourific: ''
lineage: ''
- family: Sahouri
given: Tim
honourific: ''
lineage: ''
- family: Abraham
given: Michel
honourific: ''
lineage: ''
date: 2004
date_type: published
datestamp: 2004-06-05
department: ~
dir: disk0/00/00/36/63
edit_lock_since: ~
edit_lock_until: ~
edit_lock_user: ~
editors_id: []
editors_name:
- family: Berkowitz
given: Eric
honourific: Dr
lineage: ''
eprint_status: archive
eprintid: 3663
fileinfo: /style/images/fileicons/application_pdf.png;/3663/1/ebmaics2004b.pdf
full_text_status: public
importid: ~
institution: ~
isbn: ~
ispublished: pub
issn: ~
item_issues_comment: []
item_issues_count: 0
item_issues_description: []
item_issues_id: []
item_issues_reported_by: []
item_issues_resolved_by: []
item_issues_status: []
item_issues_timestamp: []
item_issues_type: []
keywords: 'Document Classification Indexing '
lastmod: 2011-03-11 08:55:37
latitude: ~
longitude: ~
metadata_visibility: show
note: ~
number: ~
pagerange: 119-124
pubdom: FALSE
publication: ~
publisher: Omnipress
refereed: TRUE
referencetext: ~
relation_type: []
relation_uri: []
reportno: ~
rev_number: 12
series: ~
source: ~
status_changed: 2007-09-12 16:52:37
subjects:
- comp-sci-lang
- archives
succeeds: ~
suggestions: ~
sword_depositor: ~
sword_slug: ~
thesistype: ~
title: Intelligent Content Based Title and Author Name Extraction from Formatted Documents
type: confpaper
userid: 4943
volume: ~