--- abstract: |- This paper describes the development of algorithms for extracting the title and the names of the authors from documents available on the World Wide Web. In this paper we describe several algorithms for doing so in a manner designed not to rely on specific stylistic dictates of any document formatting standard. Rather, they are designed to rely on a combination of overt and subtle cues that form a generalized, common standard for placing this information in a document and its easy extraction by readers. altloc: - http://cs.roosevelt.edu/eric/ebmaics2004b.pdf chapter: ~ commentary: ~ commref: ~ confdates: 'April 17, 18' conference: Fifteenth Midwest Artificial Intelligence and Cognitive Science Conference confloc: ~ contact_email: ~ creators_id: [] creators_name: - family: Berkowitz given: Eric honourific: '' lineage: '' - family: Elkhadiri given: Mohamed Reda honourific: '' lineage: '' - family: Sahouri given: Tim honourific: '' lineage: '' - family: Abraham given: Michel honourific: '' lineage: '' date: 2004 date_type: published datestamp: 2004-06-05 department: ~ dir: disk0/00/00/36/63 edit_lock_since: ~ edit_lock_until: ~ edit_lock_user: ~ editors_id: [] editors_name: - family: Berkowitz given: Eric honourific: Dr lineage: '' eprint_status: archive eprintid: 3663 fileinfo: /style/images/fileicons/application_pdf.png;/3663/1/ebmaics2004b.pdf full_text_status: public importid: ~ institution: ~ isbn: ~ ispublished: pub issn: ~ item_issues_comment: [] item_issues_count: 0 item_issues_description: [] item_issues_id: [] item_issues_reported_by: [] item_issues_resolved_by: [] item_issues_status: [] item_issues_timestamp: [] item_issues_type: [] keywords: 'Document Classification Indexing ' lastmod: 2011-03-11 08:55:37 latitude: ~ longitude: ~ metadata_visibility: show note: ~ number: ~ pagerange: 119-124 pubdom: FALSE publication: ~ publisher: Omnipress refereed: TRUE referencetext: ~ relation_type: [] relation_uri: [] reportno: ~ rev_number: 12 series: ~ source: ~ status_changed: 2007-09-12 16:52:37 subjects: - comp-sci-lang - archives succeeds: ~ suggestions: ~ sword_depositor: ~ sword_slug: ~ thesistype: ~ title: Intelligent Content Based Title and Author Name Extraction from Formatted Documents type: confpaper userid: 4943 volume: ~