%A Eric Berkowitz
%A Mohamed Reda Elkhadiri
%A Tim Sahouri
%A Michel Abraham
%T Intelligent Content Based Title and Author Name Extraction from Formatted Documents
%X This paper describes the development of algorithms for
extracting the title and the names of the authors from
documents available on the World Wide Web. In this
paper we describe several algorithms for doing so in a
manner designed not to rely on specific stylistic dictates of
any document formatting standard. Rather, they are
designed to rely on a combination of overt and subtle cues
that form a generalized, common standard for placing this
information in a document and its easy extraction by
readers.
%K Document Classification Indexing
%P 119-124
%E Dr Eric Berkowitz
%D 2004
%I Omnipress
%L cogprints3663