Home | About
| News | Partners
and people | Papers and presentations
This page last updated Wednesday, 15-Sep-2004 17:34:04 BST
Research and development
On this page: Architecture for reference linking | Services: finding papers in Open Archives | Software solutions | User studies: mining Web logs and user surveys | OAI and metadata | Early OpCit
This page provides access to selected documentation on early-stage research by the project.
The main formal dissemination route for the results of our research will be
through published papers and presentations, or in the form
of demonstrators designed for evaluation.
Architecture for reference linking
- An Architecture for Reference Linking (Cornell OpCit site), Donna Bergmark (this version May 2000)
Why a new architecture? "True to Cornell Digital Library Research Group tradition we prefer an
object-oriented approach. ...in our object-orientation, we want to
walk up to a paper and ask it ‘what are your
references’ or ‘what is your title’ or ‘what is
your DOI’? This led directly to inventing the
concept of constructing a surrogate for each
paper, specifically for each Item in an archive
that is being analyzed. The surrogate would know how to answer these
questions. And which questions in particular it
knows how to answer is called the API". See also Reference linking working notes (Cornell OpCit site)
Services: finding papers in Open Archives
- Citation-ranked search engine for eprint archives: Citebase Tim Brody (2001- )
Building on OpCit's bibliometric analysis work citation database, Citebase will continue to be developed by Tim Brody beyond OpCit. Try the prototype search engine. Indexes arXiv eprint archives.
- OAI Aggregator: Celestial Tim Brody (2002- )
Celestial is an experimental service that aims to offer a comprehensive and accurate cache of up-to-date records from known OAI repositories. If OAI service providers harvest from this service, the load on data providers will be reduced. An effective aggregator will improve the interoperability, scalability and reliability of OAI services.
- Visualisation of simple citation and co-citation data. Two mapping tools have been developed:
Currently these services can only be shown graphically. The processing power required to run these applications is too high to present as Web applications.
- ParaTools, software tools for reference parsing. ParaTools includes two modules - the former provides functionality based on the Citebase citation parser,
and the latter provides an accurate means of reference extraction from
documents - written by Tim Brody, which are updated versions of the original reference processing software produced for OpCit by Zhuoan Jiao.
User studies: mining Web logs and user surveys
- Bibliometric analysis: mining the social life of an eprint archive Tim Brody, Ian Hickman (July-September 2000)
Authors, publications, citations and impact
Usage patterns: Web log analysis, site hits and downloads
The Los Alamos physics archives have been accepting submissions since 1991. Why have eprints not replaced the traditional publication process? What is the relationship between the
publication process and the submissions to eprint
archives? Do author's receive better exposure by
submitting to an archive? How does peer-review
affect the eprint archive?
See also Early Opcit: Preliminary user/citation analysis of arXiv, Les Carr (July-November 1999)
- A survey of users and non-users of eprint archives Cathy Hunt (2001)
Who is and is not using eprint archives, how are they using
them, why they do not use them, and what features should be added to make the archives more useful. Results and report from the survey.
OAI and metadata
- Full text services within the Open Archives Initiative: an OpCit proposal. Simple metadata mandated in the OAI protocol is not sufficient to build a reference linking service, which requires access to the full texts of the papers. OpCit proposed A Storage Architecture for Full-Text Access to Open Archives. As the OAI has become esentially a protocol, it is unlikely to support such developments directly. Instead, OpCit is working with the arXiv data provider to test software that will allow data providers to extract references automatically from papers as they are submitted and then present these data to service providers.
- Academic Metadata Format (AMF)
Having harvested citation data from arXiv, OpCit is looking for an output format so that it can make its data available to other OAI data and service providers. In collaboration with Thomas Krichel and Simeon Warner, AMF is one possibility being investigated.
Project workplan (August 2000)
- Early linking experiments, Les Carr (July/August 1999)
From recognising an initial set of within-archive citation links and a set of explicit arXiv citations containing an archive reference number, to overlaying these links on the original documents in text format and then in PDF.
- Copying arXiv? Zhuoan Jiao (last updated May 2000)
Converting arXiv to PDF: "As a viewable, linkable format, PDF is convenient for presenting a linked archive. The majority of the papers submitted to the arXiv physics archives are in other formats. PDF files are permanently stored and do not require the user to wait for conversion".
The OpCit Project
maintained by the Open Citation project. Contact