Date: Thu, 22 Jul 1999 15:42:15 +0100
From: Leslie Carr <lac@ecs.soton.ac.uk>
To: harnad@ecs.soton.ac.uk, wh@ecs.soton.ac.uk, sh94r@ecs.soton.ac.uk
Subject: Stealing a March on the EPrintLinks Project
I have spent a week fiddling around with XXX (arXiv physics) and have
now come up with an initial set of within-archive citation links. You can
see the results on the following web page:
http://cogprints.ecs.soton.ac.uk/~lac/eprintlinks/SUMMARY.html
The page shows the references in particular articles that have been
recognised as corresponding to another article within the archive. The
892 links that you see are (a) not conclusively tested :-) and (b) based
on a heath-robinson process which is applied only to the 3% of articles
which are "most-recently-requested-and-already-in-the-cache". These links
have been derived by "reading" the contents of the references sections;
the accuracy of the heath-robinson could be greatly improved.
As well as the above, there are also a huge set of explicit XXX (arXiv)
citations in which the citation contains an XXX (arXiv) reference number.
These can be seen in
http://cogprints.ecs.soton.ac.uk/~lac/eprintlinks/S.html
No work has yet been done on feeding the links back into the source, or overlaying them on the ps/pdf view that the user sees.
Note that (SLAC) SPIRES provides a similar service, which is based on
the references being typed in.
---
Leslie Carr
Date: Fri, 23 Jul 1999 17:10:23 +0100
From: Leslie Carr <lac@ecs.soton.ac.uk>
To: Steve Hitchcock <sh94r@ecs.soton.ac.uk>
CC: harnad@ecs.soton.ac.uk, wh@ecs.soton.ac.uk
Subject: Re: Stealing a March on the EPrintLinks Project
I have reworked some of the scripts (to make them work on more "problematic" articles) and the number of "software deduced" links has expanded from 892 to 6243.This looks a much more worthwhile number :-)
Since there are so many explicit citations, it is reasonable to ask whether my software is just putting in a lot of hard work to find out a link that the user has effectively added by hand anyway. But it turns out that there is actually very little overlap at all. Most references to XXX (arXiv) eprints give just the authors and reference number. Hardly any references to journal articles also give the XXX (arXiv) reference.
Now some do! Perhaps this is part of the eprint->refereed article transition process. Or perhaps it is part of a cultural change in physics. We can check this out later. It would be informative to work out
(Just done some quick calculations. Of the 6243 software deduced links,
only 302 appear to correspond to citations where the XXX (arXiv) reference
was explicitly given as well.)
---
Leslie Carr
From: "Leslie Carr" <lac@ecs.soton.ac.uk>
To: "Stevan Harnad" <harnad@coglit.ecs.soton.ac.uk>,
"Steve Hitchcock" <sh94r@ecs.soton.ac.uk>
Cc: <wh@ecs.soton.ac.uk>
Subject: Re: Stealing a March on the EPrintLinks Project
Date: Mon, 26 Jul 1999 07:40:43 +0100
Peculiarly enough, there seems to be little difference between a preprint and a reprint as far as citation practice is concerned.
In particular, if you divide the archive into pre- and re- prints according to whether they themselves claim journal status in their 'meta-data' or not, then there is almost no difference in the tendency to give XXX (arXiv) citations as opposed to Journal references.
I am going to need to check these results out in more detail to see
what is actually going on, but it seems counter-intuitive to me.
---
Les
Date: Tue, 27 Jul 1999 17:22:06 +0100
From: Leslie Carr <lac@ecs.soton.ac.uk>
To: Stevan Harnad <harnad@coglit.ecs.soton.ac.uk>
CC: wh@ecs.soton.ac.uk, sh94r@ecs.soton.ac.uk
Subject: Re: Stealing a March on the EPrintLinks Project
Outstanding issue (A)
> > No work has yet been done on feeding the links back into the source,
or overlaying them on the ps/pdf view that the user sees.
Outstanding issue (B)
> > Note that SPIRES provides a similar service, which is based on
the references being typed in.
I've just made a demo of these two features in action:
http://cogprints.ecs.soton.ac.uk/~lac/eprintlinks/NEW/ZZ.html
(Interestingly enough, now that I check this document out I see that
my reference reading process produced all the links that SPIRES did manually.
Hurrah!)
This of course shows the links only in an ASCII document, but we can
easily transfer this into a PDF context as and when.
....[1 hour pause]
...OK. See the references section on p11 of
http://cogprints.ecs.soton.ac.uk/~lac/eprintlinks/NEW/ZZ.pdf
The links appear as black boxes around the page numbers. Not very pretty
yet, but functional.
To recap
========
The PDF file was produced from the XXX (arXiv) archive and then modified
by a combination of the citation link database that I have already built
and the DLS/PDF linking software. The short html references file was produced
from the SPIRES citation data and the XXX (arXiv) 'detexxing' procedure
that I have been using.
I think that covers all the angles we discussed. I guess I ought to
write
up a report and email it to the list.
---
Leslie Carr
Tel: +44 1703 594479
Fax: +44 1703 592865
Email: L.Carr@ecs.soton.ac.uk URL: http://www.ecs.soton.ac.uk/~lac
ACM Member: 5135934
IEEE Member: 40323275
Dept of Electronics and Computer Science, University of Southampton
SO17
1BJ, UK