Re: The preprint is the postprint

From: Stevan Harnad <harnad_at_coglit.ecs.soton.ac.uk>
Date: Sat, 2 Dec 2000 13:08:56 +0000

Greg Kuperberg, a comrade-at-arms insofar as freeing the the research
literature through self-archiving is concerned, continues to
misinterpret (profoundly) the causal role of peer review, and even its
timing, in this undertaking. It is important to sort this out, as
these misunderstandings can only slow our already overdue passage to
the optimal and inevitable:

On Fri, 1 Dec 2000, Greg Kuperberg wrote:

> On Fri, Dec 01, 2000 at 06:24:48PM +0000, Stevan Harnad wrote:
> sh> I. The empirical evidence against the thesis that
> sh> "preprint = postprint"
> >
> sh> Since the advent of public online self-archiving of research over a
> sh> decade ago, there has been NO SIGN WHATSOEVER of any change in
> sh> researchers' standard practise of continuing to submit all their
> sh> preprints to refereed journals (or refereed conference proceedings)
> sh> for peer review and publication. Self-archiving, as I have pointed
> sh> out many times, is a SUPPLEMENT TO, not a SUBSTITUTE FOR
> sh> peer-reviewed publication. For example, virtually every one of the
> sh> 130,000 papers in the Physics Archive eventually appears in a
> sh> refereed journal, and this has been true from the very first days
> sh> of the archive onward: The preprint appears first, and then (within
> sh> about 8-12 months) either the postprint is archived too, or the
> sh> preprint is updated to include the reference...
>
> I would not say "virtually every" but I agree that the great majority
> of papers in the arXiv system (counting physics and math) are eventually
> published in the traditional sense.

Not all submitted papers are eventually accepted by some journal, but
most are, across disciplines .

    Harnad, S. (1986) Policing the Paper Chase. (Review of S. Lock, A
    difficult balance: Peer review in biomedical publication.) Nature
    322: 24 - 5.

The number of papers that are not submitted for refereeing to any
journal at all is even smaller.

I agree that exact data on these figures would be preferable, but
"great majority" is good enough for now; in particular, it is enough to
refute the inferences and conclusions Greg is drawing from the
self-archiving data to date.

> And I also agree that the arXiv
> is intended to supplement peer review. But this is in no way evidence
> against the thesis that preprint equals postprint; rather it is evidence
> in favor.

I cannot follow this reasoning at all -- except if I accept Greg's
apparent assumptions (not backed up by any objective evidence at all)
that:

        (1) There is little substantive difference between preprint and
        postprint. I.e., refereeing makes ("virtually"?) no
        direct difference at all: Where are the data to support this
        contention?

    and

    (2) The "invisible hand" -- i.e., the fact that (the "vast majority"
    of) today's preprints are written with the foreknowledge that they
    will have have to be answerable to formal peer review, which is in
    turn implemented by and answerable to a qualified editor, choosing
    qualified referees, etc. -- makes no indirect difference either.
    I.e., there would be no difference between the quality and
    navigability of the preprints in the present,
    answerable-to-peer-review system, and the "p-prints" in a
    hypothetical future, unfiltered system in which they are not
    answerable-to-peer-review at all -- only perhaps (some of them? all
    of them?) to ad-lib, ad-hoc, self-appointed (peer?) post-commentary
    (if any should eventually materialize for any given
    "p-print"): Where are the data to support this contention?

For unless Greg can produce credible evidence in support of (1) and
(2), the DEFAULT conclusion, based on the present universal system, is
that the current quality of the refereed journal literature is due,
both directly and indirectly, to peer review: That is what is
responsible for both the preprint/postprint difference and the
"p-print"/preprint difference, and the difference is qualitative.

Now one could conceivably conduct experiments to test the validity of
this default assumption. Reasoning from the following analogy (a rather
shrill one, I admit, but I leave it to readers to substitute a more
positive and sedate analogy in its place): There are police patrolling the
neighbourhood, but maybe they don't make much difference in people's
behaviour. So why don't we just stop wasting that money and manpower?
People are going about their business as if they weren't there anyway.

The answer is that, human nature being what it is, once the likelihood
of answerability to the law was removed, things would devolve in the
direction from which they came, before the civilizing influence of laws
and law enforcement. (But the experiment can be done to test this, if
anyone thinks it wants testing.)

You catch my drift. And as to self-appointed, post-hoc "reviewers,"
that is analogous to consigning our safety to vigilantes rather than
formal institutions. This too can be tested, if someone wants
convincing.

In other words, I think it is eminently reasonable to suppose that the
current quality of the refereed literature, such as it is, is causally
connected to its current system of quality-control, the formal
peer-review system, i.e., the very preprint/postprint (direct)
difference in quality and the "p-print"/preprint (indirect,
invisible-hand-based) difference in quality, both of whose whose
existence Greg is trying to deny.

That is the default causal assumption; the burden of proof accordingly
lies with those who believe there is in reality no causal connection
there. They must first test and demonstrate that, before we put the
current quality of our refereed literature (or our neighbourhoods) at
risk.

So, in the meantime, can we just get on with freeing the current
peer-reviewed literature from its fee-barriers (a sure thing) rather
than from its peer review (a decidedly unsure thing)?

> Again, I can only speak for what happens in mathematics, especially with
> my own papers.

But that is precisely the point: One's own papers are NOT the way to
measure the causal role of peer review in mathematics, any more than
one's own conduct is the way to measure the causal role of having the
police in the neighbourhood. I am sure Greg is a constitutionally
law-abiding citizen, and has never needed formal feedback from experts
in order to produce flawless work, but this does not generalize to the
general population. Moreover, it is far too subjective. What is needed
is objective data of the kind I have outlined above.

> Yes, I do still submit my papers to journals, and yes,
> the reason is that I want them peer reviewed. But I don't agree with the
> "and publication" part at all. I don't really care whether the journals
> publish my papers, in the sense of "making them public", because the
> recent ones are already in the arXiv. That they accept the papers is
> all that I really want.

This is a non sequitur! Greg and I are, as he knows, in complete
agreement that the online self-archiving is not only sufficient but
optimal for distributing our papers. So we do not submit them for peer
review for sake of the distribution. That fact is undisputed. It is his
further inference -- that the only service that journals are providing
must accordingly be this superfluous (and fee-based) distribution --
that is fallacious.

For that is not the case at all. The "acceptance" for " (refereed)
publication" is in fact the end-state, the certification, the "tagging"
of the successful outcome of the peer-review process, precisely the
process that Greg has simply assumed, but not even faintly
demonstrated, plays no causal role in maintaining the current quality
of this literature!

In other words, it is only if you ASSUME, a priori, with no supporting
evidence, and against all the existing evidence and arguments to the
contrary, that the "postprint minus preprint" (as well "preprint minus
'p-print'") quality-difference equals ZERO that you can conclude that
the only differences would have been in the superfluous formats of
distribution!

> In other contexts it would be absurd to conflate peer review with
> distribution as many people do when discussing journals.

But, Dear Greg, it is you who are conflating. What you need to do is
disentangle the putative causal role of mode of distribution from the
the putative causal role of peer review, not simply to assume, on the
basis of the former, that the latter is null!

> Janet Maslin
> reviews movies; she does not sell movie tickets. The critics certainly
> do have an "invisible hand" in making movies (admittely a pretty
> weak one sometimes), but who would ever think of separate terms for
> movies before and after the critics have seen them?

I suspect I may bore some of our readers in having to remind Greg that
he is conflating the existing, classical peer review system, which occurs
BEFORE acceptance for publication, with ad lib (and mostly
hypothetical) post-hoc forms of review. (Movies are not, so far as I
know, peer-reviewed; the decision to make one is based on the basis of
projected sales, not scientific quality; and the function of film
reviews is to give potential viewers a clue as to whether they might
find it worth seeing. Whatever makes you think there is any relevant
similarity between this commercial, market-based consumer product -- a
public opinion-poll would probably evaluate it even better, than an
individual review -- and scientific/scholarly quality and
quality-control?

Book reviews and movie reviews are simply bad analogies here.

> A closer example
> is that in mathematics there is a second stage of peer review, in
> the form of Math Reviews and Zentralblatt, that in some respects is
> better than peer review from journals. (The URL for Math Reviews is
> http://www.ams.org/mathscinet/ .) Again, no one says that Math Reviews
> publishes papers. But it does review them.

Ah me. Post-hoc review again. And this time of already-peer-reviewed
postprints! A lovely, welcome supplement to peer review, but as likely
a substitute for it as civic honour rolls would be for having police in
the neighbourhood.

Please try to distinguish peer-reviewed from non-peer-reviewed
domains of human endeavour. And within the former, please do not
conflate reviews on which acceptance is conditional (peer review of
preprints), from reviews of what has already earned acceptance
(post-hoc reviews of peer-reviewed postprints). And do not assume that
preprints in our current era of universal peer review can be expected
to be anything like "p-prints" in a hypothetical future era without
peer review ("no police in the neighbourhood") and only willy-nilly
post-hoc reviews of "p-prints."

[Could it be the word "publication" that is leading us astray? Let me
put my cards on the table at once, then: I think the "publication" of
give-away refereed research has always been an anomaly in the whole of
publishing, most of which is non-give-away, and rooted in an author
royalty- or fee-based model. Hence "publishing" has been synonymous
with "selling a product for a fee." But for refereed-research authors,
there has never been a question of selling a product for a royalty or
fee. What they wanted from their publisher was a certification-tag,
"accepted for publication in journal X," plus the widest possible
distribution (with which toll-booths were strongly at odds all along).
Consequently, now that online access-toll-free self-archiving is
becoming the widest possible means of distribution, PUBLICATION reduces
to "certification as accepted for publication in journal X." That's all
that's left of publication (but it's not nothing). The rest is just
online archiving, for distribution. But if you keep using "publication"
as synonymous with "distribution," it is easy to miss this one
remaining essential service that publishers -- or call them what you
like -- must continue to implement for this anomalous, give-away, but
quality-controlled literature.]

> Note also that even though most arXiv articles are published, some
> fraction of them are not. By now there are thousands of arXiv-only
> e-prints and some of them are very important and even famous. My favorite
> example is "Deformation quantization of Poisson manifolds, I", by Maxim
> Kontsevich [q-alg/9709040]. Because of further research developments
> after it was written, it will probably never be published. But it has
> been cited many times. Beyond that it was reviewed by the committee
> for the 1998 Fields Medals, which then did award Kontsevich the Fields
> Medal, the most prestigious prize in mathematics. Ironically Math Reviews
> ignored q-alg/9709040 as a mere preprint, and at first ignored its sequel
> math.QA/9904055 as a mere preprint as well. But finally this year it
> gave the sequel a rave review, three years after the first news appeared
> in the arXiv. As a final irony the review itself has more references
> to the arXiv than to "published" papers.

First, are there really thousands of cases of papers that are never
submitted for publication at all? (How does one ascertain the cut-off
for this, at any point in time? Our own analyses of the Physics Archive
suggest that there is an average latency of about 9-12 months between
preprint and postprint, but does that allow for papers that need to
undergo more revisions, or that have more trouble finding their niche?
Are your "thousands" more like the Kontsevich's of this world, or
rather these more troubled portions of the Bell Curve?

Peer review (and post-hoc review) are not without their flaws, being
performed by mere human beings. But can the minority of cases where
peer-review may have misfired be taken as an empirical basis for
dispensing with peer-review altogether, or replacing it by an untested
alternative? Exceptions do not invalidate a rule; they simply signal
that it is not an absolute rule!

So if your ambition is to improve on peer review, then your efforts
should be directed toward devising and testing alternative systems on
small but representative samples (it would be non-feasible, not to
mention reckless, to test them out on the literature as a whole, a
priori.)

If and when you do find something that proves to work at least as well,
on average, as classical peer review, there will be many who are ready
to listen, and to consider implementing it.

But right now, you have nothing of the sort, so there is nothing to
implement; nor is arXiv an implementation (or test) of any alternative
to classical peer review. It is in fact completely parasitic on
classical peer review (and its invisible hand) as currently implemented
by the refereed journals.

As to the unpublished wonders you describe, one would like to see the
actual data, and a comparison with the numbers of equivalent
unpublished papers before the on-line era: In particular, what
proportion of them are really diamonds rather than duds, and how does
that compare with the proportion in the corpus of papers that DO submit
to peer review? All you have described are subjective and anecdotal
observations, from which you have been ready to draw some rather
strong, even radical conclusions.

For my own part, my ambition is more modest: I just want to free the
current peer-reviewed literature, such as it is, from the obsolete
financial access barriers that are still needlessly, indeed
scandalously, blocking its potential impact.

Peer review could use some reform, but there is no reason to delay the
freeing of the current literature for one millisecond more, by coupling
its fate in any way with the long empirical task of finding and testing
a viable alternative to peer review (if there is one).

Worse, pre-emptively characterizing self archiving as itself being an
alternative to peer review simply compounds the many confusions that
have been needlessly delaying the research community on its road to the
optimal and inevitable, yea these 10-odd years already!

So why not just call a spade a spade? Authors can and should keep on
doing what they've been doing all along: Prepare and submit your papers
to the peer-reviewed journals of your choice. But to maximize their
accessibility and potential impact, all authors should also free them
for one and all on-line by self-archiving them, either in central,
discipline-based OAI-compliant Archives like http://arxiv.org/ and
http://cogprints.soton.ac.uk, or in distributed, university-based
OAI-compliant Eprint Archives: http://eprints.org

Nothing at all sacrificed or changed; not your preferred peer-reviewed
journal for submission, nor peer review itself. Self-archiving is
merely a supplement, not a substitute for anything. It merely frees
this give-away literature, at last.

--------------------------------------------------------------------
Stevan Harnad harnad_at_cogsci.soton.ac.uk
Professor of Cognitive Science harnad_at_princeton.edu
Department of Electronics and phone: +44 23-80 592-582
             Computer Science fax: +44 23-80 592-865
University of Southampton http://www.ecs.soton.ac.uk/~harnad/
Highfield, Southampton http://www.princeton.edu/~harnad/
SO17 1BJ UNITED KINGDOM

NOTE: A complete archive of the ongoing discussion of providing free
access to the refereed journal literature online is available at the
American Scientist September Forum (98 & 99 & 00):

    http://amsci-forum.amsci.org/archives/American-Scientist-Open-Access-Forum.html

You may join the list at the site above.

Discussion can be posted to:

    american-scientist-open-access-forum_at_amsci.org
Received on Mon Jan 24 2000 - 19:17:43 GMT

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:45:58 GMT