Re: NIH's Public Archive for the Refereed Literature: PUBMED CENTRAL

From: Stevan Harnad (harnad@cogito.ecs.soton.ac.uk)
Date: Thu Sep 02 1999 - 18:46:54 BST


This is the edited record of an exchange about the latest NIH/PubMed
Central proposal with a comrade-at-arms who has asked to remain
anonymous but agreed to my quote/commenting this edited transcript
of our discussion.

> [it seems that the critical issue is: what are the criteria for an
> organization to participate - either for screening or peer-review.]

The critical issue with NIH/PubMed Central is WHO can archive WHAT.

It has been stipulated that only "organizations" can participate, and it's
not clear what this is to mean.

Worst-case scenario: Only publishers can archive refereed papers.

Best-case scenario: Any "institutionally affiliated" (for example)
author can archive refereed papers.

For screening of unrefereed papers, the details can be worked out.

For what counts as "peer-review," the details can likewise be worked
out (and for most of the journal literature it will be pretty
straightforward).

But the serious question is about REFEREED papers. Let us assume that
we have identified which journals count as peer-reviewed. I have just
had an article accepted by one of them: Suppose that journal is NOT one
of the "participating" publishers that will immediately archive it in
the free public archive for me? What then?

THIS is the core ambiguity that MUST be resolved clearly and
explicitly if NIH/PubMed Central is to fly.

The solution is obvious: An author at an accredited institution (or
whatever other accrediting criterion NIH/PubMed Central decides on)
should be able to archive his OWN refereed papers.

If NIH/PubMed Central likes (and can afford it), the archive can (1)
check whether that paper is indeed accepted by the journal the author
indicates it is accepted by, and NIH can even do (2) an ascii-check to
make sure it corresponds verbatim with the accepted text -- but this
would be a waste of time and money, especially if the official journal
version of the text is not available online, or the journal does not
feel like cooperating! To get the archive started it will be enough
that the accredited author CLAIMS that that's the journal that accepted
it, by self-archiving it and tagging it as such (with journal name,
volume, issue, year, etc.)

(The Net will quickly unmask papers that authors claim have been
accepted for publication by Journal X but have not been; the vast
majority of authors are not interested in that kind of fraud, and it
would be absurd and self-defeating to design the NIH archive around an
a-priori restriction, on the assumption that fraudulent self-archiving
of "refereed" papers would pose a major problem from which the archive
must be pre-emptively defended -- at the cost of sacrificing that very
self-archiving capability on which the archive's success and utility so
critically depends!)

So much for restrictions on the self-archiving of refereed papers.

(When it comes to unrefereed papers, apart from whatever special
measures are deemed necessary to protect public health from quackery or
fraud, the situation again seems straightforward. Authors from
accredited institutions should be allowed to self-archive the papers
they are submitting to journals; if NIH finds it useful, Universities,
Departments, Societies, Congresses, etc., could screen these submissions,
or NIH could use the prior funding criteria mentioned in the new
proposal.

But on no account should the restrictions on self-archiving of
unrefereed material be conflated with the conditions for self-archiving
refereed material, otherwise the publishers will simply become the
authorizers for the latter, and the archive will be still-born -- with
the only "participating" publishers being an arbitrary sample of tiny
societies and unimportant journals. The contents of all the
high-quality, high-impact journals -- the raison d'etre of the whole
initiative -- will be missing from the archive, and it will stay that
way, for this asymmetric outcome (low quality for-free, high quality
for-fee) will effectively nip in the bud any genuine initiative for the
freeing of the refereed literature through public archiving. A Trojan
Horse will have become permanently entrenched.)

> [until there's an international advisory committee, the criterion is 3
> editorial board members as PIs on grants from major funding agencies]

Criterion for WHAT? Editorial Boards already exist for the established
journals, and they have nothing to do with any of this (except if they,
mirabile dictu, agree to "participate" by giving away their contents
online for free: Don't hold your breath!).

But what are these boards and committees supposed to be doing?
Screening the unrefereed papers? Fine. But what about the REFEREED
papers? They've ALREADY been "screened." All that's needed is that
they should be ARCHIVED. Who's going to do that (assuming -- and it's a
safe assumption -- that the established publishers are not going to
agree to give them away for free themselves).

The more I think about it, the more I have to say that the very same
incoherence I noted on this point in the original proposal is still
hovering over this one. Exactly what are these overseers meant to be
doing, and particularly in relation to ALREADY REFEREED papers?

http://www.nih.gov/welcome/director/ebiomed/com0509.htm#harn45

> [participants have a significant stake in the process and this
> will greatly reduce inappropriate content]

Why hamstring the archive's CORE function (self-archiving) with all
kinds of new screening functions that will only restrain self-archiving,
the completely new activity that needs to be ENCOURAGED?

Even more important: Why hamstring the archiving of the
already-screened (i.e. refereed) material? And the worst possible form
of hamstringing would be to allow only the "screeners" themselves (the
refereed journals) to archive the latter. Because they won't! And
although NIH/PubMed Central plans to open in January 2000, the archive
will be virtually barren of refereed material from the established
journals (and will, I'm afraid) stay that way for years to come, if it
is set it up like this.

> [it should not be terribly hard for most scientists to form an editorial
> board which can serve as the screening group for self-archiving their
> papers]

I worry a little about leaving it to their initiative (given that the
horses have been led to the waters of self-archiving but have been slow
to drink anyway, without their having to take still further initiatives).

But let's say that for the self-archiving of unrefereed papers "screening"
bodies can be put together.

The REAL issue is the self-archiving of REFEREED papers. These have
already been "screened" (and a lot more); at most, all they still need
is some verification (that they are indeed refereed/accepted, and that
they are indeed the verbatim texts). I am not sure there are either the
resources or the initiative to do this screening; and I'm fairly sure
it's not necessary. For now, self-archiving of refereed papers just
needs to be DONE, even if imperfectly. The rest will take care of
itself, once the momentum is there.

But the momentum is needed, and hamstringing the self-archiving of
refereed papers in any way will be extremely counterproductive. First,
there is still the horses/water problem; add obstacles and you just
make it bigger. Second, there is still the copyright problem (some
publishers try to forbid it, and even more authors feel that it's
somehow forbidden). So if one adds to that the extra handicap of having
to screen the refereed papers too, that amounts to simply raising the
goalposts even higher, instead of lowering them!

And to put the goalposts in the hands of publishers, whose current
interests conflict so profoundly with those of science and scientists
and a free refereed literature, is simply to take the goalposts off the
field altogether, leaving the archive sitting in committee rooms
getting nowhere while the self-archiving era that is upon us fails to
be taken advantage of.

The question of who can self-archive what, under what conditions, MUST
be clarified, and clarified separately for unrefereed and refereed
papers.

> [Regarding refereed papers, many of the journals that would peer review
> an author's ms will oppose the author's posting this in a public
> archive. So the archive will require the participation of the journals.
> If self-archiving is important to an author, don't you think they'll
> take this into account when chosing a journal to submit to?]

I couldn't disagree more!

Consider the logic of the following:

PREMISE: The motivation for the Archive is to make the bioscience and
biomedical research literature available online to everyone for free.

If journal publishers wanted to make it free online for everyone they
could do it on their own! Most already have online archives, so archive
availability is not the problem; they simply have no motivation to make
it free.

This is quite understandable, and I would not expect them to, at this
point. But what I am rather confidently counting on is that they will
not try to block self-archiving on the part of their authors as it
becomes clear that this would be both in direct conflict with the
interests of scientific research AND unnecessary (for there are other
ways to recover the essential costs without blocking access).

But if NIH ITSELF were to capitulate on self-archiving a priori, how
can one expect authors to do it? (NIH should be NEUTRAL on journal
self-archiving policy and should make the Archive open to authors
self-archiving refereed papers. Let the community fight the copyright
battle; don't simply capitulate on their behalf a priori!)

http://www.cogsci.soton.ac.uk/~harnad/science.html
http://www.cogsci.soton.ac.uk/~harnad/Hypermail/Author.Eprint.Archives/0006.html

It would be unrealistic bordering on nonsensical to expect or wait for
authors to submit to a journal other than the best one for their paper,
selecting instead on the basis of the journal's self-archiving policy.

Would I rather submit to the PODUNK ARCHIVES OF PHYSIOLOGY rather than
the JOURNAL OF PHYSIOLOGY -- sacrificing the latter's prestige, high
quality refereeing, high quality contents and high impact factor -- in
favor of a weak journal, just because it had a liberal self-archiving
policy?

Of course not. So a policy like that would just ensure that the best
authors and the best journals would not appear in the NIH/PubMed
Central, because the better journals would be the last to agree to free
archiving.

By biasing the Archive toward the weakest end of the literature ab
ovo, NIH would actually retard rather than hasten the day when the real
research literature will be free, because that would simply stamp in
the status quo and stamp the stigma of a low-end bias onto the medium
from the very outset (in the minds of authors, who still scarcely know
what is going on).

> [publisher reaction to authors "freeing" their content on a public
> site]

I think publisher reaction should not be a concern of NIH's, any more
than it is a concern of LANL,'s NSF's or DOE's. That is between the
publishers and the author/reader community, i.e., the scientific
community itself. NIH should not take sides. Provide the self-archiving
facility and let the biological community decide how to use it, exactly
as the physics community did with LANL.

http://xxx.lanl.gov/cgi-bin/show_monthly_submissions
http://xxx.lanl.gov/cgi-bin/show_weekly_graph

> new journals, which participate in PubMed Central, could
> start and attract authors

This is still the same flaw as in the original proposal: It is being
imagined/hoped that the Archive will spawn new journals, with new
practices (including free online content) that will supplant the
established journals/practices.

I can only wish such hopes well, and sincerely hope they come true,
because what I regard as the optimal and inevitable outcome for science
-- the entire refereed journal literature online, free for all -- would
indeed be reached if they did come true. Supplanting the literature
through free new journals would free it as surely as self-archiving by
authors would.

http://www.cogsci.soton.ac.uk/~harnad/nature.html

But I strongly doubt that they will come true, and I fear that
restricting the refereed sector of the archive to new journals (and to
"participating" old ones) will simply drag out indefinitely a process
that the NIH Archive had hoped to facilitate and accelerate.

> [self-archiving would be seen as a direct assault by NIH on the
> publishers and would not receive the support of NIH leadership, though
> it might evolve eventually]

I hope it is true that self-archiving might evolve. It seems to me that
if NSF had seen the LANL archive as a "direct assault" by NSF on
publishers, and hence unworthy of support, LANL would never have
evolved either.

Fortunately, the creation of LANL was in the hands of Paul Ginsparg,
and he was not deterred by that consideration.

http://xxx.lanl.gov/blurb/pg14Oct94.html

Nor should NIH be deterred, of course. In providing an archive where
authors can self-archive, as in LANL, they leave it to the authors to
do the "assaulting" (if that's what it is -- but in reality I think the
only assault comes from the existence of the new medium itself; the
rest just has to do with how long it takes scientists to reach the
optimal and inevitable outcome that the new medium has made possible
for them).

> [only a small fraction of journals have the impact that will attract the
> best papers. For most papers of most life scientists, journals that
> participate in PubMed Central would be very attractive because of
> their stability and visibility (over 100,000 different scientists use
> it every day)]

Again, I can only wish this hope well.

But I strongly doubt that the intangible benefits of free online access
to all -- benefits that I have to stress are ENORMOUS in reality, but
just barely enough to persuade authors IN PRACTISE to self-archive in
an unrestricted self-archive (this is the leading-horses-to-water
problem) -- will be enough to offset authors' well-established,
sensible (and highly rewarded) practise of publishing their papers in
the best journals they can get them into, which means the most
prestigious, highest-quality, highest-impact-factor established
journals.

It is PRECISELY for this reason that I gave up on the hope that

(1) establishing new rival online free journals, or

(2) trying to persuade established publishers to go online and free

were the way to free the literature. Authors will rarely risk their
best work in a new journal, let alone a new online journal, etc. (and
going online free is currently against established journals'
interests).

I took the path of subversive self-archiving because it was the ONLY
one that did not require authors to make this choice (between their
preferred established journal and something else): With self-archiving,
they could have their cake (publish in the prestige refereed journal of
their choice) and eat it too (give it away to one and all for free
through self-archiving).

http://www.arl.org/scomm/subversive/toc.html

The latest draft of the NIH proposal would instead be forcing authors
with a choice between giving up on their preferred journals or giving
up on free archiving, and my prediction is that the only ones who will
choose the first option (apart perhaps from a few zealots like you and
me) are weaker authors, the ones who would choose to submit to new
journals anyway, because they think their chances of acceptance there
are higher (and they are!), and weaker journals, brand new ones, or
older ones who participate because they were barely staying afloat the
old way.

The result would be that FROM THE VERY OUTSET, the contents of
NIH/PubMed Central would be associated with the low-end of the
literature. The effect of that would only be to strengthen the
(unjustified) notion that there is an inherent quality disparity
between the for-free and the for-fee literature. And it would play into
the hands of the publishers of the strong journals, whose interest is
obviously in preserving the status quo for as long as possible.

> [some of the journals that will be starting already seem likely
> to attract the best articles in their field]

Yes, well have another look at AAAS's On-Line Journal of Clinical
Trials, which also looked as if it was going to "attract the best
articles in their field" and see how its submissions are doing now. Or
keep an eye on the IOP's brave new online-only Journal of Physics (for
which I forecast quite a few years of uphill battles to fill their
pages and cover their costs -- which I hope IOP will have the courage
to sustain until something finally does succeed in putting us all over
the threshold to the optimal/inevitable). Or, for that matter, have a
look at my own decade-old online journal, Psycoloquy. It is still
there, limping along, but it has hardly "captured" the literature yet
-- and will never do so on the strength of new online journals alone.

Something else is needed to get it all over the top. I had hoped it
would be the NIH Archive. But if it does not support self-archiving, I
am afraid it will not be.

> [a number of outstanding scientists have indicated their interest in
> getting involved. My assumption is that the "pyramid of science" will
> exist quite soon in PubMed Central as it does in the current system - a
> relatively small fraction of the journals will be seen as
> prestigious...]

I think your hopeful view is several orders more speculative than what
the status quo and the evidence so far implies. You'll always have a
few visionaries backing new initiatives. But the rank and file (and
most of their leaders) will continue doing what has worked for them
until now.

Yes, free online accessibility to one's work is a COLOSSAL bonus -- in
fact it's on that colossal bonus that all my efforts (and yours) are
predicated. But the PROSPECT of this colossal bonus has been just
barely able to get authors to drink from the waters of self-archiving
so far, even when the facilities are available and they need give up
none of their current practises and preferences, but merely to add a
tiny task to them (self-archiving).

The current NIH proposal would merely be erecting more hurdles before
they even get to the water, on the strength of hopeful speculations
about what people will/would/might/should do...

> [There are possibilities for evolution in [the current proposal] that
> could become very similar to your self-archiving ideas. I find much of
> what you've written and continue to write interesting and useful. The
> feedback from publishers has been quite helpful even when it's clear
> they've no interest in participating. And believe me, this new
> proposal will certainly stir the pot...]

I hope so. But as it stands, it looks like it might just create another
digression from the road to the optimal and the inevitable, instead of
speeding us there!

You may be right, and I may be wrong. But if it should happen to be the
other way round, the only hope for my views is that if they are aired, so
that others can take them up, if it is not to be NIH.

So I hope you don't mind if I now post the gist of my replies in these
off-line interactions -- removing your name paraphrasing your words. Ok?

> [it was not just interactions with publishers that led to this
> strategy - many scientists favored this approach as well]

If Paul Ginsparg had conducted consultations with scientists before
setting up his self-archiving software, physics would still be where
biomedical research is now.

Thanks for agreeing to let me post my prior replies (camouflaged).

--------------------------------------------------------------------
Stevan Harnad harnad@cogsci.soton.ac.uk
Professor of Cognitive Science harnad@princeton.edu
Department of Electronics and phone: +44 2380 592-582
Computer Science fax: +44 2380 592-865
University of Southampton http://www.cogsci.soton.ac.uk/~harnad/
Highfield, Southampton http://www.princeton.edu/~harnad/
SO17 1BJ UNITED KINGDOM ftp://ftp.princeton.edu/pub/harnad/



This archive was generated by hypermail 2b30 : Tue Feb 13 2001 - 16:23:05 GMT