This is a
summary (from my own viewpoint) of the Washington meeting this weekend sponsored by American Society for Information Science & Technology (
ASIST), organized by Michael Leach (Harvard, President, ASIS):
Digital Archives for Science and Engineering Resources
(DASER 2)
(For some other slants on DASER 2, see these two blogs; but beware, as they do contain some notable garbles and omissions, having been blogged in real time by
Dorothea Salo and
Christina Pikas.)
DASER 2 rehearsed some familiar developments, highlighted some of them, and brought out one potentially important new one (re. the
NIH Public Access Policy).
The familiar developments were the worldwide growth in
Institutional Repositories (IRs), and in new services to help institutions to create, maintain or even host IRs:
ProQuest (using
Bepress software),
BioMed Central OpenRepository (using
Dspace software) and
Eprints Services (using
Eprints software).
Fedora software was also discussed, but it was quite apparent (at least to me!) that at this DASER meeting, whose specific focus was digital science/engineering resources -- hence Open Access (OA) IRs in particular, targeting the self-archiving of institutional peer-reviewed science/engineering article output, in order to maximise its visibility, usage and impact, rather than digital curation in general -- Fedora's much wider and more diffuse target (the collection and curation of any and all institutional digital content, incoming or outgoing, research or otherwise) was not the urgent priority. Indeed, there are good reasons for expecting that if the IR movement first puts its full weight and energy behind the focussed archiving of 100% of each institution's own OA IR target content, that will itself prove to be the most effective way to launch and advance the more general digital-curation agenda too.
There was likewise considerable time devoted to the future of publishing, with much discussion of OA publishing and the possibility of an eventual transition to
OA publishing. But here too, the lesson was that the best contribution that OA IRs in particular can make to this possible/eventual transition is to hasten their own transition to the institutional self-archiving of 100% of their own OA target content.
Present and contributing very constructively were the two Learned Society Publishers in whose discipline author self-archiving has been going on the longest, and has gone the farthest (having reached 100% years ago in some fields): The
American Physical Society (the first publisher to adopt [in 1994] an explicit "green" policy on author self-archiving [today about 76% of publishers and 93% of journals are
green]) and the
Institute of Physics (likewise green, along with some notable experiments in "gold" OA publishing).
The keynote speaker was Jan Velterop, formerly publisher of "pure gold" BioMed Central, and now director of OA for Springer's "optional gold"
Open Choice. Jan's main concern was (understandably) to encourage authors to pick the gold option and to encourage their institutions and research councils to fund the author costs.
Jan applauded the growth in the IR movement but noted a substantial decrease in the number of postings on the
American Scientist Open Access Forum (AmSci) in 2004-2005 compared to prior years, and worried that this might reflect a decrease in OA momentum.
On the contrary: the decreased AmSci volume was intentional. In 2004, a new policy for AmSci postings was announced, reserving the Forum for concrete, practical discussion of institutional and research-funder OA policy design and implementation. AmSci's former open-ended (and unending) philosophical and ideological debate about open access was instead redirected to the many other OA lists that have spawned since the AmSci OA Forum's inception in 1998:
"[T]his Forum, the first of what is now a half dozen lists devoted to OA matters, is -- as has been announced several times -- now reserved for the discussion of concrete, practical means of accelerating OA growth." [December 2004]
The DASER conference also devoted time and thought to the future of librarians in the digital and OA era; again, insofar as IRs are concerned, a good investment of librarians' available time, energy and resources is in helping to create and fill IRs, first OA IRs, and then eventually expanding them to wider and wider digital content, thereby again facilitating the inevitable and desirable transition. (My own personal view, however, is that librarians should abstain from speculation about the future of peer review, which is not really their field of expertise; I also think retraining librarians to become institutional in-house publishers may not be the best use of their time and talents.)
That librarians can be an enormous help in getting institutional authors to deposit their OA content in their IRs was illustrated in
my own talk, using examples from around the world (
CERN, U.
Minho,
Southampton ECS) but with especially striking data from Australia (with thanks to
Arthur Sale, University of Tasmania and Paula Callan,
Queensland University of Technology). I also reported on the growing evidence for the dramatic
OA research impact advantage across all disciplines, now including the humanities and social sciences, and its implications for research and researcher funding and progress..
The OA impact advantage, IRs, and librarian-help are all
necessary conditions for filling IRs with OA content, but to make them into a jointly
sufficient condition, one further critical component is needed, and this has been demonstrated in case after case: The only IRs that are well along the road toward toward 100% OA are the ones that also have an institutional self-archiving requirement. Without that, spontaneous OA self-archiving is hovering at about 5% - 15% globally..
Which brings us to the last and newest development reported at DASER: The
NIH Public Access Policy is flawed and failing -- its deposit rate is at about 2%, which is even
below the global average for spontaneous self-archiving. But the good news is that NIH has realized this, and is planning to do something about it. The question is: what? There is a committee to look at this question, but at a quick glance, it does not seem to include those who actually know what needs to be done, and how, to make the NIH policy work. Represented are librarians and publishers, but missing are the institutional OA policy-makers that can make self-archiving work.
But the solution is simple, and NIH can do it, very easily. First, it is important to face the 3 flaws of the current NIH policy very forthrightly. Here they are, in order of severity:
(1) Deposit is requested rather than required.
(2) The request is not for immediate deposit but deposit within one year of publication.
(3) The request is for deposit in PubMed Central (PMC) rather than in the author's own IR, from which PMC could harvest it.
The reason the deposit is not required and not immediate is related to the reason the deposit is in PMC instead of the author's own IR: NIH has cast itself in the role of a 3rd-party access-provider, via PMC. This is fine, for its own funded research. But then NIH must deal with its publishers and their conditions (which include access-embargoes of up to 12 months, in order to protect against perceived risks to their revenues).
OA itself does not require a 3rd-party access-provider. All it requires is OA! And for that, any
OAI-compliant archive, whether the author's own institutional respository or a central repository like PMC will do, because they are all equivalent and interoperable, in the OAI-compliant age, and all accessible to any user or harvester webwide.
So NIH can have what it wants -- 100% of its funded content in PMC within a year of publication -- while still requiring the author's final draft to be deposited immediately upon acceptance for publication, (
preferably in the author's IR, harvestable by PMC, but absent that, directly deposited in PMC).
That leaves only the question of how to set the access-privileges, and now those can be merely the subject of a (strong) request to set them to OA immediately upon deposit -- but with the option left open (sic) for the author to set access instead as
restricted to institution-internal and
PMC-harvestable (or, for PMC,
PMC-administrative-only) if the author has reason to prefer that (the reason presumably being that the article is published in one of the 7% of journals that are not yet
green on immediate OA self-archiving).
Is this merely a way of tweaking the current NIH policy so as to get deposits up to 100% without getting immediate OA up to 100%? The answer is: Yes and No. Yes, this policy will immediately drive up NIH deposits from their current 2% level to 100%, because deposit will be a fulfilment condition on receiving the NIH grant. But no, it is not true that it will not generate immediate 100% OA. For it can generate that too, with a far smaller delay-loop than 12 months: something more of the order of a few minutes to 12 hours at most:
The solution is very simple (and we are already building it into the
Eprints IR software): The metadata (author, title, journal, date, abstract, etc.) are of course all immediately OA for 100% of deposited papers, regardless of how the access-privileges for the full-text are set. That means that from the moment the text is deposited, the metadata are visible and accessible to all would-be users and harvesters webwide, thanks to OAI and the
OAI search engines, as well as to
google scholar and the non-OAI search engines.
But what about the full-text? For about 7% of journal articles (the ones in the
non-green journals), access might not be immediately set to OA. What the Eprints software will do when a would-be user encounters this dead-end is that the IR interface will provide a link that will pop up a window allowing the user to send an automatic email to the author (whose email address is part of the IR's internal metadata) requesting to be emailed an eprint of the full-text in question. The requester's email will be sent by the software -- automatically and immediately -- to the author, with a prepared URL that the author need merely click on, in order to have the eprint immediately emailed to the would-be user. (Eprints requests will be counted, as will direct downloads and eventually also
citation links.)
This author-mediated access-provision is not quite as convenient, instantaneous or sensible as immediately setting the full-text to unmediated OA, allowing the user to just click to download it; but it is effective 100% OA just the same. And NIH can (as now) harvest the full-text whenever it likes, and can go on to make it OA in PMC whenever it elects to. None of that will be holding back OA any longer.
This immediate-deposit requirement is also the form that the
Research Councils UK (RCUK) Self-Archiving Policy is now taking; and this offers a general model for the rest of the world to adopt too.
Note that this slightly modified policy completely side-lines all publisher objections: It is merely a deposit requirement, not an OA access-setting requirement. It is left up to researchers and the would-be users of their research to sort out access-provision according to the needs of research -- exactly as it should be.
This is of course also the policy that
institutions should adopt, for their own institutional research output, whether or not funded by NIH or RCUK. An immediate-deposit requirement will result in IRs worldwide filling virtually overnight (at long last).
(The other thing NIH should do is to couple its deposit requirement with an explicit statement of NIH's readiness to cover OA journal publication charges for those NIH fundees who choose to publish their findings in an OA journal.)
P.S. In the last session (which I had to miss, to catch my plane) David Stern suggested central archiving as the way to induce more self-archiving. Unfortunately, that's not the solution, because the problem is not that authors can't find an IR to deposit in: it's that only about 15% of authors are self-archiving spontaneously today (i.e., there are plenty of Institutional Repositories, but they are near-empty). Hence what's needed is not central archiving but "central" self-archiving mandates (from the authors' institutions and funders). Central archives are fine as a provisional locus for authors to self-archive until their institutions have an IR, but they are neither necessary nor optimal otherwise. Centralism is obsolescent in the OAI era: distributed interoperable archives, harvested centrally, are the natural way forward. Local insitutions, being the primary research content-providers, are the best placed, and motivated, to mandate, monitor, and reward compliance with a self-archiving policy for their own institutional research output that is to the joint benefit of authors, their institutions, and their funders (but no other central entity).
Stevan Harnad