Sunday, January 31. 2010Annual Costs Per Deposit of Hosting Refereed Research Output Centrally Versus InstitutionallySANDY THATCHER: "it's the peer review that is the most expensive part of the whole process, and arXiv is not in the business of peer reviewing."What Sandy Thatcher said is perfectly correct:DAVID PROSSER:: "Is that true, Sandy? Can we have a reference please? Tenopir and King back in 2004 suggested that 'manuscript receipt processing, disposition decision-making, identifying reviewers or referees and review processing' constituted 26% of the direct costs of producing an article (which they estimated at $1700 on average). Of course, costs may have shifted in the years since then. Which is why a reference would be welcome." (1) The cost of providing peer review (c. $500 per article -- though more efficient online procedures could lower that) is indeed the most expensive part of the process of providing a peer-reviewed article for free (OA) by depositing it in a central repository like Arxiv (or in the author's own Institutional Repository, IR). (2) And Arxiv does not provide the peer review. (Nor does any other repository.) (3) Low as it is, $7 per article just for deposit and archiving is probably an overestimate, because Arxiv needs to do far too much work to process and store all the world's institutions' physics deposits centrally: It would cost even less per article for an Institutional Repository (IR) that archives only its own annual research output (and knows all its own researchers, hence need not do the extra generic precautionary controls). (Be careful not to jig the estimate by factoring in the costs of online infrastructure that the institution already has, regardless of whether it has an IR: just the one-time IR set-up cost, the extra server and disk-space, etc., plus the cost per deposit and annual maintenance of the IR only.) It would be useful to have IRs' estimates of their annual cost per article deposited -- but only from mature mandated IRs that are already well on the way to capturing 100% of their annual institutional output of refereed journal articles. (Obviously the IR price per article will be somewhat higher for IRs that are still only capturing only 15% or less of their annual refereed research output, as most IRs today still are, because they have not yet mandated deposit.) Another useful comparison would be the cost -- in money and time -- of doing the unnecessary IR "quality controls" and preprocessing that many IRs think, superstitiously and superfluously, that they need to do. (In this case, estimates from all the immature, near-empty IRs are relevant too.) At Southampton ECS, the first mandated IR of all (since 2002), we realized within the first year of the mandate that the "quality control" (for the content and metadata of the deposit) was based on a completely unnecessary and dysfunctional misanalogy with library collections and cataloguing, that all it did was create needless work and backlogs for the "quality-controllers" and needless resistance and counterproductive resentment from depositing authors who, having taken the trouble to deposit their refereed final drafts, as mandated, were then denied the immediate satisfaction of seeing their deposits go immediately online and start getting downloaded: instead, they had to go into a quality-control queue, sometimes for days or weeks, as the volume of mandated deposits to "process" grew. We quickly jettisoned the gratuitous process and have seen the IR's deposits growing happily ever since. Leave any "quality control" for your institutional authors' peer-reviewed final drafts in the background. If something is wrong, users will let the author know; if users don't squawk (or there are no users!), the slip-up probably isn't even worth correcting. Focus on solving the real problem, which is not "quality control" but capturing the IR's target content: the institution's full annual output of refereed research. And remember that -- whilst journals still exist and subscriptions are still paying for their quality control -- your IR is not hosting the all-important version-of-record, but merely an OA supplement. A word to the wise... Stevan Harnad American Scientist Open Access Forum Saturday, January 30. 2010Arxiv Arcana
Nat Gustafson-Sundell wrote:
NGS: "I don't expect local repositories to ever offer quality control."Of course not. They are merely offering a locus for authors to provide free access to their preprint drafts before submitting them to journals for peer review, and to their final drafts (postprints) after they have been peer-reviewed and accepted for publication by a journal. Individual institutions cannot peer-review their own research output (that would be in-house vanity-publishing). And global repositories like arxiv or pubmedcentral or citeseerx or google scholar cannot assume the peer-review functions of the thousands and thousands of journals that are actually doing the peer- review today. That would add billions to their costs (making each into one monstrous (generic?) megajournal: near impossible, practically, if it weren't also totally unnecessary -- and irrelevant to OA and its costs). NGS: "Also, users have said again and again that they prefer discovery by subject, which will be possible for semantic docs in local repositories or better indexes (probably built through better collaborations), but not now."Search should of course be central and subject-tagged, over a harvested central collection from the distributed local IRs, not local, IR by IR. (My point was that central deposit is no longer necessary nor desirable, either for content-provision or for search. The optimal system is institutional deposit (mandated by institutions as well as funders) and then central harvesting for search. NGS: "I agree that it would be great if local repositories were more used, and eventually, the systems will be in place to make it possible, but every study I've seen still shows local repository use to remain disappointingly low, although some universities are doing better than others.""Use" is ambiguous, as it can refer both to author use (for deposit) and user use (for search and retrieval). We agree that the latter makes no sense: users search at the harvester level, not the IR level. But for the former (low author "use," i.e., low levels of deposit), the solution is already known: Unmandated IRs (i.e., most of the existing c. 1500 IRs) are near empty (of OA's target content, which is preprints and postprints of peer-reviewed journal articles) whereas mandated IRs (c. 150, i.e.m 1%!) are capturing (or on the way to capturing) their full annual postprint output. So the solution is mandates. And the locus of deposit for both institutional and funder mandates should be institutional, not central, so the two kinds of mandates converge rather than compete (requiring multiple deposit of the same paper). For the special case of arxiv, with its long history of unmandated deposit, a university's IR could import its own remote arxiv deposits (or export its local deposits to arxiv) with software like SWORD, but eventually it is clear that institution-external deposit makes no sense: Institutions are the universal providers of all peer-reviewed research, funded and unfunded, across all fields. One-stop/one-step local deposit (followed by automatic import. export. and harvesting to/ from whatever central services are needed) is the only sensible, scaleable and sustainable system, and also the one that is most conducive to the growth of universal OA deposit mandates from institutions, reinforced by funder mandates likewise requiring institutional deposit, rather than discouraged by gratuitously requiring institution-external deposit. NGS: "Inter-institutional repositories by subject area (however broadly defined) simply work better, such as arXiv or even the Princeton-Stanford repository for working papers in the classics.""Work better" for what? Deposit or search? You are conflating the locus of search (which should, of course, be cross-institutional) with the locus of deposit, which should be institutional, in order to accelerate institutional deposit mandates and in order to prevent discouraging adoption and compliance because of the prospect of having to deposit the same paper in more than one place. (Yes, automatic import/export/harvesting software is indifferent to whether it is transferring from local IRs to central CRs or from central CRs to local IRs, but the logistics and pragmatics of deposit and deposit mandates -- since the institution is always the source of the content -- make it obvious that one-time deposit institutionally fits all output, systematically and tractably, whereas willy-nilly IR/CR deposit, depending on fields' prior deposit habits or funder preferences is a recipe for many more years of the confusion, inaction, absence of mandates, and near-absence of OA content that we have now.) NGS: "Currently, universities are paying external middlemen an outsized fee for validation and packaging services. These services can and should be brought "in-house" (at least as an ideal/ goal to develop toward whenever the opportunities can be seized) except in cases where prices align with value, which occurs still with some society and commercial publications."I completely agree that along with hosting their own peer-reviewed research output, and mandating its deposit in their own IRs, institutions can also use their IRs (along with specially developed software for this purpose) to showcase, manage, monitor, and measure their own research output. That is what OA metrics (local and global) will make possible. But not till the problem of getting the content into OA IRs is solved. And the solution is institutional and funder mandates -- for institutional (not institution-external) deposit. NGS: "To the extent that an arXiv or the inter-institutional repository for humanities research which will be showing up in 3-7 years moves toward offering these services, they are clearly preferable to old fashioned subscription models (since the financial support is for actual services) and current local repositories which do not offer everything needed in the value chain (as listed in Van de Sompel et al. 2004)."(1) The reason 99% of IRs offer no value is that 99% of IRs are at least 85% empty. Only the 1% that are mandated are providing the full institutional OA content -- funded and unfunded, across all disciplines -- that all this depends on. (2) The central collections, as noted, are indispensable for the services they provide, but that does not include locus of deposit and hosting: There, central deposit is counterproductive, a disservice. (3) With local hosting of all their research output, plus central harvesting services, institutions can get all they need by way of search and metrics, partly through local statistics, partly from central ones. NGS: " I remember when I first read an article quoting a researcher in an arXiv covered field who essentially said that journals in his field were just for vanity and advancement, since all the "action" was in arXiv (Ober et al. 2007 quoting Manuel 2001 quoting McGinty 1999) -- now think about the value of a repository that doesn't just store content and offer access."This familiar slogan, often voiced by longstanding arxiv users, that "Journals are obsolete: They're only for tenure committees. We [researchers] only use the arxiv" is as false, empirically, as it is incoherent, logically: It is just another instance of the "Simon Says" phenomenon: (Pay attention to what Simon actually does, not to what he says.) Although it is perfectly true that most arxiv users don't bother to consult journals any more -- using the OA version in arxiv only, and referring to the journal's canonical version-of-record only in citing -- it is equally (and far more relevantly) true that they all continue to submit all those papers to peer-reviewed journals, and to revise them according to the feedback from the referees, until they are accepted and published. That is precisely the same thing that all other researchers are doing, including the vast majority that do not self-archive their peer-reviewed postprints (or, even more rarely, their unrefereed preprints) at all. So journals are not just for vanity and advancement; they are for peer review. And arxiv users are just as dependent on that as all other researchers. (No one has ever done the experiment of trying to base all research usage on nothing but unrefereed preprints and spontaneous user feedback.) So the only thing that is true in what "Simon says" is that when all papers are available, OA, as peer-reviewed final drafts (and sometimes also supplemented earlier by the prerefereeing drafts) there is no longer any need for users or authors to consult the journal's proprietary version of record. (They can just cite it, sight unseen.) But what follows from that is that journals will eventually have to scale down to becoming just peer-review service-providers and certifiers (rather than continuing also to be access-providers or document producers, either on-paper or online). Nothing follows from that about the value of repositories, except that they are useless if they do not contain the target content (at least after peer review, and, where possible and desired by authors, also before peer review). Harnad, S. (1998/2000/2004) The invisible hand of peer review. Nature [online] (5 Nov. 1998), Exploit Interactive 5 (2000): and in Shatz, B. (2004) (ed.) Peer Review: A Critical Inquiry. Rowland & Littlefield. Pp. 235-242. NGS: "Do I think the financial backing will remain in place? It depends on the services actually offered and to what extent subject repositories could replace a patchwork system of single titles offered by a patchwork of publishers."At the moment the issue is whether arxiv, such as it is (a central locus for institution-external deposit of institutional research content in some fields, mostly physics, plus a search and alerting service), can be sustained by voluntary sub-sidy/scription -- not whether, if arxiv also somehow "took over" the function of journals (peer review), that too could be paid for by voluntary sub-sidy/ scription... NGS: "Universities could save a great deal by refusing to pay the same overhead over and over again to maintain complete collections in single subject areas (not to mention paying for other people's profits)."I can't quite follow this: You mean universities can cancel journal subscriptions? How do those universities' users then get access to those cancelled journals' contents, unless they are all being systematically made OA? Apart from those areas of physics where it has already been happening since 1991, that isn't going to happen in most other fields till OA is mandated by the universal providers of that content, the universities (reinforced by mandates from their funders). Then (but only then) can universities cancel their journal subscriptions and use (part of) their windfall saving to pay (journals!) for the peer-review of their own research output, article by article (instead of buying in other universities' output, journal by journal). NGS: "More importantly, more could be done to make articles useful and discoverable in a collaborative environment, from metadata to preservation, so that the value chain is extended and improved (my sci-fi includes semantic docs, not just cataloged texts, and improved, or multi-stage, peer review, or peer review on top of a working papers repository)."All fine, and desirable -- but not until all the OA content is being provided, and (outside of physics), it isn't being provided -- except when mandated... So let's not build castles in Spain before we have their contents safely in hand. NGS: "I think there's been plenty of 'chatter' to indicate that the basic assumptions in conversations between universities are changing (see recent conference agendas), so that we can expect to see more and more practical plans to collaborate on metadata, preservation, and , yes, publications."I'll believe the "chatter" when it has been cashed into action (deposit mandates). Till then it's just distraction and time-wasting. NGS: "My head spins to think of the amount of money to be saved on the development of more shared platforms, although, the money will only be saved if other expenditures are slowly turned off."All this talk about money, while the target content -- which could be provided at no cost -- is still not being provided (or mandated)... NGS: "Sandy mentioned in another post that she [he] would hope for arXiv like support for university monographs..."Monographs (not even a clearcut case, like peer-reviewed articles, which are all, already, author give-aways, written only for usage and impact) are moot, while not even peer-reviewed articles are being deposited, or mandated... NGS: "Open access and NFP publications which do offer the full value chain have been proven to have much lower production costs per page than FP publishers and they do not suffer any impact disadvantages -- and these are still operated on a largely stand-alone basis, without the advantages that can be gained by sharing overhead."Cash castles in Spain again, while the free content is not yet being provided or mandated... NGS: "Maybe local repositories really are the way to go, since then each institution has more control over its own contribution, but the collaboration and the support will still need to occur to support discovery (implying metadata, both in production and development of standards and tools) and preservation."No, search and preservation are not the problem: content is. NGS: "I suppose another problem with local repositories, however, is that a consensus is far less likely to unite around local repositories as a practical option at this juncture -- the case can't just be made with words, you need the numbers and arXiv has them -- and while I am interested to see strong local repositories emerge, there is greater sense in supporting what can be achieved, since we need more steps in the right direction.""The numbers" say the following: Physicists have been depositing their preprints and postprints spontaneously (unmandated) in arxiv since 1991, but in the ensuing 20 years this commendable practice has not been taken up by other disciplines. The numbers, in other words, are static, and stagnant. The only cases in which they have grown are those where deposit was mandated (by institutions and funders). And for that, it no longer makes sense (indeed it goes contrary to sense) to deposit them institutional-externally, instead of mandating institutional deposit and then harvesting centrally. And the virtue of that is that it distributes the costs of managing deposits sustainably, by offloading them onto each institution, for its own output, instead of depending on voluntary institutional sub-sidy/scription for obsolete and unnecessary central deposit. (See also the "denominator fallacy," which arises when you compare the size of size of central repositories with the size of institutional repositories: The world's 25,000 peer-reviewed journals publish about 2.5 million articles annually, across all fields. A repository's success rate is the proportion of its annual target contents that are being deposited annually. For an institution, the denominator is its own total annual peer-reviewed journal article output across all fields. For a central repository, it is the total annual article output -- in the field(s) it covers -- from all the institutions in the world. Of course the central repository's numerator is greater than any single institutional repository's numerator. But its denominator is far greater still. Arxiv has famously been doing extremely well for certain areas of physics, unmandated, for two decades. But in other areas arxiv is not not doing so well, relative to the field's true denominator; and most other central repositories are likewise not doing well, In fact, it is pretty certain that -- apart from physics, with its 2-decade tradition of deposit, plus a few other fields such as economics (preprints) and computer science -- unmandated central repositories are doing exactly as badly unmandated institutional repositories are doing, namely, about 15%.) Stevan Harnad American Scientist Open Access Forum Simplify OA Deposit But Leave It In the Mandatee's Hands
Congratulations to MIT for this extremely helpful streamlining of the deposit process:
"MIT Libraries began to investigate how SWORD and SWAP could facilitate external contributions by publishers... Entering long and complex information about articles is avoided with the MIT Libraries’ customized submission interface. Only two pieces of metadata are required for already published papers: the name of the authorizing MIT author and a DOI or URL. If the paper is unpublished, four fields are requested."Although entering metadata is not really that complicated and time-consuming at all, we know it is difficult to persuade those who have never deposited a paper in an institutional repository of this fact. So reducing deposit to just entering a name and URL would be a huge step forward in facilitating mandate compliance -- and of course also in encouraging unmandated deposit. I hope we will implement this quickly for EPrints repositories too. I am, however, far less sanguine about the second -- publisher-deposit -- option, especially for mandated deposit: 'the use of SWORD and SWAP with the DSpace repository at MIT is part of a larger strategy to improve collaboration with publishers, facilitating a “push” of large amounts of content into a repository without necessitating a platform-specific solution. Ultimately this “publisher template” could be used with other repository platforms such as Fedora and EPrints. Richard Rodgers, Head of Software Development at MIT Libraries, says, “If we do this right there will be no code to share. SWORD and SWAP are already open and accessible. We have localized their use to accommodate MIT-specific metadata.”It might be alright to quietly provide a way for publishers to facilitate IR deposit, but it would be a huge strategic error to give them an active or essential hand in it. All the power of self-archiving (and of self-archiving mandates from institutions and funders) comes from the fact that it is the author and the author's institution (and funder) that does it, mandates it, and monitors compliance. Self-archiving -- its doing and its timing -- is all in the research community's own hands. Publisher deposit is not. The little extra content that publisher-deposit or publisher-facilitated deposit might add does not counterbalance the additional author confusion, deposit delay, diffusion of responsibility and difficulty in compliance-monitoring that it is likely to introduce into institutional mandates, as it has already done with those funder mandates that allow fundees to offload their mandate fulfillment obligations onto publishers. The problem is especially with specifying and monitoring the fulfillment conditions for deposit mandate compliance. (We always have to remember that publishers are neither employees nor fundees, and hence they are not the ones subject to the deposit mandates). (What kind of mandate is it if it says "You must deposit -- unless your publisher does it for you..." How is it even to be monitored whether and when the mandate has been complied with?) So if repositories implement some sort of back door for publisher-facilitated deposit, it is important to keep a low profile on it and to stress that on no account should it be stipulated or relied on as one of the ways to fulfill a deposit mandate: Complying with the mandate must be entirely the responsibility of the author, and the monitoring and verification of compliance must be based entirely on steps taken by the author, not steps the authors leave to a publisher to (possibly) take (sometime) on their behalf... Stevan Harnad American Scientist Open Access Forum Saturday, January 23. 2010Sub-sidy/scription Business Model for Sustaining ArXiv?
Cornell University Library has proposed a "Collaborative Business Model" for funding the worldwide Physics ArXiv that it hosts (see white paper).
"arXiv will remain free for readers and submitters, but the Library has established a voluntary, collaborative business model to engage institutions that benefit most from arXiv."Here's an alternative to this voluntary institutional sub-sidy/scription model whose sustainablity -- through all economic times, tough and tender -- is less founded on blind faith: Institutions have many self-interested reasons for wanting to host, archive, manage, monitor, measure and showcase their own research article outputs. The annual scale of their own local article output is also manageable and sustainable at the institutional level, within each institution's existing infrastructure: Carr, L. The Value that Repositories AddHence what will happen is that instead of trying to sustain a central repository like Arxiv -- most of whose costliness derives from the fact that it is a single direct locus of deposit and archiving from all institutions, worldwide -- direct deposit and hosting (and its costs) will instead be offloaded and distributed across the network of institutional repositories, with Arxiv becoming merely another central harvester, providing global search services (sustainable if it provides functionality that can compete with other OAI services or Google Scholar). But voluntary sub-sidy/scription will no doubt sustain things for a while. (Things do seem to catch on rather slowly in this domain...) Stevan Harnad American Scientist Open Access Forum Saturday, January 16. 2010Creating Institutional Repositories Is Not the Problem
The Undergraduate Science Librarian wrote:
"For a small institution like mine, having our own institutional repository might not make sense. We probably don’t have the library staff to run it well... [F]or many of our faculty, their only way of archiving their papers may be to post them on their own personal website, where they might not be as easy to find..."REPLY: (1) If a campus has the infrastructure to host a website at all (as SUNY GENESEO clearly does), it has the infrastructure to host its own institutional repository (IR) for its own research article output. (If not, it probably does not have the insfrastructure to conduct research at all.) (2) Library staff are not needed to host an IR. (3) All that's needed is some disk space on one of the institution's webservers, plus the installation of (free, open-source) IR software. (4) Even individuals can install and host the IR software on their own PCs or their personal websites. (There is even a (free) Microsoft Windows version of EPrints.) (5) All EPrints IR installations are OAI-compliant, hence harvested for searchability by all of the major search engines: scirus, scopus, citeseerx, citebase, oaister, base, etc., as well as google and google scholar (the major ports of entry for all IRs). (Worries about IR deposits not being "easy to find" are based on a profound misunderstanding of search over distributed OAI-harvestable contents.) (5) CalTech alone hosts 26 EPrints IRs. (6) The importance of institutional IR installations is as the convergent locus of deposit for Open Access self-archiving mandates (without which all IRs, personal or institutional, are doomed to lie fallow). Stevan Harnad American Scientist Open Access Forum Friday, January 8. 2010ROAR Registry of Open Access Repositories Upgraded to Power of EPrints Functionality
The ROAR Registry of Open Access Repositories has just been upgraded to the full power of the EPrints software's remarkable functionality.
Please come and explore the power of ROAR to display and track repository size, contents and growth across time, by country, repository type, and many performance parameters. This will make it possible to monitor and analyze repository growth worldwide, and to encourage institutions to create their own repositories and adopt deposit mandates for filling them. Please also register your repository if it is not yet in ROAR. The new functionality means you will own your record and will be able to update it as it progresses. And register your institution's Open Access Mandate in ROARMAP, ROAR's companion service, tracking mandate growth worldwide.
(Page 1 of 1, totaling 6 entries)
|
QuicksearchSyndicate This BlogMaterials You Are Invited To Use To Promote OA Self-Archiving:
Videos:
The American Scientist Open Access Forum has been chronicling and often directing the course of progress in providing Open Access to Universities' Peer-Reviewed Research Articles since its inception in the US in 1998 by the American Scientist, published by the Sigma Xi Society. The Forum is largely for policy-makers at universities, research institutions and research funding agencies worldwide who are interested in institutional Open Acess Provision policy. (It is not a general discussion group for serials, pricing or publishing issues: it is specifically focussed on institutional Open Acess policy.)
You can sign on to the Forum here.
ArchivesCalendarCategoriesBlog AdministrationStatisticsLast entry: 2018-09-14 13:27
1129 entries written
238 comments have been made
Top ReferrersSyndicate This Blog |