Tuesday, July 27. 2010The Mandate of Open Access Institutional Repository ManagersIn a UKSG Serials News posting, "Are we nearly there yet? On the road to open access",Graham Stone [GS], Repository Manager, University of Huddersfield and Chair, UK Council of Research Repositories (UKCoRR) wrote: GS: "Not too long ago, I took a phone call from an academic colleague from the Health Sciences regarding the submission of an article to Biomed Central. [The colleague] phoned me as I am the 'Repository guy' and [the colleague was] learning to play the 'Repository game', that is getting their work out there on open access and increasing their citations. [The colleague was] very impressed that so many people downloaded their last paper within days of it appearing in the Repository."This upbeat-sounding paragraph is unfortunately a series of (familiar) misunderstandings and non-sequiturs about Open Access (OA) and Institutional Repositories (IRs): (1) Biomed Central (BMC) is a gold OA (pay-to-publish) journal publisher. (2) Publishing in a BMC journal has nothing to do with depositing an article in "the Repository." Which Repository -- Huddersfield's? You don't need to publish in a pay-to-publish gold OA journal in order to deposit in a green OA Institutional Repository (IR) like Huddersfield's, nor in order to benefit from the increased downloads and citations that OA makes possible. All you do is publish in whatever journal you publish in, and deposit the final refereed draft in your OA IR as soon as it is accepted for publication. Or was the deposit in PubMed Central (PMC, not BMC)? Likewise no payment required (but what does deposit in that institution-external repository have to do with U. Huddersfield's IR, or its IR manager?). (3) There is no "Repository game". There is just the research and publication game. (Providing OA maximizes research access, usage and impact, and OA can be provided in two ways. I. "Gold OA": by publishing in an OA journal (of which the major ones require payment to publish); or II. "Green OA": by publishing in any journal at all -- whether subscription-based or OA -- and also depositing the final draft in your OA IR: no payment required. The "game" is merely ensuring that all potential users have online access to your published articles, not just those whose institutions can afford to subscribe to the journal in which it happened to be published.) GS: "It struck me as very interesting that to [this colleague], the next stage of the 'game' was to consider switching from green to gold open access - providing someone would pay of course!"The colleague sounds like a researcher who has just deposited an article for the first time in an OA repository (perhaps PMC, though it should have been Huddersfield's IR), and not a researcher who has just paid BMC for gold OA publication (otherwise the colleague would know who was paying!). Something has definitely been garbled here... GS: "This is not the first time that this topic has come up in conversation in the past few weeks. At the recent LIBER conference at Aarhus University in Denmark discussion over dinner turned to open access. One comment from a colleague was that green open access could not be successful in the long run as this was a compromise, and 'compromises never work'."How is providing OA to one's published article by depositing it in one's IR a "compromise"? A compromise of what, with what, for whom? Depositing an article in an IR consists of a few minutes' worth of keystrokes that maximize the access, usage and impact of one's article. But perhaps the LIBER discussion was not among (1) researchers, discussing the problem of how to "get their work out there on open access and increase their citations" rather than continue to allow access to it to be restricted only to those researchers whose institutions can afford to pay for subscription access to the journal in which it happens to be published... Perhaps the LIBER discussion was instead among (2) librarians, discussing the problem of how to afford to pay for subscription access? Or perhaps the LIBER discussion was among (3) publishers, discussing the problem of how to guarantee current subscription revenue streams in a growing climate of demand for open access on the part of researchers, their institutions, their funders, and the tax-paying public that funds the research? To repeat: In what sense is green OA self-archiving a "compromise"? A compromise of what, with what, for whom? Is a university repository manager a representative of the immediate interests of the university's researchers (and their institutions, funders, and the tax-paying public that funds the research), or of the interests of publishers and their present and future business models? If librarians are to fulfill the role of repository managers, they need to re-think what they are doing, and why, and what it is that researchers and research need in the OA era. An OA IR is not a buy-in collection of journal subscriptions: It is a give-away provision of access to an institution's published journal articles. An OA IR manager is not a serials librarian, nor someone appointed to direct or second-guess the future course of serials publishing. An OA IR manager is someone appointed to make sure the university's OA IR is filled with its primary target content: the university's published journal article output. "UKCoRR has a vision of the work of repository management as a professionally recognised and supported role within UK research institutions." -- What is that "professionally recognised and supported role" if it is not filling their institution's repository with its intended content? GS: "The road to open access is covered in gold and this is the way forward."The way forward for whom? And according to whom? And in the interests of what? Researchers can be mandated to provide green OA for their published work. (Without mandates, only about 20% or articles are self-archived.) And funds -- if any are available -- can be provided to pay for gold OA. But publishers cannot be mandated to provide gold OA. And the funds to pay for gold OA cannot be mandated while they are still tied up in paying for subscriptions (and while the asking price for gold OA is designed to preserve publishers' current revenue streams and modus operandi, come what may). The road to green OA is wide open, and traversing it is entirely in the hands of researchers (and their institutions and funders). The road to gold OA is not wide open; it costs money, and it is in the hands of publishers, not researchers. And the potential money to pay for gold OA is currently tied up in institutions' subscription fees, which are being paid to publishers, by institutions' libraries. So how is the road to OA covered with gold, and how is it the way forward? And what has this to do with the research repository manager's "professionally recognised and supported role within UK research institutions"? GS: "A few days earlier, Kurt de Belder from Leiden University in the Netherlands had laid out his vision of the future, which assumed that open access would be via the gold route and if Repositories existed, they would only contain grey literature."Kurt de Melder is the director of Leiden University's library (and an advisor to several publishers). Does his golden vision (like the green vision) include a practical means (like the green vision's mandates) of getting us from here to there? Or is it all just a golden wish, waiting passively (apart from any spare money being spent on pre-emptive gold OA payments) for publishers to convert to gold and release everyone's subscription money (for incoming journals) to pay their asking price for gold OA (for outgoing articles)? And while the institution's library keeps waiting for this to happen directly, of its own accord, is the access, usage and impact of the institution's research output to continue to be denied to all but subscribing institutions, as it is today, while institutions' IRs (which already exist, by the way) are devoted instead to "grey literature" (whatever that means) instead of to refereed research (green OA)? And meanwhile, visions aside, those who have their eyes wide open cannot help but notice that IRs (which already do exist, remember) do contain green content (20%) rather than just grey content, and that green deposit mandates can and do drive up the percentage green from the baseline 20% to 60%, and approaching 100% within a few years. What's missing, and needed (for those with eyes wide open to see) is more green OA mandates from institutions and funders -- not armchair or dinner-table visions of the future of publishing, evoked in the thrall of pre-emptive gold fever (with no critical reflection on or answerability to practical means and ends). That, perhaps (rather than gold fever), would come closer to a substantive "vision of the work of repository management as a professionally recognised and supported role within UK research institutions." GS: "Personally, and not as Chair of UKCoRR (UK Council of Research Repositories), I must admit that I am starting to agree with the gold only route, although I'm not sure I should."If the Chair of UK's Council of Research Repositories is starting to agree (whether personally or ex officio) with the gold-only route, then perhaps it is time for the Chair to think of resigning, and allowing UKCoRR's direction to be set by those who understand the needs of research and researchers, the power of green OA IRs, and the urgent need for Green OA mandates. Surely there is a "UK Council of Publishing Business Models" that could be joined instead, by those who have become afflicted with gold fever, forgetting about research and researchers' urgent immediate need for OA, and IRs' mission to provide it. GS: "I have been espousing the virtues of green open access for nearly five years. At Huddersfield we have 26% full text in the Repository despite not yet having a mandate and our full text downloads are really taking off - 46,000 in the last 12 months."If that 26% is 26% of Huddersfield's current yearly research output, then that deposit rate is somewhat above the global spontaneous (i.e., unmandated) baseline deposit rate of about 20%, but it is a far cry from what the deposit rate would be if Huddersfield were to adopt a mandate. A repository manager espousing the interests of Huddersfield's researchers should be espousing the virtues of green OA mandates to Huddersfield's researchers and administration, not just the virtues of providing green OA spontaneously (although that is, of course, welcome too). Well over five years' consistent experience (and surveys) worldwide have shown that most researchers will not deposit spontaneously but they will deposit (willingly) if deposit is mandated. In the past few years, it is not spontaneous deposit rates that have been picking up, but the rate of adoption of deposit mandates, and the resulting green OA. This is not the time for repository managers to succumb to gold fever (which leads next to nowhere, and is not even part of their remit), resigning their IRs to warehousing "grey literature." GS: "However, for some time I have had my doubts as to whether the championing of green open access was actually taking us down the right road. I could see that gold open access was a good business model. "If we all commit to deposit, we don't need green OA self-archiving mandates. But we don't all commit to deposit, even though it costs nothing. Only about 20% commit unmandated (26% at Huddersfield, perhaps because the IR manager has for five years espoused the virtues of spontaneous deposit so persuasively). But even fewer commit to gold OA, because it costs money, because most of the top journals don't offer it, and because the money to pay for it is still tied up in paying for subscriptions. And there are no mandates to require researchers to pay for gold OA, nor to release the subscription money, nor to dictate publishers' business model or modus operandi, nor to set their asking price. Besides, none of that is within an OA IR manager's remit. It has nothing to do with "the work of repository management as a professionally recognised and supported role within UK research institutions." An OA IR manager is supposed to get his IR filled with OA's target content, and that target content is supposed to be, first and foremost, peer-reviewed journal articles, most of which are today still being published in subscription journals. What needs to be championed by IR managers (and a fortiori, by the Chair of the UK Council of Research Repositories), and championed for their researchers and their institutions, are the virtues of green OA mandates that will fill their IRs -- not the virtues of "good business models," championed for publishers, by librarians. (You don't need to be a "professional and supported" IR manager to go down that road.) And those who are indeed committed to championing green OA mandates worldwide are beginning to win them. GS: "The trouble to me is that the [gold OA] model only really works if we all commit. Otherwise, you end up paying twice, once for the open access article and once for the journal subscription. I just didn't see how we arrived at this brave new world of gold open access journals, no serials budgets and stuff in the cloud."Yes, that's indeed the size of it: "The [gold OA] model only really works if we all commit. Otherwise, you end up paying twice, once for the open access article and once for the journal subscription." Trying to go directly from the status quo to gold OA is quite simply self-contradictory, like an Escher drawing of an impossible shape: Institutional subscription access tolls are paid per incoming journal; individual OA publication fees are paid per outgoing article. The money to pay for gold OA fees is tied up in subscription tolls. But institutions cannot cancel their journal subscriptions unless the journals' contents are accessible to their users otherwise. Institutions are not necessarily even subscribing annually, for their users, to the same journals in which their researchers are occasionally publishing. Catch 22. (And, as Graham notes, anyone foolish and gullible enough to believe hybrid gold publishers (the ones who charge both subscription tolls + optional gold OA fees) when they say they will reduce subscription tolls proportionately as gold OA fee revenues increase is forgetting that this requires institutions to find the money to pay the gold asking price first, while it is still being spent on the subscriptions! A good "business model" indeed…) (By the way, the somewhat uneven distribution of wealth on the planet can also be fixed "if we all commit." That's not just gold fever, it's the Golden Rule -- but alas far too few in our gene pool are committed to practising it...) GS: "But maybe I can see how we get to gold open access now? With researchers taking ownership of the 'game' by realising that gold open access is the only way to ensure access for all and increased citations, maybe we are on the right road after all?"Researchers "taking ownership of the 'game'"? by "reaising that gold OA is the only way"? The self-contradiction on the road to there from here is resolved by "realisation"? By researchers? (The same researchers for whom the only thing they need to do to provide OA is a few keystrokes? And they're not even "committed" enough to do those keystrokes, unless they are first mandated by their institutions or funders?) What does this vision envision that researchers are to do with this newfound golden realisation of theirs? The same thing 34,000 of them did (unsuccessfully) back in 2000? Sign a petition to boycott their journals if they don't go OA? And if researchers were really that committed to "ensuring access for all and increased citations," wouldn't it be simpler than making empty threats against all their publishers just to petition their one and only institution to mandate deposit? Better still, if their realisation about "the only way" were that profound, wouldn't researchers just go ahead and do the keystrokes to deposit of their own accord, unmandated, in order to "ensure access for all, and increased citations"? And would it not be a remarkable coincidence it it turned out that the most pressing thing on researchers' minds was not, in fact, the access and impact of their work (which they can already maximize with a few green keystrokes), but a "good business model" for their publishers and their long-suffering librarians? A remarkable coincidence that what researchers had been yearning for all along turned out (upon "realisation") to be exactly the same thing their librarians had been yearning for -- which was not the filling of their OA IRs but relief from the serials crisis? GS: "And maybe, instead of the superfast highway to gold open access that some envisage, are we travelling down the leafy lane of green open access with gold just around the next corner? A bit round the houses, but yes we are certainly getting there."The super-fast highway to gold OA? Amidst all this "realisation," I don't recall hearing the game plan for solving the problem of the toll booths posted along the ubiquitous subscription highways -- the ones that are currently gobbling up institutions' serial budgets (i.e., the funds that would be used instead to pay for gold OA)... But it is true that green OA, once it becomes universal, may eventually get us to gold OA too -- if universal availability of green eventually causes universal cancellations, forcing journals to cut costs, downsize, and convert to gold OA, thereby releasing the windfall subscription savings to pay the reduced cost of gold OA (peer review alone, with the print and online editions gone, and all access-provision and archiving offloaded onto the worldwide network of OA IRs). But that's not around the next corner, when we're still at 20% green OA. And we are certainly getting ahead of ourselves, if we don't provide the universal green OA first -- for that's what any eventual subscription cancellation windfall is dependent upon. The cancellations can't be done pre-emptively. Certainly not by a single institution, or IR manager -- not even the Chair of the UK Council of Research Repositories. That would require universal institutional subscription cancellations, and all at once (not one institution or country at a time -- otherwise the researchers of that institution or country, instead of gaining open access, lose subscription access altogether). My recommendation to OA IR managers who envision "the work of repository management as a professionally recognised and supported role within UK research institutions" would be to focus on their own mandate, which is to fill their own institution's IRs, not to dream about business models that are as good as gold. And the way to get their OA IRs filled is already known: It is by getting their institutions to mandate green OA. (No one connected in any way with OA IRs has a more "professionally recognised and supported role within [their] research institutions" then Southampton's Les Carr and Harvard's Stuart Shieber, the architects of their respective institutions' green OA mandates (Southampton's being the first and Harvard's the most famous). It's not too late for Huddersfield -- or Nottingham, or the rest of the 17,000 universities that have not yet adopted a mandate. That's all. And that's enough. Mandate green OA for your institution and rest will take care of itself, in its own time. But meanwhile your institution's researchers will "ensure access for all, and increased citations." That, after all -- not "a good business model" -- is the purpose of OA, and hence the mandate of OA IR managers. See "Waiting for Gold" On 2010-07-30, at 2:50 AM, Charles Oppenheim [CO] wrote in JISC-Repositories: CO: "Mr Stone's (and other repository managers') Job Specifications may say something like "your job is to ensure that articles produced by staff in this University are made OA, whether by means of the Institutional Repository or by any other means deemed appropriate." So, whilst not disagreeing with the argument that the priority should be green repositories, repository managers should not ignore alternative approaches that also produce increased downloads and citations and promote the institution's reputation. Even if their job specification is tied to their IR, it would be an unprofessional Repository Manager who was not interested in the pros and cons of alternative methods for achieving OA. Being professional means taking a holistic view of things! I see nothing incompatible therefore between Mr Stone's remarks and being chairman of UKCoRR."But GS had written: And CO has replied:GS: "I have been espousing the virtues of green open access for nearly five years… However, for some time I have had my doubts as to whether the championing of green open access was actually taking us down the right road… Kurt de Belder... assumed that open access would be via the gold route and if Repositories existed, they would only contain grey literature… I must admit that I am starting to agree with the gold only route…" If the university repository manager's "job is to ensure that articles produced by staff in this University are made OA, whether by means of the Institutional Repository or by any other means deemed appropriate," it is not clear why the job is called "repository manager."CO: "...priority should be given to green repositories..." (It sounds like something more like "publication advisor" -- and if that advice is to take the gold only route, then it sounds like an anti-repository manager!) Rather than twist simple and obvious job descriptions into complicated ideological knots, might it not be more sensible to look carefully at the concrete, practical reasons why repository managers' "priority should be [filling] green repositories" rather than "the gold only route"? After all, GS himself wrote that the "trouble to me is that the [gold OA] model only really works if we all commit. Otherwise, you end up paying twice." But GS never went on to explain how to surmount this impasse (whereas my posting [above] explains quite explicitly why you could not -- unless universal green OA came first). Yet this impasse did not seem to deter Huddersfield's green repository manager and UKCoRR's chairman from announcing that he was "starting to agree with the gold only route" because he "could see that gold open access was a good business model." CO: "And before Stevan explodes at this posting, let me say (yet again) that I am a strong supporter of the green approach to OA. But I am not blind to the existence, and in some cases success, of alternative OA approaches."Indisputably there is not one but two ways to provide OA. (We -- CO and 8 other co-authors -- defined the two ways ourselves in a Nature Web Focus six years ago: But from the capability of providing OA to some of the planet's annual 2.5 million refereed journal articles in two different ways, green and gold, it does not follow that each of the ways is capable of scaling up to providing OA to all (or even much or most) of the planet's annual 2.5 million refereed journal articles.Harnad, S., Brody, T., Vallieres, F., Carr, L., Hitchcock, S., Gingras, Y, Oppenheim, C., Stamerjohanns, H., & Hilf, E. (2004) The green and the gold roads to Open Access. Nature Web Focus. This is where the sticky Escherian details (about annual percentage green and gold OA, ongoing subscription needs and commitments, double payment, and especially the power of green mandates) come in. Surely the practical and professional mandate of the newly minted job title "repository manager" is not just a matter of abstract principles but of concrete, practical reality. Stevan Harnad American Scientist Open Access Forum Wednesday, July 21. 2010Why It Is Not Enough Just To Give Green OA Higher Weight Than Gold OA
MELIBEA's validator assesses OA policies using an algorithm that generates for each policy a one-dimensional measure, "OA%val," based on a number of weighted factors.
In assigning weights to these factors it is it not just a matter of whether one puts a greater weight on green than on gold overall. The devil is in the details. Since MELIBEA's "OA%val" is one-dimensional, the exact weights assigned by the algorithm matter very much, for in some crucial combinations the "score" can be deleterious to green (and hence to OA itself) by assigning any non-zero weight at all to gold in an OA policy evaluation. I will use the most problematic case to illustrate: With all the policy components that one can combine in order to give an OA policy a score, consider the relative weighting one is to give to four policy models: Policy Model 1 neither requires green nor funds gold (gr/go)One can agree to weight GR/GO > gr/go One can also agree (as above) to weight GR/go > gr/GO One can even agree to weight GR/GO > GR/go (although I do have reservations about this, because the potential deterrent effects of over-demanding early policy models on the spread of green OA mandates, but I will not bring these reservations into this discussion) The problematic case concerns whether to assign a greater weight to gr/GO than to gr/go in the MELIBEA score (i.e., whether gr/GO > gr/go, Policy 4 vs. Policy 3). I am strongly opposed to weighting gr/GO > gr/go, because I am convinced that when an institution adopts a premature gold payment policy without first adopting a green requirement policy, this diminishes rather than increases the likelihood of an upgrade to a green requirement. So in that case, despite the fact that a gr/GO policy no doubt generates somewhat more OA than gr/go, this small local increase OA is not better for the growth of OA overall. Rather, it reinforces the widespread misconception that the way to generate OA is to pay for gold OA (and then wait for others to do the same). Such a policy neglects the much more important need to mandate green OA, cost-free, first. It tries to pay for OA even while subscriptions are still paying the full cost of publication, hence still tying down most of the potential funds to pay for gold OA. Giving a gr/GO policy a higher weight than gr/go obscures the fact that paid gold can only cover a small fraction of an institution's output, and at an extra cost, whereas requiring green covers all of it, and at no extra cost. There are ways to remedy this, algorithmically (for example, by giving GO a non-zero weight only when GR also has a non-zero weight). The important point to note, however, is that these algorithmic subtleties are not resolved by simply stating that one assigns a higher weight -- even a much higher weight -- to GR than to GO: Promoting the right priorities in OA policy design requires a much more nuanced approach. Regarding the question of IR (institutional repository) vs CR (central repository) deposit too, the devil is in the details. Just as one more Gold OA article is indeed one more piece of OA, exactly as one more Green OA deposit is, so too one more CR deposit is indeed one more piece of OA, exactly as one more IR deposit is. But the goal is to weight the algorithm to promote stronger policy models, not just to promote isolated increments in OA. And just as a policy that pays for gold without mandating green is generating only a little more OA at the expense of not generating a lot more OA, so a funder policy that mandates CR deposit instead of IR deposit is generating only a little more OA at the expense of not generating a lot more OA (by reinforcing -- at no cost, and with no loss in OA -- the adoption of a cooperative, convergent IR deposit policy for the rest of each institution's output, funded and unfunded, across all its discipline, instead of gratuitously competing with institutional OA policies, by adopting a divergent CR deposit policy). The problem is not with publishers' green policies but with institutions' (and funders) lack of green policies! Over 60% of journals endorse immediate Green OA deposit for the postprint and over 40% more for the preprint (hence over 90% of all articles, overall), yet only 15% of articles are being deposited annually overall, because less than 1% of institutions have yet mandated deposit. This is the real gap that needs to be closed -- and can be closed, immediately, by mandating Green OA. And this is what is completely overlooked by institutions and funders hurrying to pay for gold OA instead of first mandating green OA, or funders needlessly mandating CR deposit instead of IR deposit. The fact is that there are still much fewer than even 1% mandates (about 160, out of a total of perhaps 18,000 universities plus 8,000 research institutions and at least several hundred major funders, funding across multiple institutions, worldwide). The lesson before us is hence most definitely not that mandates are not enough; it is that there are not enough mandates -- far from it. Gold OA payment is minor matter, providing a small amount of OA, whereas green OA mandates are a major priority, able to scale up to providing 100% OA. Gold is nothing but a distraction -- for either an institution or a funder -- until and unless it first mandates green. Nor is the problem that publishers are only paying lip-service to repository deposit. The problem is that the overwhelming majority of institutions and funders are still only paying lip service to repository deposit -- instead of mandating it. Nor will funders and institutions pre-emptively paying publishers for gold without first mandating green (while subscriptions are still paying for publishing, tying up the potential funds to pay for gold) solve the problem of getting green mandated by institutions and funders. For these reasons it is not enough, in evaluating OA Policy factors, just to give Green OA a higher weight than Gold OA. Monday, July 19. 2010Ameliorating MELIBEA's Open Access Policy EvaluatorThe MELIBEA Open Access policy validator is timely and promising. It has the potential to become very useful and even influential in shaping OA mandates -- but that makes it all the more important to get it right, rather than releasing MELIBEA prematurely, when it still risks increasing confusion rather than providing clarity and direction in OA policy-making. Remedios Melero is right to point out that -- unlike the CSIC Cybermetrics Lab's 's University Rankings and Repository Rankings -- the MELIBEA policy validator is not really meant to be a ranking. Yet MELIBEA has set up its composite algorithm and its graphics to make it a ranking just the same. It is further pointed out, correctly, that MELIBEA's policy criteria for institutions and funders are not (and should not be) the same. Yet, with the coding as well as the algorithm, they are treated the same way (and funder policy is taken to be the generic template, institutional policy merely an ill-fitting special case). It is also pointed out, rightly, that a gold OA publishing policy is not central to institutional OA policy making -- yet there it is, contributing sizeable components to the MELIBEA algorithm. It is also pointed out that MELIBEA's green color code has nothing to do with the "green OA" coding -- yet there it is -- red, green yellow -- competing with the widespread use of green to designate OA self-archiving, and thereby inducing confusion, both overt and covert. MELIBEA could be a useful and natural complement to the ROARMAP registry of OA policies. I (and no doubt other OA advocates) would be more than happy to give MELIBEA feedback on every aspect of its design and rationale. But as it is designed now, I can only agree with Steve Hitchcock's points and conclude that consulting MELIBEA today would be likely to create and compound confusion rather than helping to bring the all-important focus and direction to OA policy-making that I am sure CSIC, too, seeks, and seeks to help realize. Here are just a few prima facie points: (1) Since MELIBEA is not, and should not be construed as a ranking of OA policies -- especially because it includes both institutional and funder policies -- it is important not to plug it into an algorithm until and unless the algorithm has first been carefully tested, with consultation, to make sure it weights policy criteria in a way that optimizes OA progress and guides policy-makers in the right direction. I hope there will be substantive consultation and conscientious redesign of these and other aspects of MELIBEA before it can be recommended for serious consideration and use. Stevan Harnad American Scientist Open Access Forum Saturday, July 17. 2010Funders Should Mandate Institutional Deposit (and, if desired, central harvest)
SUMMARY: The most effective and natural way to ensure that all institutions -- the universal providers of all research, funded and unfunded, in all fields -- provide open access (OA) to all of their peer-reviewed research (funded and unfunded, in all fields) is for both funders and institutions to mandate cooperative, convergent deposit, by the author, in the author's own institutional repository, rather than competitive, divergent institutional-and/or-institution-external deposit by authors-and/or-publishers.
1. It is important for OA advocates to understand that it is not PubMed Central (PMC) that is making biomedical articles open access (OA) -- it is the depositors of those articles. In the case of PMC, those depositors are authors (and publishers). PMC is serving both as a locus of deposit (i.e., a central, subject-based repository) and as a locus of search and use (like google).Stevan Harnad American Scientist Open Access Forum Thursday, July 8. 2010On Comparing Institutional Apples With Multi-Institutional Fruit: The Denominator Fallacy Again
Chris Armbruster [CA] wrote in the American Scientist Open Access Forum:
CA: "'Institution' is indeed not a very precise concept, but the repository ranking will not be improved if one were to spend much time trying to decide which repository is institutional and which is not"If there is any rationale for separately ranking and comparing -- as the Ranking Web of World Repositories (RWWR) does -- both the top 800 repositories and the top 800 institutional repositories (and there is indeed an important rationale for doing so), then that rationale is that the institutions are indeed institutional and not multi-institutional. The purpose is to rank their relative size (and hence their success in capturing their target content), and there is no point in comparing the size of the category "apple" with the size of the category "fruit." This is the "denominator fallacy." The pro's and con's of Chris Armbruster's advocacy of central (multi-institutional) repositories over institutional repositories have already been multiply discussed over the years in this Forum and elsewhere. The argument for institutional repositories is that (1) institutions are the providers of all of OA's target content, (2) they have a stake in managing their own output, and (most important of all) (3) they are in a position to mandate the deposit of their own output. The argument for multi-institutional (central) repositories is that they look (superficially) as if they were bigger, hence more "successful" in attracting OA's target content. (Hence Chris's preference for keeping the two kinds of repositories and their sizes conflated in the RWWR rankings.) They also look (superficially) more manageable and sustainable. The argument against multi-institutional (central) repositories is (a) that multi-institutional entities (notably, funders) cannot mandate the deposit of all institutional research output (because not all research is funded), (b) that central deposit mandates compete with instead of reinforcing institutional mandates (eliciting resistance from authors facing the prospect of having to do double-deposits), and (most relevantly here) (c) that the size and success of a repository can only be evaluated and compared in relation to the size of that repository's total target output: And although there are differences among institutions in the size of their own total output (which can and should be weighted to normalize it and make it comparable), the differences in size between institutions and multi-institutions is the difference in size between the number of apples and the number of fruit. (The denominator fallacy.) Multi-institutional (central) repositories' content would have to be weighted by the output of all their actual and potential target institutions and the total target content of each, in order to make multi-institutional rankings comparable to those of individual institutions. RWWR is not doing that kind of weighting -- nor would it be easy to determine those weightings for each kind of multi-institutional repository, though it may eventually be possible to estimate in principle. If it were done, however, there would hardly be any need for two rankings (for repositories vs. institutional repositories). What would be clear from a proper denominator-weighted ranking of institutional and multi-institutional repositories is that, contrary to what Chris has argued, it is not at all true that the multi-institutional repositories are bigger or more successful in collecting their respective total target contents. Rather, it makes much more sense for both institutions and funders to mandate that researchers deposit in their own institutional repository -- from which multi-institutional collections could then be automatically harvested. (It would then be redundant to try to compare their relative success, as one would clearly be a derivative of the other.) For management and sustainability, local institutional deposit and central harvesting is the complementary -- and optimal -- solution. But first the primary content-provision problem has to be solved, otherwise there is next to nothing to manage and sustain! CA: "how about also deleting No 10 because it is only a departmental repository?"A departmental repository, in contrast, is sub-institutional rather than multi-institutional. Hence, unless there is to be a separate RWWR ranking of the top 800 departmental mandates, there is no harm in listing the departmental repositories among the institutional repositories -- except if the university has both an institutional and a departmental repository, and the contents of the departmental repository are also a proper subset of the contents of the institutional repository, hence double-counted. This is not the case in the instance of ["institutional"] repository #10, University of Southampton School of Electronics and Computer Science, whose contents are not part of institutional repository #27, University of Southampton. Rather than resulting in an inflated ranking for Southampton, this actually results in a lower ranking. The joint RWWR ranking of the integrated institutional repository would be higher for Southampton. (That said, with a properly weighted denominator, separately tagged departmental repositories would be useful at this time, to compare the relative success of institution-wide mandates vs. departmental/school/faculty mandates -- i.e., Arthur's Sale's "patchwork mandate" strategy.) CA: "Also, it is a bad idea to define repositories as institutional only if they restrict themselves to the output of a single institution. We already have too many repository managers who succumb to this kind of institutionalist logic - and reject OA content only because it is not from their own institution."If only the problem were that of an overflowing cup, with so much OA target content that it needs to be rejected! Chris has the OA content problem completely upside-down! The problem is that not enough of each institution's own OA target content is being deposited, anywhere -- not that institutions are declining to host the output of other institutions. (It is only Chris's central-repository preoccupation that makes him imagine that the latter is the problem.) What's missing is not repositories to deposit in, but mandates to deposit. The solution is for institutions and funders to mandate institutional deposit of all content, funded and unfunded, across all disciplines -- and then, if desired, to harvest that content into various central collections, by discipline, funder, language or nation, as desired. Institutions are the universal providers of all that content; they are also the natural locus for deposit mandates. CA: "The CSIC has a sound methodology for ranking repositories, and it not their job to define exclusively what is an IR and what not. And in cyberspace it is much more interesting to compare repositories according to domains and services they offer…"I take it that by the CSIC Chris means the RWWR. And as far as I can tell, the only reason Chris finds the methodology sound is that it conflates institutional and multi-institutional repositories, which favors Chris's preference for multi-institutional repositories. What is much more interesting and important in cyberspace than the locus of the distributed content is the presence of the content. Most (80%) of OA's target content is still missing from anywhere on the (free) web, and long overdue. Locus matters strategically for the concrete, practical goal of capturing that target content (and making it OA). Chris keeps systematically missing this point. If the content were all there already, none of this would matter in the slightest. (And a good intuition pump to bear in mind is that the key to the success of Google and the like was not to try to get everyone to deposit their content directly in Google: What happened, and worked, was distributed, local deposit and hosting, followed by central harvesting. Not a bad principle to generalize to OA...) CA: "Moreover, it would help if we could move beyond the often narrow understanding of what an institutional repository is and what not & acknowledge more clearly that a strategy of privileging institutional repositories as such has not helped."Chris does not seem to have noticed the growing institutional/departmental repository mandate movement (initiated in 2002 by Southampton ECS, but greatly accelerated since the 16th mandate in 2008 by Harvard FAS, and now running well over 100 institutional/departmental mandates, including UCL, MIT and Stanford, as well as over 40 funder mandates). It is not (and never has been) a matter of merely "privileging" institutional deposit, but mandating it. CA: "The value & sustainability of IRs (individually, as isolated instances, & if not embedded in a national system) is rather limited for both scholarship and open access."(1) Repository value is nil without content. (2) With content, locus is irrelevant, as search is not local but global, via central harvesters. (3) Sustainability is a red herring (especially with today's sparse OA content); institutional deposit loci and central harvesters are complementary, insofar as preservation is concerned. (4) Nations can and should mandate OA deposit. Nations can and should harvest OA deposits centrally. But there is no earthly need (or prospect) of nations directly hosting all their institutional OA output centrally, any more than there is any earthly need for nations to host all their institutions centrally. (5) If Chris is worried about limitations on OA scholarship, he should set his mind to thinking of how to induce the OA target content providers (institutional researchers) to deposit their content, to make it OA. (6) IRs will take care of themselves. CA: "Hence, it is very welcome that more determined efforts are underway at building viable networks of research repositories and integrate IRs in national systems (e.g. Ireland as latest instance)."All true, but a non sequitur, insofar was the fundamental problem of filling those repositories with their target contents is concerned. CA: "For a sustained argument, please see": Armbruster & Romary (2010) Comparing Repository Types: Challenges and Barriers for Subject-Based Repositories, Research Repositories, National Repository Systems and Institutional Repositories in Serving Scholarly Communication (accepted for publication in IJDLS)For a sustained critique and response, see: I have quickly skimmed (but not read verbatim) the new A & R paper, and I see that all of my prior objections (to A & R's earlier paper) remain unanswered, indeed not even noted.Conflating OA Repository-Content, Deposit-Locus, and Central-Service Issues (1) The 4-way classification system -- subject, nation, "research" and institution -- continues to be arbitrary and rather incoherent. (2) The three far more important and salient distinctions -- direct deposit repositories vs harvested collections, OA target content vs other kinds of content, mandated repositories vs. unmandated repositories -- are not treated (or not treated in enough depth to understand their salience) (3) The all-important question of how best to capture OA's target content -- the most central question, before we even talk about repository types, services or sustainability -- is not given any serious consideration. (4) The very specific question of locus of deposit, and its specific importance for deposit mandates (and hence for capturing the target content) is likewise not given any serious consideration. (5) The "denominator fallacy" continues to pervade throughout, in the continued reference to absolute repository size, without taking into account the size or proportion of the repository's target contents that the repository is actually capturing. (For an institutional repository, the denominator is its total refereed journal article output; for HAL -- which A & R stunningly misclassify as the most successful of all repositories! -- it is the totality of France's refereed journal article output.) In short, A & R's approach -- which takes so much of the current sparse and inchoate landscape for granted, and follows after it, instead of facing the real problem, which is to remedy that sparseness, and lead the way toward capturing the vast proportion of OA's target content (at least 80% of it) that is still not being captured (by any repository) -- is not, I believe, a realistic or productive one. The reality is that most repositories -- of all the kinds A & R consider and don't consider -- are near-empty of their target content. Consequently, search, services and sustainability are not the problem: Content is. Mandates generate the content, but A & R's treatment imagines that mandates, and their promise, amount mostly to funder mandates (and funder -- i.e. "research" repositories). This is (in my view) an enormous error: Not all scholarly and scientific research (perhaps not even the majority of it) is funded, but virtually all of it comes from institutions -- universities and research institutes. In and of itself, that is strong reason to give institutional repositories and institutional mandates far more serious thought than A & R give them. Another reason is that once institutional deposit is mandated and OA contents are being systematically deposited in their institutional repositories, they can be harvested to any other collections we may desire -- subject-based, national, "research" or what-have-you. Nor are the various search and other services that are built atop this OA content meant to be provided at the institutional level (where A & R note their absence as if it were a defect): services are a harvester-level function, whereas content-provision is an institution-level function. A & R's article is also missing the point of depositing the author's rather than the publisher's version (the author's version has far fewer restrictions and can be provided much earlier); nor does it take into account the power of institutional repositories to provide immediate "Almost OA" even in the case of publisher-embargoed content, via the semi-automatic "eprint request" button. A & R also make some incorrect assumptions about the difficult and effort of deposit and the need for library assistance and proxy deposit. Stevan Harnad American Scientist Open Access Forum Ranking Institutional Repositories
Isidro Aguillo wrote in the American Scientist Open Access Forum:
IA: "I disagree with [Hélène Bosc's] proposal [to eliminate from the top 800 institutional repository rankings the multi-institution repositories and the repositories that contain the contents of other repositories as subsets]. We are not measuring only [repository] contents but [repository] contents AND visibility [o]n the web."Yes, you are measuring both contents and visibility, but presumably you want the difference between (1) the ranking of the top 800 repositories and (2) the ranking of the top 800 institutional repositories to be based on the fact that the latter are institutional repositories whereas the former are all repositories (central, i.e., multi-institutional, as well as institutional). Moreover, if you list redundant repositories (some being the proper subsets of others) in the very same ranking, it seems to me the meaning of the RWWR rankings become rather vague. IA: "Certainly HyperHAL covers the contents of all its participants, but the impact of these contents depends o[n] other factors. Probably researchers prefer to link to the paper in INRIA because of the prestige of this institution, the affiliation of the author or the marketing of their institutional repository."All true. But perhaps the significance and usefulness of the RWWR rankings would be greater if you either changed the weight of the factors (volume of full-text content, number of links) or, alternatively, you designed the rankings so the user could select and weight the criteria on which the rankings are displayed. Otherwise your weightings become like the "h-index" -- an a-priori combination of untested, unvalidated weights that many users may not be satisfied with, or fully informed by... IA: "But here is a more important aspect: If I were the president of INRIA I [would] prefer people using my institutional repository instead CCSD. No problem with the [CCSD], they are [doing] a great job and increasing the reach of INRIA, but the papers deposited are a very important (the most important?) asset of INRIA."But how heavily INRIA papers are linked, downloaded and cited is not necessarily (or even probably) a function of their direct locus! What is important for INRIA (and all institutions) is that as much as possible of their paper output should be OA, simpliciter, so that it can be linked, downloaded, read, applied, used and cited. It is entirely secondary, for INRIA (and all institutions), where their papers are made OA, compared to the necessary condition that they are made OA (and hence freely accessible, useable, harvestable). Hence (in my view) by far the most important ranking factor for institutional repositories is how much of their (annual) full-text institutional paper output is indeed deposited and made OA. INRIA would have no reason to be disappointed if the locus from which its content was being searched, retrieved and linked happened to be some other, multi-institutional harvester. INRIA still gets the credit and benefits from all those links, downloads and citations of INRIA content! (Having said that, locus of deposit does matter, very much, for deposit mandates. Deposit mandates are necessary in order to generate OA content. And -- for strategic reasons that are elaborated in my own reply to Chris Armbruster -- it makes a big practical difference for success in reaching agreement on adopting a mandate in the first place that both institutional and funder mandates should require convergent institutional deposit, rather than divergent and competing institutional vs. institution-external deposit. Here too, your RWWR repository rankings would be much more helpful and informative if they gave a greater weight to the relative size of each institutional repository's content and eliminated multi-institutional repositories from the institutional repository rankings -- or at least allowed institutional repositories to be ranked independently on content vs links. I think you are perhaps being misled here by the analogy with your sister rankings of world universities rather than just their repositories. In university rankings, the links to the university site itself matter a lot. But in repository rankings, links matter much less than how much institutional content is freely accessible at all. For the degree of usage and impact of that content, harvester sites may be more relevant measures, and, after all, downloads and citations, unlike links, carry their credits (to the authors and institutions) with them no matter where the transaction happens to occur... IA: "Regarding the other comments we are going to correct those with mistakes but it is very difficult for us to realize that Virginia Tech University is 'faking' its institutional repository with contents authored by external scholars."I have called Gail McMillan at Virginia Tech to inquire about this, and she has explained it to me. The question was never whether Virginia Tech was "faking"! They simply host content over and above Virginia Tech content -- for example, OA journals whose content originates from other institutions. As such, the Virginia Tech repository, besides providing access to Virginia Tech's own content, like other institutional repositories, is also a conduit or portal for accessing the content of other institutions (e.g., those providing the articles in the OA journals Virginia Tech hosts). The "credit" for providing that conduit, goes to Virginia Tech, of course. But the credit for the links, usage and citations goes to those other institutions! When an institutional repository is also used as a portal for other institutions, its function becomes a hybrid one -- both an aggregator and a provider. I think it's far more useful and important to try to keep those functions separate, in both the rankings and the weightings of institutional repositories. Stevan Harnad American Scientist Open Access Forum Saturday, July 3. 2010Google Scholar Boolean Search on Citing Articles
In the world of journal articles, each article is both a "citing" item and a "cited" item. The list of references a given article cites provides that article's outgoing citations. And all the other articles in whose reference lists that article is cited provide that article's incoming citations.
Formerly, with Google Scholar (first launched in November 2004) (1) you could do a google-like boolean (and, or, not, etc.) word search, which ranked the articles that it retrieved by how highly cited they were. Then, for any individual citing article in that ranked list of citing articles, (2) you could go on to retrieve all the articles citing that individual cited article, again ranked by how highly cited they were. But you could not go on to do a boolean word search within just that set of citing articles; as of July 1 you can. (Thanks to Joseph Esposito for pointing this out on liblicense.) Of course, Google Scholar is a potential scientometric killer-app that is just waiting to design and display powers far, far greater and richer than even these. Only two things are holding it back: (a) the sparse Open Access content of the web to date (only about 20% of articles published annually) and (b) the sleepiness of Google, in not yet realizing what a potentially rich scientometric resource and tool they have in their hands (or, rather, their harvested full-text archives). Citebase gives a foretaste of some more of the latent power of an Open Access impact and influence engine (so does citeseerx), but even that is pale in comparison with what is still to come -- if only Green OA self-archiving mandates by the world's universities, the providers of all the missing content, hurry up and get adopted so that they can be implemented, and then all the target content for these impending marvels (not just 20% of it) can begin being reliably provided at long last. (Elsevier's SCOPUS and Thomson-Reuters' Web of Knowledge are of course likewise standing by, ready to upgrade their services so as to point also to the OA versions of the content they index -- if only we hurry up and make it OA!) Harnad, S. (2001) Proposed collaboration: google + open citation linking. OAI-General. June 2001.
(Page 1 of 1, totaling 7 entries)
|
QuicksearchSyndicate This BlogMaterials You Are Invited To Use To Promote OA Self-Archiving:
Videos:
The American Scientist Open Access Forum has been chronicling and often directing the course of progress in providing Open Access to Universities' Peer-Reviewed Research Articles since its inception in the US in 1998 by the American Scientist, published by the Sigma Xi Society. The Forum is largely for policy-makers at universities, research institutions and research funding agencies worldwide who are interested in institutional Open Acess Provision policy. (It is not a general discussion group for serials, pricing or publishing issues: it is specifically focussed on institutional Open Acess policy.)
You can sign on to the Forum here.
ArchivesCalendarCategoriesBlog AdministrationStatisticsLast entry: 2018-09-14 13:27
1129 entries written
238 comments have been made
Top ReferrersSyndicate This Blog |