Saturday, May 26. 2007Craig et al.'s Review of Studies on the OA Citation Advantage
I've read Craig et al.'s critical review concerning the OA citation impact effect and will shortly write a short, mild review. But first here is Sally Morris's posting announcing Craig et al's review, on behalf of the Publishing Research Consortium (which "proposed" the review), followed by a commentary from Bruce Royan on diglib, a few remarks from me, then commentary by JWT Smith on jisc-repositories, followed by my response, and, last, a commentary by Bernd-Christoph Kaemper on SOAF, followed by my response. Sally Morris (Publishing Research Consortium):Craig, Ian; Andrew Plume, Marie McVeigh, James Pringle & Mayur Amin (2007) Do Open Access Articles Have Greater Citation Impact? A critical review of the literature. Journal of Informetrics.A new, comprehensive review of recent bibliometric literature finds decreasing evidence for an effect of 'Open Access' on article citation rates. The review, now accepted for publication in the Journal of Informetrics, was proposed by the Publishing Research Consortium (PRC) and is available at its web site at www.publishingresearch.net. It traces the development of this issue from Steve Lawrence's original study in Nature in 2001 to the most recent work of Henk Moed and others. It is notoriously tricky (at least since David Hume) to "prove" causality empirically. The thrust of the Craig et al. critique is that despite the fact that virtually all studies comparing the citation counts for OA and non-OA articles keep finding the OA citation counts to be higher, it has not been proven beyond a reasonable doubt that the relationship is causal.Bruce Royan wrote on diglib: I agree: It is merely highly probable, not proven beyond a reasonable doubt, that articles are more cited because they are OA, rather than OA merely because they are more cited (or both OA and more cited merely because of a third factor). And I also agree that not one of the studies done so far is without some methodological flaw that could be corrected. But it is also highly probable that the results of the methodologically flawless versions of all those studies will be much the same as the results of the current studies. That's what happens when you have a robust major effect, detected by virtually every study, and only ad hoc methodological cavils and special pleading to rebut each of them with. But I am sure those methodological flaws will not be corrected by these authors, because -- OJ Simpson's "Dream Team" of Defense Attorneys comes to mind -- Craig et al's only interest is evidently in finding flaws and alternative explanations, not in finding out the truth -- if it goes against their client's interests... Iain D.Craig: Wiley-BlackwellHere is a preview of my rebuttal. It is mostly just common sense, if one has no conflict of interest, hence no reason for special pleading and strained interpretations: (1) Research quality is a necessary, but not a sufficient condition for citation impact: The research must also be accessible to be cited. (2) Research accessibility is a necessary but not a sufficient condition for citation impact: The research must also be of sufficient quality to be cited. (3) The OA impact effect is the finding that an article's citation counts are positively correlated with the probability that that article has been made OA: The more an article's citations, the more likely that that article has been made OA. (4) This correlation has at least three causal interpretations that are not mutually exclusive: (4a) OA articles are more likely to be cited.(5) Each of these causal interpretations is probably correct, and hence a contributor to the OA impact effect: (5a) The better the article, the more likely it is to be cited, hence the more citations it gains if it is made more accessible (4a). (OA Article Quality Advantage, QA)(6) In addition to QB and QA, there is an OA Early Access effect (EA): providing access earlier increases citations. (7) The OA citation studies have not yet isolated and estimated the relative sizes of each of these (and other) contributing components. (OA also gives a Download Advantage (DA), and downloads are correlated with later citations; OA articles also have a Competitive Advantage (CA), but CA will vanish -- along with QB -- when all articles are OA). (8) But the handwriting is on the wall as to the benefits of making articles OA, for those with eyes to see, and no conflicting interests to blind them. I do agree completely, however, with erstwhile (Princetonian and) Royal Society President Bob May's slightly belated call for "an evidence-based approach to the scholarly communications debate." John Smith (JS) wrote in jisc-repositories: I wonder if we can come at this discussion concerning the impact of OA on citation counts from another angle? Assuming we have a traditional academic article of interest to only a few specialists there is a simple upper bound to the number of citations it will have no matter how accessible it is.That is certainly true. It is also true that 10% of articles receive 90% of the citations. OA will not change that ratio, it will simply allow the usage and citations of those articles that were not used and cited because they could not be accessed to rise to what they would have been if they could have been used and cited. JS: Also, the majority of specialist academics work in educational institutions where they have access to a wide range of paid for sources for their subject.OA is not for those articles and those users that already have paid access; it is for those that do not. No institution can afford paid access to all or most of the 2.5 million articles published yearly in the world's 24,000 peer-reviewed journals, and most institutions can only afford access to a small fraction of them. OA is hence for that large fraction (the complement of the small fraction) of those articles that most users and most institutions cannot access. The 10% of that fraction that merit 90% of the citations today will benefit from OA the most, and in proportion to their merit. That increase in citations also corresponds to an increase in scholarly and scientific productivity and progress for everyone. JS: Therefore any additional citations must mainly come from academics in smaller institutions that do not provide access to all relevant titles for their subject and/or institutions in the poorer countries of the world.It is correct that the additional citations will come from academics at the institutions that cannot afford paid access to the journals in which the cited articles appeared. It might be the case that the access denial is concentrated in the smaller institutions and the poorer countries, but no one knows to what extent that is true, and one can also ask whether it is relevant. For the OA problem is not just an access problem but an impact problem. And the research output of even the richest institutions is losing a large fraction of its potential research impact because it is inaccessible to the fraction to whom it is inaccessible, whether or not that missing fraction is mainly from the smaller, poorer institutions. JS: Should it not be possible therefore to examine the citers to these OA articles where increased citation is claimed and show they include academics in smaller institutions or from poorer parts of the world?Yes, it is possible, and it would be a good idea to test the demography of access denial and OA impact gain. But, again, one wonders: Why would one assign this question of demographic detail a high priority at this time, when the access and impact loss have already been shown to be highly probable, when the remedy (mandated OA self-archiving) is at hand and already overdue, and when most of the skepticism about the details of the OA impact advantage comes from those who have a vested interest in delaying or deterring OA self-archiving mandates from being adopted? (It is also true that a portion of the OA impact advantage is a competitive advantage that will disappear once all articles are OA. Again, one is inclined to reply: So what?) This is not just an academic exercise but a call to action to remedy a remediable practical problem afflicting research and researchers. JS: However, even if this were done and positive results found there is still another possible explanation. Items published in both paid for and free form are indexed in additional indexing services including free services like OAIster and CiteSeer. So it may be that it is not the availability per se that increases citation but the findability? Those who would have had access anyway have an improved chance of finding the article. Do we have proof that the additional citers accessed the OA version (assuming there is both an OA and paid for version)?Increased visibility and improved searching are always welcome, but that is not the OA problem. OAIster's usefulness is limited by the fact that it only contains the c. 15% of the literature that is being self-archived spontaneously (i.e., unmandated) today. Citeseer is a better niche search engine because computer scientists self-archive a much higher proportion of their research. But the obvious benchmark today is Google Scholar, which is increasingly covering all cited articles, whether OA or non-OA. It is in vain that Google Scholar enhances the visibility of non-OA articles for those would-be users to whom they are not accessible. Those users could already have accessed the metadata of those articles from online indices such as Web of Science or PubMed, only to reach a toll-access barrier when it came to accessing the inaccessible full-text corresponding to the visible metadata. JS: It is possible that my queries above have already been answered. If so a reference to the work will suffice as a response.Accessibility is a necessary (but not a sufficient) condition for usage and impact. There is no risk that maximising accessibility will fail to maximise usage and impact. The only barrier between us and 100% OA is a few keystrokes. It is appalling that we continue to dither about this; it is analogous to dithering about putting on (or requiring) seat-belts until we have made sure that the beneficiaries are not just the small and the poor, and that seat-belts do not simply make drivers more safety-conscious. JS: Even if the apparent citation advantage of OA turns out to be false it does not weaken the real advantages of OA. We should not be drawn into a time and effort wasting defence of it while there is other work to be done to promote OA.The real advantage of Open Access is Access. The advantage of Access is Usage and Impact (of which citations are one indicator). The Craig et al. study has not shown that the OA Impact Advantage is not real. It has simply pointed out that correlation does not entail causation. Duly noted. I agree that no time or effort should be spent now trying to demonstrate causation. The time and effort should be used to provide OA. Bernd-Christoph Kaemper (B-CK) wrote on SOAF:I couldn't quite follow the logic of this posting. It seemed to be saying that, yes, there is evidence that OA increases impact, it is even trivially obvious, but, no, we cannot estimate how much, because there are possible confounding factors and the size of the increase varies. All studies have found that the size of the OA impact differential varies from field to field, journal to journal, and year to year. The range of variation is from +25% to over +250% percent. But the differential is always positive, and mostly quite sizeable. That is why I chose a conservative overall estimate of +50% for the potential gain in impact if it were not just the current 15% of research that was being made OA, but also the remaining 85%. (If you think 50% is not conservative enough, use the lower-bound 25%: You'll still find a substantial potential impact gain/loss. If you think self-selection accounts for half the gain, split it in half again: there's still plenty of gain, once you multiply by 85% of total citations.) An interesting question that has since arisen (and could be answered by similar studies) is this: It is a logical possibility that all or most of the top 10% are already among the 15% that are being made OA: I rather doubt it; but it would be worth checking whether it is so. [Attention lobbyists against OA mandates! Get out your scissors here and prepare to snip an out-of-context quote...]Since it is known that (in science) the top 10% of articles published receive 90% of the total citations made (Seglen 1992), to what extent is the top 10% of articles published over-represented among the c. 15% of articles that are being spontaneously made OA by their authors today? [snip]The empirical studies of the relation between OA and impact have been mostly motivated by the objective of accelerating the growth of OA -- and thereby the growth of research usage and impact. Those who are oersuaded that the OA impact differential is merely or largely a non-causal self-selection bias are encouraged to demonstrate that that is the case. Note very carefully, though, that the observed correlation between OA and citations takes the form of a correlation between the number of OA articles, relative to non-OA articles, at each citation level. The more highly cited an article, the more likely it is OA. This is true within journals, and within and across years, in every field tested. And this correlation can arise because more-cited articles are more likely to be made OA or because articles that are made OA are more likely to be cited (or both -- which is what I think is in reality the case). It is certainly not the case that self-selection is the default or null hypothesis, and that those who interpret the effect as OA causing the citation increase hence have the burden of proof: The situation is completely symmetric numerically; so your choice between the two hypotheses is not based on the numbers, but on other considerations, such as prima facie plausibility -- or financial interest. Until and unless it is shown empirically that today's OA 15% already contains all or most of the top-cited 10% (and hence 90% of what researchers cite), I think it is a much more plausible interpretation of the existing findings that OA is a cause of the increased usage and citations, rather than just a side-effect of them, and hence that there is usage and impact to be gained by providing and mandating OA. (I can quite understand why those who have a financial interest in its being otherwise [Craig et al. 2007] might prefer the other interpretation, but clearly prima facie plausibility cannot be their justification.) I also think that 50% of total citations is a plausible overall estimate of the potential gain from OA, as long as it is understood clearly that that the 50% gain does not apply to every article made OA. Many articles are not found useful enough to cite no matter how accessible you make them. The 50% citation gain will mostly accrue to the top 10% of articles, as citations always do (though OA will no doubt also help to remedy some inequities and will sometimes help some neglected gems to be discovered and used more widely). In other words, the OA advantage to an article will be roughly proportional to that article's intrinsic citation value (independent of OA). Other interesting questions: The top-cited articles are not evenly distributed among journals. The top journals tend to get the top-cited articles. It is also unlikely that journal subscriptions are evenly distributed among journals: The top journals are likely to be subscribed to more, and are hence more accessible. So if someone is truly interested in these questions (as I am not!), they might calculate a "toll-accessibility index" (TAI) for each article, based on the number of researchers/institutions that have toll access to the journal in which that article is published. An analysis of covariance can then be done to see whether and how much the OA citation advantage is reduced if one controls for the article's TAI. (I suspect the answer will be: somewhat, but not much.) B-CK: Could we do a thought experiment? From a representative group of authors, choose a sample of authors randomly and induce them to make their next article open access. Do you believe they will see as much gain in citations compared to their previous average citation levels as predicted from the various current "OA advantage" studies where several confounding factors are operating? Probably not - but what would remain of that advantage? -- I find that difficult to predict or model.From a random sample, I would expect an increase of around 50% or more in total citations, 90% of the increased citations going to the top 10%, as always. B-CK: As I learned from your posting, you seem to predict that it will anyway depend on the previous citedness of the members of that group (if we take that as a proxy for the unknown actual intrinsic citation value of those articles), in the sense that more-cited authors will see a larger percentage increase effect.I don't think it's just a Matthew Effect; I think the highest quality papers get the most citations (90%), and the highest quality papers are apparently about 10% (in science, according to Seglen). B-CK: To turn your argument around, most authors happily going open access in expectation of increased citation might be disappointed because the 50% increase will only apply to a small minority of them.That's true; but you could say the same for most authors going into research at all. There is no guarantee that they will produce the highest quality research, but I assume that researchers do what they do in the hope that they will, if not this time, then the next time, produce the highest quality research. B-CK: That was the reason why I said that (as an individual author) I would rather not believe in any "promised" values for the possible gain.Where there is life, and effort, there is hope. I think every researcher should do research, and publish, and self-archive, with the ambition of doing the best quality work, and having it rewarded with valuable findings, which will be used and cited. My "promise", by the way, was never that each individual author would get 50% more citations. (That would actually have been absurd, since over 50% of papers get no citations at all -- apart from self-citation -- and 50% of 0 is still 0.) My promise, in calculating the impact gain/loss that you doubted, was to countries, research funders and institutions. On the assumption that the research output of each roughly covers the quality spectrum, they can expect their total citations to increase by 50% or more with OA, but that increase will be mostly at their high-quality end. (And the total increase is actually about 85% of 50%, as the baseline spontaneous self-archiving rate is about 15%.) B-CK: That doesn't mean though that there are not enough other reasons to go for open access (I mentioned many of them in my posting).There are other reasons, but researchers' main motivation for conducting and publishing research is in order to make a contribution to knowledge that will be found useful by, and used by, and built upon by other researchers. There are pedagogic goals too, but I think they are secondary, and I certainly don't think they are strong enough to induce a researcher to make his publications OA, if the primary reason was not reason enough to induce them. (Actually, I don't think any of the reasons are enough to induce enough researchers to provide OA, and that's why Green OA mandates are needed -- and being provided -- by researchers' institutions and funders.) B-CK: With respect to the toll accessibility index, I completely agree. The occasional good article in an otherwise "obscure" journal probably has a lot to gain from open access, as many people would not bother to try to get hold of a copy should they find it among a lot of others in a bibliographic database search, if it doesn't look from the beginning like a "perfect match" of what they are looking for.You agree with the toll-accessibility argument prematurely: There are as yet no data on it, whereas there are plenty of data on the correlation between OA and impact. B-CK: An interesting question to look at would also be the effect of open access on non-formal citation modes like web linking, especially social bookmarking. Clearly NPG is interested in Connotea also as a means to enhance the visibility of articles in their own toll access articles. Has anyone already tried such investigations?Although I cannot say how much it is due to other kinds of links or from citation links themselves, the University of Southampton, the first institution with a (departmental) Green OA self-archiving mandate, and also the one with the longest-standing mandate also has a surprisingly high webmetric, university-metric and G-factor rank: Stevan Harnad American Scientist Open Access Forum Bollen, J., Van de Sompel, H., Smith, J. and Luce, R. (2005) Toward alternative metrics of journal impact: A comparison of download and citation data. Information Processing and Management, 41(6): 1419-1440. Brody, T., Harnad, S. and Carr, L. (2006) Earlier Web Usage Statistics as Predictors of Later Citation Impact. Journal of the American Association for Information Science and Technology (JASIST) 57(8) pp. 1060-1072. Craig, Ian; Andrew Plume, Marie McVeigh, James Pringle & Mayur Amin (2007) Do Open Access Articles Have Greater Citation Impact? A critical review of the literature. Journal of Informetrics. Davis, P. M. and Fromerth, M. J. (2007) Does the arXiv lead to higher citations and reduced publisher downloads for mathematics articles? Scientometrics 71: 203-215. See critiques: 1 and 2. Diamond, Jr. , A. M. (1986) What is a Citation Worth? Journal of Human Resources 21:200-15, 1986, Eysenbach, G. (2006) Citation Advantage of Open Access Articles. PLoS Biology 4: 157. Hajjem, C., Harnad, S. and Gingras, Y. (2005) Ten-Year Cross-Disciplinary Comparison of the Growth of Open Access and How it Increases Research Citation Impact. IEEE Data Engineering Bulletin 28(4) pp. 39-47. Hajjem, C. and Harnad, S. (2006) Manual Evaluation of Robot Performance in Identifying Open Access Articles. Technical Report, Institut des sciences cognitives, Universite du Quebec a Montreal. Hajjem, C. and Harnad, S. (2006) The Self-Archiving Impact Advantage: Quality Advantage or Quality Bias? Technical Report, ECS, University of Southampton. Hajjem, C. and Harnad, S. (2007) Citation Advantage For OA Self-Archiving Is Independent of Journal Impact Factor, Article Age, and Number of Co-Authors. Technical Report, Electronics and Computer Science, University of Southampton. Hajjem, C. and Harnad, S. (2007) The Open Access Citation Advantage: Quality Advantage Or Quality Bias? Technical Report, Electronics and Computer Science, University of Southampton. Harnad, S. & Brody, T. (2004) Comparing the Impact of Open Access (OA) vs. Non-OA Articles in the Same Journals, D-Lib Magazine 10 (6) June Harnad, S. (2005) Making the case for web-based self-archiving. Research Money 19(16). Harnad, S. (2005) Maximising the Return on UK's Public Investment in Research. (Unpublished ms.) Harnad, S. (2005) OA Impact Advantage = EA + (AA) + (QB) + QA + (CA) + UA. (Unpublished ms.) Harnad, S. (2005) On Maximizing Journal Article Access, Usage and Impact. Haworth Press (occasional column). Harnad, S. (2006) Within-Journal Demonstrations of the Open-Access Impact Advantage: PLoS, Pipe-Dreams and Peccadillos (LETTER). PLOS Biology 4(5). Henneken, E. A., Kurtz, M. J., Eichhorn, G., Accomazzi, A., Grant, C., Thompson, D., and Murray, S. S. (2006) Effect of E-printing on Citation Rates in Astronomy and Physics. Journal of Electronic Publishing, Vol. 9, No. 2, Summer 2006 Henneken, E. A., Kurtz, M. J., Warner, S., Ginsparg, P., Eichhorn, G., Accomazzi, A., Grant, C. S., Thompson, D., Bohlen, E. and Murray, S. S. (2006) E-prints and Journal Articles in Astronomy: a Productive Co-existence Learned Publishing. Kurtz, M. J., Eichhorn, G., Accomazzi, A., Grant, C. S., Demleitner, M., Murray, S. S. (2005) The Effect of Use and Access on Citations. Information Processing and Management, 41 (6): 1395-1402. Kurtz, Michael and Brody, Tim (2006) The impact loss to authors and research. In, Jacobs, Neil (ed.) Open Access: Key strategic, technical and economic aspects. Oxford, UK, Chandos Publishing. Lawrence, S, (2001) Online or Invisible?, Nature 411 (2001) (6837): 521. Metcalfe, Travis S (2006) The Citation Impact of Digital Preprint Archives for Solar Physics Papers. Solar Physics 239: 549-553 Moed, H. F. (2006) The effect of 'Open Access' upon citation impact: An analysis of ArXiv's Condensed Matter Section (preprint) Perneger, T. V. (2004) Relation between online 'hit counts' and subsequent citations: prospective study of research papers in the British Medical Journal. British Medical Journal 329:546-547. Seglen, P.O. (1992) The skewness of science. The American Society for Information Science 43: 628-638
(Page 1 of 1, totaling 1 entries)
|
QuicksearchSyndicate This BlogMaterials You Are Invited To Use To Promote OA Self-Archiving:
Videos:
The American Scientist Open Access Forum has been chronicling and often directing the course of progress in providing Open Access to Universities' Peer-Reviewed Research Articles since its inception in the US in 1998 by the American Scientist, published by the Sigma Xi Society. The Forum is largely for policy-makers at universities, research institutions and research funding agencies worldwide who are interested in institutional Open Acess Provision policy. (It is not a general discussion group for serials, pricing or publishing issues: it is specifically focussed on institutional Open Acess policy.)
You can sign on to the Forum here.
ArchivesCalendarCategoriesBlog AdministrationStatisticsLast entry: 2018-09-14 13:27
1129 entries written
238 comments have been made
Top ReferrersSyndicate This Blog |