Gaulé, Patrick & Maystre, Nicolas (2011) Getting cited: Does open access help? Research Policy (in press)
G & M: "Cross-sectional studies typically find positive correlations between free availability of scientific articles (‘open access’) and citations… Using instrumental variables, we find no evidence for a causal effect of open access on citations. We provide theory and evidence suggesting that authors of higher quality papers are more likely to choose open access in hybrid journals which offer an open access option. Self-selection mechanisms may thus explain the discrepancy between the positive correlation found in Eysenbach (2006) and other cross-sectional studies and the absence of such correlation in the field experiment of Davis et al. (2008)… Our results may not apply to other forms of open access beyond journals that offer an open access option. Authors increasingly self-archive either on their website or through institutional repositories. Studying the effect of that type of open access is a potentially important topic for future research..."
What the
Gaulé & Maystre (G&M) (2011) article shows -- convincingly, in my opinion -- is that
in the case of paid hybrid gold OA, most of the observed citation increase is better explained by the fact that the authors of articles that are more likely to be cited are also more likely to pay for hybrid gold OA. (The effect is even stronger when one takes into account the phase in the annual funding cycle when there is more money available to spend.)
But whether or not to pay money for the OA is definitely not a consideration in the case of
Green OA (
self-archiving), which costs the author nothing. (The exceedingly low infrastructure costs of hosting Green OA repositories
per article are borne by the institution, not the author: like the incomparably higher journal subscription costs, likewise borne by the institution, they are invisible to the author.)
I rather doubt that G & M's economic model translates into the economics of doing a few extra author keystrokes -- on top of the vast number of keystrokes already invested in keying in the article itself and in submitting and revising it for publication.
It is likely, however -- and we have been noting this from the very outset -- that one of the
multiple factors contributing to the OA citation advantage (alongside the article quality factor, the article accessibility factor, the early accessibility factor, the competitive [OA vs non-OA] factor and the download factor) is indeed an author self-selection factor that contributes to the OA citation advantage.
What G & M have shown, convincingly, is that in the special case of having to pay for OA in a hybrid Gold Journal (
PNAS: a high-quality journal that makes all articles OA on its website 6 months after publication), the article quality and author self-selection factors alone (plus the availability of funds in the annual funding cycle) account for virtually all the significant variance in the OA citation advantage:
Paying extra to provide hybrid Gold OA during those first 6 months does not buy authors significantly more citations.
G & M correctly acknowledge, however, that neither their data nor their economic model apply to Green OA self-archiving, which costs the author nothing and can be provided for any article, in any journal (most of which are not made OA on the publisher's website 6 months after publication, as in the case of PNAS). Yet it is on Green OA self-archiving that most of the
studies of the OA citation advantage (and the ones with the largest and most cross-disciplinary samples) are based.
I also think that because both citation counts and the OA citation advantage are correlated with article quality there is a potential artifact in using estimates of article or author quality as indicators of author self-selection effects: Higher quality articles are cited more, and the size of their OA advantage is also greater. Hence what would need to be done in a test of the self-selection advantage for Green OA would be to estimate article/author quality [but not from their citation counts, of course!] for a large sample and then --
comparing like with like -- to show that among articles/authors estimated to be at the
same quality level, there is no significant difference in citation counts between individual articles (published in the same journal and year) that are and are not self-archived by their authors.
No one has done such a study yet -- though we have weakly approximated it (
Gargouri et al 2010) using journal impact-factor quartiles. In our approximation, there remains a significant OA advantage even when comparing OA (self-archived) and non-OA articles (same journal/year) within the same quality-quartile. There is still room for a self-selection effect between and within journals within a quartile, however (a journal's impact factor is an average across its individual articles;
PNAS, for example, is in the top quartile, but its individual articles still vary in their citation counts). So a more rigorous study would have to tighten up the quality equation much more closely). But my bet is that a significant OA advantage will be observed even when comparing like with like.
Stevan Harnad
EnablingOpenScholarship