Re: Open access to research worth A3 1.5bn a year from Stevan Harnad on 2005-09-30 (American-Scientist-Open-Access-Forum)

From: Stevan Harnad <harnad_at_ecs.soton.ac.uk>
Date: Fri, 30 Sep 2005 04:30:16 +0100

Quoting Phil Davis <pmd8_at_cornell.edu>:

> I just read the JEP article (referred to by Peter Banks) comparing
> articles printed in Pediatrics with other articles only appearing in the
> online addition. The authors' main findings suggest that despite wider
> potential audience for articles published freely online, articles
> appearing in print received more citations:

The article compared selected (not necessarily equivalent) articles; it
compared print vs. non-print (not OA vs non-OA) and it did it some time ago.
Most journals have since become hybrid print/online, and the relevant
comparison today is conventional (print/online) access *only* versus
conventional access *plus* access to a self-archived supplement. That is the
comparison we are making in our studies (which Dr. Banks was challenging),
and our virtually exception-free results show a citation advantage of 50%-
250% for the supplemented access.

It does not even make logical sense to imagine that there would be *fewer*
citations for the supplemented articles -- except if there was a systematic
bias toward self-archiving the inferior articles! In reality, there is a bias
in the opposite direction (a greater tendency to self-archive the better
articles, which partly inflates the OA advantage, but does not constitute all
of it). See the studies of Kurtz et al., in the bibliography I cited.

> "The difference between the mean citation levels for print and online was
> 3.09 ±0.93 in favor of print (95% CI), meaning that an online article
> could expect to receive 2.16 to 4.02 fewer citations in the literature
> than if it had been printed."

Which means nothing more than the fact that at that time there was a print
advantage over non-print (and perhaps also that the better articles were
selected for print). It has next to no bearing on the real question of
interest: Does supplementing print/online paid access with supplementary
online free access increase citations?

It does.

> Or in other words, their data do not support the hypothesis that full OA
> journals receive more citations than non-full OA journals.

We are not talking here about either online-only journals vs print journals,
nor about OA journals vs. non-OA journals. The results Dr. Banks challenged
were based on comparing toll-access-only with toll-access plus free online
access.

> Yet it is methodologically difficult to rigorously test this hypothesis,
> and the use of inferential statistics in this study suggests that they are
> trying to generalize beyond their own journal. In this study, the authors
> compared two different sets of articles: 1) those that were selected for
> inclusion in the main journal, and 2) those that were not. Selection bias
> alone may explain the different results, or at least interject a large
> enough bias where the results may not accurately reflect their research
> question. In other words, it would be difficult to understand whether
> their results are a reflection of accessibility, or selection bias.

Yes there is a big methodological artifact in the comparability of the two
samples: a selection bias. There is also a small, out-dated sample. And a big
question of whether one arbitrary journal is representative of anything at all
(especially under these selective conditions and in this restricted and out-
of-date time-range, in the fast-moving online world).

> Still, this article fails to support the unstated hypothesis that full OA
> journal articles receive more citations than non-full OA journal articles.

To repeat. The studies Dr. Banks was challenging were not comparing OA to non-
OA journals; they were compared self-archived to non-self-archived articles,
all published in non-OA journals. All journals that were 100% (or 0%) OA were
left out of the analysis of OA/non-OA, for obvious reasons.

> For that conclusion alone, we would be wise to stay with the null
> hypothesis (that is, no significant difference) unless we start seeing
> compelling evidence the other way.

The null hypothesis for no difference between OA and non-OA journals was
supported by comparisons in the ISI studies (see the bibliography I cited),
but it was rejected, repeatedly, by both the Brody et al. data in physics and
the Chawki et al. data in Biomedicine, Psychology, Sociology, Education, and
Business. Stay tuned; more data on the way...

> The other conclusion that we may come to is that it may be impossible to
> come up with universal statements about Open Access publishing (i.e. it
> can provide 50 - 25% more citations). Methodology problems in designing
> rigorous studies may only permit us to make anecdotal statements about
> particular journals or publishing models that have very narrow parameters
> for generalization.

To repeat (yet again): The results Banks was challenging had nothing to do
with OA journals or OA journal publishing. They concerned OA itself, and were
comparing self-archived and non-self-archived articles in the same journal
and year of *non-OA* journals.

Stevan Harnad
Received on Fri Sep 30 2005 - 06:11:34 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:48:03 GMT