Please compare Romeo functionality and provide feedback from Stevan Harnad on 2004-04-09 (American-Scientist-Open-Access-Forum)

From: Stevan Harnad <harnad_at_ecs.soton.ac.uk>
Date: Fri, 9 Apr 2004 14:21:14 +0100

Dear colleagues:

It would be very helpful if you could compare and then provide feedback
about the comparative functionality of the two new versions of the
Romeo listing of publisher/journal policies on author self-archiving.

The old Romeo is at:
http://www.lboro.ac.uk/departments/ls/disresearch/romeo/Romeo%20Publisher%20Policies.htm

The two new Romeos to compare are at:

    (1) The SHERPA version at:
    http://www.sherpa.ac.uk/romeo.php?all=yes
    http://www.sherpa.ac.uk/romeo.php?stats=yes

versus

    (2) The Southampton version, which was really intended to demonstrate
    features for the SHERPA version to adopt, rather than to become a
    rival version:
    http://www.ecs.soton.ac.uk/~harnad/Temp/Romeo/romeo.html
    http://www.ecs.soton.ac.uk/~harnad/Temp/Romeo/romeosum.html

It is important to note that SHERPA has made some extremely valuable
improvements on the old Romeo -- in particular, a separate entry for
whether each publisher supports (i) postprint self-archiving and (ii)
preprint self-archiving -- and that these improvements are incorporated in
*both* versions and are hence not in dispute. In fact, there is no dispute
whatsoever about the information conveyed in the two versions. They are
*exactly* equivalent!

The only question is about the functionality of the presentation:

    (a) the formatting, colour coding, and exact wording of the entries
    for postprints and preprints

    (b) the colour-coding of the publishers in the entries and the
    summary statistics

There is also a proposal to add one item of information (so far absent from
both new versions, but about to be incorporate in the Southampton version):

    (c) cloning the present format and information and presenting it not
    only by publisher, but by journal

It would be very helpful if colleagues could compare the two versions and weigh
their actual features (plus the proposed feature of listing journals too)
from the specific standpoint of the *intended users* of this information.

It seems to me (though you may have different views on this) that the three
main things that all users want to know are:

    (1) Which publishers/journals already give their official green
    light to self-archiving *at all*?

    (2) Of those that do, do they give their green light to postprint
    self-archiving or only to preprint self-archiving?

    (3) What are the current numbers, percentages, and growth-rates among
    those publishers/journals who give their green light to postprint
    self-archiving, preprint self-archiving or neither?

Note that there are also more specific details in both lists about each
individual publisher/journal's self-archiving policies, but the two
versions of the new Romeo do not differ at all on those specific details
in the individual entries. The functional comparison is only with respect
to (1)-(3) above (unless you can think of another general thing that users
would need/want to know).

All the formatting, wording, and colour-coding differences concern what is
the most heuristic and informative way to present the answers to (1)-(3)
to users at a glance. Any user prepared to ponder over the data at length
will get all the answers just as he would from studying a raw excel
spread-sheet, without the help of any colour-code, but the question here
is about *functionality*.

    The colour-code for the answer to (1):

    (1) Which publishers/journals already give their official green
    light to self-archiving *at all*?

SHERPA version: yellow, blue and green, plus green ticks (yes) vs. white,
plus red X's (no).

Proposed alternative: green (yes) vs. gray (no).

    The colour-code for the answer to (2):

    (2) Of those that do, do they give their green light to postprint
    self-archiving or only to preprint self-archiving?

SHERPA version: yellow, plus green tick (preprints only); blue, plus green tick
(postprints only); green, plus green tick (both); white, plus red tick (neither)

Proposed alternative: pale-green (preprints only), bright-green (postprints
or both), gray (neither)

    The colour code for the answer to (3):

    (3) What are the current numbers, percentages, and growth-rates among
    those publishers/journals who give their green light to postprint
    self-archiving, preprint self-archiving or neither?

SHERPA version: Publisher numbers and percentages are given for yellow
(preprints), blue (postprints), green (both) and white (neither). No
journal information. No growth information.

Proposed alternative: Publisher and journal numbers, percentages and
growth-rates are given for pale-green (preprints), bright-green (postprints
and both) and gray (neither). (Note that the combination "postprint
but not preprint" is *deliberately* not given a separate colour-code,
although the numbers and percentages are given, and the postprint/preprint
distinction is coded as two shades of green, rather than as separate
colours.)

The prospective users from whose standpoint you should weigh the
functionality of these two ways of colour-coding and wording the
very same information are, in order of priority:

    (i) authors contemplating self-archiving, who want to find out whether
    or not a particular journal or publisher has given its green light
    to self-archiving, and especially self-archiving of the all-important
    article itself, the postprint; they may also wish to browse journals
    or publishers to see which ones support self-archiving; they may also
    want an idea of where things stand currently on overall numbers and
    proportions, as well as growth trends, for self-archiving as a whole,
    and postprints (primarily) and preprints (secondarily) in particular

(In weighing the functionality and the incentive vs. deterrent
effect of coding "not yet" as "gray" -- rather than as a "red X" [or a
"skull-and-cross-bones"!] -- please do bear in mind that (1) many publishers
have not yet formulated a self-archiving policy, in fact have not given
it much thought at all, and that (2) publisher policy on preprints is
not even a copyright matter at all!)

   (ii) institutions (and their libraries) that provide self-archiving
   resources and advice to their authors: they too will want to
   know which publishers and which journals to date have given their
   green light to self-archiving at all, and especially to postprint
   self-archiving. They too will want to be able to browse and look
   up publishers and journals, as well as being able to visualize the
   current summary status and trends at a glance

   (iii) research-funders that want to see at a glance what publishers'
   and journals' self-archiving policies are now, and how they are evolving
   in time

   (iv) publishers and journals that want to see what other publishers'
   and journals' self-archiving policies are, and how they are evolving
   in time

I hope I have not primed you too much in the direction of my favoured
version in putting it this way, but I do think it is important to
conceptualize clearly what the list is for, and what its intended users
are likely to want. Otherwise we are not discussing functionality but
merely one's taste in colour!

Now my reply to David Goodman, who, as far as I can discern, is not saying
anything substantive at all about the comparative functionality of the
two versions, but merely expressing his abstract ideas about colour-coding
(as illustrated by his rather off-topic example of quantum chromodynamics
in physics). I really hope further feedback will be specific about the
comparative functionality of the two versions relative to the intended
uses and users of the Romeo information.

On Thu, 8 Apr 2004, David Goodman wrote:

> I think Stevan Harnad is right on the button when he says:
>
> >sh> "It is to be regretted that there must now be two Romeo
> >sh> sites. (All I had wanted to do was to integrate Romeo, DOAJ and
> >sh> OAIster in tracking and comparing *total growth,* not to provide
> >sh> a rival version of Romeo itself. But if SHERPA/Romeo declines to
> >sh> optimise, preferring to cite precedent instead of considering
> >sh> functionality, I do feel that having this functionality at
> >sh> another site is preferable to not having it at all.)

It is not clear what David is agreeing with here, for the rest of his message
seems to contradict his agreement here! All I said was that I wish there did not
have to be two Romeo lists, but that having two lists, one with the functionality
I am proposing and one without, was better than not having the functionality at
all!

(Unless I am mistaken, David seems to have parsed this passages only
to the depth: "Better one list than two." And that's all he's saying
is right. I hope David will now look at the functionality of my re-coded
Romeo (more closely than he looked at the contents of this passage!) and
compare it with the SHERPA version. David likes the idea of listing
journals and not just publishers, but I hope he has noticed that that
is not what my proposed re-coding is about, since the re-coded version
does not yet even list journals, and since journals can so easily be
cloned into the SHERPA version: it is about the functionality of the
re-coding itself!)

> Bill Hubbard wrote:
>
> >bh> We understand that this is what Stevan Harnad and his group is now
> >bh> going to do as a separate project. We have now given the underlying
> >bh> SHERPA/RoMEO data to Stevan Harnad's group at their request for
> >bh> this purpose and we look forward to the valuable contribution that
> >bh> this new journal based interface will provide.
>
> But he is not right about everything.

To repeat, so far David has only agreed with himself! He too thinks
Romeo should list journals and not just publishers. But we are comaring
here the functionality of the two different codings, whether or not they
cover journals.

Here is what David has to say about the coding:

> The use of colors and similar arbitrary designations is typical of
> an in-group, sort of a within-family private language. Sometimes these
> things do get accepted and into the mainstream, and stay there forever. I
> suppose the best example is from physics: quantum chromodynamics,
> (along with the properties of charm and strangeness and the very name
> of quark). Or the practice in genetics of naming drosophila genes as
> English or Japanese puns. If we think that our field is of similar
> status and can count on similiar acceptance, then it will do no harm.
> If we need to talk to people in other fields (like physics and biology),
> maybe it's another matter.

With all due respect, I think the question of functionality to users that
I am raising has nothing whatsoever to do with quantum chromodynamics
or drosophila genetics, and it would be a great help if those who
are comparing the two new Romeo versions focussed concretely on their
actual and potential uses rather than abstract principles of scientific
nomenclature!

> Stevan as a cognitive scientist should know better than to universalize
> what to him are:
>
> >sh> These simple, easy-to-grasp properties are the ones that should
> >sh> be clearly reflected in the colour codes, not an arbitrary,
> >sh> exhaustive, chromatic nomenclature!

I am not universalizing: I am stating very specifically (at the top
of this posting, and in several prior postings) how and why I think two
colours (green and gray) and two shades (pale-green and bright-green) will
*work* far better for users than five colours (yellow, blue, green, white,
red). Neither David's appeal to abstract principles nor Bill's appeal to
old-Romeo's historic precedent answers the question: Which of these two
codings is more heuristic, informative and useful to the intended users
of the list?

> Bill understands better when he says :
>
> >bh> I would like to move on from discussing different colours! Many
> >bh> different colour-schemes could be proposed, with different benefits
> >bh> and drawbacks.

I regret to say that this is not an answer about functionality. It is simply
begging the question. The benefits/drawbacks have to be compared and weighed!
The old Romeo colour-code (in which I was involved, as I was part of
the old-Romeo working group) was improvised as we went along; it was
nowhere near optimised when the project ended. It is splendid that the
list is being taken over by SHERPA, but that is no reason to perpetuate
non-optimal (in fact, I would argue, dysfunctional) features that the
old Romeo happened to have when the projected ended.

Nor did SHERPA merely perpetuate the features of the old Romeo: A very valuable
new feature -- explicit, separate, +/- entries for postprint self-archiving
and preprint self-archiving -- were introduced. That is commendable, and greatly
enhanced the new Romeo's functionality.

But dysfunctional old-Romeo features (particularly the already-excessive colour
code: blue/green/white) were taken over too, and made worse by adding more
colours -- yellow, and red cross's, and green ticks with a meaning different
from the green already in use!). And these features were not supported by
functional arguments: It was simply stated that the old colours were inherited,
the new colours were added to mark still more categories, and that there are
many different possible colour schemes, but we have adopted this one,
and there's the end of the matter!

And that's why a second, competing Romeo was born, regrettably, rather than a
dialogue in which features were upgraded on the basis of their functionality
(which includes their economy!).

> I am sure by now he wishes they had never been included!

It would have been just as big a mistake to include *no* colour codes at all.
The distinction between publishers/journals that do or do not support author
self-archiving at all -- self-archiving+ vs. self-archiving- -- *is* an
extremely important one, and if we are to do counts and chart progress,
it is quite natural to use a colour code, especially in a canonical
list. Similarly, the preprint/postprint distinction should be colour-coded,
but clearly coded as two variants of the colour of self-archiving+, not as two
independent colours!

So self-archiving+/self-archiving- become green vs. gray
and post-print+/postprint+ become bright-green vs. pale-green

And that's all that's needed. But this is not just a matter of abstract
boolean logic: Look at the two versions and see whether your eye and brain
don't immediately grasp the pertinent overall distinctions far more readily
with the minimal colour-code rather than with the parti-coloured one.

> As a librarian, I note some errors and ommissions:
>
> First, publishers subdivisions often have different policies. Though
> the list requested groups all of Elsevier together, its division Cell
> Press has totally different and less friendly policies than the rest of
> the company.

This is possibly another argument for having journal listings too, not
just publisher listings. But I'm inclined to say that if the publisher
(say, Elsevier) says all Elsevier authors have the green light to
self-archive (say, the preprint), then authors can and should take them
at their word and self-archive.

David, this is an author matter, not a library matter. Librarians
are accustomed to thinking in terms of IP and permissions for
*bought-in* content: can I use it in course-packs? can I put it on our
intranet? etc. etc. None of this has the slightest relevance to author
self-archiving. If my publisher's policy statement about all Elsevier
authors does not in fact apply to all Elsevier authors, let them notify
me if they have a problem with my self-archiving! Meanwhile, I self-archive.

That's the end of it, David. Trying to squeeze author self-archiving and
the Romeo list into the librarian's IP/permissions straitjacket for
bought-in content is simply a waste of time and energy that could be
usefully redirected elsewhere.

> It is remarkably difficult getting an up-to-date and correct statement of
> policy: even senior executives are often one policy change behind. The
> worst source is the individual journal's instructions to authors: some
> have not been updated for 40 years (as shown by internal evidence).
> Ulrich's is not definitive about title changes and the like; it relies
> upon publisher's information, and can be a year or so behind. On the
> other hand, some large publishers are notorious for being unable to
> produce correct lists of their own titles.

Well, well. It seems that authors can't self-archive until someone
succeeds in sorting out this mess! I don't think so. I think you
are trying to fit author self-archiving into the procrustean bed of
librarians' IP problems, and that that is both unnecessary and a big
mistake -- and would be a great disservice to open-access provision if
it were taken seriously by authors (which I hope is will not be!).

Romeo's publisher self-archiving policy listings are based on the best
current information available (but are not guaranteed). That's good
enough for the function Romeo is intended to perform. (Ulrich's is
not the source of the policies, just of the journal-names.)

None of this, by the way, has anything to do with the question at hand,
which is about the colour-coding of the Romeo data, not their validity.

> Maintaining a proper list takes work. If we can do it all together for
> one list, maybe we can do it right. If the reason we cannot is merely
> disagreement over what codes to use, none of the groups by themselves
> may be able to accomplish it.

One Romeo list of publisher policies on author self-archiving is enough
(and trying to coordinate two independent lists, being independently updated,
would be a nightmare!). The Southampton version takes its data from the
SHERPA version. The disagreement is only about coding, and it is on the
grounds of the functionality of that coding for the user, which David
has not addressed at all, except as regards quantum chromodynamics...

I hope other colleagues will.

Stevan Harnad
Received on Fri Apr 09 2004 - 14:21:14 BST

This archive was generated by hypermail 2.3.0 : Fri Dec 10 2010 - 19:47:26 GMT