Sunday, January 29. 2006
On Fri, 27 Jan 2006, H�l�ne Bosc wrote: "Peux tu m'expliquer ce qu'il y a derri�re Opendoar?" I'll reply in English to your question about what is behind OpenDoar, so I can post the reply more widely: "Manifestement [�a reprend] les r�alisations dej� faites � Southampton..." It is true that -- so far -- DOAR is mostly just re-doing, funded, what Tim Brody had already done, unfunded (with ROAR). DOAR so far covers about 3/5 of the archives in ROAR and 1/2 the number in OAIster, and does not yet measure or provide a way to display the time-course of their growth in contents or number, as ROAR does. (DOAR will need Tim's Celestial to do that.)
However, DOAR does provide an OAI Base URL in what looks (to my eyes: DOAR does not yet give tallies) to be a much larger proportion of archives than ROAR (c. 80%) does, and this is presumably because DOAR has contacted, directly and individually, each archive for which the OAI Base URL was missing.
(This is something I had asked Tim to do, but it is perhaps too much to expect from an unfunded doctoral student, primarily working on his thesis! The solution of course is for archives to expose their own OAI Base URLs for harvesters to pick up automatically, and this will of course be the ultimate outcome. For now, there is no Registry that all archives use or aspire to be covered by. If DOAR incorporates all of the useful features of ROAR (especially celestial), and adds value, it may succeed in becoming that Registry. So far, ROAR's periodic calls to Archives to register have not inspired enough responsiveness. Most of ROAR's new archives for the past year or more have been hand-imported by me and Tim! At least DOAR will be funded to do that thankless task, from now on!)
The second potentially useful feature of DOAR is that it seems to classify separately the different content types; and (I think -- I'm not sure) that DOAR has checked that those are all full-texts (rather than just bibiographic metadata: DOAR will need to make this more explicit in their documentation).
If so, then DOAR can potentially provide size and growth-rate charts by content types (preprints, postprints, theses, etc.), though as of now there is no way to do this (or boolean combinations) in DOAR. (The Eprints software already tags and exposes content types as well as whether or not each entry is a full-text; I expect that the other archive softwares will soon follow suit. Then it's up to the archives to provide and expose those metadata, so the harvesters can pick up, telly, and do other useful things with them.)
Right now, the DOAR entry for an archive looks a lot like a library card catalogue entry for a journal or a book (perhaps by analogy with DOAJ) or even a collection.
This does not quite make sense to me, since users do not consult or use individual online institutional archives as they do when looking up card-catalogue entries for individual books or journals or collections. For one thing, most of the archives will be university IRs. Most universities produce contents of all of the types listed, and in all of the subjects listed; and rarely will any user want all/only, say, articles on subject X from individual institution Y: They will instead use an OAI harvester and service-provider like OAIster or citebase or citeseer or even google scholar, that searches across all institutions on that subject, or even across all subjects.
Hence the only likely use for those type and subject classifications is either (1) for automatic pick-up by OAI harvesters, using them to mediate in harvesting the archives' metadata directly or (2) for individuals interested in gathering summary statistics on individual archive offerings. (And again, the optimal and most likely outcome is that the archives themselves will expose these metadata to be picked up directly by harvesters, rather than having to be mediated by a middle-service, hand-gathering and checking any missing data.)
So there are still functionality issues to be thought through if DOAR is to provide a useful service. But I expect these things will be resolved, and that DOAR will build on ROAR something that provides genuine value to the OA community and the research community in general, helping to hasten the day of 100% OA.
Ceterum censeo: "DOAJ, OAIster and Romeo should chart growth, as EPrints does" (Jan 2004)
Stevan Harnad
|