Scientometrics

Thursday, January 29. 2009

Who Should Notify Authors Whenever They Are Cited?

Peter Suber wrote in Open Access News:

Notifying authors when they are cited

Elsevier has launched CiteAlert, a free service notifying authors when one of their papers is cited by an Elsevier journal. (Thanks to ResourceShelf.) The service only covers citations to articles published since 2005 in journals indexed by Scopus.
Comments [by Peter Suber]

"This is useful as far as it goes, and I can see why Elsevier can't take it much further on its own. But imagine if all journal publishers offered similar services. The utility of receiving their reports, knowing that they comprehensively covered the field, would be immense. But the labor of signing up for each one separately would also be immense, not to mention the labor of re-creating the service at thousands of different publishers. The bother of reading separate reports from separate publishers would also be immense. I understand that Elsevier's portfolio is larger than anyone else's, but the long tail of academic publishing means that Elsevier's titles still constitute less than 10% of all of peer-reviewed journals.

"I'd like to see a service that notifies authors when one of their works is cited by any journal, regardless of its publisher. If this can't be done by a creative developer harvesting online information (because the harvester doesn't have access to TA sites), then how about a consortial solution from the publishers themselves? And don't stop at emails to authors. Create RSS feeds which users can mash-up in any way they like. Imagine getting a feed of your citations from this hypothetical service and a feed of your downloads from your institutional repository. Imagine your IR feeding the citations in your articles to an OA database, upon which anyone could draw, including this hypothetical service.

"Who could do this? OpenURL? CrossRef? ParaCite? Google Scholar? OCLC (after it acquires OAIster)? A developer at an institution like Harvard with access to the bulk of TA journals? Perhaps someone could build the OA database now, with the citation-input and email- and RSS-output functions, and worry later about how to recruit publishers and repositories and/or how to harvest their citations."

It is clear who should notify whom -- once the global research community's (Green OA ) task is done. Our task is first to get all refereed research journal articles self-archived in their authors' Institutional Repositories (IRs) immediately upon acceptance for publication. (To accomplish that we need universal Green OA self-archiving mandates to be adopted by all institutions and funders, worldwide.)

Once all current and future articles are being immediately deposited in their authors' IRs, the rest is easy:

The articles are all in OAI-compliant IRs. The IR software treats the articles in the reference list of each of its own deposited articles as metadata, to be linked to the cited article, where it too is deposited in the distributed network of IRs. A citation harvesting service operating over this interlinked network of IRs can then provide (among many, many other scientometric services) a notification service, emailing each author of a deposited article whenever a new deposit cites it. (No proporietary firewalls, no toll- or access-barriers: IR-to-IR, i.e., peer-to-peer.)

Stevan Harnad
American Scientist Open Access Forum

Posted by Stevan Harnad in Scientometrics at 18:17 | Comments (0) | Trackbacks (0)

Thursday, January 22. 2009

The fundamental importance of capturing cited-reference metadata in Institutional Repository deposits

On 22-Jan-09, at 5:18 AM, Francis Jayakanth wrote on the eprints-tech list:

"Till recently, we used to include references for all the uploads that are happening into our repository. While copying and pasting metadata content from the PDFs, we don't directly paste the copied content onto the submission screen. Instead, we first copy the content onto an editor like notepad or wordpad and then copy the content from an editor on to the submission screen. This is specially true for the references.

"Our experience has been that when the references are copied and pasted on to an editor like notepad or wordpad from the PDF file, invariably non-ascii characters found in almost every reference. Correcting the non-ascii characters takes considerable amount of time. Also, as to be expected, the references from difference publishers are in different styles, which may not make reference linking straight forward. Both these factors forced us take a decision to do away with uploading of references, henceforth. I'll appreciate if you could share your experiences on the said matter."

The items in an article's reference list are among the most important of metadata, second only to the equivalent information about the article itself. Indeed they are the canonical metadata: authors, year, title, journal. If each Institutional Repository (IR) has those canonical metadata for every one of its deposited articles as well as for every article cited by every one of its deposited articles, that creates the glue for distributed reference interlinking and metric analysis of the entire distributed OA corpus webwide, as well as a means of triangulating institutional affiliations and even name disambiguation.

Yes, there are some technical problems to be solved in order to capture all references, such as they are, filtering out noise, but those technical problems are well worth solving (and sharing the solution) for the great benefits they will bestow.

The same is true for handling the numerous (but finite) variant formats that references may take: Yes, there are many, including different permutations in the order of the key components, abbreviations, incomplete components etc., but those too are finite, can be solved once and for all to a very good approximation, and the solution can be shared and pooled across the distributed IRs and their softwares. And again, it is eminently worthwhile to make the relatively small effort to do this, because the dividends are so vast.

I hope the IR community in general -- and the EPrints community in particular -- will make the relatively small, distributed, collaborative effort it takes to ensure that this all-important OA glue unites all the IRs in one of their most fundamental functions.

(Roman Chyla has replied to eprints-tech with one potential solution: "The technical solution has been there for quite some time, look at citeseer where all the references are extracted automatically (the code of the citeseer, the old version, was available upon request - I dont know if that is the case now, but it was in the past). That would be the right way to go, imo. I think to remember one citeseer-based library for economics existed, so not only the computer-science texts with predictable reference styles are possible to process. With humanities it is yet another story.")

Stevan Harnad
American Scientist Open Access Forum

Posted by Stevan Harnad in Scientometrics at 20:12 | Comments (2) | Trackbacks (0)

Wednesday, January 14. 2009

Comparing OA/non-OA in Developing Countries

"[A]n investigation of the use of open access by researchers from developing countries... show[s] that open access journals are not characterised by a different composition of authors than the traditional toll access journals... [A]uthors from developing countries do not cite open access more than authors from developed countries... [A]uthors from developing countries are not more attracted to open access than authors from developed countries. [underscoring added]"(Frandsen 2009, J. Doc. 65(1))
(See also "Open Access: No Benefit for Poor Scientists")

Open Access is not the same thing as Open Access Journals.

Articles published in conventional non-Open-Access journals can also be made Open Access (OA) by their authors -- by self-archiving them in their own Institutional Repositories.

The Frandsen study focused on OA journals, not on OA articles. It is problematic to compare OA and non-OA journals, because journals differ in quality and content, and OA journals tend to be newer and fewer than non-OA journals (and often not at the top of the quality hierarchy).

Some studies have reported that OA journals are cited more, but because of the problem of equating journals, these findings are limited. In contrast, most studies that have compared OA and non-OA articles within the same journal and year have found a significant citation advantage for OA. It is highly unlikely that this is only a developed-world effect; indeed it is almost certain that a goodly portion of OA's enhanced access, usage and impact comes from developing-world users.

It is unsurprising that developing world authors are hesitant about publishing in OA journals, as they are the least able to pay author/institution publishing fees (if any). It is also unsurprising that there is no significant shift in citations toward OA journals in preference to non-OA journals (whether in the developing or developed world): Accessibility is a necessary -- not a sufficient -- condition for usage and citation: The other necessary condition is quality. Hence it was to be expected that the OA Advantage would affect the top quality research most. That's where the proportion of OA journals is lowest.

The Seglen effect ("skewness of science") is that the top 20% of articles tend to receive 80% of the citations. This is why the OA Advantage is more detectable by comparing OA and non-OA articles within the same journal, rather than by comparing OA and non-OA journals.

We will soon be reporting results showing that the within-journal OA Advantage is higher in "higher-impact" (i.e., more cited) journals. Although citations are not identical with quality, they do correlate with quality (when comparing like with like). So an easy way to understand the OA Advantage is as a quality advantage -- with OA "levelling the playing field" by allowing authors to select which papers to cite on the basis of their quality, unconstrained by their accessibility. This effect should be especially strong in the developing world, where access-deprivation is greatest.

Leslie Chan -- "Associate Director of Bioline International, co-signatory of the Budapest Open Access Initiative, supervisor in the new media and international studies programs at the University of Toronto, and tireless champion for the needs of the developing world" (Poynder 2008) -- has added the following in the American Scientist Open Access Forum:
I concur with Stevan's comments, and would like to add the following:

1. From our perspective, OA is as much about the flow of knowledge from the South to the North as much as the traditional concern with access to literature from the North. So the question to ask is whether with OA, authors from the North are starting to cite authors from the South. This is a study we are planning. We already have good evidence that more authors from the North are publishing in OA journals in the South (already an interesting reversal) but we need a more careful analysis of the citation data.

2. The more critical issue regarding OA and developing country scientists is that most of those who publish in "international" journals cannot access their own publications. This is where open repositories are crucial, to provide access to research from the South that is otherwise inaccessible.

3. The Frandsen study focuses on biology journals and I am not sure what percentage of them are available to DC researchers through HINARI/AGORA. This would explain why researchers in this area would not need to rely on OA materials as much. But HINARI etc. are not OA programs, and local researchers will be left with nothing when the programs are terminated. OA is the only sustainable way to build local research capacity in the long term.

4. Norris et. al's (2008) "Open access citation rates and developing countries" focuses instead on Mathematics, a field not covered by HINARI and they conclude that "the majority of citations were given by Americans to Americans, but the admittedly small number of citations from authors in developing countries do seem to show a higher proportion of citations given to OA articles than is the case for citations from developed countries. Some of the evidence for this conclusion is, however, mixed, with some of the data pointing toward a more complex picture of citation behaviour."

5. Citation behaviour is complex indeed and more studies on OA's impact in the developing world are clearly needed. Davis's eagerness to pronounce that there is "No Benefit for Poor Scientists" based on one study is highly premature.

If there should be a study showing that people in developing countries prefer imported bottled water over local drinking water, should efforts to ensure clean water supplies locally be questioned?

Leslie Chan

Stevan Harnad
American Scientist Open Access Forum

Posted by Stevan Harnad in Scientometrics at 15:25 | Comments (0) | Trackbacks (0)

Tuesday, January 13. 2009

Validating Multiple Metrics As Substitutes for Expert Evaluation of Research Performance

Harnad, Stevan (2009) Multiple metrics required to measure research performance. Nature (Correspondence) 457 (785) (12 February 2009) doi :10.1038/457785a;

Nature's editorial "Experts still needed" (Nature 457: 7-8, 1 January 2009) is right that no one metric alone can substitute for the expert evaluation of research performance (based on already-published, peer-reviewed research), because no single metric (including citation counts) is strongly enough correlated with expert judgments to take their place. However, some individual metrics (such as citation counts) are nevertheless significantly correlated with expert judgments; and it is likely that a battery of multiple metrics, used jointly, will be even more strongly correlated with expert judgments. That is the unique opportunity that the current UK Research Assessment Exercise (RAE) -- and our open, online age, with its rich spectrum of potential performance indicators -- jointly provide: the opportunity to systematically cross-validate a rich and diverse battery of candidate metrics of research productivity, performance and impact (including citations, co-citations, downloads, tags, growth/decay metrics, etc.) against expert judgments, field by field. The rich data that the 2008 RAE returns have provided make it possible to do this validation exercise now too, for all disciplines, on a major nation-sized database. If successfully validated, the metric batteries can then not only pinch-hit for experts in future RAEs, but they will provide an open database that allows anyone, anywhere, any time to do comparative evaluations of research performance: continuous assessment and answerability.

(Note that what is at issue is whether metrics can substitute for costly and time-consuming expert rankings in the retrospective assessment of published, peer-reviewed research. It is of course not peer review itself -- another form of expert judgment -- that metrics are being proposed to replace [or simplify and supplement], for either submitted papers or research proposals.)

Harnad, S. (2008) Validating Research Performance Metrics Against Peer Rankings. Ethics in Science and Environmental Politics 8 (11) doi:10.3354/esep00088 Special Issue: The Use And Misuse Of Bibliometric Indices In Evaluating Scholarly Performance

Stevan Harnad
American Scientist Open Access Forum

Posted by Stevan Harnad in Scientometrics at 20:34 | Comments (0) | Trackbacks (0)

(Page 1 of 1, totaling 4 entries)

Entries from January 2009

Thursday, January 29. 2009

Who Should Notify Authors Whenever They Are Cited?

Thursday, January 22. 2009

The fundamental importance of capturing cited-reference metadata in Institutional Repository deposits

Wednesday, January 14. 2009

Comparing OA/non-OA in Developing Countries

Tuesday, January 13. 2009

Validating Multiple Metrics As Substitutes for Expert Evaluation of Research Performance

EnablingOpenScholarship (EOS)

Federal Research Public Access Act (FRPAA)

Alliance for Taxpayer Access (ATA)

Creative Commons License:

Quicksearch

Syndicate This Blog

Materials You Are Invited To Use To Promote OA Self-Archiving:

Archives

Calendar

Categories

Blog Administration

Statistics

Top Referrers

Syndicate This Blog