Wednesday, June 21. 2006
SUMMARY: The conversion of the UK Research Assessment Exercise (RAE) from the present costly, wasteful exercise to time-saving and cost-efficient metrics is timely and welcome, but it is critically important not to bias the outcome by restricting the metric to prior research funding. Otherwise, this will merely generate a Matthew Effect -- a self-fulfilling prophecy -- and the RAE will no longer be a semi-independent funding source but merely a multiplier effect on prior research funding. Open Access will provide a rich digital database from which to harvest a broad and diverse spectrum of metrics which can then be weighted and adapted to each discipline.
Let 1000 RAE Metric Flowers Bloom:
Avoid Matthew Effect as Self-Fulfilling Prophecy
Stevan Harnad
The conversion of the UK Research Assessment Exercise ( RAE) from the present costly, wasteful exercise to time-saving and cost-efficient metrics is welcome, timely, and indeed long overdue, but the worrying thing is that the RAE planners currently seem to be focused on just one metric -- prior research funding -- instead of the full and rich spectrum of new (and old) metrics that will become available in an Open Access world, with all the research performance data digitally available online for analysis and use.
Mechanically basing the future RAE rankings exclusively on prior funding would just generate a Matthew Effect (making the rich richer and the poor poorer), a self-fulfilling prophecy that is simply equivalent to increasing the amount given to those who were previously funded (and scrapping the RAE altogether, as a separate, semi-independent performance evaluator and funding source).
What the RAE should be planning to do is to look at weighted combinations of all available research performance metrics -- including the many that are correlated, but not so tightly correlated, with prior RAE rankings, such as author/article/book citation counts, article download counts, co-citations (co-cited with and co-cited by, weighted with the citation weight of the co-citer/co-citee), endogamy/exogamy metrics (citations by self or collaborators versus others, within and across disciplines), hub/authority counts (in-cites and out-cites, weighted recursively by the citation's own in-cite and out-cite counts), download and citation growth rates, semantic-web correlates, etc.
It would be both arbitrary and absurd to blunt the potential sensitivity, power, predictivity and validity of metrics a-priori, by biasing them toward the prior-funding counts metric alone. Prior funding should just be one out of a full battery of weighted metrics, adjusted to each discipline and validated against one another (and against human judgment too). Shadbolt, N., Brody, T., Carr, L. and Harnad, S. (2006) The Open Research Web: A Preview of the Optimal and the Inevitable. In: Jacobs, N., Eds. Open Access: Key Strategic, Technical and Economic Aspects chapter 21. Chandos.
Pinfield, S. (2006) UK plans research funding overhaul The Scientist Wednesday 21 June, 2006
Stevan Harnad
American Scientist Open Access Forum
Saturday, June 17. 2006
SUMMARY: In addition to despositing in their institutional repositories the metadata plus the full-texts of their journal articles, researchers should also deposit the metadata plus the cited-reference lists of their books. This will allow the book citations to be harvested webwide and citation-linked, exactly as article citations will be, thereby providing book citation-impact metrics for book-based disciplines, alongside the usual journal-article citation-impact metrics.
For all disciplines -- but especially for disciplines that are more book-based than journal-article-based -- it would be highly beneficial for authors to self-archive in their institutional repositories the metadata as well as the cited-reference lists (bibliographies) for the books they publish annually. That way, next-generation scientometric search engines like citebase will be able to harvest and link their reference lists (exactly as they do the reference lists of articles whose full texts have been self-archived). This will generate a book citation impact metric.
Books cite and are cited by books; moreover, books cite articles and are cited by articles. It is already possible to scrape together a rudimentary book-impact index from Thompson-ISI's Web of Knowledge along with data from Google Books and Google Scholar, but a worldwide Open Access database, across all disciplines, indexing all the article output as well as the book output self-archived in all the world's institutional repositories could do infinitely better than that:
All that's needed is for authors' institutions and funders to mandate institutional (author) self-archiving of (1) the metadata and full-texts of all their article output along with (2) the metadata and reference lists of all their book output.
We can even do better than that, because although many book authors may not wish to make their books' full-texts Open Access (OA), they can still deposit their books' full-texts in their institutional repositories and set access as Closed Access -- accessible only to scientometric full-text harvesters and indexers (like google books) for full-text inversion, boolean search, and semiometric analysis (text endogamy/exogamy, text-overlap, text similarity/proximity, semantic lineage, latent semantic analysis, etc.) -- without making the full-text text itself OA to individual users (i.e., potential book-buyers) if they do not wish to.
This will help provide the UK's new metrics-based Research Assessment Exercise (RAE) with research performance indicators better suited for the disciplines whose research is not as journal-article- (and conference-paper-) based as that of the physical, biological and engineering sciences. Carr, L, Hitchcock, S., Oppenheim, C., McDonald, J.W., Champion, T. & Harnad, S. (2006) Can journal-based research impact assessment be generalised to book-based disciplines? (Research Proposal)
SUMMARY: The "impact" of academic research is typically measured by how much it is read, used and cited, and by how much new work it generates and influences. Services that measure impact work well for journal-based disciplines. Book-based disciplines can now benefit from online tools and methods of impact analysis too.These analyses also predict fruitful directions for future research, and so can inform research assessment and funding. The present research project will extend tools for online bibliometric data collection of publications and their citations with the aim of testing and evaluating new Web metrics to assist research assessment in book-based disciplines. Stevan Harnad
American Scientist Open Access Forum
Friday, June 16. 2006
SUMMARY: Larry Hurtado (Divinity, Edinburgh) suggests that a metrics-based alternative to the panel-based UK Research Assessment Exercise (RAE) may not be appropriate for some disciplines because their research is not journal-article-based, and hence citation-impact metrics are not valid. I reply that all disciplines have research output, and its usage and impact can be objectively measured. For example, book citation-impact can be measured too, if authors deposit their books' reference lists in their institutional repositories. Moreover, the RAE is just one of the two components of the UK dual research funding system. The other, far larger component, continues to be individual research proposal funding, adjudicated by peer review. Metrics are merely a supplement to -- not a substitute for -- peer review, but published articles and books have already been peer reviewed, hence there is no need for RAE panels to try to repeat the exercise.
On Wed, 14 Jun 2006, Larry Hurtado, Department of Divinity, University of Edinburgh, wrote in the American Scientist Open Access Forum: LH: "Stevan Harnad is totally in favour of a "metrics based" approach to judging research merit with a view toward funding decisions, and greets the news of such a shift from past/present RAE procedure with unalloyed joy." No, metrics are definitely not meant to serve as the basis for all or most research funding decisions: research proposals, as noted, are assessed by peer review.
Metrics is intended for the other component in the UK dual funding system, in which, in addition to directly funded research, based on competitive peer review of research bids, there is also a smaller, secondary (but prestigious) top-slicing system, the Research Assessment Exercise (RAE). It is the RAE that needed to be converted to metrics from the absurd, wasteful and costly juggernaut that it used to be. LH: "Well, hmmm. I'm not so sure (at least not yet). Perhaps there is more immediate reason for such joy in those disciplines that already rely heavily on a metrics approach to making decisions about researchers." No discipline uses metrics systematically yet; moreover, many metrics are still to be designed and tested. However, the only thing "metrics" really means is: the objective measurement of quantifiable performance indicators. Surely all disciplines have measurable performance indicators. Surely it is not true of any discipline that the only way, or the best way, to assess all of its annual research output is by having each piece individually re-reviewed after it has already been peer-reviewed twice -- before execution, by a funding council's peer-reviewers as a research proposal, and after execution, by a journal's referees as a research publication. LH: "In the sciences, and also now social sciences, there are citation-services that count publications and citations thereof in a given list of journals deemed the "canon"of publication venues for a given discipline. And in these disciplines journal articles are deemed the main (perhaps sole) mode of research publication. Ok. Maybe it'll work for these chaps." First, with an Open Access database, there need be no separate "canon": articles in any of the world's 24,000 peer-reviewed journals and congresses can count -- though some will (rightly) count for more than others, based on the established and known quality standards and impact of the journal in which it appeared (this too can be given a metric weight). Alongside the weighted impact factor of the journal, there will be the citation counts for each article itself, its author, the co-citations in and out, the download counts, the hub/authority weights, the endogamy/exogamy weights. etc. etc.
All these metrics (and many more) will be derivable for all disciplines from an Open Access database (no longer just restricted to ISI's Web of Knowledge).
That includes, by the way, citations of books by journal articles -- and also citations of books and journal articles by books, because although most book authors may not wish to make their books' full-texts OA, they can and should certainly make their books' bibliographic metadata, including their bibliography of cited references, OA. Those book-impact metrics can then be added to the metric harvest, citation-linked, counted, and duly weighted, along with all the other metrics.
There are even Closed-Access ways of self-archiving books' digital full-texts (such as google book search) so they can be processed for semiometric analysis (endogamy/exogamy, content overlap, proximity, lineage, chronometric trends) by harvesters that do not make the full text available openly. All disciplines can benefit from this. LH: "But I'd like to know how it will work in Humanities fields such as mine. Some questions, for Stevan or whomever. First, to my knowledge, there is no such citation-count service in place. So, will the govt now fund one to be set up for us? Or how will the metrics be compiled for us? I.e., there simply is no mechanism in place for doing "metrics"for Humanities disciplines." All the government needs to do is to mandate the self-archiving of all UK research output in each researcher's own OAI-compliant institutional (or central) repository. (The US and the rest of Europe will shortly follow suit, once the prototype policy model is at long last adopted by a major player!) The resulting worldwide interoperable database will be the source of all the metric data, and a new generation of scientometric and semiometric harvesters and analysers will quickly be spawned to operate on it, to mine it to extract the rich new generation of metrics.
There is absolutely nothing exceptional about the humanities (as long as book bibliographies are self-archived too, alongside journal-article full-texts). Research uptake and usage is a generic indicator of research performance, and citations and downloads are generic indicators of research uptake and usage. The humanities are no different in this regard. Moreover, inasmuch as OA also enhances research uptake and usage itself, the humanities stand to benefit from OA, exactly like the other disciplines. LH: "Second, for us, journal articles are only one, and usually not deemed the primary/preferred, mode of research publication. Books still count quite heavily. So, if we want to count citations, will some to-be-imagined citation-counting service/agency comb through all the books in my field as well as the journal articles to count how many of my publications get cited and how often? If not, then the "metrics"will be so heavily flawed as to be completing misleading and useless." All you need to do is self-archive your books' metadata and cited reference lists and all your journal articles in your OAI-compliant Institutional repository. The scientometric search engines -- like citebase, citeseer, google scholar, and more to come -- will take care of all the rest. If you want to do even better, scan in, OCR and self-archive the legacy literature too (the journal articles plus the metadata and cited reference lists of books of yore too; if you're worried about variations in reference citing styles: don't worry! Just get the digital texts in and algorithms can start sorting them out and improving themselves). LH: "Third, in many sciences, esp. natural and medical sciences, research simply can't be conducted without significant external funding. But in many/most Humanities disciplines truly groundbreaking and highly influential research continues to be done without much external funding." So what is your point? That the authors of unfunded research, uncoerced by any self-archiving mandate, will not self-archive? Don't worry. They will. They may not be the first ones, but they will follow soon afterwards, as the power and potential of self-archiving to measure as well as to accelerate and increase research impact and progress become more and more manifest. LH: "(Moreover, no govt has yet seen fit to provide funding for the Humanities constituency of researchers commensurate with that available for Sciences. So, it's a good thing we don't have to depend on such funding!)" Funding grumbles are a worthy topic, but they have nothing whatsoever to do with OA and the benefits of self-archiving, or metrics. LH: "My point is that the "metrics"for the Humanities will have to be quite a bit different in what is counted, at the very least." No doubt. And the metrics used, and their weights, will be adjusted accordingly. But metrics they will be. No exceptions there. And no regression back to either human re-evaluation or delphic oracles: Objective, countable performance indicators (for the bulk research output: of course for special prizes and honours individual human judgment will have to be re-invoked, in order to compare like with like, individually). LH: "Fourth, I'm not convinced (again, not yet; but I'm open to persuasion) that counting things = research quality and impact. Example: A number of years ago, coming from a tenure meeting at my previous University I ran into a colleague in Sociology. He opined that it was unnecessary to labour over tenure, and that he needed only two pieces of information: number of publications and number of citations. I responded, "I have two words for you: Pons and Fleischman". Remember these guys? They were cited in Time and Newsweek and everywhere else for a season as discovers of "cold fusion". And over the next couple of years, as some 50 or so labs tried unsuccessfully to replicate their alleged results, they must have been among the most frequently-cited guys in the business. And the net effect of all that citation was to discredit their work. So, citation = "impact". Well, maybe, but in this case "impact"= negative impact. So, are we really so sure of "metrics"?" Not only do citations have to be weighted, as they can and will be, recursively, by the weight of their source (Proceedings of the Royal Society vs. The Daily Sun, citations from Nobel Laureates vs citations from uncited authors), but semiometric algorithms will even begin to have a go at sorting positive citations from negative ones, disinterested ones from endogamous ones, etc. Are you proposing to defer to individual expert opinion in some (many? most? all?) cases, rather than using a growing wealth and diversity of objective performance indicators? Do you really think it is harder to find individual cases of subjective opinion going wrong than objective metrics going wrong? LH: "Perhaps, however, Stevan can help me see the light, and join him in acclaiming the advent of metrics." I suggest that the best way to see the light on the subjective of Open Access Digitometrics is to start self-archiving and sampling the (few) existing digitometric engines, such as citebase. You might also wish to have a look at the chapter I recommended (no need to buy the book: it's OA: Just click!): Shadbolt, N., Brody, T., Carr, L. and Harnad, S. (2006) The Open Research Web: A Preview of the Optimal and the Inevitable, in Jacobs, N., Eds. Open Access: Key Strategic, Technical and Economic Aspects, chapter 21. Chandos. Stevan Harnad
American Scientist Open Access Forum
|