Monday, November 15. 2010
The Value of Shared Access and Reuse of Publicly Funded Scientific Data
A Public Symposium
Board on Research Data and Information
National Academy of Sciences
20 F Street Conference Center
Conference Room B, 20 F Street NW
Washington, DC
December 1, 2010, 2:00-4:15 p.m.
"The scientific community generates increasingly vast amounts of publicly funded digital data and information, and disseminates much of it online. The public investment in the production and management of such data resources in the United States alone is estimated to be several billions of dollars. Research communities within the United States and throughout the world have adopted different policies regarding whether or not to require publication of publicly funded data, how the research data and information created by individuals and projects are to be made available, and the terms under which that material may be reused by other parties. At the same time, there appears to be a broad recognition in both the public and private sectors of the importance of broad access to and reuse of publicly funded scientific data, not only for other researchers, but for the economy and society at large. The intangible social benefits of different types of scientific data are harder to measure, but they also can be very significant. They include educational, research, good-governance, and various other benefits that contribute directly and indirectly to improvement of the public welfare.
"At the same time, there are many legitimate reasons for not disclosing scientific data publicly – among them, the need to protect national security and law enforcement, personal privacy, proprietary interests, and confidentiality. Furthermore, many data sets are not sufficiently documented or organized, or of good enough quality, to make them useful to others. Questions about how to properly balance these competing interests and deficiencies in the preparation, access, and reuse of datasets remain unresolved, but will be addressed in the future work of the Board and elsewhere.
"Despite the huge public investments in generating and managing publicly funded data, and the even larger estimated downstream spillover effects of making it available, surprisingly little is known about the costs and benefits of open access and reuse on downstream research for our information society, and the knowledge economy. Many government agencies, academic organizations, and the research community generally are beginning to look into these issues in more depth.
"This public symposium will look at some of the research, economic, and social benefits that can be derived from providing online access to publicly-funded scientific data, as well as how such benefits can be evaluated, with a view to adding to that inquiry. The event will include presentations on the scientific data sharing and reuse policies of the federal government; compelling examples of the value of free online access and unfettered reuse of data; methods of assessing the value and effects of research, the economy, and society; and comments by Board members. The symposium is open to the public, but advance registration is requested (contact: Cheryl Levey, clevey@nas.edu or call 202-334-1531)."
Symposium Program
2:00 p.m. Opening remarks by the Board Chair
Michael Lesk
Rutgers University
2:10 Overview of scientific data sharing and reuse policies of the Federal government
[TBD], Interagency Working Group on Digital Data, OSTP *
* Not yet confirmed
2:30 Benefits of data sharing and reuse in policyr esearch: case studies in environmental sciences
Rod Atkinson and Jan Johansson, Congressional Research Service
2:50 Benefits of data sharing and reuse in biomedical research: the Alzheimer’s Disease Neuroimaging Initiative
Neil S. Buckholtz
National Institute on Aging, NIH
3:10 Evaluating the effects of federal data programs
Carl Shapiro
U.S. Geological Survey
3:30 Evaluating the effects of open access to scientific data and literature
Heather Joseph
SPARC
Comments
3:50 Michael Carroll
Washington School of Law, American University
4:00 Paul David
Stanford Institute for Economic Policy Research, Stanford University
4:10 Stevan Harnad
Université du Québec à Montreal & University of Southampton
4:15 Concluding observations by Symposium Chair, Michael Lesk
4:20 End of Symposium
Some supplementary remarks on Open Access and Open Data (part of a brief pre-recorded video to be delivered at the above NAS symposium):
Open Access to Refereed Research Publications and Open Access to Research Data: A Crucial Strategic Distinction
Stevan Harnad
Canada Research Chair in Cognitive Sciences
Université du Québec à Montreal
&
School of Electronics and Computer Science
University of Southampton
(1) On the Open Access Impact Advantage for Refereed Research Reports It has now been repeatedly demonstrated that refereed research articles that are made Open Access (OA) are used and cited significantly more in every scientific and scholarly field tested than those that are not made OA. It has now also been shown that this OA advantage is just as great for mandated OA as it is for self-selected OA. This means that the OA Advantage is not (as some have suggested) simply an artifact of selectively making higher-impact research open access: OA is the cause of the increased research impact. This finding greatly increases the importance and urgency of mandating OA for the sake of increasing and accelerating research uptake and progress. Gargouri, Y., Hajjem, C., Lariviere, V., Gingras, Y., Brody, T., Carr, L. and Harnad, S. (2010) Self-Selected or Mandated, Open Access Increases Citation Impact for Higher Quality Research. PLOS ONE 5(10): e13636. (2) On the Importance and Potential of Open Access Data-Archiving Although there is not yet enough OA data to be able to demonstrate that the same kind of impact benefits will be generated by OA to research data as those that have been demonstrated for OA to research articles, it is highly probable that that will prove to be the outcome. Moreover, the impact benefits of making research articles OA, and the rich new means of measuring research usage and impact that OA is generating will also serve as incentives to encourage researchers to provide OA to both their articles and their data. Brody, T., Carr, L., Gingras, Y., Hajjem, C., Harnad, S. and Swan, A. (2007) Incentivizing the Open Access Research Web: Publication-Archiving, Data-Archiving and Scientometrics. CTWatch Quarterly, 3 (3). (3) On the Crucial Differences Between Research Archiving and Data-Archiving -- And Why Immediate Data-Archiving Cannot be Mandated There is, however, a crucial difference between providing OA to research articles and providing OA to data: Scientists and scholars are not primarily data-gatherers. They gather data in order to data-mine, analyze, interpret and build further findings, theories and applications on it. Hence (except in the rare cases where the data speak for themselves), researchers cannot be expected (or mandated) to make their data OA immediately upon having collected or generated it, for all other researchers to data-mine and analyze. Researchers must be given sufficient time to data-mine their data, having invested the time and effort into collecting or generating it. And the length of the fair embargo interval on Open Access to data will vary depending on the nature of the data and the time, effort and ingenuity required to collect or generate it. This is fundamentally different from the case of refereed research reports, for which there is no justification whatsoever for embargoing Open Access once the paper has been peer-reviewed and accepted for publication.
Hence providing OA to refereed research reports can and should be mandated by researchers' institutions and funders, immediately upon acceptance for publication. Such immediate OA mandates cannot, however, be simplistically extended to research data (nor to unrefereed preprints of research reports) without generating the risk of needless and counterproductive conflicts of interest with the researchers that gathered the data. OA data-archiving, as soon as possible, should be strongly encouraged; in some cases embargo length limits can be set. But it cannot and should not be mandated (except in very special cases where the data-gathering itself is the research that is being funded.) OA, OA self-archiving, OA publishing, and data archiving
Open Access and Open Data
On Not Conflating Open Data (OD) With Open Access (OA)
More on Potential Conflict of Interest with Open Data (OD) Mandates
Don't Risk Getting Less By Needlessly Demanding More
How Green Open Access Supports Text- and Data-Mining
On Patience, and Letting (Human) Nature Take Its Course
Open Access: What Comes With the Territory
Stevan Harnad
American Scientist Open Access Forum
|