Preserv

       
Latest...
Array
Preserv 2 final report 'candid and realistic'
The final report from the Preserv 2 project has been described by the JISC programme manager responsible for funding the project, Neil Grindley, as ”candid and realistic about the ... more
Project Partners

Oxford University Library Services ECS, University of Southampton The National Archives
Project Advisors
The British Library
Funded By
JISC

PRESERV 2 is funded by JISC within its capital programme in response to the September 06 call (Circular 04/06), Repositories and Preservation strand

PRESERV was originally funded by JISC within the 4/04 programme Supporting Digital Preservation and Asset Management in Institutions, theme 3: Institutional repository infrastructure development

MORE INFORMATION?

EMAIL: Steve Hitchcock, Project Manager

TEL: +44 (0)23 8059 3256
FAX: +44 (0)23 8059 2865

PRESERV Project,
IAM (Intelligence, Agents, Multimedia) Group,
Department of Electronics & Computer Science,
University of Southampton,
Highfield,
Southampton
SO17 1BJ, UK
RSS Admin


About the ProjectObjectives & OutcomesNews RSSPapers & Presentations RSSPeopleBlogs  RSS
Storage
       

Storage controllers: choosing between local disk and the network 'cloud'

Different types of storage services are available to repositories, from local disks to the distributed 'cloud', offering choices of scale, bandwidth and cost. Rather than adopt a single storage approach, with growing data volumes and data types it is likely repositories will choose a combination of services, or 'hybrid' storage. If there are storage options, we have to manage copy and transfer of content from the repository to the chosen locations. Preserv has developed a storage controller for EPrints software.

Extended abstract From the Desktop to the Cloud: Leveraging Hybrid Storage Architectures in your Repository, updated April 2009, accepted for Open Repositories Conference 2009 (OR09), Atlanta, May
Introduces the EPrints storage controller, which allows repositories using this software to integrate with emerging network, storage and cloud services. For a less technical approach, see the furniture removal analogy in this blog commentary (February 2009) to help you understand these cloud-storage controller developments.

EPrints storage controller architecture The EPrints storage controller has been successfully tested for storing content in Amazon S3/Cloudfront, and will be implemented and available from EPrints version 3.2 (availability tba). Two more storage plug-ins are available so far for the storage controller: the local storage plug-in that also supports the legacy local disk layout, and a plug-in for the Sun STK5800 server. Find out how to write a storage plug-in.

Presentation From open storage to smart storage: enabling EPrints repository preservation (slides), Sun Preservation and Archiving Special Interest Group (PASIG) meeting, May 2008
First description of the storage controller for EPrints repository software. Supports a pluggable storage layer for repositories, providing the ability to store objects in different locations based on a set of rules, e.g. using metadata or type. For example, a generated thumbnail could be stored locally while the original image is stored in Amazon S3 and in a local archival server. Another example would be storing files of a certain size or classification offsite and sending these to a tape queue for backup.

Listing Fedora Commons (the repository software) as a storage service in the slide above led to some confusion initially. Using one repository software as a front-end (EPrints) to another (Fedora) offers intriguing possiblities, particularly in this case where the softwares have complementary strengths in terms of interfaces and data management. Keeping two repositories in sync when both are trying to perform similar operations could be tricky, however, but is possible when dealing with input of new items.

Alternative storage controllers

As part of Fedora Commons, the Akubra project is implementing a plug-in based storage abstraction.

DuraSpace, a joint DSpace/Fedora project, seeks to offer a commercial service to mediate between the respective repository softwares and storage services: "'DuraSpace' (is) a new web-based service that will allow institutions to easily distribute content to multiple storage providers, both 'cloud-based' and institution-based. The idea behind DuraSpace is to provide a trusted, value-added service layer to augment the capabilities of generic storage providers by making stored digital content more durable, manageable, accessible and sharable." Full DSpace press release.

For a perspective on early progress with Duraspace from a Preserv team leader, see this blog entry (25 February 2009) applauding the concept of durability for repository content and storage flexibility, but cautious about offering too many new services: "Let's do it in the cloud - but lets work really hard at articulating the benefits that the cloud end user will enjoy and stop relying on general talk about value-added services. I think researchers/end-users will forgive us for not having finished implementing something yet, but they won't forgive us for a lack of imagination."
<--- Repository storage: the impact of data growth and diversity 2/5 Storage options: open storage --->

This page produced and maintained by the PRESERV Project. Contact us