Preserv

       
Latest...
Array
Preserv 2 final report 'candid and realistic'
The final report from the Preserv 2 project has been described by the JISC programme manager responsible for funding the project, Neil Grindley, as ”candid and realistic about the ... more
Project Partners

Oxford University Library Services ECS, University of Southampton The National Archives
Project Advisors
The British Library
Funded By
JISC

PRESERV 2 is funded by JISC within its capital programme in response to the September 06 call (Circular 04/06), Repositories and Preservation strand

PRESERV was originally funded by JISC within the 4/04 programme Supporting Digital Preservation and Asset Management in Institutions, theme 3: Institutional repository infrastructure development

MORE INFORMATION?

EMAIL: Steve Hitchcock, Project Manager

TEL: +44 (0)23 8059 3256
FAX: +44 (0)23 8059 2865

PRESERV Project,
IAM (Intelligence, Agents, Multimedia) Group,
Department of Electronics & Computer Science,
University of Southampton,
Highfield,
Southampton
SO17 1BJ, UK
RSS Admin


About the ProjectObjectives & OutcomesNews RSSPapers & Presentations RSSPeopleBlogs  RSS
Preservation Services
       

Is the file at risk?

Having classified the files, this information can be used to assess risks associated with the file type and version. As mentioned previously, one of the easiest checks is to ensure that the file conforms to that format specification and does not contain any errors which are likely to cause it to fail when being used in the future. Examples where this can be applied include:
  • XML based files (XML,HTML,RDF)
  • These have a specified type header in the file that can be used in conjunction with a parser to verify the structure of the contents.
  • Project source code
  • For tagged releases it should be checked that the source code can be compiled.
  • Latex source files
  • These are similar to project source code but are for documents rather than programs. These files often contain comments and mark-up that is not present in the final version, much like doc files which have margin comments.

Although file checks can be performed on individual files, this is likely to be unrealistic for archives ingesting huge amounts of content in a short time. In this case we need to look at the grouping the files to provide a risk analysis based on the features of each set of files.

Grouping was initiated using DROID file identification. DROID not only classifies files by application, it also shows the format version. With this specific information, the functionality of the PRONOM registry can be used to add features that allow a risk score to be retrieved for each format type and version.

The figure below shows the same EPrints classification screen as previously, except this time it also shows the risk score for each format. Using thresholds, which can be adjusted by the repository administrator, the file formats are grouped into traffic light-style categories that clearly show the user the risks related to all of their objects, including those where no risk has been found.
EPrints Format & Risk Classification
EPrints format classification screen showing risk categories

<--- Preservation - Analyse 6/8 Implementing a risk assessment service --->

This page produced and maintained by the PRESERV Project. Contact us