University of Southampton OCS (beta), CAA 2012

Font Size: 
Archaeology in broad strokes: collating data for England from 1500 BC to AD 1086
Chris Green, Chris Gosden, Zena Kamash, Letty Ten Harkel, Xin Xiong, John Pybus

Last modified: 2011-12-18

Abstract


Landscape and Identities: the case of the English Landscape 1500 BC - AD 1086 (EngLaID) is an ERC funded project running for five years at the University of Oxford, which began during the second half of 2011.

The central concept of EngLaID lies in bringing together as many large scale spatial datasets as possible in order to learn about identity and change in the English landscape from the Middle Bronze Age until Domesday.  This includes English Heritage’s National Mapping Program (NMP) data, data collected from England’s Historic Environment Records (HERs), data collected under the Portable Antiquities Scheme (PAS), and several other period and thematic datasets.

All of these datasets are recorded in different ways and to differing levels of spatial and categorical precision, including different methods within each broad grouping.  This presents a considerable challenge in combining such disparate data within a single analytical environment.  The project is aided in this task through the use of semantic web, linked data, and GIS technologies.  Eventually, the intention is to publish as much of the data collected as possible in an accessible web-based format.

Beyond different formatting, these datasets are also all very large: for example, the English Heritage database for the south east of England alone comes to over 86,000 records.  With this, amongst other issues, come particular difficulties in terms of computer processing power, error checking, and data duplication.

The paper presented will outline the scope of EngLaID and discuss the challenges encountered to date (i.e. by March 2012), particularly in regard to the GIS implementation of the NMP, HER and PAS datasets.  It will also discuss some of our initial ideas about how we might publish the final outcomes online, including how we might deal with the various levels of reporting allowed by the various data providers and how we might deal with the sheer size of these datasets in terms of data resupply.

It is also our intention to submit a further paper on some of the semantic web / linked data technologies used and built for EngLaID.


Keywords


GIS; large datasets; semantic web; linked data; NMP; HER; PAS