AKT EPrint Archive

Developing a Service-Oriented Architecture to Harvest Information for the Semantic Web

Norton, Mr Barry and Chapman, Mr Sam and Ciravegna, Prof Fabio (2004) Developing a Service-Oriented Architecture to Harvest Information for the Semantic Web. In Proceedings First AKT Workshop on Semantic Web Services, (AKT-SWS04), KMi, The Open University, Milton Keynes, UK..

Full text available as:

PDF - Requires Adobe Acrobat Reader or other PDF viewer.

Armadillo is a tool that provides automatic annotation for the Semantic Web using unannotated resources like the existing Web for information harvesting, that is: combining a crawling mechanism with an extensible architecture for ontology population. The latter is achieved via largely unsupervised machine learning, boot-strapped from oracles, such as web-site wrappers, and backed up by an `evidential reasoning', allowing evidence to be gained from the redundancy in the Web and allowing the inaccuracies in information, also characteristic of today's Web, to be circumvented. In this paper we sketch how the Armadillo architecture has been reinterpreted as work ow templates that compose semantic web services and show how the porting of Armadillo to new domains, and the application of new tools, has thus been simplied.

Keywords:Armadillo, SWS, Semantic Web Services, BPEL, Architecture, Annotation, Sheffield, Information Extraction, Information Integration, IE, II,
Subjects:AKT Challenges > Knowledge acquisition
AKT Challenges > Knowledge maintenance
ID Code:376
Deposited By:Chapman, Mr Sam
Deposited On:28 January 2005
Alternative Locations:http://www.dcs.shef.ac.uk/~sam/papers/aktsws04.pdf

Contact the site administrator at: hg@ecs.soton.ac.uk