Ferrara, Emilio and Baumgartner, Robert (2011) Intelligent Self-Repairable Web Wrappers. [Conference Paper]
There is a more recent version of this eprint available. Click here to view it. |
Full text available as:
PDF - Published Version |
Abstract
The amount of information available on the Web grows at an incredible high rate. Systems and procedures devised to extract these data from Web sources already exist, and different approaches and techniques have been investigated during the last years. On the one hand, reliable solutions should provide robust algorithms of Web data mining which could automatically face possible malfunctioning or failures. On the other, in literature there is a lack of solutions about the maintenance of these systems. Procedures that extract Web data may be strictly interconnected with the structure of the data source itself; thus, malfunctioning or acquisition of corrupted data could be caused, for example, by structural modifications of data sources brought by their owners. Nowadays, verification of data integrity and maintenance are mostly manually managed, in order to ensure that these systems work correctly and reliably. In this paper we propose a novel approach to create procedures able to extract data from Web sources -- the so called Web wrappers -- which can face possible malfunctioning caused by modifications of the structure of the data source, and can automatically repair themselves.
Item Type: | Conference Paper |
---|---|
Additional Information: | ISSN: 0302-9743 |
Subjects: | Computer Science > Artificial Intelligence |
ID Code: | 7648 |
Deposited By: | Ferrara, Dr. Emilio |
Deposited On: | 01 Oct 2011 00:34 |
Last Modified: | 01 Oct 2011 00:34 |
Available Versions of this Item
- Intelligent Self-Repairable Web Wrappers. (deposited 01 Oct 2011 00:34) [Currently Displayed]
Metadata
- ASCII Citation
- Atom
- BibTeX
- Dublin Core
- EP3 XML
- EPrints Application Profile (experimental)
- EndNote
- HTML Citation
- ID Plus Text Citation
- JSON
- METS
- MODS
- MPEG-21 DIDL
- OpenURL ContextObject
- OpenURL ContextObject in Span
- RDF+N-Triples
- RDF+N3
- RDF+XML
- Refer
- Reference Manager
- Search Data Dump
- Simple Metadata
- YAML
Repository Staff Only: item control page