Ferrara, Emilio and Baumgartner, Robert (2010) Automatic Wrapper Adaptation by Tree Edit Distance Matching. [Book Chapter]
Full text available as:
|
PDF
- Published Version
Available under License Creative Commons Attribution Non-commercial No Derivatives. 340Kb |
Abstract
Information distributed through the Web keeps growing faster day by day, and for this reason, several techniques for extracting Web data have been suggested during last years. Often, extraction tasks are performed through so called wrappers, procedures extracting information from Web pages, e.g. implementing logic-based techniques. Many fields of application today require a strong degree of robustness of wrappers, in order not to compromise assets of information or reliability of data extracted. Unfortunately, wrappers may fail in the task of extracting data from a Web page, if its structure changes, sometimes even slightly, thus requiring the exploiting of new techniques to be automatically held so as to adapt the wrapper to the new structure of the page, in case of failure. In this work we present a novel approach of automatic wrapper adaptation based on the measurement of similarity of trees through improved tree edit distance matching techniques.
Item Type: | Book Chapter |
---|---|
Subjects: | Computer Science > Artificial Intelligence |
ID Code: | 7642 |
Deposited By: | Ferrara, Dr. Emilio |
Deposited On: | 01 Oct 2011 00:34 |
Last Modified: | 01 Oct 2011 00:34 |
Metadata
- ASCII Citation
- Atom
- BibTeX
- Dublin Core
- EP3 XML
- EPrints Application Profile (experimental)
- EndNote
- HTML Citation
- ID Plus Text Citation
- JSON
- METS
- MODS
- MPEG-21 DIDL
- OpenURL ContextObject
- OpenURL ContextObject in Span
- RDF+N-Triples
- RDF+N3
- RDF+XML
- Refer
- Reference Manager
- Search Data Dump
- Simple Metadata
- YAML
Repository Staff Only: item control page