TY - GEN
N1 - ISBN: 978-989-8425-40-9
ID - cogprints7640
UR - http://cogprints.org/7640/
A1 - Ferrara, Emilio
A1 - Baumgartner, Robert
Y1 - 2011///
N2 - Nowadays, the huge amount of information distributed through the Web motivates studying techniques to
be adopted in order to extract relevant data in an ef?cient and reliable way. Both academia and enterprises
developed several approaches of Web data extraction, for example using techniques of arti?cial intelligence or
machine learning. Some commonly adopted procedures, namely wrappers, ensure a high degree of precision
of information extracted from Web pages, and, at the same time, have to prove robustness in order not to
compromise quality and reliability of data themselves.
In this paper we focus on some experimental aspects related to the robustness of the data extraction process
and the possibility of automatically adapting wrappers. We discuss the implementation of algorithms for
?nding similarities between two different version of a Web page, in order to handle modi?cations, avoiding
the failure of data extraction tasks and ensuring reliability of information extracted. Our purpose is to evaluate
performances, advantages and draw-backs of our novel system of automatic wrapper adaptation.
TI - Design of Automatically Adaptable Web Wrappers
SP - 211
AV - public
EP - 217
ER -