Items where author is affiliated with TU Wien
Number of items: 1.
and Holzinger, Wolfgang
and Darmaputra, Yansen
and Baumgartner, Robert A Flight Meta-Search Engine with Metamorph.
We demonstrate a ﬂight meta-search engine that is based on the Metamorph framework. Metamorph provides mechanisms to model web forms together with the interactions which are needed to fulﬁl a request, and can generate interaction sequences that pose queries using these web forms and collect the results. In this paper, we discuss an interesting new feature that makes use of the forms themselves as an information source. We show how data can be extracted from web forms (rather than the data behind web forms) to generate a graph of ﬂight connections between cities. The ﬂight connection graph allows us to vastly reduce the number of queries that the engine sends to airline websites in the most interesting search scenarios; those that involve the controversial practice of creative ticketing, in which agencies attempt to ﬁnd lower price fares by using more than one airline for a journey. We describe a system which attains data from a number of websites to identify promising routes and prune the search tree. Heuristics that make use of geographical information and an estimation of cost based on historical data are employed. The results are then made available to improve the quality of future search requests. Categories and Subject Descriptors: H.3.4 [Information Storage and Retrieval]: Systems and Software General Terms: Algorithms, Design, Experimentation. Keywords: Hidden Web, Web Data Extraction, Web Form Mapping, Web Form Extraction.
About this site
This website has been set up for WWW2009 by Christopher Gutteridge of the University of Southampton, using our EPrints software.
We (Southampton EPrints Project) intend to preserve the files and HTML pages of this site for many years, however we will turn it into flat files for long term preservation. This means that at some point in the months after the conference the search, metadata-export, JSON interface, OAI etc. will be disabled as we "fossilize" the site. Please plan accordingly. Feel free to ask nicely for us to keep the dynamic site online longer if there's a rally good (or cool) use for it... [this has now happened, this site is now static]