Number of items: 6.
Chapelle, Olivier and
Zhang, Ya A Dynamic Bayesian Network Click Model for Web Search Ranking. As with any application of machine learning, web search ranking requires labeled data. The labels usually come in the form of relevance assessments made by editors. Click logs can also provide an important source of implicit feedback and can be used as a cheap proxy for editorial labels. The main difficulty however comes from the so called position bias — urls appearing in lower positions are less likely to be clicked even if they are relevant. In this paper, we propose a Dynamic Bayesian Network which aims at providing us with unbiased estimation of the relevance from the click logs. Experiments show that the proposed click model outperforms other existing click models in predicting both click-through rate and relevance.
Zheng, Shuyi and
Dmitriev, Pavel and
Lee Giles, C. Graph Based Crawler Seed Selection. This paper identifies and explores the problem of seed selection in a web-scale crawler. We argue that seed selection is not a trivial but very important problem. Selecting proper seeds can increase the number of pages a crawler will discover, and can result in a collection with more “good” and less “bad” pages. Based on the analysis of the graph structure of the web, we propose several seed selection algorithms. Effectiveness of these algorithms is proved by our experimental results on real web data.
Chu, Wei and
Park, Seung-Taek Personalized Recommendation on Dynamic Content Using Predictive Bilinear Models. In Web-based services of dynamic content (such as news articles), recommender systems face the difficulty of timely identifying new items of high-quality and providing recommendations for new users. We propose a feature-based machine learning approach to personalized recommendation that is capable of handling the cold-start issue effectively. We maintain profiles of content of interest, in which temporal characteristics of the content, e.g. popularity and freshness, are updated in real-time manner. We also maintain profiles of users including demographic information and a summary of user activities within Yahoo! properties. Based on all features in user and content profiles, we develop predictive bilinear regression models to provide accurate personalized recommendations of new items for both existing and new users. This approach results in an offline model with light computational overhead compared with other recommender systems that require online re-training. The proposed framework is general and flexible for other personalized tasks. The superior performance of our approach is verified on a large-scale data set collected from the Today-Module on Yahoo! Front Page, with comparison against six competitive approaches.
Agarwal, Deepak and
Chen, Bee-Chung and
Elango, Pradheep Spatio-Temporal Models for Estimating Click-through Rate. We propose novel spatio-temporal models to estimate clickthrough rates in the context of content recommendation. We track article CTR at a fixed location over time through a dynamic Gamma-Poisson model and combine information from correlated locations through dynamic linear regressions, significantly improving on per-location model. Our models adjust for user fatigue through an exponential tilt to the firstview CTR (probability of click on first article exposure) that is based only on user-specific repeat-exposure features. We illustrate our approach on data obtained from a module (Today Module) published regularly on Yahoo! Front Page and demonstrate significant improvement over commonly used baseline methods. Large scale simulation experiments to study the performance of our models under different scenarios provide encouraging results. Throughout, all modeling assumptions are validated via rigorous exploratory data analysis.
Chang, William and
Pantel, Patrick and
Popescu, Ana-Maria and
Gabrilovich, Evgeniy Towards Intent-Driven Bidterm Suggestion. In online advertising, pervasive in commercial search engines, advertisers typically bid on few terms, and the scarcity of data makes ad matching difficult. Suggesting additional bidterms can significantly improve ad clickability and conversion rates. In this paper, we present a large-scale bidterm suggestion system that models an advertiser’s intent and finds new bidterms consistent with that intent. Preliminary experiments show that our system significantly increases the coverage of a state of the art production system used at Yahoo while maintaining comparable precision.
Chung, Sukwon and
Shiowattana, Dungjit and
Dmitriev, Pavel and
Chan, Su The Web of Nations. In this paper, we report on a large-scale study of structural differences among the national webs. The study is based on a webscale crawl conducted in the summer 2008. More specifically, we study two graphs derived from this crawl, the nation graph, with nodes corresponding to nations and edges – to links among nations, and the host graph, with nodes corresponding to hosts and edges – to hyperlinks among pages on the hosts. Contrary to some of the previous work [2], our results show that webs of different nations are often very different from each other, both in terms of their internal structure, and in terms of their connectivity with other nations.
This list was generated on Fri Feb 15 09:04:19 2019 GMT.
About this site
This website has been set up for WWW2009 by Christopher Gutteridge of the University of Southampton, using our EPrints software.
Preservation
We (Southampton EPrints Project) intend to preserve the files and HTML pages of this site for many years, however we will turn it into flat files for long term preservation. This means that at some point in the months after the conference the search, metadata-export, JSON interface, OAI etc. will be disabled as we "fossilize" the site. Please plan accordingly. Feel free to ask nicely for us to keep the dynamic site online longer if there's a rally good (or cool) use for it... [this has now happened, this site is now static]