Items from Social Networks and Web 2.0 track
Jump to: Backstrom, Lars | Bai, Hongjie | Bauckhage, Christian | Brandes, Ulrik | Cha, Meeyoung | Chang, Edward Y. | Chen, Wen-Yen | Chu, Jon-Chyuan | Chu, Wei | Crandall, David | Goel, Sharad | Gummadi, Krishna P. | Huttenlocher, Daniel | Karagiannis, Thomas | Kenis, Patrick | Kleinberg, Jon | Kunegis, Jérôme | Lerman, Kristina | Lerner, Jürgen | Lommatzsch, Andreas | Luan, Junyi | Matsuo, Yutaka | Mislove, Alan | Muhamad, Roby | Park, Seung-Taek | Plangprasopchok, Anon | Riedl, John | San Pedro, Jose | Sen, Shilad | Siersdorfer, Stefan | Vig, Jesse | Vojnovic, Milan | Wang, Yi | Watts, Duncan | Yamamoto, Hikaru | van Raaij, DeniseNumber of items: 12.
Backstrom, LarsCrandall, David and Backstrom, Lars and Huttenlocher, Daniel and Kleinberg, Jon Mapping the World's Photos. We investigate how to organize a large collection of geotagged photos, working with a dataset of about 35 million images collected from Flickr. Our approach combines content analysis based on text tags and image data with structural analysis based on geospatial data. We use the spatial distribution of where people take photos to define a relational structure between the photos that are taken at popular places. We then study the interplay between this structure and the content, using classification methods for predicting such locations from visual, textual and temporal features of the photos. We find that visual and temporal features improve the ability to estimate the location of a photo, compared to using just textual features. We illustrate using these techniques to organize a large photo collection, while also revealing various interesting properties about popular cities and landmarks at a global scale.
Bai, HongjieChen, Wen-Yen and Chu, Jon-Chyuan and Luan, Junyi and Bai, Hongjie and Wang, Yi and Chang, Edward Y. Collaborative Filtering for Orkut Communities: Discovery of User Latent Behavior. Users of social networking services can connect with each other by forming communities for online interaction. Yet as the number of communities hosted by such websites grows over time, users have even greater need for effective commu- nity recommendations in order to meet more users. In this paper, we investigate two algorithms from very different do- mains and evaluate their effectiveness for personalized com- munity recommendation. First is association rule mining (ARM), which discovers associations between sets of com- munities that are shared across many users. Second is latent Dirichlet allocation (LDA), which models user-community co-occurrences using latent aspects. In comparing LDA with ARM, we are interested in discovering whether modeling low-rank latent structure is more effective for recommen- dations than directly mining rules from the observed data. We experiment on an Orkut data set consisting of 492, 104 users and 118, 002 communities. Our empirical comparisons using the top-k recommendations metric show that LDA performs consistently better than ARM for the community recommendation task when recommending a list of 4 or more communities. However, for recommendation lists of up to 3 communities, ARM is still a bit better. We analyze exam- ples of the latent information learned by LDA to explain this finding. To efficiently handle the large-scale data set, we parallelize LDA on distributed computers [1] and demon- strate our parallel implementation’s scalability with varying numbers of machines.
Bauckhage, ChristianKunegis, Jérôme and Lommatzsch, Andreas and Bauckhage, Christian The Slashdot Zoo: Mining a Social Network with Negative Edges. We analyse the corpus of user relationships of the Slash- dot technology news site. The data was collected from the Slashdot Zoo feature where users of the website can tag other users as friends and foes, providing positive and negative en- dorsements. We adapt social network analysis techniques to the problem of negative edge weights. In particular, we con- sider signed variants of global network characteristics such as the clustering coefficient, node-level characteristics such as centrality and popularity measures, and link-level character- istics such as distances and similarity measures. We evaluate these measures on the task of identifying unpopular users, as well as on the task of predicting the sign of links and show that the network exhibits multiplicative transitivity which allows algebraic methods based on matrix multiplication to be used. We compare our methods to traditional methods which are only suitable for positively weighted edges.
Brandes, UlrikBrandes, Ulrik and Kenis, Patrick and Lerner, Jürgen and van Raaij, Denise Network Analysis of Collaboration Structure in Wikipedia. In this paper we give models and algorithms to describe and analyze the collaboration among authors of Wikipedia from a network analytical perspective. The edit network encodes who interacts how with whom when editing an article; it significantly extends previous network models that code author communities in Wikipedia. Several characteristics summarizing some aspects of the organization process and allowing the analyst to identify certain types of authors can be obtained from the edit network. Moreover, we propose several indicators characterizing the global network structure and methods to visualize edit networks. It is shown that the structural network indicators are correlated with quality labels of the associated Wikipedia articles.
Cha, MeeyoungCha, Meeyoung and Mislove, Alan and Gummadi, Krishna P. A Measurement-driven Analysis of Information Propagation in the Flickr Social Network. Online social networking sites like MySpace, Facebook, and Flickr have become a popular way to share and disseminate content. Their massive popularity has led to viral marketing techniques that at- tempt to spread content, products, and ideas on these sites. How- ever, there is little data publicly available on viral propagation in the real world and few studies have characterized how information spreads over current online social networks. In this paper, we collect and analyze large-scale traces of infor- mation dissemination in the Flickr social network. Our analysis, based on crawls of the favorite markings of 2.5 million users on 11 million photos, aims at answering three key questions: (a) how widely does information propagate in the social network? (b) how quickly does information propagate? and (c) what is the role of word-of-mouth exchanges between friends in the overall propaga- tion of information in the network? Contrary to viral marketing “intuition,” we find that (a) even popular photos do not spread widely throughout the network, (b) even popular photos spread slowly through the network, and (c) information exchanged be- tween friends is likely to account for over 50% of all favorite- markings, but with a significant delay at each hop.
Chang, Edward Y.Chen, Wen-Yen and Chu, Jon-Chyuan and Luan, Junyi and Bai, Hongjie and Wang, Yi and Chang, Edward Y. Collaborative Filtering for Orkut Communities: Discovery of User Latent Behavior. Users of social networking services can connect with each other by forming communities for online interaction. Yet as the number of communities hosted by such websites grows over time, users have even greater need for effective commu- nity recommendations in order to meet more users. In this paper, we investigate two algorithms from very different do- mains and evaluate their effectiveness for personalized com- munity recommendation. First is association rule mining (ARM), which discovers associations between sets of com- munities that are shared across many users. Second is latent Dirichlet allocation (LDA), which models user-community co-occurrences using latent aspects. In comparing LDA with ARM, we are interested in discovering whether modeling low-rank latent structure is more effective for recommen- dations than directly mining rules from the observed data. We experiment on an Orkut data set consisting of 492, 104 users and 118, 002 communities. Our empirical comparisons using the top-k recommendations metric show that LDA performs consistently better than ARM for the community recommendation task when recommending a list of 4 or more communities. However, for recommendation lists of up to 3 communities, ARM is still a bit better. We analyze exam- ples of the latent information learned by LDA to explain this finding. To efficiently handle the large-scale data set, we parallelize LDA on distributed computers [1] and demon- strate our parallel implementation’s scalability with varying numbers of machines.
Chen, Wen-YenChen, Wen-Yen and Chu, Jon-Chyuan and Luan, Junyi and Bai, Hongjie and Wang, Yi and Chang, Edward Y. Collaborative Filtering for Orkut Communities: Discovery of User Latent Behavior. Users of social networking services can connect with each other by forming communities for online interaction. Yet as the number of communities hosted by such websites grows over time, users have even greater need for effective commu- nity recommendations in order to meet more users. In this paper, we investigate two algorithms from very different do- mains and evaluate their effectiveness for personalized com- munity recommendation. First is association rule mining (ARM), which discovers associations between sets of com- munities that are shared across many users. Second is latent Dirichlet allocation (LDA), which models user-community co-occurrences using latent aspects. In comparing LDA with ARM, we are interested in discovering whether modeling low-rank latent structure is more effective for recommen- dations than directly mining rules from the observed data. We experiment on an Orkut data set consisting of 492, 104 users and 118, 002 communities. Our empirical comparisons using the top-k recommendations metric show that LDA performs consistently better than ARM for the community recommendation task when recommending a list of 4 or more communities. However, for recommendation lists of up to 3 communities, ARM is still a bit better. We analyze exam- ples of the latent information learned by LDA to explain this finding. To efficiently handle the large-scale data set, we parallelize LDA on distributed computers [1] and demon- strate our parallel implementation’s scalability with varying numbers of machines.
Chu, Jon-ChyuanChen, Wen-Yen and Chu, Jon-Chyuan and Luan, Junyi and Bai, Hongjie and Wang, Yi and Chang, Edward Y. Collaborative Filtering for Orkut Communities: Discovery of User Latent Behavior. Users of social networking services can connect with each other by forming communities for online interaction. Yet as the number of communities hosted by such websites grows over time, users have even greater need for effective commu- nity recommendations in order to meet more users. In this paper, we investigate two algorithms from very different do- mains and evaluate their effectiveness for personalized com- munity recommendation. First is association rule mining (ARM), which discovers associations between sets of com- munities that are shared across many users. Second is latent Dirichlet allocation (LDA), which models user-community co-occurrences using latent aspects. In comparing LDA with ARM, we are interested in discovering whether modeling low-rank latent structure is more effective for recommen- dations than directly mining rules from the observed data. We experiment on an Orkut data set consisting of 492, 104 users and 118, 002 communities. Our empirical comparisons using the top-k recommendations metric show that LDA performs consistently better than ARM for the community recommendation task when recommending a list of 4 or more communities. However, for recommendation lists of up to 3 communities, ARM is still a bit better. We analyze exam- ples of the latent information learned by LDA to explain this finding. To efficiently handle the large-scale data set, we parallelize LDA on distributed computers [1] and demon- strate our parallel implementation’s scalability with varying numbers of machines.
Chu, WeiChu, Wei and Park, Seung-Taek Personalized Recommendation on Dynamic Content Using Predictive Bilinear Models. In Web-based services of dynamic content (such as news articles), recommender systems face the difficulty of timely identifying new items of high-quality and providing recommendations for new users. We propose a feature-based machine learning approach to personalized recommendation that is capable of handling the cold-start issue effectively. We maintain profiles of content of interest, in which temporal characteristics of the content, e.g. popularity and freshness, are updated in real-time manner. We also maintain profiles of users including demographic information and a summary of user activities within Yahoo! properties. Based on all features in user and content profiles, we develop predictive bilinear regression models to provide accurate personalized recommendations of new items for both existing and new users. This approach results in an offline model with light computational overhead compared with other recommender systems that require online re-training. The proposed framework is general and flexible for other personalized tasks. The superior performance of our approach is verified on a large-scale data set collected from the Today-Module on Yahoo! Front Page, with comparison against six competitive approaches.
Crandall, DavidCrandall, David and Backstrom, Lars and Huttenlocher, Daniel and Kleinberg, Jon Mapping the World's Photos. We investigate how to organize a large collection of geotagged photos, working with a dataset of about 35 million images collected from Flickr. Our approach combines content analysis based on text tags and image data with structural analysis based on geospatial data. We use the spatial distribution of where people take photos to define a relational structure between the photos that are taken at popular places. We then study the interplay between this structure and the content, using classification methods for predicting such locations from visual, textual and temporal features of the photos. We find that visual and temporal features improve the ability to estimate the location of a photo, compared to using just textual features. We illustrate using these techniques to organize a large photo collection, while also revealing various interesting properties about popular cities and landmarks at a global scale.
Goel, SharadGoel, Sharad and Muhamad, Roby and Watts, Duncan Social Search in "Small-World" Experiments. The “algorithmic small-world hypothesis” states that not only are pairs of individuals in a large social network connected by short paths, but that ordinary individuals can find these paths. Although theoretically plausible, empirical evidence for the hypothesis is limited, as most chains in “small-world” experiments fail to complete, thereby biasing estimates of “true” chain lengths. Using data from two recent small-world experiments, comprising a total of 162,328 message chains, and directed at one of 30 “targets” spread across 19 countries, we model heterogeneity in chain attrition rates as a function of individual attributes. We then introduce a rigorous way of estimating true chain lengths that is provably unbiased, and can account for empiricallyobserved variation in attrition rates. Our findings provide mixed support for the algorithmic hypothesis. On the one hand, it appears that roughly half of all chains can be completed in 6-7 steps—thus supporting the “six degrees of separation” assertion—but on the other hand, estimates of the mean are much longer, suggesting that for at least some of the population, the world is not “small” in the algorithmic sense. We conclude that search distances in social networks are fundamentally different from topological distances, for which the mean and median of the shortest path lengths between nodes tend to be similar.
Gummadi, Krishna P.Cha, Meeyoung and Mislove, Alan and Gummadi, Krishna P. A Measurement-driven Analysis of Information Propagation in the Flickr Social Network. Online social networking sites like MySpace, Facebook, and Flickr have become a popular way to share and disseminate content. Their massive popularity has led to viral marketing techniques that at- tempt to spread content, products, and ideas on these sites. How- ever, there is little data publicly available on viral propagation in the real world and few studies have characterized how information spreads over current online social networks. In this paper, we collect and analyze large-scale traces of infor- mation dissemination in the Flickr social network. Our analysis, based on crawls of the favorite markings of 2.5 million users on 11 million photos, aims at answering three key questions: (a) how widely does information propagate in the social network? (b) how quickly does information propagate? and (c) what is the role of word-of-mouth exchanges between friends in the overall propaga- tion of information in the network? Contrary to viral marketing “intuition,” we find that (a) even popular photos do not spread widely throughout the network, (b) even popular photos spread slowly through the network, and (c) information exchanged be- tween friends is likely to account for over 50% of all favorite- markings, but with a significant delay at each hop.
Huttenlocher, DanielCrandall, David and Backstrom, Lars and Huttenlocher, Daniel and Kleinberg, Jon Mapping the World's Photos. We investigate how to organize a large collection of geotagged photos, working with a dataset of about 35 million images collected from Flickr. Our approach combines content analysis based on text tags and image data with structural analysis based on geospatial data. We use the spatial distribution of where people take photos to define a relational structure between the photos that are taken at popular places. We then study the interplay between this structure and the content, using classification methods for predicting such locations from visual, textual and temporal features of the photos. We find that visual and temporal features improve the ability to estimate the location of a photo, compared to using just textual features. We illustrate using these techniques to organize a large photo collection, while also revealing various interesting properties about popular cities and landmarks at a global scale.
Karagiannis, ThomasKaragiannis, Thomas and Vojnovic, Milan Behavioral Profiles for Advanced Email Features. We examine the behavioral patterns of email usage in a large-scale enterprise over a three-month period. In particular, we focus on two main questions: (Q1) what do replies depend on? and (Q2) what is the gain of augmenting contacts through the friends of friends from the email social graph? For Q1, we identify and evaluate the significance of several factors that affect the reply probability and the email response time. We find that all factors of our considered set are significant, provide their relative ordering, and identify the recipient list size, and the intensity of email communication between the correspondents as the dominant factors. We highlight various novel threshold behaviors and provide support for existing hypotheses such as that of the least-effort reply. For Q2, we find that the number of new contacts extracted from the friends-of-friends relationships amounts to a large number, but which is still a limited portion of the total enterprise size. We believe that our results provide significant insights towards informed design of advanced email features, including those of social-networking type. Categories & Subject Descriptors: H.4.3 [Communications Applications]: Electronic mail General Terms: Design, Measurement, Human Factors Keywords: Reply time, reply probability, email profiles.
Kenis, PatrickBrandes, Ulrik and Kenis, Patrick and Lerner, Jürgen and van Raaij, Denise Network Analysis of Collaboration Structure in Wikipedia. In this paper we give models and algorithms to describe and analyze the collaboration among authors of Wikipedia from a network analytical perspective. The edit network encodes who interacts how with whom when editing an article; it significantly extends previous network models that code author communities in Wikipedia. Several characteristics summarizing some aspects of the organization process and allowing the analyst to identify certain types of authors can be obtained from the edit network. Moreover, we propose several indicators characterizing the global network structure and methods to visualize edit networks. It is shown that the structural network indicators are correlated with quality labels of the associated Wikipedia articles.
Kleinberg, JonCrandall, David and Backstrom, Lars and Huttenlocher, Daniel and Kleinberg, Jon Mapping the World's Photos. We investigate how to organize a large collection of geotagged photos, working with a dataset of about 35 million images collected from Flickr. Our approach combines content analysis based on text tags and image data with structural analysis based on geospatial data. We use the spatial distribution of where people take photos to define a relational structure between the photos that are taken at popular places. We then study the interplay between this structure and the content, using classification methods for predicting such locations from visual, textual and temporal features of the photos. We find that visual and temporal features improve the ability to estimate the location of a photo, compared to using just textual features. We illustrate using these techniques to organize a large photo collection, while also revealing various interesting properties about popular cities and landmarks at a global scale.
Kunegis, JérômeKunegis, Jérôme and Lommatzsch, Andreas and Bauckhage, Christian The Slashdot Zoo: Mining a Social Network with Negative Edges. We analyse the corpus of user relationships of the Slash- dot technology news site. The data was collected from the Slashdot Zoo feature where users of the website can tag other users as friends and foes, providing positive and negative en- dorsements. We adapt social network analysis techniques to the problem of negative edge weights. In particular, we con- sider signed variants of global network characteristics such as the clustering coefficient, node-level characteristics such as centrality and popularity measures, and link-level character- istics such as distances and similarity measures. We evaluate these measures on the task of identifying unpopular users, as well as on the task of predicting the sign of links and show that the network exhibits multiplicative transitivity which allows algebraic methods based on matrix multiplication to be used. We compare our methods to traditional methods which are only suitable for positively weighted edges.
Lerman, KristinaPlangprasopchok, Anon and Lerman, Kristina Constructing Folksonomies from User-Specified Relations on Flickr. Automatic folksonomy construction from tags has attracted much attention recently. However, inferring hierarchical relations between concepts from tags has a drawback in that it is difficult to distinguish between more popular and more general concepts. Instead of tags we propose to use userspecified relations for learning folksonomy. We explore two statistical frameworks for aggregating many shallow individual hierarchies, expressed through the collection/set relations on the social photosharing site Flickr, into a common deeper folksonomy that reflects how a community organizes knowledge. Our approach addresses a number of challenges that arise while aggregating information from diverse users, namely noisy vocabulary, and variations in the granularity level of the concepts expressed. Our second contribution is a method for automatically evaluating learned folksonomy by comparing it to a reference taxonomy, e.g., the Web directory created by the Open Directory Project. Our empirical results suggest that user-specified relations are a good source of evidence for learning folksonomies.
Lerner, JürgenBrandes, Ulrik and Kenis, Patrick and Lerner, Jürgen and van Raaij, Denise Network Analysis of Collaboration Structure in Wikipedia. In this paper we give models and algorithms to describe and analyze the collaboration among authors of Wikipedia from a network analytical perspective. The edit network encodes who interacts how with whom when editing an article; it significantly extends previous network models that code author communities in Wikipedia. Several characteristics summarizing some aspects of the organization process and allowing the analyst to identify certain types of authors can be obtained from the edit network. Moreover, we propose several indicators characterizing the global network structure and methods to visualize edit networks. It is shown that the structural network indicators are correlated with quality labels of the associated Wikipedia articles.
Lommatzsch, AndreasKunegis, Jérôme and Lommatzsch, Andreas and Bauckhage, Christian The Slashdot Zoo: Mining a Social Network with Negative Edges. We analyse the corpus of user relationships of the Slash- dot technology news site. The data was collected from the Slashdot Zoo feature where users of the website can tag other users as friends and foes, providing positive and negative en- dorsements. We adapt social network analysis techniques to the problem of negative edge weights. In particular, we con- sider signed variants of global network characteristics such as the clustering coefficient, node-level characteristics such as centrality and popularity measures, and link-level character- istics such as distances and similarity measures. We evaluate these measures on the task of identifying unpopular users, as well as on the task of predicting the sign of links and show that the network exhibits multiplicative transitivity which allows algebraic methods based on matrix multiplication to be used. We compare our methods to traditional methods which are only suitable for positively weighted edges.
Luan, JunyiChen, Wen-Yen and Chu, Jon-Chyuan and Luan, Junyi and Bai, Hongjie and Wang, Yi and Chang, Edward Y. Collaborative Filtering for Orkut Communities: Discovery of User Latent Behavior. Users of social networking services can connect with each other by forming communities for online interaction. Yet as the number of communities hosted by such websites grows over time, users have even greater need for effective commu- nity recommendations in order to meet more users. In this paper, we investigate two algorithms from very different do- mains and evaluate their effectiveness for personalized com- munity recommendation. First is association rule mining (ARM), which discovers associations between sets of com- munities that are shared across many users. Second is latent Dirichlet allocation (LDA), which models user-community co-occurrences using latent aspects. In comparing LDA with ARM, we are interested in discovering whether modeling low-rank latent structure is more effective for recommen- dations than directly mining rules from the observed data. We experiment on an Orkut data set consisting of 492, 104 users and 118, 002 communities. Our empirical comparisons using the top-k recommendations metric show that LDA performs consistently better than ARM for the community recommendation task when recommending a list of 4 or more communities. However, for recommendation lists of up to 3 communities, ARM is still a bit better. We analyze exam- ples of the latent information learned by LDA to explain this finding. To efficiently handle the large-scale data set, we parallelize LDA on distributed computers [1] and demon- strate our parallel implementation’s scalability with varying numbers of machines.
Matsuo, YutakaMatsuo, Yutaka and Yamamoto, Hikaru Community Gravity: Measuring Bidirectional Effects by Trust and Rating on Online Social Networks. Several attempts have been made to analyze customer behavior on online E-commerce sites. Some studies particularly emphasize the social networks of customers. Users’ reviews and ratings of a product exert effects on other consumers’ purchasing behavior. Whether a user refers to other users’ ratings depends on the trust accorded by a user to the reviewer. On the other hand, the trust that is felt by a user for another user correlates with the similarity of two users’ ratings. This bidirectional interaction that involves trust and rating is an important aspect of understanding consumer behavior in online communities because it suggests clustering of similar users and the evolution of strong communities. This paper presents a theoretical model along with analyses of an actual online E-commerce site. We analyzed a large community site in Japan: @cosme. The noteworthy characteristics of @cosme are that users can bookmark their trusted users; in addition, they can post their own ratings of products, which facilitates our analyses of the ratings’ bidirectional effects on trust and ratings. We describe an overview of the data in @cosme, analyses of effects from trust to rating and vice versa, and our proposition of a measure of of community gravity, which measures how strongly a user might be attracted to a community. Our study is based on the @cosme dataset in addition to the Epinions dataset. It elucidates important insights and proposes a potentially important measure for mining online social networks.
Mislove, AlanCha, Meeyoung and Mislove, Alan and Gummadi, Krishna P. A Measurement-driven Analysis of Information Propagation in the Flickr Social Network. Online social networking sites like MySpace, Facebook, and Flickr have become a popular way to share and disseminate content. Their massive popularity has led to viral marketing techniques that at- tempt to spread content, products, and ideas on these sites. How- ever, there is little data publicly available on viral propagation in the real world and few studies have characterized how information spreads over current online social networks. In this paper, we collect and analyze large-scale traces of infor- mation dissemination in the Flickr social network. Our analysis, based on crawls of the favorite markings of 2.5 million users on 11 million photos, aims at answering three key questions: (a) how widely does information propagate in the social network? (b) how quickly does information propagate? and (c) what is the role of word-of-mouth exchanges between friends in the overall propaga- tion of information in the network? Contrary to viral marketing “intuition,” we find that (a) even popular photos do not spread widely throughout the network, (b) even popular photos spread slowly through the network, and (c) information exchanged be- tween friends is likely to account for over 50% of all favorite- markings, but with a significant delay at each hop.
Muhamad, RobyGoel, Sharad and Muhamad, Roby and Watts, Duncan Social Search in "Small-World" Experiments. The “algorithmic small-world hypothesis” states that not only are pairs of individuals in a large social network connected by short paths, but that ordinary individuals can find these paths. Although theoretically plausible, empirical evidence for the hypothesis is limited, as most chains in “small-world” experiments fail to complete, thereby biasing estimates of “true” chain lengths. Using data from two recent small-world experiments, comprising a total of 162,328 message chains, and directed at one of 30 “targets” spread across 19 countries, we model heterogeneity in chain attrition rates as a function of individual attributes. We then introduce a rigorous way of estimating true chain lengths that is provably unbiased, and can account for empiricallyobserved variation in attrition rates. Our findings provide mixed support for the algorithmic hypothesis. On the one hand, it appears that roughly half of all chains can be completed in 6-7 steps—thus supporting the “six degrees of separation” assertion—but on the other hand, estimates of the mean are much longer, suggesting that for at least some of the population, the world is not “small” in the algorithmic sense. We conclude that search distances in social networks are fundamentally different from topological distances, for which the mean and median of the shortest path lengths between nodes tend to be similar.
Park, Seung-TaekChu, Wei and Park, Seung-Taek Personalized Recommendation on Dynamic Content Using Predictive Bilinear Models. In Web-based services of dynamic content (such as news articles), recommender systems face the difficulty of timely identifying new items of high-quality and providing recommendations for new users. We propose a feature-based machine learning approach to personalized recommendation that is capable of handling the cold-start issue effectively. We maintain profiles of content of interest, in which temporal characteristics of the content, e.g. popularity and freshness, are updated in real-time manner. We also maintain profiles of users including demographic information and a summary of user activities within Yahoo! properties. Based on all features in user and content profiles, we develop predictive bilinear regression models to provide accurate personalized recommendations of new items for both existing and new users. This approach results in an offline model with light computational overhead compared with other recommender systems that require online re-training. The proposed framework is general and flexible for other personalized tasks. The superior performance of our approach is verified on a large-scale data set collected from the Today-Module on Yahoo! Front Page, with comparison against six competitive approaches.
Plangprasopchok, AnonPlangprasopchok, Anon and Lerman, Kristina Constructing Folksonomies from User-Specified Relations on Flickr. Automatic folksonomy construction from tags has attracted much attention recently. However, inferring hierarchical relations between concepts from tags has a drawback in that it is difficult to distinguish between more popular and more general concepts. Instead of tags we propose to use userspecified relations for learning folksonomy. We explore two statistical frameworks for aggregating many shallow individual hierarchies, expressed through the collection/set relations on the social photosharing site Flickr, into a common deeper folksonomy that reflects how a community organizes knowledge. Our approach addresses a number of challenges that arise while aggregating information from diverse users, namely noisy vocabulary, and variations in the granularity level of the concepts expressed. Our second contribution is a method for automatically evaluating learned folksonomy by comparing it to a reference taxonomy, e.g., the Web directory created by the Open Directory Project. Our empirical results suggest that user-specified relations are a good source of evidence for learning folksonomies.
Riedl, JohnSen, Shilad and Vig, Jesse and Riedl, John Tagommenders: Connecting Users to Items through Tags. Tagging has emerged as a powerful mechanism that enables users to find, organize, and understand online entities. Recommender systems similarly enable users to efficiently navigate vast collections of items. Algorithms combining tags with recommenders may deliver both the automation inherent in recommenders, and the flexibility and conceptual comprehensibility inherent in tagging systems. In this paper we explore tagommenders, recommender algorithms that predict users’ preferences for items based on their inferred preferences for tags. We describe tag preference inference algorithms based on users’ interactions with tags and movies, and evaluate these algorithms based on tag preference ratings collected from 995 MovieLens users. We design and evaluate algorithms that predict users’ ratings for movies based on their inferred tag preferences. Our tag-based algorithms generate better recommendation rankings than state-of-the-art algorithms, and they may lead to flexible recommender systems that leverage the characteristics of items users find most important.
San Pedro, JoseSan Pedro, Jose and Siersdorfer, Stefan Ranking and Classifying Attractiveness of Photos in Folksonomies. Web 2.0 applications like Flickr, YouTube, or Del.icio.us are increasingly popular online communities for creating, editing and sharing content. The growing size of these folksonomies poses new challenges in terms of search and data mining. In this paper we introduce a novel methodology for automatically ranking and classifying photos according to their attractiveness for folksonomy members. To this end, we exploit image features known for having significant effects on the visual quality perceived by humans (e.g. sharpness and colorfulness) as well as textual meta data, in what is a multi-modal approach. Using feedback and annotations available in the Web 2.0 photo sharing system Flickr, we assign relevance values to the photos and train classification and regression models based on these relevance assignments. With the resulting machine learning models we categorize and rank photos according to their attractiveness. Applications include enhanced ranking functions for search and recommender methods for attractive content. Large scale experiments on a collection of Flickr photos demonstrate the viability of our approach.
Sen, ShiladSen, Shilad and Vig, Jesse and Riedl, John Tagommenders: Connecting Users to Items through Tags. Tagging has emerged as a powerful mechanism that enables users to find, organize, and understand online entities. Recommender systems similarly enable users to efficiently navigate vast collections of items. Algorithms combining tags with recommenders may deliver both the automation inherent in recommenders, and the flexibility and conceptual comprehensibility inherent in tagging systems. In this paper we explore tagommenders, recommender algorithms that predict users’ preferences for items based on their inferred preferences for tags. We describe tag preference inference algorithms based on users’ interactions with tags and movies, and evaluate these algorithms based on tag preference ratings collected from 995 MovieLens users. We design and evaluate algorithms that predict users’ ratings for movies based on their inferred tag preferences. Our tag-based algorithms generate better recommendation rankings than state-of-the-art algorithms, and they may lead to flexible recommender systems that leverage the characteristics of items users find most important.
Siersdorfer, StefanSan Pedro, Jose and Siersdorfer, Stefan Ranking and Classifying Attractiveness of Photos in Folksonomies. Web 2.0 applications like Flickr, YouTube, or Del.icio.us are increasingly popular online communities for creating, editing and sharing content. The growing size of these folksonomies poses new challenges in terms of search and data mining. In this paper we introduce a novel methodology for automatically ranking and classifying photos according to their attractiveness for folksonomy members. To this end, we exploit image features known for having significant effects on the visual quality perceived by humans (e.g. sharpness and colorfulness) as well as textual meta data, in what is a multi-modal approach. Using feedback and annotations available in the Web 2.0 photo sharing system Flickr, we assign relevance values to the photos and train classification and regression models based on these relevance assignments. With the resulting machine learning models we categorize and rank photos according to their attractiveness. Applications include enhanced ranking functions for search and recommender methods for attractive content. Large scale experiments on a collection of Flickr photos demonstrate the viability of our approach.
Vig, JesseSen, Shilad and Vig, Jesse and Riedl, John Tagommenders: Connecting Users to Items through Tags. Tagging has emerged as a powerful mechanism that enables users to find, organize, and understand online entities. Recommender systems similarly enable users to efficiently navigate vast collections of items. Algorithms combining tags with recommenders may deliver both the automation inherent in recommenders, and the flexibility and conceptual comprehensibility inherent in tagging systems. In this paper we explore tagommenders, recommender algorithms that predict users’ preferences for items based on their inferred preferences for tags. We describe tag preference inference algorithms based on users’ interactions with tags and movies, and evaluate these algorithms based on tag preference ratings collected from 995 MovieLens users. We design and evaluate algorithms that predict users’ ratings for movies based on their inferred tag preferences. Our tag-based algorithms generate better recommendation rankings than state-of-the-art algorithms, and they may lead to flexible recommender systems that leverage the characteristics of items users find most important.
Vojnovic, MilanKaragiannis, Thomas and Vojnovic, Milan Behavioral Profiles for Advanced Email Features. We examine the behavioral patterns of email usage in a large-scale enterprise over a three-month period. In particular, we focus on two main questions: (Q1) what do replies depend on? and (Q2) what is the gain of augmenting contacts through the friends of friends from the email social graph? For Q1, we identify and evaluate the significance of several factors that affect the reply probability and the email response time. We find that all factors of our considered set are significant, provide their relative ordering, and identify the recipient list size, and the intensity of email communication between the correspondents as the dominant factors. We highlight various novel threshold behaviors and provide support for existing hypotheses such as that of the least-effort reply. For Q2, we find that the number of new contacts extracted from the friends-of-friends relationships amounts to a large number, but which is still a limited portion of the total enterprise size. We believe that our results provide significant insights towards informed design of advanced email features, including those of social-networking type. Categories & Subject Descriptors: H.4.3 [Communications Applications]: Electronic mail General Terms: Design, Measurement, Human Factors Keywords: Reply time, reply probability, email profiles.
Wang, YiChen, Wen-Yen and Chu, Jon-Chyuan and Luan, Junyi and Bai, Hongjie and Wang, Yi and Chang, Edward Y. Collaborative Filtering for Orkut Communities: Discovery of User Latent Behavior. Users of social networking services can connect with each other by forming communities for online interaction. Yet as the number of communities hosted by such websites grows over time, users have even greater need for effective commu- nity recommendations in order to meet more users. In this paper, we investigate two algorithms from very different do- mains and evaluate their effectiveness for personalized com- munity recommendation. First is association rule mining (ARM), which discovers associations between sets of com- munities that are shared across many users. Second is latent Dirichlet allocation (LDA), which models user-community co-occurrences using latent aspects. In comparing LDA with ARM, we are interested in discovering whether modeling low-rank latent structure is more effective for recommen- dations than directly mining rules from the observed data. We experiment on an Orkut data set consisting of 492, 104 users and 118, 002 communities. Our empirical comparisons using the top-k recommendations metric show that LDA performs consistently better than ARM for the community recommendation task when recommending a list of 4 or more communities. However, for recommendation lists of up to 3 communities, ARM is still a bit better. We analyze exam- ples of the latent information learned by LDA to explain this finding. To efficiently handle the large-scale data set, we parallelize LDA on distributed computers [1] and demon- strate our parallel implementation’s scalability with varying numbers of machines.
Watts, DuncanGoel, Sharad and Muhamad, Roby and Watts, Duncan Social Search in "Small-World" Experiments. The “algorithmic small-world hypothesis” states that not only are pairs of individuals in a large social network connected by short paths, but that ordinary individuals can find these paths. Although theoretically plausible, empirical evidence for the hypothesis is limited, as most chains in “small-world” experiments fail to complete, thereby biasing estimates of “true” chain lengths. Using data from two recent small-world experiments, comprising a total of 162,328 message chains, and directed at one of 30 “targets” spread across 19 countries, we model heterogeneity in chain attrition rates as a function of individual attributes. We then introduce a rigorous way of estimating true chain lengths that is provably unbiased, and can account for empiricallyobserved variation in attrition rates. Our findings provide mixed support for the algorithmic hypothesis. On the one hand, it appears that roughly half of all chains can be completed in 6-7 steps—thus supporting the “six degrees of separation” assertion—but on the other hand, estimates of the mean are much longer, suggesting that for at least some of the population, the world is not “small” in the algorithmic sense. We conclude that search distances in social networks are fundamentally different from topological distances, for which the mean and median of the shortest path lengths between nodes tend to be similar.
Yamamoto, HikaruMatsuo, Yutaka and Yamamoto, Hikaru Community Gravity: Measuring Bidirectional Effects by Trust and Rating on Online Social Networks. Several attempts have been made to analyze customer behavior on online E-commerce sites. Some studies particularly emphasize the social networks of customers. Users’ reviews and ratings of a product exert effects on other consumers’ purchasing behavior. Whether a user refers to other users’ ratings depends on the trust accorded by a user to the reviewer. On the other hand, the trust that is felt by a user for another user correlates with the similarity of two users’ ratings. This bidirectional interaction that involves trust and rating is an important aspect of understanding consumer behavior in online communities because it suggests clustering of similar users and the evolution of strong communities. This paper presents a theoretical model along with analyses of an actual online E-commerce site. We analyzed a large community site in Japan: @cosme. The noteworthy characteristics of @cosme are that users can bookmark their trusted users; in addition, they can post their own ratings of products, which facilitates our analyses of the ratings’ bidirectional effects on trust and ratings. We describe an overview of the data in @cosme, analyses of effects from trust to rating and vice versa, and our proposition of a measure of of community gravity, which measures how strongly a user might be attracted to a community. Our study is based on the @cosme dataset in addition to the Epinions dataset. It elucidates important insights and proposes a potentially important measure for mining online social networks.
van Raaij, DeniseBrandes, Ulrik and Kenis, Patrick and Lerner, Jürgen and van Raaij, Denise Network Analysis of Collaboration Structure in Wikipedia. In this paper we give models and algorithms to describe and analyze the collaboration among authors of Wikipedia from a network analytical perspective. The edit network encodes who interacts how with whom when editing an article; it significantly extends previous network models that code author communities in Wikipedia. Several characteristics summarizing some aspects of the organization process and allowing the analyst to identify certain types of authors can be obtained from the edit network. Moreover, we propose several indicators characterizing the global network structure and methods to visualize edit networks. It is shown that the structural network indicators are correlated with quality labels of the associated Wikipedia articles.
This list was generated on Fri Feb 15 08:40:32 2019 GMT.
About this site
This website has been set up for WWW2009 by Christopher Gutteridge of the University of Southampton, using our EPrints software.
Preservation
We (Southampton EPrints Project) intend to preserve the files and HTML pages of this site for many years, however we will turn it into flat files for long term preservation. This means that at some point in the months after the conference the search, metadata-export, JSON interface, OAI etc. will be disabled as we "fossilize" the site. Please plan accordingly. Feel free to ask nicely for us to keep the dynamic site online longer if there's a rally good (or cool) use for it... [this has now happened, this site is now static]
|