Semantic-Oriented Knowledge Transfer for Review RatingSemantic-Oriented Knowledge Transfer for Review Rating
王波;张宁;林泉;陈松灿;李玉华;
摘要(Abstract):
With the rapid development of Web 2.0, more and more people are sharing their opinions about online products, so there is much product review data. However, it is difficult to compare products directly using ratings because many ratings are based on different scales or ratings are even missing. This paper addresses the following question: given textual reviews, how can we automatically determine the semantic orientations of reviewers and then rank different items? Due to the absence of ratings in many reviews, it is difficult to collect sufficient rating data for certain specific categories of products (e.g., movies), but it is easier to find rating data in another different but related category (e.g., books). We refer to this problem as transfer rating, and try to train a better ranking model for items in the interested category with the help of rating data from another related category. Specifically, we developed a ranking-oriented method called TRate for determining the semantic orientations and for ranking different items and formulated it in a regularized algorithm for rating knowledge transfer by bridging the two related categories via a shared latent semantic space. Tests on the Epinion dataset verified its effectiveness.
关键词(KeyWords):
基金项目(Foundation): supported by the National Natural Science Foundation of China (No. 60773061);; the National Natural Science Foundation of Jiangsu Province of China (No. BK2008381);; supported by the National High-Tech Research and Development (863) Program of China (No.2009AA01Z138);; supported by the National Natural Science Foundation of China (No.70771043)
作者(Authors): 王波;张宁;林泉;陈松灿;李玉华;
参考文献(References):
- [1] http://www.epinion.com, 2010.
- [2] http://www.amazon.com, 2010.
- [3] Wijaya D T, Bressan S. A random walk on the red carpet: Rating movies with user reviews and pagerank. In: Pro- ceeding of the 17th ACM Conference on Information and Knowledge Management. New York, USA, 2008: 951-960.
- [4] Esuli A, Sebastiani F. Pageranking wordnet synsets: An application to opinion mining. In: Proceedings of the 45th Annual Meeting of the Association of Computational Lin- guistics. Prague, Czech Republic, 2007: 424-431.
- [5] Ding X, Liu B. The utility of linguistic rules in opinion mining. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA, 2007: 811-812.
- [6] Liu N N, Yang Q. Eigenrank: A ranking-oriented approach to collaborative filtering. In: Proceedings of the 31st An- nual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA, 2008: 83-90.
- [7] Burges C, Shaked T, Renshaw E, et al. Learning to rank using gradient descent. In: Proceedings of the 22th Interna- tional Conference on Machine Learning. New York, USA, 2005: 89-96.
- [8] Herbrich R, Graepel T, Obermayer K. Large Margin Rank Boundaries for Ordinal Regression. MIT Press, Cambridge, USA, 2000.
- [9] Dai W, Yang Q, Xue G R, et al. Boosting for transfer learning. In: Proceedings of the 24th International Confer- ence on Machine Learning. 2007: 193-200.
- [10] Blitzer J, Crammer K, Kulesza A, et al. Learning bounds for domain adaptation. In: Proceedings of the 19th Neural Information Processing Systems. Vancouver, Canada, 2007, 20: 129-136.
- [11] Argyriou A, Evgeniou T, Pontil M. Multi-task feature learning. In: Proceedings of the 18th Neural Information Processing Systems. Vancouver, Canada, 2006: 41-48.
- [12] Joachims T. Optimizing search engines using clickthrough data. In: Proceedings of the 8th ACM SIGKDD Interna- tional Conference on Knowledge Discovery and Data Mining. Edmonton, Canada, 2002: 133-142.
- [13] Xu J, Li H. Adarank: A boosting algorithm for information retrieval. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. New York, USA, 2007: 391-398.
- [14] Yue Y, Finley T, Radlinski F, et al. A support vector method for optimizing average precision. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Re- search and Development in Information Retrieval. New York, USA, 2007: 271-278.
- [15] Duh K, Kirchhoff K. Learning to rank with partially- la- beled data. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Singapore, 2008: 251-258.
- [16] Amini M R, Truong T V, Goutte C. A boosting algorithm for learning bipartite ranking functions with partially la- beled data. In: Proceedings of the 31st Annual International ACM SIGIR Conference on Research and Development in Information Retrieval. Singapore, 2008: 99-106.
- [17] Chen K, Lu R, Wong C K, et al. Trada: Tree based ranking function adaptation. In: Proceedings of the 17th ACM In- ternational Conference on Information and Knowledge Management. New York, USA, 2008: 1143-1152.
- [18] Gao J, Fan W, Jian J, et al. Knowledge transfer via multiple model local structure mapping. In: Proceeding of the 14th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Las Vegas, USA, 2008: 283-291.
- [19] Jebara T. Multi-task feature and kernel selection for svms. In: Proceedings of the 21th International Conference on Machine Learning. Banff, Canada, 2004.
- [20] Lee S I, Chatalbashev V, Vickrey D, et al. Learning a meta-level prior for feature relevance from multiple related tasks. In: Proceedings of the 24th International Conference on Machine Learning. Corvalis, USA, 2007: 489-496.
- [21] Pan S J, Kwok J T, Yang Q. Transfer learning via dimen- sionality reduction. In: Proceedings of Association for the Advancement of Artificial Intelligence. Chicago, USA, 2008: 677-682.
- [22] Bonilla E, Chai K M, Williams C. Multi-task Gaussian process prediction. In: Proceedings of the 20th Neural In- formation Processing Systems. Vancouver, Canada, 2008: 153-160.
- [23] Blei D M, Ng A Y, Jordan M I. Latent Dirichlet allocation. Journal of Machine Learning Research, 2003, 3: 993-1022.
- [24] Wang B, Tang J, Fan W, et al. Heterogeneous cross domain ranking in latent space. In: Proceedings of the 18th Con- ference on Information and Knowledge Management. Kong Hong, China, 2009: 987-996.
- [25] Jarvelin K, Kekalainen J. IR evaluation methods for re- trieving highly relevant documents. In: Proceedings of the 23th Annual International ACM SIGIR Conference on Re- search and Development in Information Retrieval. Athens, Greece, 2000: 41-48.
- [26] http://svmlight.joachims.org/, 2010.
- [27] Chu C T, Kim S K, Lin Y A, et al. Map-reduce for machine learning on multicore. In: Proceedings of the 18th Neural Information Processing Systems. Vancouver, Canada, 2006: 281-288.
- [28] Blei D M, McAuliffe J D. Supervised topic models. In: Proceedings of Advances in Neural Information Processing Systems. Vancouver, Canada, 2007.
- [29] Tang J, Sun J, Wang C, et al. Social influence analysis in large-scale networks. In: Proceedings of the 15th ACM SIGKDD International Conference on Knowledge Discov- ery and Data Mining. Paris, France, 2009: 807-816.