Abstract - With the drastic increase of the amount of data shared across the Internet, it becomes important to utilize human feedback during information retrieval. Collaborative Filtering is a promising approach that introduces a ‘human’ element into information retrieval but still suffers from one major drawback. Current collaborative filtering systems deal with a specific type of product i.e. a system for books deals only with books (e.g. Amazon Book Search), a system for movies deals only with movies (e.g. MovieLens) and so on. The data stored in the former system is not accessible from the latter, and vice versa. I intend to explore if this inflexibility can be addressed by realizing a collaborative filtering system for a generic user-product model. The underlying assumption behind this entire approach remains the same as in Collaborative Filtering-‘The people who agreed in the past are likely to agree again in the future’.
Index Terms – Collaborative Filtering, Evaluation Metrics,Item-Item Based Filtering, User-Product Model.
INTRODUCTION
With the tremendous increase in the amount of data available in the internet, we have reached a point where we require some sort of ‘smart browsing’ that enables us to get to the most valuable information out of the Internet in the least possible time. We have all felt overwhelmed sometimes when looking for movies, research papers, and newspaper articles at the sheer amount of information displayed and rarely ever trust the usefulness of new products without obtaining a second opinion from a recognized source (e.g. a friend, a professor etc) or testing it ourselves. Collaborative Filtering based Recommender Systems is a powerful technology that has the potential to act as a guide and a trusted source while directing us to relevant information. Collaborative Filtering [1,2,3,4] works by building a database of preferences for items by users. A new user, Alice, is matched against the database to discover neighbors, who are other users who have historically had similar taste to Alice. Items liked by these neighbors are then recommended to Alice, as she will probably also like them. Collaborative Filtering has been successfully deployed in many E-commerce and Information Filtering applications. However most deployment of such algorithms till date has been in specific fields (such as music, books, movies, newspaper articles and clothing) and this raises some important issues. The first of these is the fact that a user is required to interact with different interfaces for advice on different products. Alice and Bob may be good friends and have similar tastes in many things but on the Internet require different profiles on different E-Commerce websites to recommend items to each other.
Secondly, a possible assumption one can make is that if two people agree on some things of one kind, they might also agree on certain other things of a different kind. Users who share similar tastes in Music also often share a similar taste in Movies, Literature etc. Such complex human behavior can be explored further if all the information a User provides throughout his Internet browsing can be processed by a single system.
Finally, current recommender systems only use a small subset of the available information about the customer in making their recommendations [5]. Systems can use demographic information, purchase data information, explicit ratings, ownership data to generate ratings but no system has yet been devised that can use all this data simultaneously for real-time recommendations.
In this paper I first state the User-Product Model that I have chosen to implement and then describe the Item Based Collaborative Algorithm [6] that will be used to generate predictions and provide recommendations. Item Based Collaborative Algorithms unlike k-Nearest Neighbor based Collaborative Algorithms avoid the need to find suitable neighbors among a large population of potential neighbors [7] and instead explore relationships between items first. The pseudocode for the implementation of the system has been provided in the next section. Finally, the evaluation metrics used to judge the efficiency of the system have been categorically stated and the last section deals with possible future work.
Post a Comment