Dissertação
Combining collaborative and content-based filtering to recommend research papers
Autor
Torres Júnior, Roberto Dias
Resumen
The number of research papers available today is growing at a staggering rate, generating a huge amount of information that people cannot keep up with. According to a tendency indicated by the United States’ National Science Foundation, more than 10 million new papers will be published in the next 20 years. Because most of these papers will be available on the Web, this research focus on exploring issues on recommending research papers to users, in order to directly lead users to papers of their interest. Recommender systems are used to recommend items to users among a huge stream of available items, according to users’ interests. This research focuses on the two most prevalent techniques to date, namely Content-Based Filtering and Collaborative Filtering. The first explores the text of the paper itself, recommending items similar in content to the ones the user has rated in the past. The second explores the citation web existing among papers. As these two techniques have complementary advantages, we explored hybrid approaches to recommending research papers. We created standalone and hybrid versions of algorithms and evaluated them through both offline experiments on a database of 102,295 papers, and an online experiment with 110 users. Our results show that the two techniques can be successfully combined to recommend papers. The coverage is also increased at the level of 100% in the hybrid algorithms. In addition, we found that different algorithms are more suitable for recommending different kinds of papers. Finally, we verified that users’ research experience influences the way users perceive recommendations. In parallel, we found that there are no significant differences in recommending papers for users from different countries. However, our results showed that users’ interacting with a research paper Recommender Systems are much happier when the interface is presented in the user’s native language, regardless the language that the papers are written. Therefore, an interface should be tailored to the user’s mother language.