Friday, November 9, 2012

Open Source Recommendation Systems Survey

Here follows a survey I did back in 2010 when I was studying Recommender Systems. Hope it is useful.


The growth of web content and the expansion of e-commerce has deeply increased the interest
on recommender systems. This fact has led to the development of some open source projects in the area.
Among the recommender systems algorithms available in the web, we can distinguish the following:


All of these projects offers collaborative-filtering implementations, in different programming languages.

The Duine Framework supplies also an hybrid implementation. It is a Java software that presents the content-based and collaborative filtering in a switching engine: it dynamically switches between each prediction given the current state of the data. For example if there aren't many ratings
available, it uses the content-based approach, and switches to the collaborative when the scenario changes. It also presents an Explanation API, which can be used to create user-friendly recommendations and a demo application, with a Java Client example.

Apache Mahout constitutes a Java framework in the data mining area. It has incorporated the Taste Recommender System, a collaborative engine for personalized recommendations.

Vogoo is a PHP framework that implements an collaborative filtering recommender system. It also presents a Slope-One code.

A Java version of the Collaborative Filtering method is implemented in the Cofi library. It was developed by Daniel Lemire, the creator of the Slope-One algorithms. There is also an PHP version available in Lemire's webpage.

OpenSlopeOne offers an Slope One implementation on PHP that cares about performance.

SUGGEST is a recommendation library made by George Karkys and distributed in a binary format.

Many of these projects run with the help of Maven, a project manager by Apache, that can be downloaded in the website.
In this project, they were tested with the MovieLens dataset, a database available by the GroupLens Research. It is offered three packages with 100.00, 1 million and 10 million ratings from users on items varying from 0 to 5.
For my specific project, I had to chose one of these open source packages to be used. It was, then, natural to compare the softwares, analyzing which one was a better fit to our requirements.


  Comparison

 

Analysing software in the recommendation area is not an simple task, since is difficult to define measurement standards. In this work, we proposed some criteria of evaluation such as: types of recommendation implemented by the project, programming language, level of documentation and
magnitude of the project.
The documentation was evaluated based on its volume and clarity. It is possible to observe that
the volume of documentation presented by Mahout and Duine is remarkably larger than the other
systems. Both offer installation and utilization guides and come with a demonstration example. It must
be taken in count that OpenSlopeOne and Cofi are smaller project, and due to it, their documentation
tend to be smaller.
In the Downloads column we have a representation of the magnitude of the project. It is
presented the number of times the software, in any version, was downloaded from its source. Although
Mahout  does not present its number, its very populated mailing lists show that it is a widely used
software.

The two projects that stood out were Apache Mahout and Duine. We tested them in order to verify which one was more applicable to our work. Both of them are Java frameworks and present an demonstration example with the Movielens data set.
The fact that Mahout is a greater project and has multiples machine-learning algorithms made it
more interesting to our research. Also, its module structure encouraged us to choose it.

Here follows the main advantages and characteristics of the two most qualified projects for our needs.

To read more about Mahout.

9 comments:

  1. thank you for sharing this nice survey

    ReplyDelete
  2. Good one !

    If possible can you include few more open source engines like myrrix, easyRec, and one from lenskit.groupon.org..

    rgds
    Harish

    ReplyDelete
  3. hello, I need to develop a recommendation system in college and am looking for a framework for content-based filtering. What is the best framework for desenvover-based recommendation system for content?

    ReplyDelete
    Replies
    1. Hi,

      hum I think you could start of taking a look at Duine, and seeing how it is implemented there. There are several articles on content-based filtering that you could also use as a base to your implementation. This article describes in details a content based recommender system

      http://www.ics.uci.edu/~welling/teaching/CS77Bwinter12/handbook/ContentBasedRS.pdf

      you could use it to do your implementation.

      Delete
    2. I will look at the article, thank you!

      Delete
  4. Any python svd++ recommendaton you know of ?

    ReplyDelete
    Replies
    1. Hello Swaroop, I've heard about this implementation of a SVD++ in python:

      http://gustavonarea.net/blog/posts/korens-svd-python-implementation/

      but never tested it. Mahout also offer a SVD++ implementation https://mahout.apache.org/users/recommender/matrix-factorization.html but as you might be aware, it is in java. Nevertheless, I found a Mahout and python integration here http://bayesianbrain.blogspot.com.br/2011/03/mahout-and-python-integration-using.html. Maybe you could try that!

      Delete
  5. hai, i need to develop a loaction aware recommender system for my thesis work. I am looking for a framework which is using collaborative filtering technique.

    ReplyDelete