This dataset was given directly by Epinions staff to Paolo Massa. As a consequence, the dataset contains also the distrust lists (which users are distrusted by which users) that is not shown on the site but kept private. Note that it is not a tipical collaborative filtering dataset, since the ratings are about the articles and not about items: the ratings represent how much a certain user rates a certain textual article written by an other user, i.e. a review.
The dataset contains
* ~132,000 users, who issued
* 841,372 statements (717,667 trusts and 123,705 distrusts).
* ~85,000 users received at least one statement.
Users and Items are represented by anonimized numeric identifiers.
The dataset consists of 3 files.
user_rating.txt.gz (4.7 Megabytes): Trust is the mechanism by which the user makes a statement that he likes the content or the behavior of particular user and would like to see more of what the users does in the site. Distrust is the opposite of the trust in which the user says that they do want to see lesser of the operations performed by that user.
mc.txt.gz (15 Megabytes): Each article is written by a user.
rating.txt (684 Megabytes): Ratings are quantified statements made by users regarding the quality of a content in the site. Ratings is the basis on which the contents are sorted and filtered.
How to use these files?
Just download the txt.gz files on your hard disk. Then run from the command line of your GNU/Linux shell:
Some people reported that under Windows the files seems to be doubly zipped.
When you unzip the files, you'll get a .txt file which is not really a text file. It's still a zip file. Change the extension to .zip and unzip the file again. Then you are done. Let me know if you have any problem.
If you use this dataset, you might want to cite one of the following papers:
Trustlet, open research on trust metrics. P Massa, K Souren, M Salvetti, D Tomasoni. Scalable Computing: Practice and Experience 9 (4)
Trust-aware recommender systems. P Massa, P Avesani. Proceedings of the 2007 ACM conference on Recommender systems, 17-24