Meeting 23 july 2004

Meeting 23 july 2004

Massa, avesani

we summarize some of the thesis of my phd:

1. trust metrics (using trust statements) can compute a relevance measure in more users than user similarity (Pearson) (using ratings on items)

1. CF + Trust metrics > CF

1. Trust metrics > CF : this was the thesis of the PaperCoopis2004

what does > means?

better coverage: the number of predictable ratings and the number of users for which at least a prediction is possible (this was specially true for ColdStartUser)

better accuracy: the MeanAbsoluteError was smaller (this is not true, the error is more or less the same)

(about this we thought a bit about better evaluation techniques. see below)

1. Compose(Trust,Ratings): find an intelligent algorithm to compose ratings and trust information in order to improve CF

1. propose a better local metric (and evaluate proposed ones especially the global/local tradeoffs)

We need to fine tune our instruments for RS evaluation. for example, if we simply consider MeanAbsoluteError, then very simple algorithms (such as "return average rating of item" perform very well!!).

Better evaluation techniques are:

* compute error (MAE or MeanSquaredError?) only on items with more than x ratings, with less than x ratings, ...

on items with few ratings (the vast majority in real world dataset), trust-aware tecniques are espected to work best!

* compute error (MAE or MeanSquaredError?) only on items with variance greater than y.

these are the ControversialItems and we could argue that they are the really interesting to predict about. While everyone will be able to tell you "you will like matrix as 4.9 and make a small error (on average)", it is much more difficult to predict the rating of a very controversial movie (xxx make an example! xxx on epinions.com dataset "Backstreet Boys [ECD] - Backstreet Boys/musc_mu-242546" have a variance>2.85 && ratings>25 (#ratings: 33, mean: 2.84848484848485, variance: 2.88257575757576, deviation: 1.69781499509686)

* compute error (MAE or MeanSquaredError?) only on users with more than x ratings, with less than x ratings, ...

this was already done in PaperCoopis2004

We call this portions "views".

---

Good point made by paoloavesani:

change perspective! we have always look to trust as a mean to improve the use of ratings (CF). we can do the opposite: look at ratings as a way to improve trust mentrics (moleskiing can be a testbed and brainarena for this!) CHANGE PERSPECTIVE (or should i say "think different" ;-)

about thesis "propose a better local metric"

15% of the posts (items) on slashdot get both positive and negative comments --> they are controversial! how many users on epinions get both trust and distrust statements (i.e. they are controversial)?

global trust metrics (pagerank, slashdot, ebay, ...) performs well when people are standardized: if everyone of us likes user Angel and dislike user Spammer, then the goal is just to average ratings and spot out that angel is trusted by everyone and spammer no! but if ControversialUser is trusted by 1000 users and distrusted by 1000 users, then a global trust metric is probably ineffective! in this case a local trust metric can be better, but we must prouve it!!! IT CAN BE a ubercool paper!!

how to prouve it? find the controversial users on epinions dataset (trust+distrust) and show that pagerank produce a higher error than local trust metrics (even a simple one) for their predicted trust values! see TODO below

an interesting collateral point is the following: pagerank uses link as a "trust statement" but actually a link can be a trust or a distrust link (example: "i think this guy is an idiot!", in this case the link is not a vote for but a vote against...). we can run pagerank on epinions.com only considering trust statements and considering trust+distrust. (THINK BETTER!!!)

similar points can be done for bibliometrics: in citeseer not always a citation is a "vote for", sometimes it is a vote against.

---

marco_gori will be here in trento giovedi 29 luglio. i coudl speak with him (aroung 17.00) about:

* search engines and algorithms that take into account (as input) positive links and negative links. are there papers? running demos?

is there someone who proposed to add "negative links" in pagerank formulation?

* personalized search engine that take into account personal opinions (and links) [if i link www.linux.org, when i ask for "Operating system", it is probably a bad idea to return "microsoft.com"] : are there papers? running demos?

* are you aware of "local" pagerank uses? such as in http://moloko.itc.it/paoloblog/archives/2003/11/14/trust_management_for_the_semantic_web.html

TODO:

* extract controversial users from epinions.com dataset (how many they are?)

* recompute errors based on nr and variance of ratings to items (it should be enough to change a class "Results..." and run it against already computed results matrices)

* think better the thesis of pagarank considering links as "Vote for" while sometimes they are "vote against": how can this reduce performances? it depends on the number of "negative" links? what can we prouve on epinions.com dataset? what on citeseer dataset?

* speak with marco gori on 29 july afternoon