Examples
From TrustLet, a free, collaborative project for collecting and analyzing information about trust metrics.
When using trustlet in an interactive way (e.g. in IPython) it's convenient (but not very clean) to import everything into the main namespace:
from trustlet import *
Contents |
[edit] Loading datasets
It's easy to start with a dataset and get some information about it:
# create dummy dataset D = DummyNetwork() D.info() # show information
For several advogato-style websites trustlet will download the most recent dataset automatically:
# create Kaitiaki dataset K = KaitiakiNetwork() K.info() # show information
# create SqueakFoundation dataset S = SqueakFoundationNetwork() S.info()
# create Advogato dataset A = AdvogatoNetwork() A.info()
You can also specify a date in the past:
# create Advogato dataset as it was on a certain date. # The .dot file is taken from http://www.trustlet.org/datasets/advogato/ looking for the correctly dated file AD = AdvogatoNetwork(date="2007-10-13") AD.info()
[edit] Generate graphics for an arbitrary number of trust metrics on controversial nodes
If you would to generate this graphics you need to compute the type of error that you want to plot for each controversiality level. The fist step is to allocate the network:
IdentifierNetwork = AdvogatoNetwork( date="year-month-day", base_path="your/dataset/directory(only if it isn't in your home)" )
Now you must allocate the trust metrics that you would plot.
TM = TrustMetric( IdentifierNetwork , trust_metric_function ) or/and TM1 = PageRankTM( IdentifierNetwork ) or/and ...
Now you might have a certain number of trustmetric (we call it tm1, tm2... tmN ) In order to plot them we must calcolate (or read if there was already calculated) the PredGraph (a network with on edges the original trust, and predicted trust) and on this class invoke the graphcontroversiality method. (For the documentation of the method open python import trustlet and launch this command "print PredGraph.graphcontroversiality.__doc__")
PredictedNetwork1 = PredGraph( tm1 ) PredictedNetwork2 = PredGraph( tm2 ) . . . PredictedNetworkN = PredGraph( tmN )
Now we are ready to calculate the points to be plotted, there are returned in a list of tuple by graphcontroversiality method
ListOfPoints1 = PredictedNetwork1.graphcontroversiality( maxcontroversiality, step, typeOfError [, NumberOfYourProcessor] ) ListOfPoints2 = PredictedNetwork2.graphcontroversiality( maxcontroversiality, step, typeOfError [, NumberOfYourProcessor] ) . . ListOfPointsN = PredictedNetworkN.graphcontroversiality( maxcontroversiality, step, typeOfError [, NumberOfYourProcessor] )
Now ( finally ;-) ) we have all data in order to plot the graph. We use the prettyplot function. (See documentation launching "print prettyplot.__doc__")
prettyplot( [ListOfPoints1,ListOfPoints1,....ListOfPointsN], path/to/img/to/save.png,
legend=('Short comment List1','Short comment List2'...'Short comment ListN')[, .....other parameters] )
If you are done all correctly this command show a graphics with the data that you have selected, and save it on path that you have specified.
[edit] Evaluating trust metrics
# load the Advogato network dataset A = AdvogatoNetwork() # create a trust metric based on MoleTrust with horizon 3 and threshold 0.5 moletrust3 = TrustMetric(A, 3) # use the trust metric to predict all the present trust edges (leave-one-out technique) pred_graph = PredGraph( TrustMetric(A, moletrust_generator(horizon=3))) # compute some errors measures on the trust values predicted for trust edges pred_graph.abs_error() pred_graph.coverage() # write something about generating a pred_graph only on edges satisfying some conditions
[edit] Network Evolution
Package trustlet.netevolution provides some tools to compare different snapshot of the same network in order to study its evolution.
USAGE: ./netevolution.py startdate enddate dataset_path save_path [debug file] [-s step]
startdate and enddate is something like 2008-05-12.
- dataset_path is the folder in which all the dataset are stored. (ex. /home/.../datasets/AdvogatoNetwork/ [because contains all the datasets])
- save_path is the path in which we save .png image with the graph and .gnuplot text file that was used to create graph.
- debug file is the file in which the debug are stored. If you pass this parameter automatically you enable debug mode, and store all information in
this file. If the file does not exist it will be created.
- -s step is a parameter to specify the distance between the network that must be calculated and plotted. If you want to calculate a network
only every 10 days you must specify -s 10.
Some functions calculated by netevolution
- edgespernode
- shows the average number of votes for each user
- trustAverage
- shows the average trust of the network
- usersgrown
- shows the grown of the network
. . .
[edit] wikixml2graph.py
For generating .c2 files representing the Network instance (plus lists of special users)
USAGE:
./wikixml2graph.py xml_file [--history|--current] lang date [base_path|real<real_path>] [--input-size bytes]
Default base_path = home dir
If base_path starts with 'real' graph will save in real_path
If lang and date are both '-' wikixml2dot will read them from file name
If xml_file is - it will use stdin
input-size: useful if xml_file is stdin
Download compressed dataset from http://download.wikipedia.org/backup-index.html and decompress them. We need pages-meta-current.xml.bz2 and stub-meta-history.xml.gz.
$ ./wikixml2graph.py /mnt/data/datasetwiki/itwiki-20080626-stub-meta-history.xml - - ~ --history $ ./wikixml2graph.py /mnt/data/datasetwiki/itwiki-20080626-pages-meta-current.xml - - ~ --current
Using xargs we can create a lot of graphs with one command:
$ ls /mnt/data/datasetwiki/*current.xml | xargs -I f ./wikixml2graph.py f - - ~ --current
It's also possible create graphs without decompress datasets:
$ bzcat /mnt/data/datasetwiki/itwiki-20080626-stub-meta-history.xml | ./wikixml2graph.py - it 2008-06-26 ~ --history $ zcat /mnt/data/datasetwiki/itwiki-20080626-pages-meta-current.xml | ./wikixml2graph.py - it 2008-06-26 ~ --current
It's more difficult combine xargs with this.

