SourceForge networks

OSSmole collects data about Sourceforge contributions.

In particular, this page describes which data are collected (every 2 months since 2003!!!)

The most interesting data from a network point of view are the data about which developers contributes to which projects and in which role. It would be possible for instance to create networks of developers working in same projects, in which the weight is the normalized number of projects the 2 developers are working with together.

Note: the network is probably symmetric, unless we use as weight the number of common projects / total number of projects, but it will be anyway almost symmetric. Or maybe we could use the different roles as an indication of involvements, unfortunately it seems OSSmole does not collect [Networks of CVS commits|commits into the code repository].

Collected Data
Project Items
 * Project names (long name and short unique 'unixname')
 * Project Descriptions
 * Project URLs (URL on Sourceforge and 'real' URL)
 * Project registration date
 * Project intended audience(s)
 * Project license(s)
 * Project programming language(s)
 * Project database environment(s)
 * Project operating system(s)
 * Project donor(s)
 * Project status (alpha, beta, mature, etc)
 * Project topic(s)
 * Project user interface(s)

Developer items Note about developer items: we only have information on Sourceforge users (developers) associated with a project. If someone is a signed up as a Sourceforge user, but is not associated with any project, then we will not know about that person. Similarly, if a person is on a SF project in one month (say April), and then leaves the project the next scrape (say June) and does not join another project, that person will no longer appear in our data files for June even though they were in our data files for April.
 * Project developers (username, real name, Sourceforge email address)
 * Developer role(s) on project, including whether an administrator or not

Statistics items
 * Project downloads (sum of project downloads over 60-day window)
 * Project ranks (project rank averaged over 60-day window)
 * Project tracker sums (sums of tracker opens and closes over 60-day window)