Social Network dataset
This webpage contains the social network datasets gathered by Golnoosh Farnadi. If you have any question, feel free to contact: golnoosh.farnadi[at]ugent[dot]beNetlog datasets
- Netlog is a social networking site starting in 2005, which currently has more than 90 million users worldwide. Here, you can find information of two Netlog datasets used for the scientific experiments. The links of downloding the datasets are provided at the end of each section.
(1) Netlog Dataset (Soft Quantification Analysis)
- The Netlog dataset used for the experiments in the paper Soft Quantification in Statistical Relational Learning is collected by our crawler from the Netlog web site in July 2014. By applying snowball sampling, and starting from one user, we crawled the profiles of 3015 users, called the core users henceforth. Out of these 3015 profiles, 765 users (25%) have private profiles (the private core users) and 2241 (75%) have public profiles (the public core users). Next we crawled the user profiles of all the friends of the public core users, resulting in 171,439 additional profiles, referred to as the background users. Note that we could not do the same for the private core users, as their friend lists are not publicly accessible.
We ended up with a sample network including 174,454 users.
You can find the detail statistics of our sample data from Netlog in the following table:
General statistics on the sample Netlog data set number of nodes (users) 174,454 number of code users 3,015 Public core 2,241 Private Core 765 Background 171,439 Gender # of females 82,387 (47%) # of males 92,067 (53%) Age # of youngs (age<=25 years old) 93,090 (53%) # of non-youngs (age>25 years old) 81,364 (47%)
Download
-
The Netlog social graph and users' profile can be downloaded as zip-file below:
File Descption MLJ Netlog dataset Netlog profiles (anonymized user profiles including users' age and gender) and social graph
Citation
- If you use this database, please cite the following publication:
-
Soft Quantification in Statistical Relational Learning
G. Farnadi, S-h. Bach, M-F. Moens, L. Getoor, and M. De Cock
Accepted at Machine Learning Journal (MLJ), 2017
@inproceedings{farnadi2017MLJ,
title={Soft Quantification in Statistical Relational Learning},
author={Farnadi, Golnoosh and Bach, Stephen-H and Moens, Marie-Francine and Getoor, Lise and De Cock, Martine},
booktitle={Machine Learning Journal (MLJ)},
year={2017},
organization={Springer}
}
(2) Netlog Dataset (Big Data Analysis)
- The Netlog dataset used for the experiments in the paper Scalable Adaptive Label Propagation in Grappa is collected by our crawler from the Netlog web site in May 2014. We crawled the network from a random user in a breadth-first way following the friendship links and collect publicly available information of the users. Netlog data has been anonymized by replacing the user ids for each user with a new id. Also, while social graph and profile info from this dataset have been provided, the interpretation of users has been obscured.
Using the crawler, we obtained 3,359,775 users in 1059 connected components. We chose the giant connected component of this set of graphs which includes 3,351,975 users and 8,029,423 friendship links. You can find the detail statistics of our sample data from Netlog in the following table:
General statistics on the sample Netlog data set number of nodes (users) 3,351,975 number of edges (friendship links) 8,029,423 degree distribution power law exponent 2.25 clustering coefficient 0.11 Gender # of females 1,585,080 (47.3%) # of males 1,766,895 (52.7%) Age min age 1 max age 114 mean age 29.4
Download
-
The Netlog social graph and users' profile can be downloaded as zip-file below:
File Descption Netlog GraphML[zip] Netlog profiles (anonymized user profiles including users' age and gender) and social graph
Citation
- If you use this database, please cite the following publication:
-
Scalable Adaptive Label Propagation in Grappa
G. Farnadi, Z. Mahdavifar, I. Keller, J. Nelson, A. Teredesai, M.-F. Moens, M. De Cock
In: Proceedings of IEEE-BigData 2015 (IEEE International Conference on Big Data), 2015
@inproceedings{farnadi2015IEEE-BigData,
title={Scalable Adaptive Label Propagation in Grappa},
author={Farnadi, Golnoosh and Mahdavifar, Zeinab and Keller, Ivan and Nelson, Jacob and Teredesai, Ankur and Moens, Marie-Francine and De Cock, Martine},
booktitle={IEEE International Conference on Big Data (IEEE-BigData 2015)},
year={2015},
organization={IEEE}
}