HTTPS SSH

Dataset about the accesses to honey accounts in the Dark Web and the Surface Web

Authors: Dario Bermudez Villalva, Jeremiah Onaolapo, Gianluca Stringhini, and Mirco Musolesi (2018).

This is the data we collected from leaking the credentials from honey accounts, as presented in the paper "Under and Over the Surface: A Comparison Of The Use Of Leaked Account Credentials in the Dark and Surface Web".

Below is a description of the files in the repository. As described in the paper, we collected data related to the accesses to Gmail honey accounts such as: cookie identifier, timestamps of accesses, leaking outlet, hashed IP address, operating system, browser, location, duration.

Files

  • raw_data/attributionS.json - This file stores a dictionary of honey accounts about accesses from saved pages and attributes actions of honey accounts in the Surface Web.

  • raw_data/attributionD.json - This file stores a dictionary of honey accounts about accesses from saved pages and attributes actions of honey accounts in the Dark Web.

CSV File

A spreadsheet that summarizes the information extracted from the files mentioned previously can be founf here:

https://bitbucket.org/gianluca_students/surface_vs_dark_credentials_dataset/downloads/dataset.csv

It combines the information about the accesses to the honey account in both environments.