TY - GEN
T1 - TRANCO: A Research-Oriented Top Sites Ranking Hardened Against Manipulation
AU - Le Pochat, Victor
AU - Van Goethem, Tom
AU - Tajalizadehkhoob, Samaneh
AU - Joosen, Wouter
PY - 2019
Y1 - 2019
N2 - In order to evaluate the prevalence of security and privacy practices on a representative sample of the Web, researchers rely on website popularity rankings such as the Alexa list. While the validity and representativeness of these rankings are rarely questioned, our findings show the contrary: we show for four main rankings how their inherent properties (similarity, stability, representativeness, responsiveness and benignness) affect their composition and therefore potentially skew the conclusions made in studies. Moreover, we find that it is trivial for an adversary to manipulate the composition of these lists. We are the first to empirically validate that the ranks of domains in each of the lists are easily altered, in the case of Alexa through as little as a single HTTP request. This allows adversaries to manipulate rankings on a large scale and insert malicious domains into whitelists or bend the outcome of research studies to their will. To overcome the limitations of such rankings, we propose improvements to reduce the fluctuations in list composition and guarantee better defenses against manipulation. To allow the research community to work with reliable and reproducible rankings, we provide TRANCO, an improved ranking that we offer through an online service available at https://tranco-list.eu.
AB - In order to evaluate the prevalence of security and privacy practices on a representative sample of the Web, researchers rely on website popularity rankings such as the Alexa list. While the validity and representativeness of these rankings are rarely questioned, our findings show the contrary: we show for four main rankings how their inherent properties (similarity, stability, representativeness, responsiveness and benignness) affect their composition and therefore potentially skew the conclusions made in studies. Moreover, we find that it is trivial for an adversary to manipulate the composition of these lists. We are the first to empirically validate that the ranks of domains in each of the lists are easily altered, in the case of Alexa through as little as a single HTTP request. This allows adversaries to manipulate rankings on a large scale and insert malicious domains into whitelists or bend the outcome of research studies to their will. To overcome the limitations of such rankings, we propose improvements to reduce the fluctuations in list composition and guarantee better defenses against manipulation. To allow the research community to work with reliable and reproducible rankings, we provide TRANCO, an improved ranking that we offer through an online service available at https://tranco-list.eu.
UR - http://www.scopus.com/inward/record.url?scp=85170646912&partnerID=8YFLogxK
U2 - 10.14722/ndss.2019.23386
DO - 10.14722/ndss.2019.23386
M3 - Conference contribution
SN - 1-891562-55-X
T3 - 26th Annual Network and Distributed System Security Symposium, NDSS 2019
BT - Network and Distributed Systems Security (NDSS) Symposium 2019
T2 - Network and Distributed Systems Security Symposium 2019
Y2 - 24 February 2019 through 27 February 2019
ER -