Tranco

A Research-Oriented Top Sites Ranking Hardened Against Manipulation

By Victor Le Pochat, Tom Van Goethem, Samaneh Tajalizadehkhoob, Maciej Korczyński and Wouter Joosen

Download the latest Tranco list

Researchers in web security or Internet measurements often use rankings of popular websites (e.g. the Alexa ranking). However, we showed that these rankings disagree on which domains are most popular, can change significantly on a daily basis and can be manipulated (by malicious actors).

As the research community still benefits from regularly updated lists of popular domains, we provide Tranco, a new ranking that improves upon the shortcomings of current lists. We also emphasize the reproducibility of these rankings and the studies using them by providing permanent citable references.

Access the list

We advise to use the standard Tranco list: one million domains obtained by combining all four rankings averaged over the past 30 days.

You can directly retrieve the latest Tranco list at the following permanent URL, e.g. for use in daily crawls. The file uses the same format as the Alexa and Cisco Umbrella lists. You can retrieve the ID of this list separately to refer to it later on.

Latest Tranco listID of latest list

You can install the tranco Python package to directly access the list from within your code..

PyPi projectSource code on GitHub

You can retrieve past versions of the standard Tranco list.

Date:

You can customize your list to have specific properties.

Customize

About this service

Researchers often use rankings of popular websites when measuring security practices, evaluating defenses, analyzing ecosystems or conducting Internet-wide observations. However, little is known about the data collection and processing methodologies of these rankings, even though this affects the validity of conclusions based upon the results obtained using these rankings.

We uncovered how both inherent properties and vulnerabilities to adversarial manipulation of these rankings may affect the conclusions of security studies. In addition, we found that researchers overwhelmingly lack any comment on when the list was downloaded, when the websites on the lists were visited and what proportion was actually reachable. This hampers reproducibility of these studies given the daily changes in list compositions and ranks.

Overall, our study revealed significant shortcomings of current rankings and calls for a more cautious approach from the research community. In order to make researchers more aware of the properties and effects of these popularity rankings and to help in ensuring that studies using these rankings are reproducible, we provide this service where researchers can obtain rankings with properties that are more appropriate to their studies, and where the methodology and composition of these rankings is stored for future reference and reproduction.

We propose a ranking that is suitable for most research purposes, consisting of data from all available rankings over a period of 30 days. We also allow researchers to customize rankings, applying multiple filters in order to best reflect their needs. We archive all generated rankings and supply a permanent link to a page with additional information on the methodology behind the ranking and the ability to download the exact same list as was used in the study. This improves the validity of research results through a better understanding and awareness of these rankings, as well as reproducibility by keeping a permanent reference to the rankings.

About us

Feel free to contact us regarding our research, the data or this new ranking.