Tranco
A Research-Oriented Top Sites Ranking Hardened Against Manipulation
By Victor Le Pochat, Tom Van Goethem, Samaneh Tajalizadehkhoob, Maciej Korczyński and Wouter Joosen
Download the latest Tranco listResearchers in web security or Internet measurements often use rankings of popular websites. However, in our paper we showed that these rankings disagree on which domains are most popular, can change significantly on a daily basis and can be manipulated (by malicious actors).
As the research community still benefits from regularly updated lists of popular domains, we provide Tranco, a ranking that improves upon the shortcomings of current lists. We also emphasize the reproducibility of these rankings and the studies using them by providing permanent citable references.
Access the Tranco ranking
We advise to use the latest standard Tranco list:
one million domains
obtained by averaging all four rankings over the past 30 days.
You can also use the latest standard Tranco list with subdomains.
You can directly retrieve the latest Tranco list at this permanent URL (with subdomains), e.g. for use in daily crawls. You can retrieve the permanent ID of this list (with subdomains) separately to refer to it later on.
The file uses the same format as the Alexa and Cisco Umbrella lists. A daily update to the list is made available by 0:00 UTC; the Last-Modified header provides an exact timestamp.
You can query (historical) ranks of individual domains.
Query domain rankYou can access the Tranco ranking programmatically through our API.
Many thanks to Patrik Lundin for creating his own Tranco API and inspiring the further development of the API that we provide.
You can use the Tranco ranking through BigQuery. The table tranco.daily.daily contains the full daily list. tranco.list_ids.list_ids contains the corresponding list IDs.
You can access Tranco directly from within your code using these packages:
- Python (tranco)
- Go (tranco)
contributed by Yihang Wang (王一航)
You can retrieve past versions of the standard Tranco list.
You can customize your list to have specific properties.
Configure a custom listAttribution: We currently use the lists from five providers: Cisco Umbrella (available free of charge), and Majestic (available under a CC BY 3.0 license), Farsight (only for the default list), the Chrome User Experience Report (CrUX) (available under a CC BY-SA 4.0 license), and Cloudflare Radar (available under a CC BY-NC 4.0 license). Tranco is not affiliated with any of these providers.
Updates
- New The Chrome User Experience Report and Cloudflare Radar rankings have been integrated into the default Tranco list since August 1, 2023. The Alexa ranking has been deprecated, as it is no longer available.
- Tranco received the 2022 ACSAC Cybersecurity Artifacts Competition - Impactful Dataset Award.
- The Farsight ranking has been integrated into the default Tranco list since May 1, 2022.
- Additional API endpoints are now available, including one to generate custom lists. Find your API key on the account page.
-
The Quantcast ranking has become unavailable since
April 1, 2020, and is therefore no longer included in the default Tranco list.
Many thanks to Jeremia Halim for notifying us of this change! - The daily Tranco list is available through Google BigQuery.
- We presented a long-term evaluation of Tranco at the 12th USENIX Workshop on Cyber Security Experimentation and Test.
- We posted a summary of our findings on the RIPE Labs blog.
- We have enabled filters on Google Safe Browsing (to remove malicious websites) and the Chrome User Experience Report (to keep only sites regularly visited in Chrome).
Usage of Tranco
- Over 600 academic publications have referenced and/or used Tranco.
- Mozilla runs a regular scan on Tranco to find websites using TLS 1.0 or 1.1.
- The Norwegian public broadcaster NRK used Tranco to study cookies on Norwegian websites.
- The NoLeaks project analyzes tracking on EU domains in Tranco.
- Scott Helme scans Tranco to assess the state of security on the web.
- The Why No HTTPS? project lists domains in Tranco that do not yet support HTTPS.
- Tranco is used in URLhaus by abuse.ch.
- W3Techs covers Tranco domains in their web technology survey.
- Tranco is available as a feed in The Honeynet Project's Intel Owl as well as ThreatConnect's threat intelligence platform.
- Tranco is used in the Internet Society's Insights project.
- The Markup used Tranco to analyze online tracking.
- BuiltWith covers sites in Tranco for its web technologies analysis.
- CloudFlare evaluated performance of its Oblivious DoH service on Tranco sites.
- BitSight measured DMARC adoption of domains on the Tranco list.
- The Tranco API is available as an integrated data provider for data enrichment on Databar.
If you're using Tranco, let us know!
About this service
Researchers often use rankings of popular websites when measuring security practices, evaluating defenses, analyzing ecosystems or conducting Internet-wide observations. However, little is known about the data collection and processing methodologies of these rankings, even though this affects the validity of conclusions based upon the results obtained using these rankings.
We uncovered how both inherent properties and vulnerabilities to adversarial manipulation of these rankings may affect the conclusions of security studies. In addition, we found that researchers overwhelmingly lack any comment on when the list was downloaded, when the websites on the lists were visited and what proportion was actually reachable. This hampers reproducibility of these studies given the daily changes in list compositions and ranks.
Overall, our study revealed significant shortcomings of current rankings and calls for a more cautious approach from the research community. In order to make researchers more aware of the properties and effects of these popularity rankings and to help in ensuring that studies using these rankings are reproducible, we provide this service where researchers can obtain rankings with properties that are more appropriate to their studies, and where the methodology and composition of these rankings is stored for future reference and reproduction.
We propose a ranking that is suitable for most research purposes, consisting of data from all available rankings over a period of 30 days. We also allow researchers to customize rankings, applying multiple filters in order to best reflect their needs. We archive all generated rankings and supply a permanent link to a page with additional information on the methodology behind the ranking and the ability to download the exact same list as was used in the study. This improves the validity of research results through a better understanding and awareness of these rankings, as well as reproducibility by keeping a permanent reference to the rankings.
About us
Feel free to contact us regarding Tranco and our research: [email protected] (main contact)
The source code for the Tranco ranking is available on GitHub. If you run into any problems with the ranking or the website, you can create an issue there.