Downloads provided by UsageCounts
doi: 10.5061/dryad.q447c
In increasing numbers, researchers around the world are turning to Sci-Hub, the controversial website that hosts 50 million pirated papers and counting. Now, with server log data from Alexandra Elbakyan, the neuroscientist who created Sci-Hub in 2011 as a 22-year-old graduate student in Kazakhstan, Science addresses some basic questions: Who are Sci-Hub's users, where are they, and what are they reading? The Sci-Hub data provide the first detailed view of what is becoming the world's de facto open-access research library. Among the revelations that may surprise both fans and foes alike: Sci-Hub users are not limited to the developing world. Some critics of Sci-Hub have complained that many users can access the same papers through their libraries but turn to Sci-Hub instead—for convenience rather than necessity. The data provide some support for that claim. Over the 6 months leading up to March, Sci-Hub served up 28 million documents, with Iran, China, India, Russia, and the United States the leading requestors.
Sci-Hub download dataThese data include 28 million download request events from the server logs of Sci-Hub from 1 September 2015 through 29 February 2016. The uncompressed 2.7 gigabytes of data are separated into 6 data files, one for each month, in tab-delimited text format.scihub_data.zipIPython Notebook for Sci-Hub raw dataIPython Notebook used to process the raw server log data (processing the GIS files into CSV, scraping DOI metadata, etc.).Sci-Hub.htmlSci-Hub.ipynbSci-Hub publisher DOI prefixesData scraped from the CrossRef website which can be used to replicate the analysis of downloads by publisher.publisher_DOI_prefixes.csv
open access, scientific communication
open access, scientific communication
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 7 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Top 10% | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Top 1% | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Top 10% |
| views | 4K | |
| downloads | 785 |

Views provided by UsageCounts
Downloads provided by UsageCounts