Downloads provided by UsageCounts
High-quality dataset gathered from ChEMBL version 22 based on UniProt accession P34972. Regarding to activity data potential, duplicates were ignored, no activity or data validity comments were allowed, only data from binding assays and with a pCheMBL value were kept. This led to a dataset composed of 3925 chemical compounds (instances) represented using 2132 features. The first 2048 features epitomize different chemical structures fingerprints (represented using FCFP_6 notation), while the remaining 84 are associated with several physicochemical descriptors (such as Fractional Polar Surface Area, Rotatable Bonds or Molecular Weight). Finally, the set was transformed into a binary classification set where the activity cut-off was defined at a pChEMBL value > 7 and written to a tab-delimited text file. The final set contained 1977 active compounds and 1948 inactive compounds. Table 3 shows the codification of each feature grouped by type.
drug screening set, drug discovery
drug screening set, drug discovery
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 9 | |
| downloads | 2 |

Views provided by UsageCounts
Downloads provided by UsageCounts