Downloads provided by UsageCounts
pcr_patents.csv is the dataset which is generated by collecting samples randomly from Google Patents by exploiting a Python library. The dataset comprises around 250,000 US patents and their titles, abstracts, and citations. Each patent has roughly on average 27 citations. The zip file contains 3 different datasets for training and testing patent citation recommendation systems. These datasets were generated by utilizing the main dataset. They consist of around 1 million instances which are positive as well as negative samples. pcr_cpc_negative_sample_data.csv consists of negative samples that were generated based on CPC subclass codes. pcr_random_negative_sample_data.csv consists of negative samples that were generated randomly. pcr_sem_sim_negative_sample_data_2.csv consists of negative samples that were generated based on nearest neighbor relation.
patent citation, citation recommendation
patent citation, citation recommendation
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
| views | 20 | |
| downloads | 7 |

Views provided by UsageCounts
Downloads provided by UsageCounts