
<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>
This source code and its required materials implement a content-based recommender system in the context of the Open Research Knowledge Graph (ORKG). The recommender system accepts research paper's title and abstracts as input and recommends existing predicates in the ORKG semantically relevant to the given paper. All notebooks are dependent on this Dataset. Please consider downloading the files and uploading them to your Google Drive in order to run the notebooks. Also, please consider to adapt the notebooks to your Google Drive folder as well as you Google Cloud Storage bucket name, or configuring the applied clustering algorithm (agglomerative or kmeans) and the number of clusters "k" scibert_embeddings.ipynb: This notebook is responsible for representing the training and test instances in SciBERT embeddings. The output of this notebook is the files scibert_training_representations.npz and scibert_test_representations.npz that are required for running predicates_clustering_scibert.ipynb predicates_clustering_scibert.ipynb: This notebook depends on the output of scibert_embeddings.ipynb. It trains different clustering models depending on "N_CLUSTERS" using SciBERT embeddings and uploads the trained models to a specified bucket on Google Cloud Storage. It is also responsible for downloading the trained models and evaluating them. predicates_clustering_tfidf.ipynb: This notebook trains different clustering models depending on "N_CLUSTERS" using TF-IDF embeddings and uploads the trained models to a specified bucket on Google Cloud Storage. It is also responsible for downloading the trained models, evaluating them analyzing the constructed clusters. Pre-trained Models: We hereby publish 2 pre-trained clustering models with the naming format <embedding approach>_<clustering algorithm>_<number of clusters>.pkl: scibert_kmeans_2050.pkl with micro-averaged F1-score 72.6% tfidf_agglomerative_1300.pkl with micro-averaged F1-score 80.4%
Clustering algorithms, Artificial Intelligence, Open Research Knowledge Graph, Content-based recommender system
Clustering algorithms, Artificial Intelligence, Open Research Knowledge Graph, Content-based recommender system
citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
views | 8 | |
downloads | 3 |