<script type="text/javascript">
<!--
document.write('<div id="oa_widget"></div>');
document.write('<script type="text/javascript" src="https://www.openaire.eu/index.php?option=com_openaire&view=widget&format=raw&projectId=undefined&type=result"></script>');
-->
</script>

COPY SCRIPT

For further information contact us at helpdesk@openaire.eu

Source Code - Clustering Semantic Predicates in the Open Research Knowledge Graph

Name: Source Code - Clustering Semantic Predicates in the Open Research Knowledge Graph
Creator: Arab Oghli, Omar
Keywords: Clustering algorithms, Artificial Intelligence, Open Research Knowledge Graph, Content-based recommender system

integration_instructionsResearch softwarekeyboard_double_arrow_right Software 14 Jan 2022 English Publisher:Zenodo

Authors: Arab Oghli, Omar;

doi: 10.5281/zenodo.6973678 , 10.5281/zenodo.6514138 , 10.5281/zenodo.6514139

Source Code - Clustering Semantic Predicates in the Open Research Knowledge Graph

- Summary
- Subjects
- Metrics

Abstract

This source code and its required materials implement a content-based recommender system in the context of the Open Research Knowledge Graph (ORKG). The recommender system accepts research paper's title and abstracts as input and recommends existing predicates in the ORKG semantically relevant to the given paper. All notebooks are dependent on this Dataset. Please consider downloading the files and uploading them to your Google Drive in order to run the notebooks. Also, please consider to adapt the notebooks to your Google Drive folder as well as you Google Cloud Storage bucket name, or configuring the applied clustering algorithm (agglomerative or kmeans) and the number of clusters "k" scibert_embeddings.ipynb: This notebook is responsible for representing the training and test instances in SciBERT embeddings. The output of this notebook is the files scibert_training_representations.npz and scibert_test_representations.npz that are required for running predicates_clustering_scibert.ipynb predicates_clustering_scibert.ipynb: This notebook depends on the output of scibert_embeddings.ipynb. It trains different clustering models depending on "N_CLUSTERS" using SciBERT embeddings and uploads the trained models to a specified bucket on Google Cloud Storage. It is also responsible for downloading the trained models and evaluating them. predicates_clustering_tfidf.ipynb: This notebook trains different clustering models depending on "N_CLUSTERS" using TF-IDF embeddings and uploads the trained models to a specified bucket on Google Cloud Storage. It is also responsible for downloading the trained models, evaluating them analyzing the constructed clusters. Pre-trained Models: We hereby publish 2 pre-trained clustering models with the naming format <embedding approach>_<clustering algorithm>_<number of clusters>.pkl: scibert_kmeans_2050.pkl with micro-averaged F1-score 72.6% tfidf_agglomerative_1300.pkl with micro-averaged F1-score 80.4%

Keywords

Clustering algorithms, Artificial Intelligence, Open Research Knowledge Graph, Content-based recommender system

Impact byBIP!

	citations This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Usage byUsageCounts

visibility	views	8
download	downloads	3

8
views
3
downloads
Powered by

Found an issue? Give us feedback

visibility

download

Average

Related to Research communities

Knowmad Institut