
# Implementations of LLMmap We include the weights of the models used to collect the main results in the paper, as well as the code necessary to run them. In particular, models are stored in the directory `./data/models/`: - `closed_set_8q`: The closed-set model used in Table 2, Figures 5, 7, 8, and G.1.- `open_set_8q`: The open-set model used in Table 2, D.1, Figures 6, C.1, and Appendix B. Models weights are stored in the standard ```keras``` format. We include the necessary code to run these models in `./LLMmap`. The script `./main_interactive` acts as the main entry point for all the models. In addition, we provide:- `unseen_model_random_forest.pickle`: the random forest model used to detect if a model is unseen based on the predictions of the open-set model (Appendix E). This is saved as a standard pickle file and contain a pre-trained `sklearn.ensemble.RandomForestClassifier`. Code to load and use the model is given in `LLMmap.unseen_detector`. # Datasets We report the dataset we generated to train and test the models across the paper. This is stored in `./data/dataset.jsonl` in JSON format. We provide the function `read_dataset`, located in `./LLMmap/data_pipeline.py`, to load the dataset and partition it into train and test sets.
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
