
Auxiliary material, up to date documentation, and issue tracking available at: https://github.com/rub-softsec/MLCerts Docker images for reproducing artifacts are available at: https://zenodo.org/records/17850372 Datasets Raw PEM certificates used in differential testing: v3-chain.tar.bz2: 12 synthetic certificate datasets. v3-experiments-extra.tar.bz2: MLCerts 1M dataset. frankencerts-v1-8M.tar.bz2: Frankencerts 8M dataset. seeds30k.tar.bz2: Transcert 30K dataset. The CA information is available in customCA/ directory. Language Models (llm-code-MLcerts-EXPORT.zip) One of the model architectures below are used to generate synthetic ASN.1 instances (with BEGIN/END tags). asn1_to_pem.py is then used to convert them into a PEM format, with CA information copied from customCA/ directory. RNN models Code for RNN models, based on Char-RNN-Python, is available in Char-RNN-PyTorch directory. charRNN-custom.py is used for training, and generate.py for generating synthetic certificate instances. python3 generate.py saved_model hidden_size layers temperature original_cert_dataset extra_run_name Saved models available are: 2022-scanned-1024-3-0.0002lr-0.1dropout-epoch3-step300000 2022-scanned-256-3-0.0002lr-0.1dropout-epoch3-step300000 balanced-versions-1024-3-0.0002lr-0.1dropout-epoch3-step300000 balanced-versions-256-3-0.0002lr-0.1dropout-epoch3-step300000 zmap-data-256-3-0.0002lr-0.1dropout-epoch3-step300000 zmap-data-1024-3-0.0002lr-0.1dropout-epoch3-step300000 To generate certificates for the final model used in paper results (IPv4/RNN-Medium with Temperature = 1.5), use: python3 generate.py zmap-data-1024-3-0.0002lr-0.1dropout-epoch3-step300000 1024 3 1.5 zmap-data testZmap1M GPT Models Code for GPT models, based on GPT-Neo-125, is available in Transformers directory. train_script.py is used for training (train_script_scratch.py for training from scratch), and generate.py for generating synthetic certificate instances. python3 generate.py saved_model checkpoint_num training_type temperature training_type can be 'finetune' or 'custom’, for instance: python3 generate.py 2022-scanned-custom checkpoint-284400 custom 1.0 Saved models available are: 2022-scanned 2022-scanned-custom balanced-versions balanced-versions-custom zmap-data-custom zmap-data The custom versions are the ones trained from scratch. conda-env.yml can be consulted for environment dependencies. BibTeX Please cite our paper if you rely on the datasets for your work. @inproceedings{icse2026-hallucinating-certificates, title = {{Hallucinating Certificates: Differential Testing of TLS Certificate Validation Using Generative Language Models}}, author = {Paracha, Talha and Posluns, Kyle and Borgolte, Kevin and Lindorfer, Martina and Choffnes, David}, booktitle = {Proceedings of the 48th IEEE/ACM International Conference on Software Engineering (ICSE)}, date = {2026-04}, edition = {48}, editor = {Mezini, Mira and Zimmermann, Thomas}, location = {Rio de Janeiro, Brazil}, publisher = {Association for Computing Machinery (ACM)/Institute of Electrical and Electronics Engineers (IEEE)} }
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 0 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
