Web Server Log Dataset

This dataset contains real-world web server log files collected from two public sector organizations in Indonesia, referred to as Organization X and Organization Y to preserve anonymity. The logs were used as datasets in forensic event reconstruction research for web application attacks. Dataset contentsThe dataset includes raw log files from each organization, covering the following log artifact types where available: access.log, error.log, auth.log, and syslog. These log types capture HTTP requests, web server errors, authentication activities, and operating system events respectively. Organization XApache web server, contains 60 log files, covering the period from August 12, 2025 to October 7, 2025. The logs reflect extensive reconnaissance and exploitation attempts. No evidence of privilege escalation was identified, indicating that the attacks were detected or mitigated before reaching critical stages. Organization YOpenlitespeed (cyberpanel) based web server log, contains 208 log files, the logs capture a successful privilege escalation incident that occurred on April 29, 2025 (05:28–07:07 local time), despite the absence of clear prior reconnaissance traces. This scenario represents cases where attackers may have bypassed initial detection or performed reconnaissance through alternative channels. Ground truth Ground truth labeling rules are provided for both datasets in YAML format. Each rule specifies an attack type label, a sensitivity mode (moderate or strict), and one or more filter patterns. The rules are designed to be applied on a log timeline through pattern matching. A log entry is assigned the corresponding attack label if all filter patterns in a rule are matched. Entries that do not match any rule are labeled as benign. This allows researchers to systematically reproduce ground truth labels from the raw log files without manual annotation. Intended useThis dataset is intended for use in research on digital forensics, web security incident analysis, log-based anomaly detection, and forensic event reconstruction.

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average