
arXiv: 1904.01725
Web traffic is a valuable data source, typically used in the marketing space to track brand awareness and advertising effectiveness. However, web traffic is also a rich source of information for cybersecurity monitoring efforts. To better understand the threat of malicious cyber actors, this study develops a methodology to monitor and evaluate web activity using data archived from Google Analytics. Google Analytics collects and aggregates web traffic, including information about web visitors' location, date and time of visit, visited webpages, and searched keywords. This study seeks to streamline analysis of this data and uses rule-based anomaly detection and predictive modeling to identify web traffic that deviates from normal patterns. Rather than evaluating pieces of web traffic individually, the methodology seeks to emulate real user behavior by creating a new unit of analysis: the user session. User sessions group individual pieces of traffic from the same location and date, which transforms the available information from single point-in-time snapshots to dynamic sessions showing users' trajectory and intent. The result is faster and better insight into large volumes of noisy web traffic.
2017 IEEE International Conference on Big Data (Big Data)
FOS: Computer and information sciences, Computer Science - Cryptography and Security, Cryptography and Security (cs.CR), Information Retrieval (cs.IR), Computer Science - Information Retrieval
FOS: Computer and information sciences, Computer Science - Cryptography and Security, Cryptography and Security (cs.CR), Information Retrieval (cs.IR), Computer Science - Information Retrieval
| selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | 1 | |
| popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network. | Average | |
| influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically). | Average | |
| impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network. | Average |
