Powered by OpenAIRE graph
Found an issue? Give us feedback
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/ Universidade de Lisb...arrow_drop_down
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
image/svg+xml art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos Open Access logo, converted into svg, designed by PLoS. This version with transparent background. http://commons.wikimedia.org/wiki/File:Open_Access_logo_PLoS_white.svg art designer at PLoS, modified by Wikipedia users Nina, Beao, JakobVoss, and AnonMoos http://www.plos.org/
UTL Repository
Doctoral thesis . 2025
Data sources: UTL Repository
versions View all 2 versions
addClaim

OSINT

Based Data-Driven Cybersecurity Discovery
Authors: Alves, Fernando Baptista Leal;
Abstract

A cibersegurança tem vindo a atrair cada vez mais as atenções devido ao crescimento contínuo do número e da gravidade dos ataques efectuados. Para proteger eficazmente um sistema informático é necessário receber continuamente notícias e eventos relacionadas com cibersegurança, incluíndos novas vulnerabilidades e ataques. Uma alternativa viável a subscrições pagas (dado que muitas têm preços elevados) é obter esta informação através de fontes abertas Open Source Intelligence [OSINT]), visto que especialistas em cibersegurança publicam diariamente vastos conteúdos sobre o tópico. O Twitter encontra-se em destaque como plataforma de OSINT por ser um agregador natural de conteúdos, incluíndo cibersegurança. Esta tese foca-se na recolha e processamento de tweets que falam sobre cibersegurança. Primeiro efetuámos um estudo qualitativo e quantitativo sobre as informacções de cibersegurança publicadas no Twitter, e comparamos estes resultados com as informações presentes em bases de dados dedicadas a armazenar vulnerabilidades e ataques; o nosso estudo mostra que o Twitter é uma fonte relevante e completa de informação sobre cibersegurança. O restante trabalho foi dedicado ao desenvolvimento de uma plataforma dedicada à recolha, processamento, e agregação de tweets sobre cibersegurança. A nossa plataforma é composta por fases de processamento de texto, conversão num´erica, classificação binária, agregação, e criação de indicadores de segurança (Indicators of Compromise [IoC]). Primeiro, mostramos como criar um modelo de classificação adequado para tweets usando boas práticas de aprendizagem automática. Depois, criámos uma nova estratégia de aplicação do algorítmo k-means de modo a não ser necessário definir previamente o n´umero de grupos a obter, ao mesmo tempo que a agregação é feita sobre um fluxo contíno de tweets e não sobre um conjunto estático. A partir destes grupos geramos IoCs para que a nossa plataforma possa facilmente alimentar outras ferramentas de ciberseguranc¸a. Por fim, demonstramos a integração da nossa plataforma com o sistema de gestão de eventos de cibersegurança de uma fornecedora eléctrica nacional.

Cybersecurity is a topic of growing concern as the number and gravity of cyberattacks are continuously increasing. Receiving the latest updates, patches, and news is crucial to maintaining an IT infrastructure’s high-security level. An alternative to purchasing expensive security news feeds is to collect Open Source Intelligence: a wealth of knowledge published daily by users, security companies, researchers, and hackers, among others. In particular, Twitter has become an information hub for obtaining cutting-edge information about many subjects, including cybersecurity. This thesis is focused on the collection and processing of cybersecurity-related tweets. Firstly, we conducted a qualitative and quantitative study about the security data found on Twitter and compared it to databases that publish confirmed vulnerabilities or exploits. Our study shows that Twitter is a relevant cybersecurity source. The remainder of the work is about developing a framework for collecting, processing, and delivering security tweets. Its pipeline comprises text filtering, text feature extraction, a binary classifier, clustering, and Indicator of Compromise generation. We show how to obtain a tweet classifier model following tweet characteristics and machine learning best practices. Our clustering strategy adopts the k-means algorithm to an unknown number of clusters, and to cluster and update based on a stream of tweets instead of the classical batch operation. From the clusters we generate Indicators of Compromise, which are structured data formats used in cybersecurity; this step eases the integration of our tool with existing cybersecurity tools. Finally, we showcase one such integration with the Security Information and Event Management system of a nation-wide electrical utility company.

H2020 DiSIEM (H2020-700692)

Country
Portugal
Keywords

Open Source Intelligence, Cybersecurity, Twitter, Domínio/Área Científica::Ciências Naturais::Ciências da Computação e da Informação, Security operations centre, Centro de operações de segurança, cibersegurança

  • BIP!
    Impact byBIP!
    selected citations
    These citations are derived from selected sources.
    This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    0
    popularity
    This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
    Average
    influence
    This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
    Average
    impulse
    This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
    Average
Powered by OpenAIRE graph
Found an issue? Give us feedback
selected citations
These citations are derived from selected sources.
This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Citations provided by BIP!
popularity
This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.
BIP!Popularity provided by BIP!
influence
This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).
BIP!Influence provided by BIP!
impulse
This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.
BIP!Impulse provided by BIP!
0
Average
Average
Average
Green
Related to Research communities