OSINT

A cibersegurança tem vindo a atrair cada vez mais as atenções devido ao crescimento contínuo do número e da gravidade dos ataques efectuados. Para proteger eficazmente um sistema informático é necessário receber continuamente notícias e eventos relacionadas com cibersegurança, incluíndos novas vulnerabilidades e ataques. Uma alternativa viável a subscrições pagas (dado que muitas têm preços elevados) é obter esta informação através de fontes abertas Open Source Intelligence [OSINT]), visto que especialistas em cibersegurança publicam diariamente vastos conteúdos sobre o tópico. O Twitter encontra-se em destaque como plataforma de OSINT por ser um agregador natural de conteúdos, incluíndo cibersegurança. Esta tese foca-se na recolha e processamento de tweets que falam sobre cibersegurança. Primeiro efetuámos um estudo qualitativo e quantitativo sobre as informacções de cibersegurança publicadas no Twitter, e comparamos estes resultados com as informações presentes em bases de dados dedicadas a armazenar vulnerabilidades e ataques; o nosso estudo mostra que o Twitter é uma fonte relevante e completa de informação sobre cibersegurança. O restante trabalho foi dedicado ao desenvolvimento de uma plataforma dedicada à recolha, processamento, e agregação de tweets sobre cibersegurança. A nossa plataforma é composta por fases de processamento de texto, conversão num´erica, classificação binária, agregação, e criação de indicadores de segurança (Indicators of Compromise [IoC]). Primeiro, mostramos como criar um modelo de classificação adequado para tweets usando boas práticas de aprendizagem automática. Depois, criámos uma nova estratégia de aplicação do algorítmo k-means de modo a não ser necessário definir previamente o n´umero de grupos a obter, ao mesmo tempo que a agregação é feita sobre um fluxo contíno de tweets e não sobre um conjunto estático. A partir destes grupos geramos IoCs para que a nossa plataforma possa facilmente alimentar outras ferramentas de ciberseguranc¸a. Por fim, demonstramos a integração da nossa plataforma com o sistema de gestão de eventos de cibersegurança de uma fornecedora eléctrica nacional.

Cybersecurity is a topic of growing concern as the number and gravity of cyberattacks are continuously increasing. Receiving the latest updates, patches, and news is crucial to maintaining an IT infrastructure’s high-security level. An alternative to purchasing expensive security news feeds is to collect Open Source Intelligence: a wealth of knowledge published daily by users, security companies, researchers, and hackers, among others. In particular, Twitter has become an information hub for obtaining cutting-edge information about many subjects, including cybersecurity. This thesis is focused on the collection and processing of cybersecurity-related tweets. Firstly, we conducted a qualitative and quantitative study about the security data found on Twitter and compared it to databases that publish confirmed vulnerabilities or exploits. Our study shows that Twitter is a relevant cybersecurity source. The remainder of the work is about developing a framework for collecting, processing, and delivering security tweets. Its pipeline comprises text filtering, text feature extraction, a binary classifier, clustering, and Indicator of Compromise generation. We show how to obtain a tweet classifier model following tweet characteristics and machine learning best practices. Our clustering strategy adopts the k-means algorithm to an unknown number of clusters, and to cluster and update based on a stream of tweets instead of the classical batch operation. From the clusters we generate Indicators of Compromise, which are structured data formats used in cybersecurity; this step eases the integration of our tool with existing cybersecurity tools. Finally, we showcase one such integration with the Security Information and Event Management system of a nation-wide electrical utility company.

H2020 DiSIEM (H2020-700692)

Country

Portugal

Related Organizations

Universidade de Lisboa
Portugal
Technical University of Lisbon
Portugal

Keywords

Open Source Intelligence, Cybersecurity, Twitter, Domínio/Área Científica::Ciências Naturais::Ciências da Computação e da Informação, Security operations centre, Centro de operações de segurança, cibersegurança

Impact byBIP!

	selected citations These citations are derived from selected sources. This is an alternative to the "Influence" indicator, which also reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	0
	popularity This indicator reflects the "current" impact/attention (the "hype") of an article in the research community at large, based on the underlying citation network.	Average
	influence This indicator reflects the overall/total impact of an article in the research community at large, based on the underlying citation network (diachronically).	Average
	impulse This indicator reflects the initial momentum of an article directly after its publication, based on the underlying citation network.	Average

Found an issue? Give us feedback

0

Average

Green

Related to Research communities

Knowmad Institut