TY - GEN
T1 - Auto Semi-supervised Outlier Detection for Malicious Authentication Events
AU - Kaiafas, Georgios
AU - Hammerschmidt, Christian
AU - Lagraa, Sofiane
AU - State, Radu
PY - 2020
Y1 - 2020
N2 - Cyber-attacks become more sophisticated and complex especially when adversaries steal user credentials to traverse the network of an organization. Detecting a breach is extremely difficult and this is confirmed by the findings of studies related to cyber-attacks on organizations. A study conducted last year by IBM found that it takes 206 days on average to US companies to detect a data breach. As a consequence, the effectiveness of existing defensive tools is in question. In this work we deal with the detection of malicious authentication events, which are responsible for effective execution of the stealthy attack, called lateral movement. Authentication event logs produce a pure categorical feature space which creates methodological challenges for developing outlier detection algorithms. We propose an auto semi-supervised outlier ensemble detector that does not leverage the ground truth to learn the normal behavior. The automatic nature of our methodology is supported by established unsupervised outlier ensemble theory. We test the performance of our detector on a real-world cyber security dataset provided publicly by the Los Alamos National Lab. Overall, our experiments show that our proposed detector outperforms existing algorithms and produces a 0 False Negative Rate without missing any malicious login event and a False Positive Rate which improves the state-of-the-art. In addition, by detecting malicious authentication events, compared to the majority of the existing works which focus solely on detecting malicious users or computers, we are able to provide insights regarding when and at which systems malicious login events happened. Beyond the application on a public dataset we are working with our industry partner, POST Luxembourg, to employ the proposed detector on their network.
AB - Cyber-attacks become more sophisticated and complex especially when adversaries steal user credentials to traverse the network of an organization. Detecting a breach is extremely difficult and this is confirmed by the findings of studies related to cyber-attacks on organizations. A study conducted last year by IBM found that it takes 206 days on average to US companies to detect a data breach. As a consequence, the effectiveness of existing defensive tools is in question. In this work we deal with the detection of malicious authentication events, which are responsible for effective execution of the stealthy attack, called lateral movement. Authentication event logs produce a pure categorical feature space which creates methodological challenges for developing outlier detection algorithms. We propose an auto semi-supervised outlier ensemble detector that does not leverage the ground truth to learn the normal behavior. The automatic nature of our methodology is supported by established unsupervised outlier ensemble theory. We test the performance of our detector on a real-world cyber security dataset provided publicly by the Los Alamos National Lab. Overall, our experiments show that our proposed detector outperforms existing algorithms and produces a 0 False Negative Rate without missing any malicious login event and a False Positive Rate which improves the state-of-the-art. In addition, by detecting malicious authentication events, compared to the majority of the existing works which focus solely on detecting malicious users or computers, we are able to provide insights regarding when and at which systems malicious login events happened. Beyond the application on a public dataset we are working with our industry partner, POST Luxembourg, to employ the proposed detector on their network.
KW - Cybersecurity
KW - Embedding
KW - Ensemble learning
KW - Outlier detection
KW - Semi-supervised learning
UR - http://www.scopus.com/inward/record.url?scp=85083682397&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-43887-6_14
DO - 10.1007/978-3-030-43887-6_14
M3 - Conference contribution
AN - SCOPUS:85083682397
SN - 978-3-030-43886-9
T3 - Communications in Computer and Information Science
SP - 176
EP - 190
BT - Machine Learning and Knowledge Discovery in Databases - International Workshops of ECML PKDD 2019, Proceedings
A2 - Cellier, Peggy
A2 - Driessens, Kurt
PB - Springer
CY - Cham
T2 - 19th Joint European Conference on Machine Learning and Principles and Practice of Knowledge Discovery in Databases, ECML PKDD 2019
Y2 - 16 September 2019 through 20 September 2019
ER -