The Cross-evaluation of Machine Learning-based Network Intrusion Detection Systems

news/2024/11/9 1:59:14

本文提出了交叉检验的框架，指的是在不同的数据集进行交叉验证。we endorse the idea of cross-evaluating ML-NIDS by using malicious samples captured in different network datasets.1 By performing such cross-evaluations, it is possible to gauge additional
properties of ML-NIDS, allowing a better understanding of
the state-of-the-art at no extra labelling cost.

However, most related work simply used such data as an ‘additional’ setting to perform their experiments. In contrast, in this paper we promote a different approach, based on mixing different network data to cross-evaluate ML-NIDS

链接为：https://arxiv.org/abs/2203.04686

异常检测是发现真实入侵攻击的辅助工作

Specificallyin NID, by creating a training dataset where the samples are distinguished between benign and malicious, it is possible to
develop a fully autonomous Machine Learning-based Network
Intrusion Detection System (ML-NIDS)

Abstract—Enhancing Network Intrusion Detection Systems
(NIDS) with supervised Machine Learning (ML) is tough. MLNIDS must be trained and evaluated, operations requiring data where benign and malicious samples are clearly labelled.

Such labels demand costly expert knowledge, resulting in a lack of real deployments, as well as on papers always relying on the same
outdated data. The situation improved recently, as some efforts
disclosed their labelled datasets. However, most past works used
such datasets just as a ‘yet another’ testbed, overlooking the
added potential provided by such availability.

In contrast, we promote using such existing labelled data to
cross-evaluate ML-NIDS. Such approach received only limited attention and, due to its complexity, requires a dedicated treatment.
We hence propose the first cross-evaluation model. Our model
highlights the broader range of realistic use-cases that can be
assessed via cross-evaluations, allowing the discovery of still unknown qualities of state-of-the-art ML-NIDS. For instance, their
detection surface can be extended—at no additional labelling
cost. However, conducting such cross-evaluations is challenging.
Hence, we propose the first framework, XeNIDS, for reliable
cross-evaluations based on Network Flows. By using XeNIDS on
six well-known datasets, we demonstrate the concealed potential,
but also the risks, of cross-evaluations of ML-NIDS.