基于机器学习的网络异常检测

 2022-04-14 08:04

论文总字数:46662字

摘 要

基于流量的网络异常检测是网络空间安全的重要研究内容。现有的主流机器学习方式通常性能不佳,原因在于点集形式的数据忽视了流量的时序关系,造成关联信息丢失,时序形式的数据增长了序列输入长度,造成数据处理的复杂度增加。

针对上述问题,本文提出了一种高效时序异常检测模型。模型采用循环递归的方式考虑时序关联,利用长短期记忆单元(LSTM)构建模型基本结构。为了克服LSTM单元对长输入序列性能变差的局限性,我们引入注意力机制修改了单元中间状态输出。对输入序列每一步的中间输出结果进行了保留,训练模型学习为输入分配注意力,与中间输出结果联系起来,增加了模型的目标性。输出序列最后通过分类器输出结果,判断网络攻击类型。

在数据集上实验的结果表明,模型的宏精准率和宏召回率达到了98%,宏F1值达到了0.98,相较于其他机器学习算法在性能上都有了明显提高。本文对传统的LSTM模型结构创新的提出了一种优化设计,克服了原有模型信息分散性能不佳的缺陷,同时深度学习结合注意力机制在流量检测领域的创新应用也为相关领域的研究提供了可借鉴的思路。

关键词:网络流量,异常检测,长短期记忆单元,注意力机制

ABSTRACT

Traffic-based network anomaly detection is an important research content of cyberspace security. The existing mainstream machine learning methods usually have poor performance. The reason is that the data in the form of point sets ignores the timing relationship of the traffic, causing the loss of associated information, and the data in the time series increases the sequence input length, resulting in an increase in the complexity of data processing.

In view of the above problems, this paper proposes an efficient timing anomaly detection model. The model considers the temporal correlation by means of cyclic recursion, and constructs the basic structure of the model by using long short term memory units (LSTM). In order to overcome the limitations of the LSTM unit on the performance degradation of long input sequences, we introduced an attention mechanism to modify the unit intermediate state output. The intermediate output of each step of the input sequence is preserved. The training model learns to assign attention to the input and correlates with the intermediate output, which increases the target of the model. The output sequence finally outputs the result through the classifier to determine the type of network attack.

The experimental results on the dataset show that the model's macro precision rate and macro recall rate reach 98%, and the macro F1 value reaches 0.98, which is significantly improved compared with other machine learning algorithms. In this paper, an optimization design of traditional LSTM model structure innovation is proposed, which overcomes the defects of poor dispersion of original model information. At the same time, the innovative application of deep learning combined with attention mechanism in flow detection field also provides research for related fields. The ideas that can be borrowed.

KEY WORDS: network traffic, anomaly detection, LSTM, attention mechanism

目 录

摘要 …………………………………………………………………………………………………Ⅰ

ABSTRACT ………………………………………………………………………………………Ⅱ

  1. 绪论 ………………………………………………………………………………………1

1.1 研究背景及意义 ………………………………………………………………………1

1.2 研究现状及发展趋势 …………………………………………………………………1

1.2.1 研究现状 …………………………………………………………………1

1.2.2 发展趋势 …………………………………………………………………4

1.3 本文主要研究内容 …………………………………………………………………4

1.4 本文组织结构 ………………………………………………………………………4

  1. 相关理论研究 ……………………………………………………………………………6

2.1 网络安全攻击 ………………………………………………………………………6

2.1.1 DOS攻击 …………………………………………………………………6

2.1.2 网络扫描 …………………………………………………………………6

2.1.3 Web攻击 …………………………………………………………………7

2.1.4 僵尸网络 …………………………………………………………………7

2.1.5 渗透攻击 …………………………………………………………………8

2.1.6 暴力破解 …………………………………………………………………8

2.2 RNN概述 ……………………………………………………………………………8

2.3 LSTM模型 ……………………………………………………………………………9

2.4 注意力机制 ……………………………………………………………………………10

2.4.1 编码-解码框架 ……………………………………………………………10

2.4.2 注意力模型 …………………………………………………………………12

2.5 本章小结 ……………………………………………………………………………12

  1. 基于注意力机制的异常检测模型 …………………………………………………14

3.1 隐藏状态输出 ………………………………………………………………………14

3.1.1 遗忘门 ………………………………………………………………………15

3.1.2 输入门 ………………………………………………………………………16

3.1.3 输出门 ………………………………………………………………………17

3.2 注意力计算 ……………………………………………………………………………17

3.3 流量检测分类 ………………………………………………………………………18

3.4 本章小结 ……………………………………………………………………………18

  1. 实验设计与结果分析 …………………………………………………………………20

4.1 实验环境和数据集介绍 ……………………………………………………………20

4.1.1 环境设置 …………………………………………………………………20

4.1.2 CICIDS2017数据集 ………………………………………………………20

4.2 数据集优化 ……………………………………………………………………………23

4.2.1 数据预处理 …………………………………………………………………23

4.2.2 合并少数类 …………………………………………………………………24

4.2.3 模型评估 …………………………………………………………………25

4.3 评价指标 ……………………………………………………………………………25

4.4 实验结果与分析 ………………………………………………………………………26

4.5 本章小结 ……………………………………………………………………………29

  1. 总结与展望 ……………………………………………………………………………30

5.1 研究内容总结 ………………………………………………………………………30

5.2 创新点 …………………………………………………………………………………30

5.2 未来展望 ……………………………………………………………………………31

参考文献 ……………………………………………………………………………………………32

附录 …………………………………………………………………………………………………35

剩余内容已隐藏,请支付后下载全文,论文总字数:46662字

您需要先支付 80元 才能查看全部内容!立即支付

该课题毕业论文、开题报告、外文翻译、程序设计、图纸设计等资料可联系客服协助查找;