基于无监督学习的物联网设备异常行为发现

 2022-08-11 04:08

论文总字数:36530字

摘 要

物联网,即“Internet of things(IoT)”,顾名思义即物与物相连的互联网。近年来,物联网设备飞速发展,堪称是继计算机、互联网之后世界信息产业的第三次革命性发展。现如今,各类物联网设备随处可见,从常见的共享单车、监控摄像,到各种智能家居设备、智能医疗设备,再到军事国防、能源电力等等,都是物联网设备具体应用的例子。这就引起了我们对物联网设备安全性的思考与关注,事实上以物联网设备为目标的入侵和监控行为一直在威胁物联网设备的安全,为了避免被不法之徒利用,对物联网设备异常行为的检测就显得十分重要。

首先,本文分析了物联网网络层数据流,选择了其中比较有代表性的几个特征,如设备上网访问的时段、频次、流量大小、数据访问方向等几个特征,创建物联网设备行为矩阵。接着以该行为矩阵作为依据,分析了异常点的种类,将异常点分类为三种:数据超出正常范围、场景异常、0-1型异常。

接着,分析、比较现有的典型无监督异常检测算法:K-means、DBSCAN和LOF,发现LOF和DBSCAN算法各有优点,LOF算法有极高的检测率,DBSCAN算法误判率很低,于是吸取了LOF算法中的K-近邻概念和DBSCAN的成簇思想,提出一种新的算法K-DBSCAN。

然后,提出了本文的算法K-DBSCAN,将LOF算法中的K-近邻概念和DBSCAN的成簇方法相结合,利用K近邻进行成簇,力求降低算法的参数敏感性,同时采用新的异常检测机制,配合较低的阈值提高检测率,并引入补偿机制,通过两次对误判点的补偿来降低误判率。

最后对本算法进行了实现与测试,测试结果表明算法在改进之后检测率保持在90%左右,误判率也比较低,同时在参数K变化的情况下算法效果保持稳定,并且算法的效果并不会受到异常点输入形式的影响。

关键词:物联网设备,异常行为识别,无监督学习

ABSTRACT

Internet of Things (IoT), as the name suggests, is the network of devices which connect with each other on the Internet. In recent years, Internet of Things developed rapidly, considered to be the third revolutionary development in world information industry after computer and Internet. Nowadays, we can see IoT devices everywhere, such as shared bikes, Internet cameras, home appliances and smart medical devices. IoT devices even play an important role in military defense and energy industry. This raised our concern on the security of IoT device. In fact, many have tried to invade and make use of IoT devices to gain illegal interests. So it’s important to do the anomaly detection of IoT device behaviors.

Firstly, we analyzed the data flow of Network layer in IoT devices. We chose some representative features of data flow, such as the communication time, the communication frequency and so on. We built a device behavior matrix by using these features. Then we grouped the outlier into three types: outliers whose data beyond the normal range, outliers which are in the wrong scene and the 0-1 outliers.

Then we chose three unsupervised anomaly detection algorithm: K-means, DBSCAN and LOF. DBSCAN has low false positive rate and LOF has high detection rate. We put forward a new algorithm, which has the K-neighbor from LOF and clustering from DBSCAN. We hope to improve the stability when the parameters change and improve the detection rate.

After that, we put forward a new algorithm, which has the K-neighbor from LOF and clustering from DBSCAN. We introduced a new parameter, K to replace the two previous parameters, MinPts and Eps. We hope to improve the stability by introduce K. And we used low threshold to improve detection rate. Also, we used make-up mechanism to decrease false positive rate.

At last, we finished our algorithm and did some tests. The results of tests show we got a detection rate of 90% and low false positive rate. Also, the stability of our algorithm is better than before. The performance of our algorithm won’t be influenced by the change of K and the way which the outliers are input.

KEY WORDS: IoT Devices, Abnormal Behavior Distinguish, Unsupervised Learning

目 录

摘 要 I

ABSTRACT II

1 绪论 1

1.1研究背景与意义 1

1.1.1物联网的发展 1

1.1.2物联网安全事件 2

1.1.3项目研究意义 3

1.2国内外研究现状 3

1.2.1物联网异常行为研究现状 3

1.2.2无监督学习研究现状 4

1.3课题任务与主要工作 5

2 物联网设备异常行为分析 6

2.1网络层数据流特征选择 6

2.2物联网设备异常行为类型分析 6

2.3本章小结 9

3 典型的无监督异常检测算法分析 10

3.1利用DBSCAN聚类进行异常检测 10

3.1.1 DBSCAN算法简介 10

3.1.2算法分析 12

3.1.3测试结果 13

3.2利用K-means聚类进行异常检测 14

3.2.1 K-means算法简介 14

3.2.2算法分析 15

3.3.3算法测试 15

3.3利用LOF(局部异常因子)算法进行异常检测 16

3.3.1 LOF异常因子算法简介 16

3.3.2算法分析 17

3.4无监督算法的比较与分析 18

3.5本章小结 19

4无监督异常检测算法K-DBSCAN 20

4.1改进原因 20

4.2K-DBSCAN算法描述 20

4.3理论分析 26

4.4本章小结 27

5 K-DBSCAN算法测试与分析 28

5.1实验环境 28

5.2无监督异常检测算法测试 28

5.2.1异常点检测率和误判率 28

5.2.2参数K对异常检测效果的影响 29

5.2.3数据输入形式、位置、顺序对算法的影响 31

5.3算法效果比较 33

5.4本章小结 34

6 总结与展望 35

6.1总结 35

6.2展望 35

参考文献 36

附录 38

剩余内容已隐藏,请支付后下载全文,论文总字数:36530字

您需要先支付 80元 才能查看全部内容!立即支付

该课题毕业论文、开题报告、外文翻译、程序设计、图纸设计等资料可联系客服协助查找;