基于深度学习的对抗样本生成研究

 2022-05-11 08:05

论文总字数:33319字

摘 要

到目前为止,已经有很多深度学习模型能十分准确和高效的进行图像识别的任务,从辨别图像中的物体、自动生成字幕,到QA任务、人脸和步态识别等,无不展示着这些模型的强大能力和实用价值。但是,这些模型的一些弱点也是众所周知的:过度依赖于表层特征、泛化能力弱等等。对抗样本即是针对这些深度学习模型的弱点所生成的,只需改动原有数据的一小部分,例如图像中一些肉眼不可感知的像素点,即可欺骗深度学习模型做出错误判断。目前已有一些针对防御对抗样本问题而提出的一些方法。本文通过分析目前图像和文本数据集中样本的数据特征,研究不同特性的信息对当前深度学习分类器带来的影响。在对不同信息进行分离和过滤的基础上,提出一个提高当前分类器在Domain Adaptation任务上正确率和泛化性的方法,以及一个生成相应对抗样本的方案。

  1. 本文针对图像识别问题,先使用灰度共生矩阵来提取一些从先验知识看来十分表层的信息,如图像的纹理信息而不是有意义的语义信息。再通过将一般模型学习到的综合信息投影到与上述GLCM特征所正交的向量空间,滤除这些表层信息对模型分类的影响,来使得我们的模型能够尽量避免数据集中微小扰动或统计分布的移位所带来的干扰,进而提高防御对抗样本的能力。
  2. 在上述操作的基础上,本文对MNIST数字识别数据集进行实验,通过将其中的样本旋转不同角度从而生成新的样本,来测试本文方法的抗干扰能力。之后,通过修改习得的图像的GLCM特征,来生成新的对抗样本。
  3. 类似于图像问题,本文同时也在文本数据集上进行了相同的实验。不同于图像的GLCM特征,这里我们通过过滤数据集的BOW(Bag-of-Words)特征来提高模型的抗干扰能力,并在SNLI,MultiNLI,SICK等数据及上进行了相应实验,验证了结论。
  4. 针对上述文本数据集的BOW特征和人工构造数据集本身的缺陷,本文通过添加或修改相应关键词来生成新的对抗样本。

关键词:深度学习,对抗样本,泛化性,GLCM,生成与防御

Abstract

So far, many deep learning models have been proposed to carry out image recognition tasks, most of which are able to do it in a fairly efficient way with a high accuracy. They have shown their strong power and great practical value in tasks like subject recognition, automatic generalization of subscriptions to QA tasks, face or gait recognition. However, their weakness and disadvantages are just as obvious as their impressive abilities. They rely on superficial features too much, and their ability of generalization doesn’t live up to people’s expectations. When confronted with adversarial examples, which are designed and created on purpose to deceit the deep learning models by adjusting tiny, or say invisible, parts of the original images(e.g. some pixels), they are likely to make the wrong decisions, which might lead to a lower accuracy. This could be very dangerous if it happens on those applications that have been largely used in our daily life. Some methods to help deep learning models to detect and avoid adversarial examples have been proposed. In this article, we analyze the influence brought by different information with different properties by looking into the statistic natures of current image and text dataset. At the same time, by separating and filtering these disparate information, or say features, we propose a new method that helps deep learning models to improve their accuracies on domain adaptation tasks and ability to generalize. With the help of analysis mentioned above, we also came up with an idea of creating corresponding adversarial examples using the samples in the dataset.

  1. To deal with the image recognition problems, we first use gray-level co-occurrence matrix to extract the features that we think are superficial, like the texture information which has little to do with the content of the image unlike those meaningful semantic featues. Later in the article we project the integrated features learnt by a normal deep learning models(e.g. a simple CNN model) onto the orthogonal subspace of the GLCM features to filter out the influence that those meaningless GLCM features might bring. By dong this, we can help the deep learning models avoid the disturbs that could be made by invisible adjustments in the dataset or the shifts in its statistical distribution, thus elevate the ability of those models to detect and prevent adversarial examples.
  2. Based on the two steps mentioned above, we conducted some experiments on the MNIST dataset, one of which is to rotate the pictures in the datset to create new samples, and then use these newly built data to test the robustness of our method. We also try modifying the GLCM features extracted by the model to create new adversarial examples.
  3. Similarly, we carried out the same experiment on text dataset. However, unlike the images, we use the BOW(Bag-of-Words) features instead of GLCM features used in the previous experiment. We used SNLI, MultiNLI, SICK dataset to test our method and verified the conclusion.
  4. We tried building new adversarial examples by adding or modifying some key words based on the defects of manually constructed dataset and the BOW features it has.

目 录

摘要 III

Abstract IV

第一章 绪论 3

1.1 选题背景 3

1.1.1 图像识别问题及其遇到的困难 3

1.1.2 对抗样本(Adversarial Examples)生成及其防御 3

1.1.3 灰度共生矩阵(GLCM, Gray-Level Co-occurrence Matrix) 4

1.1.4 领域适应及泛化(Domain Adaptation and Generalization) 4

1.2 本文内容与结构安排 5

第二章 对无关信息的滤除算法 6

2.1 主要方法及模型结构 6

2.2 滤除算法 6

2.2.1 GLCM特征的提取 6

2.2.2 GLCM特征的滤除 7

第三章 相关实验及其结果 9

3.1 实验材料 9

3.1.1 实验使用的模型 9

3.1.2 实验使用的数据集 9

3.2 实验准备 9

3.3 实验结果 10

3.3.1 使用GLCM提取图像(语义无关的)纹理信息 10

3.3.2 在与纹理信息正交的向量空间进行投影 10

3.3.3 在MNIST数据集上进行的验证试验 10

3.4 对抗样本的生成 11

第四章 文本数据集上的转移研究 13

4.1 实验综述 13

4.1.1 实验背景 13

4.2.1 相关定义 13

4.2 实验材料 14

4.2.1 使用的模型 14

4.2.2 使用的数据集 14

4.3 评估方法 15

4.3.1 Stress Test 15

4.3.2 Cross-Domain 15

4.4 实验结果 15

4.4.1 Stress Test 15

4.4.2 Cross-Domain 16

4.5 案例讨论 17

4.6 对抗样本的生成 18

4.7 相关的NLI工作与研究 19

剩余内容已隐藏,请支付后下载全文,论文总字数:33319字

您需要先支付 80元 才能查看全部内容!立即支付

该课题毕业论文、开题报告、外文翻译、程序设计、图纸设计等资料可联系客服协助查找;