WO2022037295A1 - Procédé d'attaque ciblée pour récupération de hachage profond et dispositif terminal - Google Patents

Procédé d'attaque ciblée pour récupération de hachage profond et dispositif terminal Download PDF

Info

Publication number
WO2022037295A1
WO2022037295A1 PCT/CN2021/104818 CN2021104818W WO2022037295A1 WO 2022037295 A1 WO2022037295 A1 WO 2022037295A1 CN 2021104818 W CN2021104818 W CN 2021104818W WO 2022037295 A1 WO2022037295 A1 WO 2022037295A1
Authority
WO
WIPO (PCT)
Prior art keywords
hash
sample
deep
retrieval
adversarial
Prior art date
Application number
PCT/CN2021/104818
Other languages
English (en)
Chinese (zh)
Inventor
夏树涛
白家旺
陈斌
戴涛
李清
齐竹云
Original Assignee
鹏城实验室
清华大学深圳国际研究生院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 鹏城实验室, 清华大学深圳国际研究生院 filed Critical 鹏城实验室
Publication of WO2022037295A1 publication Critical patent/WO2022037295A1/fr

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/903Querying
    • G06F16/90335Query processing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/60Protecting data
    • G06F21/62Protecting access to data via a platform, e.g. using keys or access control rules
    • G06F21/6218Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database
    • G06F21/6227Protecting access to data via a platform, e.g. using keys or access control rules to a system of files or objects, e.g. local or distributed file system or database where protection concerns the structure of data, e.g. records, types, queries

Definitions

  • the invention relates to the technical field of hash retrieval, in particular to a targeted attack method and terminal device for deep hash retrieval.
  • Large-scale data approximate nearest neighbor retrieval has the characteristics of high efficiency and high performance, and is used in many search engines to retrieve images or videos, such as Google and Bing.
  • Google and Bing haveh-based retrieval in particular has received more attention, which can map data into a compact binary space, thereby using Hamming distance to measure similarity and improve computational efficiency.
  • Hash retrieval methods based on deep learning can achieve the best performance in current hash retrieval.
  • many studies have shown that deep learning models are vulnerable to adversarial attacks, which affects the performance of deep learning models.
  • adversarial sample generation can be divided into two types: untargeted attack and targeted attack.
  • Untargeted attack refers to degrading the performance of the attacked model
  • targeted attack refers to the attacker to achieve a specific goal (for example, in a classification task, the goal is to classify adversarial examples into a specified class).
  • there are few methods about adversarial attacks in retrieval tasks and there is no targeted attack method for deep hash retrieval, which is not conducive to the research on the robustness and security of retrieval systems.
  • the technical problem to be solved by the present invention is to provide a targeted attack method and terminal device for deep hash retrieval in view of the deficiencies of the prior art, aiming to solve the lack of a targeted attack method for deep hash retrieval in the prior art , which is not conducive to the research on the robustness and security of the retrieval system.
  • a targeted attack method for deep hash retrieval comprising the steps of:
  • the label t specifies the category expected to be returned by the attacker, and the label t is different from the category of the query image x;
  • the representative hash code ha is obtained by adopting the bit voting algorithm
  • tanh is the hyperbolic tangent function
  • x' is the adversarial sample
  • the sample xi is a picture or a video.
  • the targeted attack method for deep hash retrieval wherein the step of using a bit voting algorithm to obtain the representative hash code ha includes:
  • Hash code for all samples in the sample set According to the bit voting method, the representative hash code ha is obtained.
  • the targeted attack method for deep hash retrieval wherein the hash code of all samples in the sample set is
  • the steps of obtaining the representative hash code ha include:
  • a computer-readable storage medium wherein the computer-readable storage medium stores one or more programs, and the one or more programs can be executed by one or more processors, so as to realize the depth-targeting described in the present invention. Steps in a targeted attack method for hash retrieval.
  • a terminal device comprising: a processor, a memory and a communication bus; a computer-readable program executable by the processor is stored on the memory;
  • the communication bus implements connection communication between the processor and the memory
  • the present invention provides a targeted attack method, storage medium and terminal device for deep hash retrieval.
  • the targeted attack in retrieval is defined as a point-to-set optimization problem, that is, Minimize the average distance between the hash code of the adversarial sample and the set of hash codes of the desired category; then a bit-voting algorithm is designed to obtain the optimal representative hash code of the set of hash codes of the desired category; in order to ensure the invisibility of the adversarial samples It is further proposed to optimize the adversarial noise under infinite constraints, so that the distance between the hash code of the adversarial sample and the representative hash code is as small as possible.
  • the method of the invention not only ensures the indistinguishability between the confrontation sample and the original sample, but also obtains a good target attack effect; the invention adopts this attack method when designing the deep hash retrieval model, which is beneficial to improve the security and robustness of the model.
  • the adversarial examples generated can make the retrieval model return the class samples expected by the attacker.
  • FIG. 1 is a flowchart of a preferred embodiment of a targeted attack method for deep hash retrieval provided by the present invention.
  • FIG. 2 is a schematic diagram of a targeted attack method for deep hash retrieval provided by the present invention.
  • FIG. 3 is a schematic structural diagram of a terminal device provided by the present invention.
  • the present invention provides a targeted attack method, storage medium and terminal device for deep hash retrieval.
  • the present invention is further described in detail below with reference to the accompanying drawings and examples. . It should be understood that the specific embodiments described herein are only used to explain the present invention, but not to limit the present invention.
  • the hash function When the user uploads a picture, use the hash function to convert it into 01 code, and then calculate the distance between this code and the codes of all pictures in the database (using Hamming distance calculation at this time), that is, the picture's Binary code, XOR operation with all binary codes in the database, the number of 1 is the distance, sort all distances, select the first 100 closest pictures as similar pictures, and then find the original picture by index and display it .
  • the cifar-10 data set can be used. First, the gist feature is extracted from the data set, and each image is represented by a vector.
  • each image will use A 512-dimensional vector representation, 10,000 pictures are finally formed: a 10000*512 matrix.
  • Divide the data into a training set and a test set and the training set is used to train the hash function.
  • the test set is used to test the precision and recall.
  • the hash function is trained from the training set.
  • the training data is converted into a hash function code through a hash function, and the test data is converted into a hash code. Calculate the distance from the test data to the training data, sort, select the top 100 pictures with the smallest distance, and the 100 pictures found are the pictures of the approximate neighbors.
  • the hash retrieval method based on deep learning can achieve the best performance in the current hash retrieval, however, research shows that the deep learning model is vulnerable to adversarial attacks, which affects the performance of the deep learning model.
  • adversarial sample generation can be divided into two types: untargeted attack and targeted attack.
  • Untargeted attack refers to degrading the performance of the attacked model
  • targeted attack refers to the attacker to achieve a specific goal (for example, in a classification task, the goal is to classify adversarial examples into a specified class).
  • a specific goal for example, in a classification task, the goal is to classify adversarial examples into a specified class.
  • the targeted attack methods in classification cannot be directly transferred to retrieval.
  • Embodiments of the present invention provide a targeted attack method for deep hash retrieval, which includes the steps:
  • the label t specifies the category expected to be returned by the attacker, and the label t is different from the category of the query image x;
  • the representative hash code ha is obtained by adopting the bit voting algorithm
  • tanh is the hyperbolic tangent function
  • x' is the adversarial sample
  • the retrieval process for query sample x is as follows: first, the model outputs the hash code F(x) of x, and then calculates the difference between the query hash code and all sample hash codes in the database. Hamming distance d H (F(x), F(x i )), and finally the retrieval system will sort the samples in the database according to the calculated distance and return the result.
  • the targeted attack method for deep hash retrieval provided by this embodiment first defines the targeted attack in deep hash retrieval as a point-to-set optimization problem, that is, minimizing the hash code of the adversarial sample and the expected class hash Then, a bit-voting algorithm is designed to obtain the optimal representative hash code method of the desired category hash code set; in order to ensure the invisibility of adversarial samples, it is further proposed to optimize the adversarial noise under infinite constraints, so that The distance between the hash code of the adversarial example and the representative hash code is as small as possible.
  • the method of this embodiment not only ensures the indistinguishability of the adversarial sample from the original sample, but also obtains a good effect of targeted attack; this embodiment adopts this attack method when designing the deep hash retrieval model, which is beneficial to improve the security of the model and robustness, and the resulting adversarial examples enable the retrieval model to return samples of the class expected by the attacker.
  • the attacker specifies the desired category t to be returned, and t needs to be different from the real category of x; as an example, if the category of x is dog, the attacker specifies the desired category to be returned.
  • the category t of can be cat, pig, fish, chicken, etc., but is not limited thereto.
  • An attacker can provide a set of samples with label t Generate hash codes for all samples in sample set X (t) using model F( ) Hash code for all samples in the sample set
  • the representative hash code ha is obtained; then the size of the hyperparameter ⁇ is specified as 0 to 1, and the loss function is designed as: Among them, tanh is the hyperbolic tangent function, and x' is the adversarial sample; then use the gradient descent method to calculate the gradient of x', and use the calculated gradient to update x'; project the generated adversarial sample x' so that x' satisfies infinity Constraint and image space; judge whether the preset number of updates is reached, if so, get the adversarial sample x'; if not, continue to return to step S06 to continue updating x'; finally input the adversarial sample x' into the depth In the hope retrieval model, the samples of the desired category are returned.
  • the adversarial samples generated by this algorithm are first input into the hash model, that is, the adversarial query "dog" picture is input into the following feature extractor and fully connected layer to obtain the hash code of the adversarial samples.
  • the hash code retrieves the neighbor samples in the database, and the obtained neighbor samples belong to the attack category preset by the attacker in the targeted attack, that is, the "cat" in the figure below.
  • the size of the hyperparameter ⁇ is set from 0 to 1 to prevent the gradient disappearance problem during backpropagation and speed up the convergence speed of the adversarial sample generation algorithm; by designing the loss function To denote that the infinite norm of the original query image and the generated adversarial sample is smaller than a given threshold ⁇ , that is, to make the hash code of the adversarial sample and the representative hash code ha as close as possible to make the two samples indistinguishable.
  • the preset number of updates is a parameter set by the attacker, which can be set to 2000; reaching a certain preset number of times is to satisfy the success of the attack and at the same time, within an acceptable calculation time, the preset number of updates does not reach the preset value.
  • the number of updates may cause the generated adversarial examples to attack poorly.
  • the sample x' is a picture or a video.
  • This embodiment adopts the algorithm of bit voting to calculate the representative hash code, which provides an optimized target for targeted confrontation attacks, and can make the attack effect efficient and stable.
  • this embodiment provides a computer-readable storage medium, where the computer-readable storage medium stores one or more programs, and the one or more programs can be One or more processors execute to implement the steps in the targeted attack method for deep hash retrieval as described in the above embodiments.
  • the present invention also provides a terminal device, as shown in FIG. 3 , which includes at least one processor 20 ; a display screen 21 ; and a memory 22 , may also include a communications interface (Communications Interface) 23 and a bus 24.
  • the processor 20 , the display screen 21 , the memory 22 and the communication interface 23 can communicate with each other through the bus 24 .
  • the display screen 21 is set to display a user guide interface preset in the initial setting mode.
  • the communication interface 23 can transmit information.
  • the processor 20 may invoke logic instructions in the memory 22 to perform the methods in the above-described embodiments.
  • logic instructions in the memory 22 can be implemented in the form of software functional units and can be stored in a computer-readable storage medium when sold or used as an independent product.
  • the memory 22 may be configured to store software programs and computer-executable programs, such as program instructions or modules corresponding to the methods in the embodiments of the present disclosure.
  • the processor 20 executes functional applications and data processing by running the software programs, instructions or modules stored in the memory 22, ie, implements the methods in the above embodiments.
  • the memory 22 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal device, and the like. Additionally, memory 22 may include high-speed random access memory, and may also include non-volatile memory. For example, U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes, or temporary state storage medium.
  • U disk U disk, mobile hard disk, read-only memory (Read-Only Memory, ROM), random access memory (Random Access Memory, RAM), magnetic disk or optical disk and other media that can store program codes, or temporary state storage medium.
  • the present invention provides a targeted attack method, storage medium and terminal device for deep hash retrieval.
  • the targeted attack in retrieval is defined as a point-to-set optimization problem, that is, minimizing the impact of adversarial samples.
  • the average distance between the hash code and the set of hash codes of the desired category; then a method of bit voting to obtain the optimal representative hash code of the set of hash codes of the desired category is designed; in order to ensure the invisibility of the adversarial samples, it is further proposed that in the infinite
  • the adversarial noise is optimized under constraints so that the distance between the hash code of the adversarial sample and the representative hash code is as small as possible.
  • the method of the invention not only ensures the indistinguishability between the confrontation sample and the original sample, but also obtains a good target attack effect; the invention adopts this attack method when designing the deep hash retrieval model, which is beneficial to improve the security and robustness of the model.
  • the adversarial examples generated can make the retrieval model return the class samples expected by the attacker.
  • the present invention provides support for improving the robustness and security of the retrieval system by proposing a targeted adversarial attack method for deep hash retrieval, verifying the robustness of the retrieval model under this attack.
  • the invention destroys the model retrieval result by adding invisible anti-noise to the input image, and returns the sample of the desired category of the attacker.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Bioethics (AREA)
  • Software Systems (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Hardware Design (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

La présente invention concerne un procédé d'attaque ciblée pour récupération de hachage profond et un dispositif terminal. Le procédé comprend : la fourniture d'un ensemble d'échantillons avec une étiquette t, la saisie de tous les échantillons dans l'ensemble d'échantillons dans un modèle de récupération de hachage profond, et la génération d'un code de hachage correspondant ; l'obtention d'un code de hachage représentatif ha par adoption d'un algorithme de vote binaire ; la spécification de la taille d'un hyper-paramètre α à 0-1, et la conception d'une fonction de perte ; le calcul du gradient de x' à l'aide d'un procédé de descente de gradient et la mise à jour du x' à l'aide du gradient ; la projection de l'échantillon antagoniste généré x' de telle sorte que le x' réponde à des contraintes infinies et à l'espace d'image ; la détermination si un nombre prédéfini de mises à jour est atteint ou non, et si tel est le cas, l'obtention d'un échantillon antagoniste x' ; et la saisie de l'échantillon antagoniste x' dans le modèle de récupération de hachage profond, et le renvoi d'un échantillon d'une catégorie attendue. Lorsque le modèle de récupération de hachage profond est conçu, le procédé d'attaque est adopté, la sécurité et la robustesse du modèle peuvent être améliorées, et l'échantillon antagoniste généré peut permettre au modèle de récupération de revenir à un échantillon de catégorie attendu par un attaquant.
PCT/CN2021/104818 2020-08-20 2021-07-06 Procédé d'attaque ciblée pour récupération de hachage profond et dispositif terminal WO2022037295A1 (fr)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010841276.8 2020-08-20
CN202010841276.8A CN112115317B (zh) 2020-08-20 2020-08-20 一种针对深度哈希检索的有目标攻击方法及终端设备

Publications (1)

Publication Number Publication Date
WO2022037295A1 true WO2022037295A1 (fr) 2022-02-24

Family

ID=73805608

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/104818 WO2022037295A1 (fr) 2020-08-20 2021-07-06 Procédé d'attaque ciblée pour récupération de hachage profond et dispositif terminal

Country Status (2)

Country Link
CN (1) CN112115317B (fr)
WO (1) WO2022037295A1 (fr)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114757336A (zh) * 2022-04-06 2022-07-15 西安交通大学 深度学习模型对抗攻击敏感频带检测方法及相关装置
CN114882312A (zh) * 2022-05-13 2022-08-09 北京百度网讯科技有限公司 对抗图像样本的生成方法、装置、电子设备以及存储介质
CN116070277A (zh) * 2023-03-07 2023-05-05 浙江大学 一种基于深度哈希的纵向联邦学习隐私保护方法和系统
CN116662490A (zh) * 2023-08-01 2023-08-29 山东大学 融合层次化标签信息的去混淆文本哈希算法和装置
CN118069885A (zh) * 2024-04-19 2024-05-24 山东建筑大学 一种动态视频内容编码检索方法及系统

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112115317B (zh) * 2020-08-20 2024-05-14 鹏城实验室 一种针对深度哈希检索的有目标攻击方法及终端设备
CN113343025B (zh) * 2021-08-05 2021-11-02 中南大学 基于加权梯度哈希激活热力图的稀疏对抗攻击方法
CN113727301B (zh) * 2021-08-05 2023-07-11 西安交通大学 面向v2n低时延通信服务的哈希安全接入方法及系统

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027060A (zh) * 2019-12-17 2020-04-17 电子科技大学 基于知识蒸馏的神经网络黑盒攻击型防御方法
CN111368725A (zh) * 2020-03-03 2020-07-03 广州大学 一种基于深度学习的hrrp有目标对抗样本生成方法
CN112115317A (zh) * 2020-08-20 2020-12-22 鹏城实验室 一种针对深度哈希检索的有目标攻击方法及终端设备

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109558890B (zh) * 2018-09-30 2023-03-31 天津大学 基于自适应权重哈希循环对抗网络的零样本图像分类方法
CN111127385B (zh) * 2019-06-06 2023-01-13 昆明理工大学 基于生成式对抗网络的医学信息跨模态哈希编码学习方法
CN110321957B (zh) * 2019-07-05 2023-03-24 重庆大学 融合三元组损失和生成对抗网络的多标签图像检索方法

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111027060A (zh) * 2019-12-17 2020-04-17 电子科技大学 基于知识蒸馏的神经网络黑盒攻击型防御方法
CN111368725A (zh) * 2020-03-03 2020-07-03 广州大学 一种基于深度学习的hrrp有目标对抗样本生成方法
CN112115317A (zh) * 2020-08-20 2020-12-22 鹏城实验室 一种针对深度哈希检索的有目标攻击方法及终端设备

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
YANG ERKUN: "Deep Compact Coding for Multimedia Nearest Neighbor Search", CHINESE DOCTORAL DISSERTATIONS FULL-TEXT DATABASE, no. 3, 15 March 2020 (2020-03-15), XP055902200 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114757336A (zh) * 2022-04-06 2022-07-15 西安交通大学 深度学习模型对抗攻击敏感频带检测方法及相关装置
CN114882312A (zh) * 2022-05-13 2022-08-09 北京百度网讯科技有限公司 对抗图像样本的生成方法、装置、电子设备以及存储介质
CN116070277A (zh) * 2023-03-07 2023-05-05 浙江大学 一种基于深度哈希的纵向联邦学习隐私保护方法和系统
CN116070277B (zh) * 2023-03-07 2023-08-29 浙江大学 一种基于深度哈希的纵向联邦学习隐私保护方法和系统
CN116662490A (zh) * 2023-08-01 2023-08-29 山东大学 融合层次化标签信息的去混淆文本哈希算法和装置
CN116662490B (zh) * 2023-08-01 2023-10-13 山东大学 融合层次化标签信息的去混淆文本哈希算法和装置
CN118069885A (zh) * 2024-04-19 2024-05-24 山东建筑大学 一种动态视频内容编码检索方法及系统

Also Published As

Publication number Publication date
CN112115317B (zh) 2024-05-14
CN112115317A (zh) 2020-12-22

Similar Documents

Publication Publication Date Title
WO2022037295A1 (fr) Procédé d'attaque ciblée pour récupération de hachage profond et dispositif terminal
Li et al. Universal perturbation attack against image retrieval
Jiang et al. Discrete latent factor model for cross-modal hashing
CN110309331B (zh) 一种基于自监督的跨模态深度哈希检索方法
Ke et al. End-to-end automatic image annotation based on deep CNN and multi-label data augmentation
US11244205B2 (en) Generating multi modal image representation for an image
CN111382868B (zh) 神经网络结构搜索方法和神经网络结构搜索装置
US10510021B1 (en) Systems and methods for evaluating a loss function or a gradient of a loss function via dual decomposition
Liu et al. Sequential compact code learning for unsupervised image hashing
CN110796057A (zh) 行人重识别方法、装置及计算机设备
CN109492776B (zh) 基于主动学习的微博流行度预测方法
CN110020711A (zh) 一种采用灰狼优化算法的大数据分析方法
CN114329109B (zh) 基于弱监督哈希学习的多模态检索方法及系统
CN115358305A (zh) 一种基于边界样本迭代生成的增量学习鲁棒性提升方法
JP2022548187A (ja) 対象再識別方法および装置、端末並びに記憶媒体
Chu et al. Visualization feature and CNN based homology classification of malicious code
CN113656700A (zh) 基于多相似度一致矩阵分解的哈希检索方法
Qin et al. Efficient non-targeted attack for deep hashing based image retrieval
KR102615073B1 (ko) 유사도 검색을 위한 신경 해싱
CN113869005A (zh) 一种基于语句相似度的预训练模型方法和系统
Feng et al. A novel feature selection method with neighborhood rough set and improved particle swarm optimization
CN113535947A (zh) 一种带有缺失标记的不完备数据的多标记分类方法及装置
JP2012155394A (ja) 文書分類学習制御装置、文書分類装置およびコンピュータプログラム
CN112364198A (zh) 一种跨模态哈希检索方法、终端设备及存储介质
CN116069985A (zh) 一种基于标签语义增强的鲁棒在线跨模态哈希检索方法

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21857390

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21857390

Country of ref document: EP

Kind code of ref document: A1