CN113343025B - Sparse attack resisting method based on weighted gradient Hash activation thermodynamic diagram - Google Patents

Sparse attack resisting method based on weighted gradient Hash activation thermodynamic diagram Download PDF

Info

Publication number
CN113343025B
CN113343025B CN202110893931.9A CN202110893931A CN113343025B CN 113343025 B CN113343025 B CN 113343025B CN 202110893931 A CN202110893931 A CN 202110893931A CN 113343025 B CN113343025 B CN 113343025B
Authority
CN
China
Prior art keywords
hash
video
sparse
matrix
representing
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110893931.9A
Other languages
Chinese (zh)
Other versions
CN113343025A (en
Inventor
黄亮
施荣华
胡超
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202110893931.9A priority Critical patent/CN113343025B/en
Publication of CN113343025A publication Critical patent/CN113343025A/en
Application granted granted Critical
Publication of CN113343025B publication Critical patent/CN113343025B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/732Query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/71Indexing; Data structures therefor; Storage structures

Abstract

The invention provides a sparse counterattack method based on weighted gradient hash activation thermodynamic diagram, which comprises the following steps: step 1, inputting a query video into a video hash retrieval model to obtain a query video hash code; step 2, acquiring a target video set and respectively inputting target videos in the target video set into a video hash retrieval model to generate a plurality of target video hash codes; and 3, performing dot multiplication on the query video hash code and the target video hash codes respectively to construct a Hamming distance function between the query video hash code and the target video hash codes. According to the sparse countermeasure attack method based on the weighted gradient hash activation thermodynamic diagram, the position and the sensitive area of the sparse countermeasure attack are determined by using the accuracy of the sensitivity of the weighted gradient hash activation thermodynamic diagram, the pixel cost of the countermeasure attack is reduced, and the accuracy, the efficiency and the imperceptibility of a countermeasure sample of the sparse countermeasure attack are improved.

Description

Sparse attack resisting method based on weighted gradient Hash activation thermodynamic diagram
Technical Field
The invention relates to the technical field of video counterattack, in particular to a sparse counterattack method based on weighted gradient hash activation thermodynamic diagram.
Background
The application of the deep neural network to the hash retrieval greatly improves the hash retrieval efficiency, and in recent years, the deep neural network is proved to be very fragile under the attack resistance, so that the safety problem related to the deep neural network draws attention, the research on the attack resistance is further developed, and the deep retrieval system bears the risk of the deep neural network while enjoying the benefits brought by the deep neural network.
The existing anti-attack method can be divided into two categories of dense anti-attack and sparse anti-attack, compared with the dense attack, the sparse attack achieves the attack effect by disturbing pixel points at partial positions, and the biggest challenge of the sparse attack lies in how to determine the disturbed positions.
At present, the attack aiming at a video hash retrieval system only comprises a deep hash target attack method, the principle of the attack method is that a target label is optimized, a new voting component is arranged to obtain the optimal representation of a hash code set of the target label, so that the accuracy of resisting the attack is improved, the deep hash target attack method is intensive resisting attack, a plurality of redundant pixels are generated and are not suitable for a real scene, and moreover, the intensive resisting attack also needs to cost more pixels, so that the imperceptibility of a resisting sample is low.
Disclosure of Invention
The invention provides a sparse counterattack method based on weighted gradient hash activation thermodynamic diagram, and aims to solve the problems that a traditional counterattack method generates a plurality of redundant pixels, is not suitable for a real scene, needs more pixel cost and causes lower imperceptibility of countersamples.
In order to achieve the above object, an embodiment of the present invention provides a sparse counterattack method based on weighted gradient hash activation thermodynamic diagram, including:
step 1, inputting a query video into a video hash retrieval model to obtain a query video hash code;
step 2, acquiring a target video set and respectively inputting target videos in the target video set into a video hash retrieval model to generate a plurality of target video hash codes;
step 3, performing dot multiplication on the query video hash code and the plurality of target video hash codes respectively to construct a Hamming distance function between the query video hash code and the plurality of target video hash codes;
step 4, performing chain derivation on the output of the Hamming distance function and the intermediate layer input of the video Hash retrieval model, and then performing linear combination on the output of the Hamming distance function and the intermediate layer input of the video Hash retrieval model to generate a weighted gradient Hash activation thermodynamic diagram;
step 5, mapping the time dimension and the space dimension of the weighted gradient Hash activation thermodynamic diagram through trilinear interpolation and upsampling to obtain a weighted gradient Hash activation thermodynamic matrix, and enabling the weighted gradient Hash activation thermodynamic matrix to pass throughReluFunction activation andBinarizeperforming binarization on the function to obtain a sparse mask matrix;
and 6, multiplying the antagonism disturbance by a mask matrix to obtain an antagonism mask matrix, constructing an antagonism objective function according to the antagonism mask matrix and the Hamming distance function, and optimizing the antagonism objective function by an ADAM (adaptive dynamic analysis and analysis) optimization method to obtain a sparse antagonism video sample.
Wherein, the step 1 specifically comprises:
step 11, defining the video hash retrieval model asF(.);
Step 12, inputting the query video into a video Hash retrieval modelF(.) The video hash retrieval model generates a query video hash code, and the query video hash code generation process is as follows:
H q =F(X q ) (1)
wherein the content of the first and second substances,H q indicating that the video hash code is queried,H q ∈{0,1} N Nthe length is represented as a function of time,H q is of length ofNThe binary hash-code sequence of (a),X q which represents the query video, is presented to the user,X q R C×G×B×T Crepresenting the number of frames of the query video,Grepresenting the width of each frame of the query video,Bindicating the height of each frame of the query video,Trepresenting the number of channels per frame of the query video.
Wherein, the step 2 specifically comprises:
step 21, obtaining a target video setX t ={x t1,x t2,…,x ti And (c) the step of (c) in which,x ti representing the target video setiThe number of the target videos is reduced,i=1,2,…,n
step 22, respectively inputting the target videos in the target video set into a video hash retrieval model, wherein the video hash retrieval model generates a plurality of target video hash codes, and the target video hash code generation process is as follows:
H ti =F(x ti ) (2)
wherein the content of the first and second substances,H ti is shown asiThe hash code of the video to be targeted,i=1,2,…,n
wherein, the step 3 specifically comprises:
querying a hamming distance function between the video hash code and the plurality of target video hash codes as follows:
Figure DEST_PATH_IMAGE001
(3)
wherein the content of the first and second substances,d(.,.) Representing a dot product operation function.
Wherein, the step 4 specifically comprises:
step 41, after performing chain derivation on the output of the hamming distance function and the intermediate layer input of the video hash retrieval model, obtaining the gradient of the intermediate feature map, and using the gradient of the intermediate feature map as the weight of the intermediate feature map, as follows:
Figure 882094DEST_PATH_IMAGE002
(4)
wherein the content of the first and second substances,Wthe weights representing the intermediate feature map are,WR c×y×g×b cthe number of frames representing the intermediate feature map,yrepresenting the number of weighted graphs of the intermediate feature graph per frame,grepresenting the weighted graph width of the intermediate feature graph for each frame,brepresenting the weighted graph height of the intermediate feature map for each frame,Arepresenting middle layer input, wherein the middle layer input is a middle characteristic diagram;
step 42, weighting the intermediate feature mapWGlobal average is carried out on the second dimension to obtain the global average weight of each frame of feature mapw c As follows:
Figure DEST_PATH_IMAGE003
(5)
wherein the content of the first and second substances,w c a global average weight representing the feature map of each frame,
Figure 123719DEST_PATH_IMAGE004
the spatial resolution of the feature map of each frame is represented,iandjrepresenting pixel coordinates;
step 43, global average weight of each frame feature mapw c And input of intermediate layerAPerforming linear combination to obtain a weighted gradient hash activation thermodynamic diagramQ k As follows:
Q k =w c A(6)
wherein the content of the first and second substances,Q k represents a weighted gradient hash activation thermodynamic diagram,k=1,2,…,c
wherein, the step 5 specifically comprises:
step 51, mapping the weighted gradient hash activation thermodynamic diagram into a weighted gradient hash activation thermodynamic matrix with the same size as the target video size through trilinear interpolation and upsampling;
step 52, activating the weighted gradient hash thermal matrix inputReLUThe activation function obtains an activation matrixV T
Step 53, setting threshold valueεCombined with a threshold valueεWill activate the matrixV T Input deviceBinarizeA sparse mask matrix is generated in the function as follows:
Figure 100002_DEST_PATH_IMAGE005
(7)
wherein the content of the first and second substances,M n a matrix of masks is represented that is,Binarizea binary function is represented that is a function of,V T representing an activation matrix, activating the matrixV T Mask of pixels with intermediate weights below the threshold is set to 0, activating the matrixV T The mask for pixels with a median weight above the threshold is set to 1.
Wherein, the step 6 specifically comprises:
step 61, constructing an antagonism objective function, as follows:
Figure 398843DEST_PATH_IMAGE006
Figure 100002_DEST_PATH_IMAGE007
(8)
wherein the content of the first and second substances,Ewhich is indicative of a competing disturbance,Ma matrix of masks is represented that is,Mis composed ofM n In a shorthand form of (1);
Figure 100002_DEST_PATH_IMAGE009
represents the maximum value in the matrix, τ and
Figure 917680DEST_PATH_IMAGE010
represents a constant;
step 62, optimizing the formula (8) by using an ADAM optimization method to obtain an optimal solution with the objective of minimizing the hamming distance function between the query video hash code and the plurality of target video hash codes and minimizing the addition of adversarial disturbance, so as to obtain sparse adversarial video sampleX a
The scheme of the invention has the following beneficial effects:
according to the sparse countermeasure attack method based on the weighted gradient hash activation thermodynamic diagram, the position of the sparse countermeasure attack in the video hash retrieval model is determined by generating the weighted gradient hash activation thermodynamic diagram, the query video hash code is subjected to point multiplication with the target video hash codes respectively, the sensitive area of the sparse countermeasure attack is determined, the range of the sparse countermeasure attack is limited through the mask matrix, the pixel cost of the sparse countermeasure attack is reduced, the accuracy and the efficiency of the sparse countermeasure attack are improved, and the imperceptibility of a countermeasure sample is improved.
Drawings
FIG. 1 is a flow chart of the present invention;
fig. 2 is a schematic diagram of a mask matrix visualization according to the present invention.
Detailed Description
In order to make the technical problems, technical solutions and advantages of the present invention more apparent, the following detailed description is given with reference to the accompanying drawings and specific embodiments.
Aiming at the problems that the existing counterattack method generates a plurality of redundant pixels, is not suitable for a real scene, needs more pixel cost and causes lower imperceptibility of countersamples, the invention provides a sparse counterattack method based on weighted gradient hash activation thermodynamic diagram.
As shown in fig. 1 to 2, an embodiment of the present invention provides a sparse counterattack method based on weighted gradient hash activation thermodynamic diagram, including: step 1, inputting a query video into a video hash retrieval model to obtain a query video hash code; step 2, acquiring a target video set and respectively inputting target videos in the target video set into a video hash retrieval model to generate a plurality of target video hash codes; step 3, performing dot multiplication on the query video hash code and the plurality of target video hash codes respectively to construct a Hamming distance function between the query video hash code and the plurality of target video hash codes; step 4, performing chain derivation on the output of the Hamming distance function and the intermediate layer input of the video Hash retrieval model, and then performing linear combination on the output of the Hamming distance function and the intermediate layer input of the video Hash retrieval model to generate a weighted gradient Hash activation thermodynamic diagram; step 5, mapping the time dimension and the space dimension of the weighted gradient Hash activation thermodynamic diagram through trilinear interpolation and upsampling to obtain a weighted gradient Hash activation thermodynamic matrix, and enabling the weighted gradient Hash activation thermodynamic matrix to pass throughReluFunction activation andBinarizeperforming binarization on the function to obtain a sparse mask matrix; step 6, multiplying the antagonism disturbance by the mask matrix to obtain an antagonism mask matrix, constructing an antagonism objective function according to the antagonism mask matrix and the Hamming distance function, and optimizing by using an ADAM (adaptive dynamic analysis and analysis) methodAnd optimizing the antagonistic objective function to obtain a sparse antagonistic video sample.
Wherein, the step 1 specifically comprises: step 11, defining the video hash retrieval model asF(.);
Step 12, inputting the query video into a video Hash retrieval modelF(.) The video hash retrieval model generates a query video hash code, and the query video hash code generation process is as follows:
H q =F(X q ) (1)
wherein the content of the first and second substances,H q indicating that the video hash code is queried,H q ∈{0,1} N Nthe length is represented as a function of time,H q is of length ofNThe binary hash-code sequence of (a),X q which represents the query video, is presented to the user,X q R C×G×B×T Crepresenting the number of frames of the query video,Grepresenting the width of each frame of the query video,Bindicating the height of each frame of the query video,Trepresenting the number of channels per frame of the query video.
Wherein, the step 2 specifically comprises: step 21, obtaining a target video setX t ={x t1,x t2,…,x ti And (c) the step of (c) in which,x ti representing the target video setiThe number of the target videos is reduced,i=1,2,…,n
step 22, respectively inputting the target videos in the target video set into a video hash retrieval model, wherein the video hash retrieval model generates a plurality of target video hash codes, and the target video hash code generation process is as follows:
H ti =F(x ti ) (2)
wherein the content of the first and second substances,H ti is shown asiThe hash code of the video to be targeted,i=1,2,…,n
wherein, the step 3 specifically comprises: querying a hamming distance function between the video hash code and the plurality of target video hash codes as follows:
Figure 252846DEST_PATH_IMAGE001
(3)
wherein the content of the first and second substances,d(.,.) Representing a dot product operation function.
Wherein, the step 4 specifically comprises: step 41, after performing chain derivation on the output of the hamming distance function and the intermediate layer input of the video hash retrieval model, obtaining the gradient of the intermediate feature map, and using the gradient of the intermediate feature map as the weight of the intermediate feature map, as follows:
Figure 981768DEST_PATH_IMAGE002
(4)
wherein the content of the first and second substances,Wthe weights representing the intermediate feature map are,WR c×y×g×b cthe number of frames representing the intermediate feature map,yrepresenting the number of weighted graphs of the intermediate feature graph per frame,grepresenting the weighted graph width of the intermediate feature graph for each frame,brepresenting the weighted graph height of the intermediate feature map for each frame,Arepresenting middle layer input, wherein the middle layer input is a middle characteristic diagram;
step 42, weighting the intermediate feature mapWGlobal average is carried out on the second dimension to obtain the global average weight of each frame of feature mapw c As follows:
Figure 122899DEST_PATH_IMAGE003
(5)
wherein the content of the first and second substances,w c a global average weight representing the feature map of each frame,
Figure 620877DEST_PATH_IMAGE004
the spatial resolution of the feature map of each frame is represented,iandjrepresentation imageA pixel coordinate;
step 43, global average weight of each frame feature mapw c And input of intermediate layerAPerforming linear combination to obtain a weighted gradient hash activation thermodynamic diagramQ k As follows:
Q k =w c A(6)
wherein the content of the first and second substances,Q k represents a weighted gradient hash activation thermodynamic diagram,k=1,2,…,c
according to the sparse countermeasure attack method based on the weighted gradient hash activation thermodynamic diagram, the activation thermodynamic diagram is a category distinguishing and positioning technology, any model based on the convolutional neural network is made to be more transparent through generating a visual interpretation, the visual interpretation is also called the weighted gradient hash activation thermodynamic diagram, the key area in the image can be positioned through the weighted gradient hash activation thermodynamic diagram, the sparse countermeasure attack method based on the weighted gradient hash activation thermodynamic diagram applies the weighted gradient hash activation thermodynamic diagram to the video to obtain the key area in the video, and then the sparse countermeasure attack on the video hash retrieval model is achieved through combining with the countermeasure attack technology.
Wherein, the step 5 specifically comprises: step 51, mapping the weighted gradient hash activation thermodynamic diagram into a weighted gradient hash activation thermodynamic matrix with the same size as the target video size through trilinear interpolation and upsampling;
step 52, activating the weighted gradient hash thermal matrix inputReLUThe activation function obtains an activation matrixV T
Step 53, setting threshold valueεCombined with a threshold valueεWill activate the matrixV T Input deviceBinarizeA sparse mask matrix is generated in the function as follows:
Figure 392524DEST_PATH_IMAGE005
(7)
wherein the content of the first and second substances,M n a matrix of masks is represented that is,Binarizea binary function is represented that is a function of,V T representing an activation matrix, activating the matrixV T Mask of pixels with intermediate weights below the threshold is set to 0, activating the matrixV T The mask for pixels with a median weight above the threshold is set to 1.
In the sparse counterattack method based on weighted gradient hash activation thermodynamic diagram according to the above embodiment of the present invention, when the sparse counterattack video sample is generated, the threshold value in the formula (7) is usedεSet to 0.5.
Wherein, the step 6 specifically comprises: step 61, constructing an antagonism objective function, as follows:
Figure 654747DEST_PATH_IMAGE006
Figure 412618DEST_PATH_IMAGE007
(8)
wherein the content of the first and second substances,Ewhich is indicative of a competing disturbance,Ma matrix of masks is represented that is,Mis composed ofM n In a shorthand form of (1);
Figure 827419DEST_PATH_IMAGE009
represents the maximum value in the matrix, τ and
Figure 769967DEST_PATH_IMAGE010
represents a constant;
step 62, optimizing the formula (8) by using an ADAM optimization method to obtain an optimal solution with the objective of minimizing the hamming distance function between the query video hash code and the plurality of target video hash codes and minimizing the addition of adversarial disturbance, so as to obtain sparse adversarial video sampleX a
Activation of thermal power based on weighted gradient hash as described in the above embodiments of the present inventionSparse counterattack method of graph, and obtained sparse counterattack video sampleX a And (4) carrying out verification: 1. acquiring a test video data set and inputting the test video data set into a video hash retrieval model to generate a hash code database of the test video data set; 2. antagonistic video samples to be sparseX a Input video hash retrieval model generation of antagonistic hash codesH a (ii) a 3. To-be-antagonistic hash codesH a Performing point multiplication with each test video hash code in the hash code database of the test video data set respectively, and constructing a Hamming distance function between the antagonistic hash code and each test video hash code to obtain the Hamming distance between the antagonistic hash code and each test video hash code; 4. sequencing the Hamming distances between the antagonistic hash codes and each test video hash code from small to large to obtain a retrieval result, wherein if the Hamming distances between the antagonistic hash codes and the test video hash codes in the retrieval result are sequenced more forward, the smaller the Hamming distance between the test video hash codes and the antagonistic hash codes is, the higher the similarity between the test video hash codes and the antagonistic hash codes is; 5. MeanAverageprecision (MAP) is defined to measure the search results after sorting, as follows:
Figure DEST_PATH_IMAGE011
(9)
where O represents the number of test videos in the test data set,ka ranking representing a hamming distance between the antagonistic hash code and each of the test video hash codes in the search results,P(K) The accuracy of the representation is such that,P(K)=r/kris shown askHow many test video hashes before the ranking are consistent with the antagonistic hashes, whenkRel (k) is 1 when the test video hash code corresponding to the rank is consistent with the antagonistic hash code, and when the rank is not consistent with the antagonistic hash codekWhen the test video hash code corresponding to the rank is not consistent with the antagonistic hash code, rel (k) is 0,Rindicating the number of test video hash codes in the retrieval result which are consistent with the antagonistic hash codes.
In the sparse attack countermeasure method based on weighted gradient hash activation thermodynamic diagram according to the embodiment of the invention, the public data sets UCF101 and HMDB51 are respectively selectednThe hash codes are of lengths of 16bits, 32bits and 62bits respectively of the target video, wherein,n=1,3,5,7,9;nrepresenting the number of target videos, and enabling the target videos with different hash code lengths and different numbers to be differentεSubstituting the values into a formula (8) respectively to calculate MAP values of the confrontation video samples under different hash code lengths and different target video numbers; s (sparse) is introduced to measure the mask matrix set in the formula (7), so as to show the number of disturbed pixels introduced by the adversity attack, specifically expressed by S = U/L, where S represents the percentage of the number of disturbed pixels, U represents the number of disturbed pixels in the video, that is, the number of pixels set to 1 in the mask matrix, L represents the sum of the number of pixels in the video, that is, the number of pixels in the mask matrix, and when S is smaller, the smaller the number of added disturbed pixels, the lower the cost of the pixels to be paid, and the experimental results are shown in table 1:
TABLE 1
Figure DEST_PATH_IMAGE013
Calculation of MAP values for Sparse antagonistic video samples Sparse: take the MAP value equal to 91.61% for example whennIf not less than 1, inputting a target video with the hash code length of 16bits in the UCF101 data set into a video hash retrieval model to generate a target hash code, and setting the hash code in the formula (7)εGenerating a Sparse mask matrix for 0.5, substituting the target hash code, the number of the target videos and the Sparse mask matrix into a formula (8), obtaining a Sparse antagonistic video sample Sparse16bits with the hash code length of 16bits through an ADAM optimizer optimization formula (8), inputting the Sparse antagonistic video sample Sparse16bits into a video hash retrieval model to generate a Sparse antagonistic hash code, performing point multiplication on the Sparse antagonistic hash code and each test video hash code in a hash code database of the test video data set to construct a Sparse antagonistic hash codeObtaining a Hamming distance between the sparse antagonistic hash code and each test video hash code by using a Hamming distance function between the sparse antagonistic hash code and each test video hash code; sequencing the Hamming distance between the sparse antagonistic hash codes and each test video hash code from small to large to obtain a retrieval result, and calculating an MAP value according to the retrieval result through a formula (9) to obtain the MAP value of the sparse antagonistic hash codes of the sparse antagonistic video samples with the target video hash code as a target; calculation of MAP values for Dense antagonistic video samples density: take the MAP value equal to 91.76% for example whennIf not less than 1, inputting a target video with the hash code length of 16bits in the UCF101 data set into a video hash retrieval model to generate a target hash code, and setting the hash code in the formula (7)εGenerating a Dense mask matrix, substituting a target hash code, the number of target videos and the Dense mask matrix into a formula (8), obtaining a resistant video sample Dense16bits with the length of the Dense hash code being 16bits through an ADAM optimizer optimization formula (8), inputting the Dense resistant video sample Dense16bits into a video hash retrieval model to generate a Dense resistant hash code, performing point multiplication on the Dense resistant hash code and each test video hash code in a hash code database of a test video data set, constructing a Hamming distance function between the Dense resistant hash code and each test video hash code, and obtaining the Hamming distance between the Dense resistant hash code and each test video hash code; sequencing the Hamming distance between the dense antagonism hash codes and each test video hash code from small to large to obtain a retrieval result, and calculating an MAP value according to the retrieval result through a formula (9) to obtain the MAP value of the dense antagonism hash codes of the dense antagonism video samples by taking the target video hash codes as targets; in table 1, origin represents a MAP result of a target video, when hash lengths of sparse antagonistic video samples generated based on the UCF101 dataset are 16bits, 32bits, and 64bits, respectively, values of Sparsity are 66.38%, 65.03%, and 62.42%, respectively, and when hash lengths of dense antagonistic video samples generated based on the UCF101 dataset are 16bits, 32bits, and 64bits, respectively, values of Sparsity are 100%; dense pairs generated based on HMDB51 datasetWhen the hash lengths of the resistant video samples are 16bits, 32bits and 64bits respectively, the sparity values are 69.14%, 59.79% and 54.18%, and when the hash lengths of the dense resistant video samples generated based on the HMDB51 dataset are 16bits, 32bits and 64bits respectively, the sparity values are all 100%; when the value of the Sparsity is 100%, it represents that all the pixels in the mask matrix are added with disturbance, density represents Dense,ε=at 1, the pixels in the mask matrix are all 1, Sparse represents Sparse,ε=at 0.5, the portion of pixels in the mask matrix is 1.
As can be seen from Table 1, the MAP value gradually increases with the increase of the number of hash bits, and the MAP value of 64bits is the largest; number of target videosnWhen the MAP value is gradually increased, the MAP value is gradually increased; when in usen<At 5, MAP increased most rapidly; when in usenWhen =7 or 9, the rate of MAP increase is significantly reduced; the MAP of the sparse antagonistic video sample is slightly lower than that of the dense antagonistic video sample, but the value of disturbed pixel points of the sparse antagonistic video sample is obviously reduced; when the number of hash bits is increased, the value of the granularity is gradually reduced, which means that longer hash codes have richer information, and it is easier to find a key region of the target video for adding noise, that is, fewer pixels are needed to resist the attack, and the cost of the pixels is lower. Fig. 2 shows a visualization result of the sparse matrix, where the white portion in fig. 2 represents that the pixel position is set to 1, and the black portion in fig. 2 represents that the pixel position is set to 0.
According to the sparse countermeasure attack method based on the weighted gradient hash activation thermodynamic diagram, the sparse countermeasure attack method based on the weighted gradient hash activation thermodynamic diagram is applied to countermeasure attack, sparse countermeasure attack is carried out on the video hash retrieval model, the position of the sparse countermeasure attack in the video hash retrieval model is determined by generating the weighted gradient hash activation thermodynamic diagram, the query video hash code is respectively subjected to point multiplication with a plurality of target video hash codes, the sensitive area of the sparse countermeasure attack is determined, the range of the sparse countermeasure attack is limited through the mask matrix, the pixel cost of the sparse countermeasure attack is reduced, the accuracy and the efficiency of the sparse countermeasure attack are improved, and the imperceptibility of a countermeasure sample is improved.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (6)

1. A sparse countermeasure attack method based on weighted gradient hash activation thermodynamic diagram is characterized by comprising the following steps:
step 1, inputting a query video into a video hash retrieval model to obtain a query video hash code;
step 2, acquiring a target video set and respectively inputting target videos in the target video set into a video hash retrieval model to generate a plurality of target video hash codes;
step 3, performing dot multiplication on the query video hash code and the plurality of target video hash codes respectively to construct a Hamming distance function between the query video hash code and the plurality of target video hash codes;
step 4, performing chain derivation on the output of the Hamming distance function and the intermediate layer input of the video Hash retrieval model, and then performing linear combination on the output of the Hamming distance function and the intermediate layer input of the video Hash retrieval model to generate a weighted gradient Hash activation thermodynamic diagram;
step 5, mapping the time dimension and the space dimension of the weighted gradient Hash activation thermodynamic diagram through trilinear interpolation and upsampling to obtain a weighted gradient Hash activation thermodynamic matrix, and enabling the weighted gradient Hash activation thermodynamic matrix to pass throughReluFunction activation andBinarizeperforming binarization on the function to obtain a sparse mask matrix;
step 6, multiplying the antagonism disturbance by a mask matrix to obtain an antagonism mask matrix, constructing an antagonism objective function according to the antagonism mask matrix and the Hamming distance function, and optimizing the antagonism objective function by an ADAM (adaptive dynamic analysis and analysis) optimization method to obtain a sparse antagonism video sample;
the step 6 specifically includes:
step 61, constructing an antagonism objective function, as follows:
Figure 419155DEST_PATH_IMAGE002
wherein the content of the first and second substances,Ewhich is indicative of a competing disturbance,Ma matrix of masks is represented that is,Mis composed ofM n In a shorthand form of (1);
Figure 336296DEST_PATH_IMAGE004
represents the maximum value in the matrix, τ and
Figure DEST_PATH_IMAGE005
represents a constant;d(.,.) The function of the dot product operation is represented,F(.) A video hash retrieval model is represented and,X q which represents the query video, is presented to the user,X q R C×G×B×T Crepresenting the number of frames of the query video,Grepresenting the width of each frame of the query video,Bindicating the height of each frame of the query video,Tindicating the number of channels per frame of the query video,H ti is shown asiThe hash code of the video to be targeted,i=1,2,…,n
step 62, optimizing the antagonistic objective function by using an ADAM optimization method, and solving an optimal solution with the objective of minimizing hamming distance function between the query video hash code and the plurality of objective video hash codes and minimizing adding antagonistic disturbance to obtain sparse antagonistic video samplesX a
2. The sparse counterattack method based on weighted gradient hash activation thermodynamic diagram according to claim 1, wherein the step 1 specifically comprises:
step 11, defining the video hash retrieval model asF(.);
Step 12, inputting the query video into a video Hash retrieval moduleModel (III)F(.) The video hash retrieval model generates a query video hash code, and the query video hash code generation process is as follows:
H q =F(X q ) (1)
wherein the content of the first and second substances,H q indicating that the video hash code is queried,H q ∈{0,1} N Nthe length is represented as a function of time,H q is of length ofNThe binary hash-code sequence of (a),X q which represents the query video, is presented to the user,X q R C×G×B×T Crepresenting the number of frames of the query video,Grepresenting the width of each frame of the query video,Bindicating the height of each frame of the query video,Trepresenting the number of channels per frame of the query video.
3. The sparse counterattack method based on weighted gradient hash activation thermodynamic diagram according to claim 2, wherein the step 2 specifically comprises:
step 21, obtaining a target video setX t ={x t1,x t2,…,x ti And (c) the step of (c) in which,x ti representing the target video setiThe number of the target videos is reduced,i=1,2,…,n
step 22, respectively inputting the target videos in the target video set into a video hash retrieval model, wherein the video hash retrieval model generates a plurality of target video hash codes, and the target video hash code generation process is as follows:
H ti =F(x ti ) (2)
wherein the content of the first and second substances,H ti is shown asiThe hash code of the video to be targeted,i=1,2,…,n
4. the sparse counterattack method based on weighted gradient hash activation thermodynamic diagram of claim 3, wherein the step 3 specifically comprises:
querying a hamming distance function between the video hash code and the plurality of target video hash codes as follows:
Figure 302984DEST_PATH_IMAGE006
(3)
wherein the content of the first and second substances,d(.,.) Representing a dot product operation function.
5. The sparse counterattack method based on weighted gradient hash activation thermodynamic diagram of claim 4, wherein the step 4 specifically comprises:
step 41, after performing chain derivation on the output of the hamming distance function and the intermediate layer input of the video hash retrieval model, obtaining the gradient of the intermediate feature map, and using the gradient of the intermediate feature map as the weight of the intermediate feature map, as follows:
Figure DEST_PATH_IMAGE007
(4)
wherein the content of the first and second substances,Wthe weights representing the intermediate feature map are,WR c×y×g×b cthe number of frames representing the intermediate feature map,yrepresenting the number of weighted graphs of the intermediate feature graph per frame,grepresenting the weighted graph width of the intermediate feature graph for each frame,brepresenting the weighted graph height of the intermediate feature map for each frame,Arepresenting middle layer input, wherein the middle layer input is a middle characteristic diagram;
step 42, weighting the intermediate feature mapWGlobal average is carried out on the second dimension to obtain the global average weight of each frame of feature mapw c As follows:
Figure 698193DEST_PATH_IMAGE008
(5)
wherein the content of the first and second substances,w c a global average weight representing the feature map of each frame,
Figure DEST_PATH_IMAGE009
the spatial resolution of the feature map of each frame is represented,iandjrepresenting pixel coordinates;
step 43, global average weight of each frame feature mapw c And input of intermediate layerAPerforming linear combination to obtain a weighted gradient hash activation thermodynamic diagramQ k As follows:
Q k =w c A(6)
wherein the content of the first and second substances,Q k represents a weighted gradient hash activation thermodynamic diagram,k=1,2,…,c
6. the sparse counterattack method based on weighted gradient hash activation thermodynamic diagram of claim 5, wherein the step 5 specifically comprises:
step 51, mapping the weighted gradient hash activation thermodynamic diagram into a weighted gradient hash activation thermodynamic matrix with the same size as the target video size through trilinear interpolation and upsampling;
step 52, activating the weighted gradient hash thermal matrix inputReLUThe activation function obtains an activation matrixV T
Step 53, setting threshold valueεCombined with a threshold valueεWill activate the matrixV T Input deviceBinarizeA sparse mask matrix is generated in the function as follows:
Figure 512565DEST_PATH_IMAGE010
(7)
wherein the content of the first and second substances,M n a matrix of masks is represented that is,Binarizea binary function is represented that is a function of,V T representing an activation matrix, activating the matrixV T Mask of pixels with intermediate weights below the threshold is set to 0, activating the matrixV T The mask for pixels with a median weight above the threshold is set to 1.
CN202110893931.9A 2021-08-05 2021-08-05 Sparse attack resisting method based on weighted gradient Hash activation thermodynamic diagram Active CN113343025B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110893931.9A CN113343025B (en) 2021-08-05 2021-08-05 Sparse attack resisting method based on weighted gradient Hash activation thermodynamic diagram

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110893931.9A CN113343025B (en) 2021-08-05 2021-08-05 Sparse attack resisting method based on weighted gradient Hash activation thermodynamic diagram

Publications (2)

Publication Number Publication Date
CN113343025A CN113343025A (en) 2021-09-03
CN113343025B true CN113343025B (en) 2021-11-02

Family

ID=77480812

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110893931.9A Active CN113343025B (en) 2021-08-05 2021-08-05 Sparse attack resisting method based on weighted gradient Hash activation thermodynamic diagram

Country Status (1)

Country Link
CN (1) CN113343025B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115761263B (en) * 2022-12-09 2023-07-25 中南大学 Deep hash method
CN115878848B (en) * 2023-02-22 2023-05-02 中南大学 Antagonistic video sample generation method, terminal equipment and medium

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8885984B1 (en) * 2013-05-07 2014-11-11 Picscout (Israel) Ltd. Efficient image matching for large sets of images
CN107977461A (en) * 2017-12-21 2018-05-01 厦门美图之家科技有限公司 A kind of video feature extraction method and device
CN108304573A (en) * 2018-02-24 2018-07-20 江苏测联空间大数据应用研究中心有限公司 Target retrieval method based on convolutional neural networks and supervision core Hash
CN112016686A (en) * 2020-08-13 2020-12-01 中山大学 Antagonism training method based on deep learning model
CN112115317A (en) * 2020-08-20 2020-12-22 鹏城实验室 Targeted attack method for deep hash retrieval and terminal device
CN112395457A (en) * 2020-12-11 2021-02-23 中国搜索信息科技股份有限公司 Video to-be-retrieved positioning method applied to video copyright protection
CN112949678A (en) * 2021-01-14 2021-06-11 西安交通大学 Method, system, equipment and storage medium for generating confrontation sample of deep learning model

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8885984B1 (en) * 2013-05-07 2014-11-11 Picscout (Israel) Ltd. Efficient image matching for large sets of images
CN107977461A (en) * 2017-12-21 2018-05-01 厦门美图之家科技有限公司 A kind of video feature extraction method and device
CN108304573A (en) * 2018-02-24 2018-07-20 江苏测联空间大数据应用研究中心有限公司 Target retrieval method based on convolutional neural networks and supervision core Hash
CN112016686A (en) * 2020-08-13 2020-12-01 中山大学 Antagonism training method based on deep learning model
CN112115317A (en) * 2020-08-20 2020-12-22 鹏城实验室 Targeted attack method for deep hash retrieval and terminal device
CN112395457A (en) * 2020-12-11 2021-02-23 中国搜索信息科技股份有限公司 Video to-be-retrieved positioning method applied to video copyright protection
CN112949678A (en) * 2021-01-14 2021-06-11 西安交通大学 Method, system, equipment and storage medium for generating confrontation sample of deep learning model

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于卷积神经网络的污点攻击与防御;胡慧敏等;《浙江科技学院学报》;20200114(第01期);第44-49页 *
深度学习中的对抗攻击与防御;刘西蒙等;《网络与信息安全学报》;20201013(第05期);第40-57页 *

Also Published As

Publication number Publication date
CN113343025A (en) 2021-09-03

Similar Documents

Publication Publication Date Title
CN109948663B (en) Step-length self-adaptive attack resisting method based on model extraction
Cheng et al. Perturbation-seeking generative adversarial networks: A defense framework for remote sensing image scene classification
CN113343025B (en) Sparse attack resisting method based on weighted gradient Hash activation thermodynamic diagram
CN108154167B (en) Chinese character font similarity calculation method
CN110765458A (en) Malicious software detection method and device based on deep learning
CN110929080B (en) Optical remote sensing image retrieval method based on attention and generation countermeasure network
CN104008174A (en) Privacy-protection index generation method for mass image retrieval
CN110826056B (en) Recommended system attack detection method based on attention convolution self-encoder
CN113806746B (en) Malicious code detection method based on improved CNN (CNN) network
CN108595688A (en) Across the media Hash search methods of potential applications based on on-line study
Zhang et al. Positional context aggregation network for remote sensing scene classification
CN116385928A (en) Space-time action detection method, equipment and medium based on self-adaptive decoder
Cao et al. Improving generative adversarial networks with local coordinate coding
CN111523586A (en) Noise-aware-based full-network supervision target detection method
CN114399630A (en) Countercheck sample generation method based on belief attack and significant area disturbance limitation
CN111581352B (en) Credibility-based Internet malicious domain name detection method
Chen et al. Query Attack by Multi-Identity Surrogates
CN116975864A (en) Malicious code detection method and device, electronic equipment and storage medium
CN111967909A (en) Trust attack detection method based on convolutional neural network
Sasipriyaa et al. Design and simulation of handwritten detection via generative adversarial networks and convolutional neural network
CN116310728A (en) Browser identification method based on CNN-Linformer model
Yang et al. APE-GAN++: An improved APE-GAN to eliminate adversarial perturbations
CN115081627B (en) Cross-modal data hash retrieval attack method based on generative network
CN115294424A (en) Sample data enhancement method based on generation countermeasure network
CN111984800B (en) Hash cross-modal information retrieval method based on dictionary pair learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant