CN114240951A

CN114240951A - Black box attack method of medical image segmentation neural network based on query

Info

Publication number: CN114240951A
Application number: CN202111520299.XA
Authority: CN
Inventors: 徐行; 李思远; 沈复民; 杨阳
Original assignee: University of Electronic Science and Technology of China
Current assignee: University of Electronic Science and Technology of China
Priority date: 2021-12-13
Filing date: 2021-12-13
Publication date: 2022-03-25
Anticipated expiration: 2041-12-13
Also published as: CN114240951B

Abstract

The invention discloses a black box attack method of a medical image segmentation neural network based on query, which learns a construction mode of disturbance by establishing probability distribution, continuously constructs new disturbance in iteration and generates a confrontation sample to initiate query to an attacked model, and dynamically adjusts parameters of the probability distribution according to feedback of the attacked model, thereby generating the confrontation sample which causes the attacked model to have serious segmentation errors in fewer query times. The invention fully utilizes the prior information provided in the picture label, and the information is beneficial to focusing on the key foreground pixel in the picture during attack, thereby avoiding unnecessary inquiry and ensuring that the attack is more concealed; meanwhile, the construction mode of disturbance is dynamically adjusted according to the feedback of the attacked model, namely, the method has self-adaptive capacity, and compared with the existing other methods, the generated countermeasure sample can enable the medical image segmentation neural network to generate larger segmentation errors.

Description

Black box attack method of medical image segmentation neural network based on query

Technical Field

The invention belongs to the field of anti-attack in medical image segmentation, and particularly relates to a black box attack method of a medical image segmentation neural network based on query.

Background

Medical image segmentation is an interleaved task of medical image processing and semantic segmentation. The goal is to identify organs or lesion sites from the medical image and identify specific locations for subsequent processing. Medical image segmentation is often used as a pre-task for other medical image processing tasks and is therefore widely used in the field of computer-aided diagnosis.

Counterattacks were first presented in the field of image recognition. The anti-attack method can enable a powerful neural network model to generate wrong output by adding tiny disturbance invisible to human eyes to an image, and the image added with the disturbance is called as an anti-sample. With the development of anti-attack technology, the existence of anti-samples is also found in the fields of target detection, semantic segmentation, image retrieval and the like.

According to the degree of understanding of an attacker about an attacked model, existing attack resisting methods can be divided into the following two categories:

1) the white box attack method comprises the following steps: such attack methods assume that an attacker knows all information of the attacked model, including the network structure and parameters of the model, the training data set, and so on. This means that an attacker can obtain gradient information of the attacked model through a back propagation algorithm and easily generate a challenge sample by using the gradient information. The more representative FGSM (Fast Gradient Sign Method) belongs to this type of attack.

2) The black box attack method comprises the following steps: such attack methods assume that an attacker can only obtain partial information of the target model. Compared with a white box attack method, the black box attack method has more limitations and greater difficulty, but is closer to the actual situation, so that the method has higher research value. Black-box attack methods can be further classified into migratory/query-based attack methods, according to the principles of the method.

Different from a natural image segmentation task, the medical image segmentation task has higher requirements on precision. Slight segmentation errors can be sufficient to alter the diagnostic outcome, causing serious medical consequences. This means that the medical image segmentation neural network is more vulnerable to attack against the sample. The anti-attack method, especially the black box attack method, in the field of research on medical image segmentation has important application value, not only can eliminate potential threats and eliminate medical potential safety hazards, but also is beneficial to training a more robust medical neural network model.

Disclosure of Invention

The invention aims to make up the blank of a black box anti-attack method in the field of medical image segmentation, and provides a black box attack method of a medical image segmentation neural network based on query; and (3) launching limited-time query to the attacked model, dynamically adjusting the construction mode of disturbance according to model feedback, and finally generating a countermeasure sample which enables the model to generate an error segmentation result.

According to the black box attack method of the query-based medical image segmentation neural network, a new perturbation is continuously constructed and countermeasures samples are generated in iteration to initiate query to an attacked model by establishing a construction mode of probability distribution learning perturbation, and parameters of probability distribution are adjusted according to feedback of the attacked model, so that countermeasures samples which enable severe segmentation errors to occur in the attacked model are generated in fewer query times. The method specifically comprises the following steps:

step S1: setting initial parameters of initial disturbance and disturbance mode distribution;

step S2: sampling in the disturbance mode distribution, constructing disturbance according to a sampling result and generating a countermeasure sample;

step S3: inputting the confrontation sample into a medical image segmentation neural network, judging whether to terminate iteration according to the output of the medical image segmentation neural network, if not, entering a step S4, otherwise, terminating the attack;

step S4: calculating the gradient of the target function to the disturbance mode distribution parameters, and updating the disturbance mode distribution parameters by using gradient rise according to the gradient information;

step S5: updating the disturbance and ending the current iteration, returning to the step S2 and entering the next iteration.

Further, the initial disturbance r in the step S1⁰Defining as a tensor which has the same dimensionality (the number of channels is C, the height is H, the width is W) with the picture x and randomly setting each component as belonging to or belonging to; e represents allowablePerturbation is maximum infinite norm. Each component of the picture x is a real number with a value between 0 and 1. The concrete expression is as follows:

x∈[0,1]^C×H×W，r⁰∈{-ε,ε}^C×H×W。

further, the disturbance mode distribution D in step S1 is a two-dimensional continuous probability distribution with at least one parameter.

Further, the step S2 specifically includes:

step S21: sampling from the disturbance mode distribution D to obtain a sample point s ═(s)₁,s₂) Wherein s is₁And s₂Respectively representing two components of a two-dimensional vector s;

step S22: mapping s to i rows and j columns of pixels p' on the picture, wherein the mapping rule is as follows:

wherein

Represents rounding down;

step S23: taking the disturbance r of the previous iteration^n-1The square area with p' as the center and S as the side length is randomly executed with one of the following two actions: will tensor r^n-1Each component (each component refers to a component of the tensor) within the square region is set to-epsilon; or will tensor r^n-1Each component in the square area is set to epsilon. After the action is executed, a new disturbance (namely the disturbance of the current round of iteration) r is obtainedⁿ；

Step S24: will new perturbation rⁿAdding the picture x to obtain a confrontation sample xⁿ。

Further, the iteration termination conditions in step S3 specifically include "reaching the maximum allowable number of queries" and "all foreground pixels in the confrontation sample are classified as error categories", and the iteration can be terminated as long as one of the conditions is satisfied.

Further, the objective function is defined as a foreground dess coefficient in step S4. The label y of picture x is a two-dimensional tensor whose height and width are H and W, respectively, and each component is an integer between 1 and Q, where Q represents the total number of pixel classes in x. An output f (x) obtained by inputting a picture x into a medical image segmentation neural network f is a two-dimensional tensor whose height and width are H and W, respectively, and each component takes a real number between 0 and 1, and represents the highest confidence of each pixel in x in 1 to Q categories, specifically represented as:

y∈{1,2,...,Q}^H×W，f(x)∈[0,1]^H×W。

the foreground dess coefficient on picture x is:

wherein (·)_ijI rows and j columns of elements representing the two-dimensional tensor; the foreground pixel mask M of the picture x is a two-dimensional tensor whose height and width are H and W, respectively, and each component is a real number whose value is between 0 and 1, and is specifically defined as:

wherein x_ijI rows and j columns of pixels representing picture x.

Further, the step S4 specifically includes:

step S41: computing a confrontation sample xⁿAnd will be (x)ⁿ,FD(f(xⁿ) Y)) into an experience pool; wherein xⁿRepresenting the confrontation sample constructed in the current iteration;

step S42: taking the confrontation samples in the latest K iterations and the foreground Splace coefficients thereof from the experience pool:

wherein x^n-kRepresenting the challenge samples constructed in the n-k iterations.

Step S43: calculating the gradient of the target function to the disturbance mode distribution parameters:

wherein

Representing the gradient of the mathematical expectation of the foreground dess coefficient FD on the perturbation mode distribution D to the parameter ω of the perturbation mode distribution; s^n-kSampling results from the perturbation pattern distribution in n-k iterations; p is a radical of_ω(. is a probability density function of the perturbation pattern distribution; ω is the perturbation mode distribution parameter.

Step S44: updating the parameter omega of D by adopting a gradient ascending method:

where α represents the learning rate employed by the gradient ascent method. The above expression indicates that the updated parameter (i.e. the right end of the arrow) is assigned to the parameter before updating (i.e. the left end of the arrow);

step S5 specifically includes: if FD (f (x)ⁿ),y)＜FD(f(x^n-1) Y), then r is maintainedⁿUnchanged and returns to step S2, otherwise r isⁿBack off is r^n-1And returns to step S2. The number of iterations is then incremented by 1 and the next iteration is entered.

The invention makes full use of the prior information provided in the picture label, and the information is helpful to focus on the key foreground pixel in the picture during the attack, thereby avoiding unnecessary inquiry and making the attack more concealed. The invention dynamically adjusts the construction mode of disturbance according to the feedback of the attacked model, namely, the invention has self-adaptive capability, and the generated countersample can enable the medical image segmentation neural network to generate larger segmentation error compared with the existing other methods.

Drawings

FIG. 1 is a flow chart of an implementation of the present invention;

FIG. 2 is a schematic illustration of an embodiment of the invention;

fig. 3 is a graph showing the effect of an attack on a chest radiograph from JSRT and SCR databases according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention is provided in order to enable those skilled in the art to better understand the present invention, with reference to the accompanying drawings.

The invention provides a black box attack method of a query-based medical image segmentation neural network, a work flow chart of the method is shown in figure 1, and the method specifically comprises the following steps:

step S4: calculating the gradient of the target function to the disturbance mode distribution parameters, and updating the disturbance mode distribution parameters by using the gradient rise according to the gradient information;

Initial disturbance r in the step S1⁰Defining as a tensor which has the same dimensionality (the number of channels is C, the height is H, the width is W) with the picture x and randomly setting each component as belonging to or belonging to; e represents the maximum infinite norm of the allowable disturbance, and each component of the picture x is a real number between 0 and 1. Wherein:

x∈[0,1]^C×H×W，r⁰∈{-ε,ε}^C×H×W

the disturbance mode distribution D in step S1 is a two-dimensional independent normal distribution, and its probability density function is as follows:

wherein p (s; mu, C) represents a probability density function with a mean value mu and a covariance C; (.)^TA transposition operation of a representative vector; | · | represents a determinant operation of the matrix; c^-1Represents the inverse matrix of C. Parameters of the disturbance mode distribution specifically include a mean value mu and a covariance C; the initial value of the two-dimensional mean vector μ is set to (0, 0), and the initial value of the covariance matrix C is set to (C)

The step S2 specifically includes:

step S22: and transforming the coordinates of the sample points into the range of [0,1] by adopting a Sigmoid function, wherein the Sigmoid function is as follows:

where s is the input variable of the Sigmoid function. The transformed sample points are

Step S23: will s^tMapping to i rows and j columns of pixels p' on the picture x by adopting a mapping rule as follows:

wherein

Represents rounding down;

step S24: taking the disturbance r of the previous iteration^n-1The square area with p' as the center and S as the side length is randomly executed with one of the following two actions: will tensor r^n-1Each component in the square area is set to-epsilon, or the tensor r^n-1Each component in the square area is set to epsilon. After performing the action, a new (in-round iteration) disturbance r is obtainedⁿ；

Step S25: will new perturbation rⁿAdding the picture x to obtain a confrontation sample xⁿ。

The iteration termination conditions in step S3 specifically include "maximum allowable query times are reached" and "all foreground pixels in the countermeasure sample are classified as wrong categories", and the iteration can be terminated as long as any one of the conditions is satisfied.

The objective function in step S4 is defined as the foreground dys coefficient. The label y of picture x is a two-dimensional tensor whose height and width are H and W, respectively, and each component is an integer between 1 and Q, where Q represents the total number of pixel classes in x. The output f (x) obtained by inputting the picture x into the medical image segmentation neural network f is a two-dimensional tensor whose height and width are H and W, respectively, and each component takes a real number between 0 and 1, which represents the highest confidence of each pixel in x in 1 to Q categories. Wherein:

y∈{1,2,...,Q}^H×W，f(x)∈[0,1]^H×W。

the foreground dess coefficient on picture x is:

wherein x_ijI rows and j columns of pixels representing picture x;

the step S4 specifically includes:

step S41: computing a confrontation sample xⁿAnd will be (x)ⁿ,FD(f(xⁿ) Y)) into an experience pool, where xⁿRepresenting the confrontation sample constructed in the current iteration;

Step S43: calculating the gradient of the objective function to the disturbance mode distribution parameters mu and C:

wherein

And

the gradients of the mathematical expectations of the foreground dess coefficient FD on the perturbation mode distribution D on the parameters μ and C of the perturbation mode distribution (ω being specifically denoted μ and C), respectively; p is a radical of_μ(. and p)_C(. cndot.) is the edge probability density function for μ and C, respectively. s^n-kAnd (5) iterating sample points sampled from D and transformed by a Sigmoid function for n-k rounds.

Step S44: parameters μ and C of D are updated using a gradient ascent method:

wherein alpha is_μAnd alpha_CThe learning rates used by the gradient ascent method for μ and C, respectively. The above two equations represent assigning the updated parameter (i.e., the right end of the equation) to the parameter before updating (i.e., the left end of the equation).

Step S5 specifically includes: if FD (f (x)ⁿ),y)＜FD(f(x^n-1) Y), then r is maintainedⁿUnchanged and returns to step S2, otherwise r isⁿBack off is r^n-1And returns to step S2. Then the current iteration number n is added by 1 and the next iteration is entered.

Example (b):

as shown in fig. 2, the present invention provides a black box attack method for a query-based medical image segmentation neural network, which converts the generation problem of the countersample into an equivalent optimization problem, i.e. minimizing the mathematical expectation of the foreground dess coefficient, and the found optimal solution is the countersample which causes the attacked model to generate the wrong segmentation result. In order to make the generated confrontation sample imperceptible, the search space needs to be limited to the epsilon infinite norm neighborhood of the original picture.

The black box attack method assumes that the structure and parameters of the attacked model are unknown, and thus the above optimization problem cannot be solved by means of gradient information. The random search algorithm is an iterative non-gradient optimization method, and the specific flow is as follows: in each iteration, an observation point is selected at the current iteration point along a random direction, if the objective function value at the observation point is lower than the iteration point, the iteration point is replaced by the observation point, and if not, the iteration point is reserved; the next iteration is then entered until the maximum number of iterations allowable is reached.

The invention introduces a pattern distribution on the basis of a random search algorithm, and the distribution models the probability of obtaining a smaller objective function value at each feasible solution in a search space. After the mode distribution is established, a new observation point can be selected along the feasible solution direction for obtaining a smaller objective function value at a large probability in each iteration, and therefore the search efficiency is improved. To achieve this, the pattern distribution is updated in an iterative process based on the sampled feasible solution locations and the corresponding objective function values.

In generating a countermeasure sample for a medical image segmentation neural network, the foreground pixels in the picture should be considered heavily and the background pixels should be ignored, because the change of the background pixel class does not pose a serious security threat to the attacked model. Therefore, the foreground pixel mask is introduced on the basis of the Splace coefficient, so that the obtained foreground Splace coefficient is not reduced due to the change of the background pixel class, and the condition that limited query times are wasted on the background pixel is avoided.

In the embodiment, three classical medical image segmentation neural networks of UNet, UNet-Attention and COPLE-Net are selected as attacked models, and attack effects are tested on chest X-ray data sets from JSRT and SCR databases. UNet is the most common neural network model in the field of image segmentation, and is also a precursor of many other segmentation networks. UNet-Attention introduces an Attention mechanism on the basis of UNet, and a shallow feature supervises a deep feature to enable a model to gradually focus on a region to be segmented in the identification process. The COPLE-NET introduces a bridging layer and an ASPP module on the basis of UNet, so that the semantic gap is reduced, and the model can obtain better performance in a multi-scale target segmentation task. The chest radiograph data set from the JSRT and SCR databases contains 247 chest radiographs and their tags. For convenience of use, pictures are uniformly scaled to 256 × 256 sizes, with pixel values being unsigned integers between 0-255. The labels provide class information at the pixel level, identifying the class to which each pixel belongs, including specifically the foreground (heart) and background (others).

In the embodiment, the attack effect of the method provided by the invention is evaluated by using the average value of the foreground dess coefficient on all the confrontation samples, and the lower the foreground dess coefficient is, the larger the proportion of the wrongly classified foreground pixels in the confrontation samples is, namely, the better the attack effect is. In order to better show the superiority of the present invention, the effect of the random search algorithm without introducing the perturbation pattern distribution when attacking the medical image segmentation neural network is also tested in the present embodiment. As can be seen from table 1, the mean value of the foreground dess coefficient on the challenge sample generated by the present invention is significantly lower than that on the challenge sample generated by the random search algorithm under the same query times, which indicates that the present invention has a better effect in attacking the medical image neural network than the random search algorithm.

TABLE 1 comparison of the Effect of the present invention and the random search algorithm in attacking the neural network of medical images

As shown in fig. 3, for three different chest pictures (a), (b), and (c), the difference between the confrontation sample generated by the present invention and the normal picture can hardly be distinguished by human eyes, but the foreground region (i.e. the white region in the picture) in the prediction result of the neural network for segmenting the medical image can be significantly reduced, which shows that the confrontation sample generated by the present invention can generate enough security threat to the neural network for segmenting the medical image without being perceived by human, and the effectiveness of the method proposed by the present invention is proved.

Although illustrative embodiments of the present invention have been described above to facilitate the understanding of the present invention by those skilled in the art, it should be understood that the present invention is not limited to the scope of the embodiments, and various changes may be made apparent to those skilled in the art as long as they are within the spirit and scope of the present invention as defined and defined by the appended claims, and all matters of the invention which utilize the inventive concepts are protected.

Claims

1. A black box attack method of a medical image segmentation neural network based on query is characterized in that the method adopts an attack method based on query to launch an attack to a medical image segmentation neural network model, dynamically adjusts a disturbed construction mode according to a feedback result of the attacked model to the query, and finally generates an countermeasure sample which enables the attacked model to generate an error segmentation result; the method specifically comprises the following steps:

step S3: inputting the confrontation sample into a medical image segmentation neural network, and judging whether to terminate iteration according to the output of the medical image segmentation neural network: if the iteration termination condition is not met, the step S4 is carried out, otherwise, the attack is terminated;

2. The black-box attack method for query-based medical image segmentation neural network as claimed in claim 1, wherein the initial perturbation r in the step S1⁰Defining a tensor which has the same dimensionality with the picture x and randomly setting each component as epsilon or epsilon, wherein the dimensionality comprises a channel number C, a height H and a width W; e represents the maximum infinite norm of the allowed disturbance, and each component of the picture x is a real number between 0 and 1, specifically represented as:

x∈[0,1]^C×H×W，r⁰∈{-ε,ε}^C×H×W

the disturbance mode distribution D in step S1 is a two-dimensional independent normal distribution, and the probability density function thereof is expressed as follows:

wherein p (s; mu, C) represents a probability density function with a mean value mu and a covariance C; (.)^TRepresenting a transpose operation; | · | represents a determinant operation of the matrix; c^-1An inverse matrix representing C; of disturbance mode distribution DThe parameters include mean μ and covariance C; the initial value of the two-dimensional mean vector μ is set to (0, 0), and the initial value of the covariance matrix C is set to (C)

3. The black box attack method for query-based medical image segmentation neural network according to claim 2, wherein the step S2 specifically includes:

step S22: transforming the coordinates of the sample points s into the range of [0,1] by adopting a Sigmoid function, wherein the Sigmoid function is as follows:

wherein s is an input variable of a Sigmoid function, and the transformed sample point is

Wherein the content of the first and second substances,

respectively represents s^tTwo components of (a);

wherein

Represents rounding down;

step S24: taking the disturbance r of the previous iteration^n-1The square area with p' as the center and S as the side length is randomly executed with one of the following two actions: will tensor r^n-1Each component in the square area is set to-epsilon, or the tensor r^n-1Each component in the square region is set to epsilon; obtaining the disturbance r of the current iteration after executing the actionⁿ；

Step S25: disturbance r to be iterated in turnⁿAdding the image x to obtain a confrontation sample x of the current iterationⁿ。

4. The black-box attack method for query-based medical image segmentation neural network as claimed in claim 3, wherein the iteration termination conditions in step S3 specifically include "reaching the maximum number of allowed queries" and "all foreground pixels in the confrontation sample are classified as wrong categories", and the iteration can be terminated as long as any one of the iteration termination conditions is satisfied.

5. The black-box attack method for query-based medical image segmentation neural network according to claim 4, wherein the objective function is defined as foreground dess coefficient in step S4, label y of picture x is two-dimensional tensor whose height and width are H and W, respectively, and each component is an integer between 1 and Q, where Q represents the total number of pixel classes in x;

the output f (x) obtained by inputting the picture x into the medical image segmentation neural network f is a two-dimensional tensor whose height and width are H and W, respectively, and each component takes a real number between 0 and 1, and represents the highest confidence of each pixel in x in 1 to Q categories, specifically:

y∈{1,2,...,Q}^H×W，f(x)∈[0,1]^H×W

the foreground dess coefficient on picture x is:

wherein x_ijI rows and j columns of pixels representing picture x;

the step S4 specifically includes:

step S41: computing a confrontation sample xⁿAnd will be (x)ⁿ,FD(f(xⁿ) Y)) into an experience pool, where xⁿRepresenting the confrontational sample constructed in the round of iteration;

wherein x^n-kRepresenting the constructed confrontation sample in the n-k iterations;

wherein

And

respectively representing the mathematical expectation pair perturbation mode of the foreground Splace coefficient FD on the perturbation mode distribution DThe gradient of the parameters μ and C of the formula distribution; p is a radical of_μ(. and p)_C(. h) edge probability density functions for μ and C, respectively; s^n-kSampling from D for n-k rounds of iteration and carrying out Sigmoid function transformation on the sampled points;

step S44: parameters μ and C of D are updated using a gradient ascent method:

wherein alpha is_μAnd alpha_CThe learning rates adopted by the gradient ascent method for μ and C, respectively, are expressed by the above two equations, in which the updated parameters on the right side of the arrow are assigned to the parameters before the update on the left side of the arrow.

6. The black box attack method for query-based medical image segmentation neural network according to claim 5, wherein the step S5 specifically includes: if FD (f (x)ⁿ),y)＜FD(f(x^n-1) Y), then r is maintainedⁿUnchanged and returns to step S2, otherwise r isⁿBack off is r^n-1And returns to step S2; then the current iteration number n is added by 1 and the next iteration is entered.