CN112506797A - Performance test method for medical image recognition system - Google Patents

Performance test method for medical image recognition system Download PDF

Info

Publication number
CN112506797A
CN112506797A CN202011525218.0A CN202011525218A CN112506797A CN 112506797 A CN112506797 A CN 112506797A CN 202011525218 A CN202011525218 A CN 202011525218A CN 112506797 A CN112506797 A CN 112506797A
Authority
CN
China
Prior art keywords
image
test
loss
performance
accuracy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011525218.0A
Other languages
Chinese (zh)
Other versions
CN112506797B (en
Inventor
陈芳
成楚凡
张道强
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Nanjing University of Aeronautics and Astronautics
Original Assignee
Nanjing University of Aeronautics and Astronautics
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Nanjing University of Aeronautics and Astronautics filed Critical Nanjing University of Aeronautics and Astronautics
Priority to CN202011525218.0A priority Critical patent/CN112506797B/en
Publication of CN112506797A publication Critical patent/CN112506797A/en
Application granted granted Critical
Publication of CN112506797B publication Critical patent/CN112506797B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3684Test management for test design, e.g. generating new test cases
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F11/00Error detection; Error correction; Monitoring
    • G06F11/36Preventing errors by testing or debugging software
    • G06F11/3668Software testing
    • G06F11/3672Test management
    • G06F11/3688Test management for test execution, e.g. scheduling of test suites

Abstract

The invention discloses a performance test method of a medical image recognition system, which comprises the following steps that 1) a multi-class image test data generation module comprises a confrontation sample generation network and an entity and background recombination method; 2) the multi-angle test module is based on system stability, reliability and safety performance; 3) and a model decision evaluation module. The invention realizes the generation of multi-class image test data and multi-angle complete system test, finally completes the decision evaluation of the medical image recognition system, and has wide future application prospect.

Description

Performance test method for medical image recognition system
Technical Field
The invention belongs to the technical field of performance analysis of medical image recognition systems, and particularly relates to a performance test method for a medical image recognition system.
Background
The medical image recognition system plays an important role in clinical medical diagnosis, greatly changes the clinical diagnosis mode and promotes the development of clinical medicine. The intelligent medical image recognition is based on an artificial intelligence technology and is used for analyzing and processing images and operation videos scanned by common medical imaging technology such as X-ray films, computed tomography, magnetic resonance imaging and the like, and the development direction of the intelligent medical image recognition mainly comprises intelligent image diagnosis, three-dimensional image reconstruction and registration, intelligent operation video analysis and the like. Currently, research in this field has progressed to a certain extent, and is gradually moving to clinical applications. Therefore, evaluation and testing of the performance of medical image recognition systems is particularly important for the development of future clinical medicine. The FERET sets a performance standard for the recognition algorithm for the first time, defines a series of evaluation standards, greatly promotes the development of the recognition technology, and the established evaluation standards and evaluation protocols always influence the prior art, thereby bringing a profound influence on the development of the face recognition technology in the future. Although there are some test schemes for general image recognition systems, a test scheme for medical image recognition systems has not been proposed. Moreover, because the identification technology at that time is not mature, the identification system participating in FERET evaluation is mostly a prototype system in a university laboratory, and the identification effect is not very satisfactory.
In recent years, there has been an increasing demand for a test method that analyzes an identification model and automatically performs performance analysis. With the rapid development of the deep learning technology, the performance index of the medical image recognition system is also rapidly improved, and the recognition efficiency is greatly improved, so how to test the performance of the models needs to be solved urgently. Aiming at the fact that the visual reality of a generated test image needs to be considered in the test of a medical image recognition system, a method for resisting against a sample generation network and entity and background recombination is provided, and the authenticity of a generated sample is fully guaranteed; and the medical identification system has higher requirements on the test of reliability and safety, so that a multi-angle test scheme is provided, and the countermeasure sample is applied to the medical image, thereby achieving the effect of better analyzing the medical identification model.
Disclosure of Invention
The invention provides a performance testing method for a medical image recognition system, which aims to solve the problems in the prior art.
In order to achieve the purpose, the invention adopts the following technical scheme:
a performance test method for a medical image recognition system comprises the following steps: the multi-class image test data generation module comprises a confrontation sample generation network and an entity and background recombination method; the multi-angle test module comprises a performance test, a reliability test and a safety test; the decision evaluation module analyzes the input test result, judges the performance of the model and gives a detailed test report;
the network inputs a group of pictures to be classified and identified, the pictures are input into the multi-class image test data generation module for the first time, the input models are classified after image augmentation, classification results are input into the multi-angle test module, the multi-angle test module tests the learning results of the models and transmits the results to the decision evaluation module, and the decision evaluation module analyzes the input test results, judges the performance of the models and gives detailed test reports.
Further, the countermeasure sample generation network and entity, background recombination method includes using multi-loss hybrid countermeasure camouflage against augmentation,
multiple loss function
Figure BDA0002850573050000021
Expressed as:
Figure BDA0002850573050000022
wherein: λ represents the antagonistic strength,
Figure BDA0002850573050000023
Indicating a loss of resistance,
Figure BDA0002850573050000024
Representing a style loss for style generation,
Figure BDA0002850573050000025
Representing a content loss for preserving source image content,
Figure BDA0002850573050000026
Representing a smoothness penalty for ensuring smoothness of the extended sample;
defining an existing image, a target attack area and an expected target pattern by a user, generating a required pattern in the required area, and adding additional physical adaptation training to the generated extended sample in each step;
the style distance between two images is defined by the difference in the style representation of the two images,
Figure BDA0002850573050000027
wherein:
Figure BDA0002850573050000028
as a feature distance, l is a style level feature, SlIs a collection of style layers from which a style representation is extracted,
Figure BDA0002850573050000029
is a feature extractor for a style or style,
Figure BDA00028505730500000210
is from
Figure BDA00028505730500000211
A set of pattern layers of the extracted deep layer features of the Gram matrix, xsIs a stylistic reference image, x' is the generated countermeasure sample;
pattern loss for pattern generation
Figure BDA00028505730500000212
The content of the enhanced image in the generated reference pattern is very different from the content of the original image; specifically, as follows, the following description will be given,
Figure BDA00028505730500000213
wherein:
Figure BDA00028505730500000214
is content loss, t is content layer characteristic, ctIs a set of content layers that extracts the content representation,
Figure BDA00028505730500000215
is a feature extractor for the content layer, x is the original image, x' is the generated countermeasure sample;
improving the smoothness of the enhanced image by reducing the variation between adjacent pixels; for the enhanced image, the smoothness penalty is defined as,
Figure BDA00028505730500000216
wherein: x'i,jTo combat the pixel value at the sample (i, j) coordinate, xi+1,jIs the pixel value, x, at the (i +1, j) coordinate of the original imagei,j+1For the pixel value at the original image (i, j +1) coordinate:
for the loss of antagonism
Figure BDA0002850573050000031
The following cross entropy loss was used:
Figure BDA0002850573050000032
wherein: p is a radical ofyadv() And py() The probability outputs of the object model F (F refers to the object function of a general machine model, e.g. the object function F of vgg is fc8, from which the probability outputs corresponding to 1000 classes can be derived) to the labels yadv (class of confrontational samples) and y (class of original images), respectively.
Realistic conditions are introduced into the generation process of the augmented example, as follows:
Figure BDA0002850573050000033
wherein: o is a random background image sampled in the physical world, T is a random transformation of rotation, resizing and color shift,
Figure BDA00028505730500000311
is a set of transformations; by following the original graphLike x and background image o, the resulting enhanced sample is substantially legitimate to a human observer;
and the target background is recombined and expanded, the target is segmented from the background by using a segmentation algorithm Mask R-CNN, pixels are supplemented to a blank part in the background by using an interpolation algorithm, and finally the target and the background are randomly combined to realize the image expansion.
Furthermore, the performance test in the multi-angle test module comprises different angles: judging and judging the identification Accuracy Accuracy, judging and judging the identification Loss value Loss and judging the metamorphic relation; judging the Accuracy Accuracy and the Loss value Loss, wherein the Accuracy Accuracy and the Loss value Loss are obtained by subtracting the Accuracy Accuracy and the identification Loss value Loss output by the models before and after the augmentation to obtain the identification Accuracy difference percentage delta acc before and after the augmentation and the identification Loss difference percentage delta Loss before and after the augmentation;
the disintegration test is defined as: ciFor the original test image
Figure BDA0002850573050000034
Is classified by the image recognition system, SiFor the original test image
Figure BDA0002850573050000035
A confidence score of; ci' use for association with metamorphic relations
Figure BDA0002850573050000036
Synthesized new test image
Figure BDA0002850573050000037
Class label of Si' use for association with metamorphic relations
Figure BDA0002850573050000038
Synthesized new test image
Figure BDA0002850573050000039
Then the metamorphic relationship is expressed as:
Ci=C′iandΔS=|Si-S′i|<c (7)
wherein: c is the hyperparameter, 0< c <100, c is set to 50, Δ S is the difference in confidence scores before and after expansion.
Further, the reliability test in the multi-angle test module is a robustness test, and under the condition that an original image x meets confidence assurance, immune attack can be performed within a norm sphere radius R:
Figure BDA00028505730500000310
wherein: z () is a loss function, g () is an objective function to be optimized,
Figure BDA0002850573050000041
the method is characterized in that the method is arbitrary, epsilon is introduced noise, B (x; R) is a noise set, R is a norm sphere radius, R is a value of a wireless approximate 0, and x is an original image;
the final robustness accuracy (robac) is defined as:
Figure BDA0002850573050000042
the security test in the multi-angle test module is a model invariance test, selects a random image, selects a perturbation of a pixel using one of four methods described below, then measures the sensitivity of the network to the perturbation, the first method is the "Crop" (Crop) method, randomly selects a square in the original image and resizes the square to 224x224px, then we shift the square diagonally by one pixel to create a second image that is different from the first image by shifting a single pixel; the second method is the "Embedding" (Embedding) method, which first reduces the image to a minimum size of 100px while maintaining the aspect ratio and embeds it at a random location within the 224x224px image while filling the rest of the image with Black (Black) pixels, then shifts the Embedding location by a single pixel, creating two identical images again until a single pixel is shifted; in the third method, the image is first reduced to a minimum size of 100px while maintaining the aspect ratio and embedded at random locations within the 224x224px image, then using a simple repair algorithm (each black pixel is replaced by a weighted average of the non-black pixels in its neighborhood), the fourth method is the same as the second protocol, the image is first reduced to a minimum size of 100px, but we do not move the embedding location, but instead keep the embedding location unchanged and change the size of the embedded image by a single pixel (e.g., from a size of 100x100px to a size of 101x101px pixels).
Further, in the safety test, two methods are used for measuring sensitivity as an invariance test of a model, the first method is called P (Top-1Change), and the TOP-1 of the network predicts the probability of Change after single-pixel disturbance; second referred to as "mean absolute change" (MAC), measures the change in the mean absolute value of the probability computed by the network (i.e., the class with the highest probability in the first of the two frames) after a pixel perturbation of the top class (i.e., the class with the highest probability in the first of the two frames).
Further, the decision evaluation module analyzes the input test result, judges the model performance [ identification Accuracy after Accuracy expansion, identification Loss after Loss expansion, identification Accuracy difference before and after Δ acc expansion, identification Loss difference before and after Δ Loss expansion, CR model robustness (characterized by robac), confidence score difference before and after Δ S expansion, probability of Change of TOP-1 prediction of P (Top-1Change) network after single pixel disturbance, MAC average absolute Change) ] and gives a detailed test report, when the performance of a plurality of recognition models is compared, a large number of independent performance indexes are often too complicated for users, so that the users are difficult to make reasonable judgment, therefore, the comprehensive influence of different indexes on the identification system is considered in the design of the performance indexes, then, a comprehensive performance index CM (composite value) is defined to reflect the comprehensive performance of different recognition systems; the formula is as follows:
Figure BDA0002850573050000043
wherein: CM (compact message processor)iRepresenting the integrated performance value, ω, of the i-th identification systemjWeight, max (M) representing jth performance index value of cloud servicej) Represents the maximum value of the jth individual performance index, min (M), in multiple recognition systemsj) Representing the minimum value of the j-th performance indicator, M, in a plurality of recognition systemsijA j-th performance index value representing an i-th identification system, N representing a total number of identification system performance index values, by using the formula (2 max (M)j)-Mij)/(2*max(Mj)-min(Mj) ) to MijIs normalized to [0, 1 ]]An interval. It can be seen that the larger the value of CM, the better the overall performance of the identification system.
For some identification system performance indicators, such as Loss, P (Top-1Change), MAC, etc., the value MijThe smaller the value of the CM comprehensive performance index is, the larger the value of the performance index is, the higher the value of the performance index is, the better the identification performance is, but the direct substitution of the value into the formula causes the CM value to be reduced, which is not expected, so that the values of the performance indexes need to be processed and used (1-M)ij) To replace M in the formulaijThe value is obtained.
Further, the decision evaluation module finally outputs the results of the multi-angle test module test and generates a corresponding test report table, as shown in tables 1-3. The test system also gives corresponding suggestions due to different requirements of different task scenarios.
Compared with the prior art, the invention has the following beneficial effects:
the invention realizes the generation of multi-class image test data and multi-angle complete system test, finally completes the decision evaluation of the medical image recognition system, and has wide future application prospect.
Drawings
FIG. 1 is a framework flow diagram of the present invention;
FIG. 2 is a flow chart of the present invention for countering amplification;
FIG. 3 is a flow chart of the background recombination augmentation of an object in the present invention;
FIG. 4 is a block diagram of a decision evaluation module of the present invention.
Detailed Description
The present invention will be further described with reference to the following examples.
A performance testing method for a medical image recognition system, as shown in fig. 1, the performance testing method includes: the system comprises a multi-class image test data generation module, a multi-angle test module and a decision evaluation module, wherein the multi-class image test data generation module comprises a confrontation sample generation network and an entity and background recombination method; the multi-angle test module comprises a performance test, a reliability test and a safety test; the decision evaluation module analyzes the input test result, judges the performance of the model and gives a detailed test report;
the network inputs a group of pictures to be classified and identified, the pictures are input into the multi-class image test data generation module for the first time, the input models are classified after image augmentation, classification results are input into the multi-angle test module, the multi-angle test module tests the learning results of the models and transmits the results to the decision evaluation module, and the decision evaluation module analyzes the input test results, judges the performance of the models and gives detailed test reports.
A countermeasure sample generation network and an entity and background recombination method use a countermeasure sample generation joint entity and background recombination scheme in consideration of the characteristics of medical images and the visual reality of generated test images. Counter-augmentation uses multi-loss hybrid counter-camouflaging, which can generate new enhanced images that human observers look legitimate without relying on large amounts of data to train the generating network. The method aims to develop a mechanism to generate an extended sample with a custom pattern, realize image enhancement by using a pattern transformation technology and realize concealment of the image by using an anti-attack technology. Final multiple loss function
Figure BDA0002850573050000061
Is toResistance to loss of strength lambda and antagonism
Figure BDA0002850573050000062
Product of (2), pattern loss for pattern generation
Figure BDA0002850573050000063
Content loss for preserving source image content
Figure BDA0002850573050000064
And smoothness penalty for ensuring smoothness of extended samples
Figure BDA0002850573050000065
Combinations of (a) and (b).
Multiple loss function
Figure BDA0002850573050000066
Expressed as:
Figure BDA0002850573050000067
wherein: λ represents the antagonistic strength,
Figure BDA0002850573050000068
Indicating a loss of resistance,
Figure BDA0002850573050000069
Representing a style loss for style generation,
Figure BDA00028505730500000610
Representing a content loss for preserving source image content,
Figure BDA00028505730500000611
Representing a smoothness penalty for ensuring smoothness of the extended sample;
as shown in fig. 2, an overview of the anti-augmentation method is shown. The user defines the existing image, the target attack area and the expected target pattern, and generates the required pattern in the required area, as shown on the right side of fig. 2. In order to make the extended samples robust to various environmental conditions (including illumination, rotation, etc.), additional physical adaptation training is added to the generated extended samples at each step;
the style distance between two images is defined by the difference in the style representation of the two images,
Figure BDA00028505730500000612
wherein:
Figure BDA00028505730500000613
as a feature distance, l is a style level feature, SlIs a collection of style layers from which a style representation is extracted,
Figure BDA00028505730500000614
is a feature extractor for a style or style,
Figure BDA00028505730500000615
is from
Figure BDA00028505730500000616
A set of pattern layers of the extracted deep layer features of the Gram matrix, xsIs a stylistic reference image, x' is the generated countermeasure sample;
pattern loss for pattern generation
Figure BDA00028505730500000617
The content of the enhanced image in the generated reference pattern is very different from the content of the original image; specifically, as follows, the following description will be given,
Figure BDA00028505730500000618
wherein:
Figure BDA00028505730500000619
is a content loss, t isContent layer characteristics, ctIs a set of content layers that extracts the content representation,
Figure BDA00028505730500000620
is a feature extractor for the content layer, x is the original image, x' is the generated countermeasure sample;
improving the smoothness of the enhanced image by reducing the variation between adjacent pixels; for the enhanced image, the smoothness penalty is defined as,
Figure BDA0002850573050000071
wherein: x'i,jTo combat the pixel value at the sample (i, j) coordinate, xi+1,jIs the pixel value, x, at the (i +1, j) coordinate of the original imagei,j+1For the pixel value at the original image (i, j +1) coordinate:
for the loss of antagonism
Figure BDA0002850573050000072
The following cross entropy loss was used:
Figure BDA0002850573050000073
wherein: p is a radical ofyadv() And py() The probability outputs of the object model F (F refers to the object function of a general machine model, e.g. the object function F of vgg is fc8, from which the probability outputs corresponding to 1000 classes can be derived) to the labels yadv (class of confrontational samples) and y (class of original images), respectively.
In order to make the confrontational image samples realistic in the real world, we model the realistic conditions in the process of generating the augmented samples. Since real-world environments often involve fluctuations in conditions such as viewpoint movement, image noise and other natural transformations, we use a series of adjustments to accommodate these different conditions. In particular, we use a technique similar to expected over-conversion (EOT). Our goal is to improve the adaptability of the expanded samples to different physical conditions. Therefore, we consider transformations to model fluctuations in physical world conditions, including rotation, scaling, color shifting (to model lighting variations), and random backgrounds. Here, realistic conditions are introduced into the generation process of the augmented example, as follows:
Figure BDA0002850573050000074
wherein: o is a random background image sampled in the physical world, T is a random transformation of rotation, resizing and color shift,
Figure BDA0002850573050000075
is a set of transformations; by generating an enhanced sample from the original image x and the background image o that is substantially legitimate to a human observer;
the target background recombination and augmentation uses a segmentation algorithm Mask R-CNN to segment the target from the background, uses an interpolation algorithm to supplement pixels to the blank part in the background, and finally randomly combines the target and the background to realize the image augmentation, wherein the overall method frame is shown in figure 3.
The performance test in the multi-angle test module comprises different angles: judging and judging the identification Accuracy Accuracy, judging and judging the identification Loss value Loss and judging the metamorphic relation; judging the Accuracy Accuracy and the Loss value Loss, wherein the Accuracy Accuracy and the Loss value Loss are obtained by subtracting the Accuracy Accuracy and the identification Loss value Loss output by the models before and after the augmentation to obtain the identification Accuracy difference percentage delta acc before and after the augmentation and the identification Loss difference percentage delta Loss before and after the augmentation;
the disintegration test is defined as: ciFor the original test image
Figure BDA0002850573050000081
Is classified by the image recognition system, SiFor the original test image
Figure BDA0002850573050000082
A confidence score of; ci' is a combination of transmutation relationshipBy using
Figure BDA0002850573050000083
Synthesized new test image
Figure BDA0002850573050000084
Class label of Si' use for association with metamorphic relations
Figure BDA0002850573050000085
Synthesized new test image
Figure BDA0002850573050000086
Then the metamorphic relationship is expressed as:
Ci=C′iandΔS=|Si-S′i|<c (7)
wherein: c is the hyperparameter, 0< c <100, c is set to 50, Δ S is the difference in confidence scores before and after expansion.
The reliability test in the multi-angle test module is a robustness test, and under the condition that an original image x meets confidence guarantee, immune attack can be carried out in a norm sphere radius R:
Figure BDA0002850573050000087
wherein: z () is a loss function, g () is an objective function to be optimized,
Figure BDA0002850573050000089
the method is characterized in that the method is arbitrary, epsilon is introduced noise, B (x; R) is a noise set, R is a norm sphere radius, R is a value of a wireless approximate 0, and x is an original image;
the final robustness accuracy (robac) is defined as:
Figure BDA0002850573050000088
the security test in the multi-angle test module is a model invariance test, selects a random image, selects a perturbation of a pixel using one of four methods described below, then measures the sensitivity of the network to the perturbation, the first method is the "Crop" (Crop) method, randomly selects a square in the original image and resizes the square to 224x224px, then we shift the square diagonally by one pixel to create a second image that is different from the first image by shifting a single pixel; the second method is the "Embedding" (Embedding) method, which first reduces the image to a minimum size of 100px while maintaining the aspect ratio and embeds it at a random location within the 224x224px image while filling the rest of the image with Black (Black) pixels, then shifts the Embedding location by a single pixel, creating two identical images again until a single pixel is shifted; in the third method, the image is first reduced to a minimum size of 100px while maintaining the aspect ratio and embedded at random locations within the 224x224px image, then using a simple repair algorithm (each black pixel is replaced by a weighted average of the non-black pixels in its neighborhood), the fourth method is the same as the second protocol, the image is first reduced to a minimum size of 100px, but we do not move the embedding location, but instead keep the embedding location unchanged and change the size of the embedded image by a single pixel (e.g., from a size of 100x100px to a size of 101x101px pixels).
In the safety test, two methods are used for measuring sensitivity as an invariance test of a model, the first method is called P (Top-1Change), and the TOP-1 of the network predicts the probability of Change after single-pixel disturbance; second referred to as "mean absolute change" (MAC), measures the change in the mean absolute value of the probability computed by the network (i.e., the class with the highest probability in the first of the two frames) after a pixel perturbation of the top class (i.e., the class with the highest probability in the first of the two frames).
As shown in FIG. 4, the decision evaluation module analyzes the input test result, and determines the model performance [ recognition Accuracy after Accuracy expansion, recognition Loss after Loss expansion, recognition Accuracy difference before and after Δ acc expansion, recognition Loss difference before and after Δ Loss expansion, CR model robustness (characterized by robac), confidence score difference before and after Δ S expansion, probability of Change after single pixel disturbance of TOP-1 prediction of P (Top-1Change) network, MAC average absolute Change) ] and gives a detailed test report, when comparing the performance of a plurality of recognition models, a large number of individual performance indexes are often too complicated for users, so that the users are difficult to make reasonable judgment, therefore, the comprehensive influence of different indexes on the identification system is considered in the design of the performance indexes, then, a comprehensive performance index CM (composite value) is defined to reflect the comprehensive performance of different recognition systems; the formula is as follows:
Figure BDA0002850573050000091
wherein: CM (compact message processor)iRepresenting the integrated performance value, ω, of the i-th identification systemjWeight, max (M) representing jth performance index value of cloud servicej) Represents the maximum value of the jth individual performance index, min (M), in multiple recognition systemsj) Representing the minimum value of the j-th performance indicator, M, in a plurality of recognition systemsijA j-th performance index value representing an i-th identification system, N representing a total number of identification system performance index values, by using the formula (2 max (M)j)-Mij)/(2*max(Mj)-min(Mj) ) to MijIs normalized to [0, 1 ]]An interval. It can be seen that the larger the value of CM, the better the overall performance of the identification system.
For some identification system performance indicators, such as Loss, P (Top-1Change), MAC, etc., the value MijThe smaller the value of the CM comprehensive performance index is, the larger the value of the performance index is, the higher the value of the performance index is, the better the identification performance is, but the direct substitution of the value into the formula causes the CM value to be reduced, which is not expected, so that the values of the performance indexes need to be processed and used (1-M)ij) To replace M in the formulaijThe value is obtained.
The decision evaluation module finally outputs the test results of the multi-angle test module and generates a corresponding test report table as shown in tables 1-3. The test system also gives corresponding suggestions due to different requirements of different task scenarios.
TABLE 1 Performance index report
Figure BDA0002850573050000092
TABLE 2 safety index report
Figure BDA0002850573050000093
Figure BDA0002850573050000101
TABLE 3 model stability index report and model combination property report
Figure BDA0002850573050000102
The above description is only of the preferred embodiments of the present invention, and it should be noted that: it will be apparent to those skilled in the art that various modifications and adaptations can be made without departing from the principles of the invention and these are intended to be within the scope of the invention.

Claims (8)

1. A performance test method for a medical image recognition system is characterized by comprising the following steps: the performance test method comprises the following steps: the multi-class image test data generation module comprises a confrontation sample generation network and an entity and background recombination method; the multi-angle test module comprises a performance test, a reliability test and a safety test; the decision evaluation module analyzes the input test result, judges the performance of the model and gives a detailed test report;
the network inputs a group of pictures to be classified and identified, the pictures are input into the multi-class image test data generation module for the first time, the input models are classified after image augmentation, classification results are input into the multi-angle test module, the multi-angle test module tests the learning results of the models and transmits the results to the decision evaluation module, and the decision evaluation module analyzes the input test results, judges the performance of the models and gives detailed test reports.
2. The performance testing method for medical image recognition system according to claim 1, wherein:
the challenge sample generation network and entity, the background recombination method includes using multiple loss hybrid challenge camouflage against augmentation,
multiple loss function
Figure FDA0002850573040000011
Expressed as:
Figure FDA0002850573040000012
wherein: λ represents the antagonistic strength,
Figure FDA0002850573040000013
Indicating a loss of resistance,
Figure FDA0002850573040000014
Representing a style loss for style generation,
Figure FDA0002850573040000015
Representing a content loss for preserving source image content,
Figure FDA0002850573040000016
Representing a smoothness penalty for ensuring smoothness of the extended sample;
defining an existing image, a target attack area and an expected target pattern by a user, generating a required pattern in the required area, and adding additional physical adaptation training to the generated extended sample in each step;
the style distance between two images is defined by the difference in the style representation of the two images,
Figure FDA0002850573040000017
wherein:
Figure FDA0002850573040000018
as a feature distance, l is a style level feature, SlIs a collection of style layers from which a style representation is extracted,
Figure FDA0002850573040000019
is a feature extractor for a style or style,
Figure FDA00028505730400000110
is from
Figure FDA00028505730400000111
A set of pattern layers of the extracted deep layer features of the Gram matrix, xsIs a stylistic reference image, x' is the generated countermeasure sample;
pattern loss for pattern generation
Figure FDA00028505730400000115
The content of the enhanced image in the generated reference pattern is very different from the content of the original image; specifically, as follows, the following description will be given,
Figure FDA00028505730400000112
wherein:
Figure FDA00028505730400000113
is content loss, t is content layer characteristic, ctIs a set of content layers that extracts the content representation,
Figure FDA00028505730400000114
is a feature extractor for the content layer, x is the original image, x' is the generated countermeasure sample;
improving the smoothness of the enhanced image by reducing the variation between adjacent pixels; for the enhanced image, the smoothness penalty is defined as,
Figure FDA0002850573040000021
wherein: x'i,jTo combat the pixel value at the sample (i, j) coordinate, xi+1,jIs the pixel value, x, at the (i +1, j) coordinate of the original imagei,j+1For the pixel value at the original image (i, j +1) coordinate:
for the loss of antagonism
Figure FDA0002850573040000022
The following cross entropy loss was used:
Figure FDA0002850573040000023
wherein: p is a radical ofyadv() And py() Respectively outputting the probability of the target model F to labels yadv and y, wherein yadv is the category of the confrontation sample, and y is the category of the original image;
realistic conditions are introduced into the generation process of the augmented example, as follows:
Figure FDA0002850573040000024
wherein: o is sampling in the physical worldA machine background image, T is a random transformation of rotation, resizing and color shifts,
Figure FDA00028505730400000211
is a set of transformations;
and the target background is recombined and expanded, the target is segmented from the background by using a segmentation algorithm Mask R-CNN, pixels are supplemented to a blank part in the background by using an interpolation algorithm, and finally the target and the background are randomly combined to realize the image expansion.
3. The performance testing method for medical image recognition system according to claim 1, wherein:
the performance test in the multi-angle test module comprises different angles: judging and judging the identification Accuracy Accuracy, judging and judging the identification Loss value Loss and judging the metamorphic relation; judging the Accuracy Accuracy and the Loss value Loss, wherein the Accuracy Accuracy and the Loss value Loss are obtained by subtracting the Accuracy Accuracy and the identification Loss value Loss output by the models before and after the augmentation to obtain the identification Accuracy difference percentage delta acc before and after the augmentation and the identification Loss difference percentage delta Loss before and after the augmentation;
the disintegration test is defined as: ciFor the original test image
Figure FDA0002850573040000025
Is classified by the image recognition system, SiFor the original test image
Figure FDA0002850573040000026
A confidence score of; ci' use for association with metamorphic relations
Figure FDA0002850573040000027
Synthesized new test image
Figure FDA00028505730400000210
Class label of Si' use for association with metamorphic relations
Figure FDA0002850573040000028
Synthesized new test image
Figure FDA0002850573040000029
The confidence score of (a) is determined,
the metamorphic relationship is expressed as:
Ci=C′iand ΔS=|si-s′i|<c (7)
wherein: c is a hyperparameter, c is more than 0 and less than 100, c is set to be 50, and Delta S is the difference of confidence scores before and after expansion.
4. The performance testing method for medical image recognition system according to claim 1, wherein:
the reliability test in the multi-angle test module is a robust verified robustness) test, and under the condition that an original image x meets confidence guarantee, immune attack can be carried out in a norm sphere radius R:
Figure FDA0002850573040000031
wherein: z () is a loss function, g () is an objective function to be optimized,
Figure FDA0002850573040000032
the method is characterized in that the method is arbitrary, epsilon is introduced noise, B (x; R) is a noise set, R is a norm sphere radius, and x is an original image;
the final robustness accuracy robac is defined as:
Figure FDA0002850573040000033
5. the performance testing method for medical image recognition system according to claim 1, wherein:
the security test in the multi-angle test module is a model invariance test, a random image is selected, the disturbance of a pixel is selected by using one of four methods described below, and then the sensitivity of the network to the disturbance is measured, the first method is a Crop method, a square is randomly selected in an original image, the size of the square is adjusted to 224x224px, then the diagonal of the square is translated by one pixel to create a second image, and the image is different from the first image by translating a single pixel; the second method, Embedding, is the Embedding method, which first reduces the image to a minimum size of 100px while maintaining the aspect ratio and embeds it at a random location within the 224x224px image while filling the rest of the image with black pixels, then shifts the Embedding location by a single pixel, and creates two identical images again until a single pixel is shifted; in the third method, the image is first reduced to a minimum size of 100px while maintaining the aspect ratio and embedded at random locations within the 224x224px image, then a simple repair algorithm is used, i.e., each black pixel is replaced by a weighted average of the non-black pixels in its neighborhood, and the fourth method is the same as the second method, the image is first reduced to a minimum size of 100px without moving the embedded location, but keeping the embedded location unchanged, and changing the size of the embedded image by a single pixel.
6. The performance testing method for medical image recognition system according to claim 5, wherein:
in the security test, sensitivity is measured as an invariance test of a model by two methods, the first is the probability that TOP-1 of the network predicts the Change after a single-pixel disturbance, which is called P, i.e. Top-1Change, and the second is the average absolute Change of the probability calculated by the network after a pixel disturbance of the TOP class, which is called MAC.
7. The performance testing method for medical image recognition system according to claim 1, wherein:
the decision evaluation module analyzes the input test result, and judges the model performance, namely the identification Accuracy after Accuracy expansion, the identification Loss after Loss expansion, the identification Accuracy difference before and after delta acc expansion, the identification Loss difference before and after delta Loss expansion, the CR model robustness is represented by robac, the confidence score difference before and after delta S expansion, the probability that the TOP-1 of a P network changes after single-pixel disturbance is predicted, the MAC average absolute change is given, and a test report is given; the formula is as follows:
Figure FDA0002850573040000041
wherein: CM (compact message processor)iRepresenting the integrated performance value, ω, of the i-th identification systemjWeight, max (M) representing jth performance index value of cloud servicej) Represents the maximum value of the jth individual performance index, min (M), in multiple recognition systemsj) Representing the minimum value of the j-th performance indicator, M, in a plurality of recognition systemsijA j-th performance index value representing an i-th identification system, N representing a total number of identification system performance index values, by using the formula (2 max (M)j)-Mij)/(2*max(Mj)-min(Mj) ) to MijIs normalized to [0, 1 ]]An interval.
8. The performance testing method for medical image recognition system according to claim 1, wherein:
and the decision evaluation module finally outputs various test results of the multi-angle test module and generates a corresponding test reporting table.
CN202011525218.0A 2020-12-22 2020-12-22 Performance test method for medical image recognition system Active CN112506797B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011525218.0A CN112506797B (en) 2020-12-22 2020-12-22 Performance test method for medical image recognition system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011525218.0A CN112506797B (en) 2020-12-22 2020-12-22 Performance test method for medical image recognition system

Publications (2)

Publication Number Publication Date
CN112506797A true CN112506797A (en) 2021-03-16
CN112506797B CN112506797B (en) 2022-05-24

Family

ID=74923062

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011525218.0A Active CN112506797B (en) 2020-12-22 2020-12-22 Performance test method for medical image recognition system

Country Status (1)

Country Link
CN (1) CN112506797B (en)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113268870A (en) * 2021-05-19 2021-08-17 北京航空航天大学 Monte Carlo-based image recognition reliability evaluation method under outdoor environment condition
CN113486899A (en) * 2021-05-26 2021-10-08 南开大学 Saliency target detection method based on complementary branch network
CN113780557A (en) * 2021-11-11 2021-12-10 中南大学 Method, device, product and medium for resisting image attack based on immune theory
US11900553B2 (en) 2021-12-31 2024-02-13 Samsung Electronics Co., Ltd. Processing method and apparatus with augmented reality

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008060022A1 (en) * 2006-11-13 2008-05-22 Electronics And Telecommunications Research Institute System and method for evaluating and certifying image identifier
US20160314064A1 (en) * 2015-04-21 2016-10-27 Cloudy Days Inc. Dba Nouvola Systems and methods to identify and classify performance bottlenecks in cloud based applications
CN110458213A (en) * 2019-07-29 2019-11-15 四川大学 A kind of disaggregated model robust performance appraisal procedure
CN110516695A (en) * 2019-07-11 2019-11-29 南京航空航天大学 Confrontation sample generating method and system towards Medical Images Classification
CN111191660A (en) * 2019-12-30 2020-05-22 浙江工业大学 Rectal cancer pathology image classification method based on multi-channel collaborative capsule network
US20200285896A1 (en) * 2019-03-09 2020-09-10 Tongji University Method for person re-identification based on deep model with multi-loss fusion training strategy
CN111681210A (en) * 2020-05-16 2020-09-18 浙江德尚韵兴医疗科技有限公司 Method for identifying benign and malignant breast nodules by shear wave elastogram based on deep learning
CN111753985A (en) * 2020-06-28 2020-10-09 浙江工业大学 Image deep learning model testing method and device based on neuron coverage rate
CN111782529A (en) * 2020-06-30 2020-10-16 平安国际智慧城市科技股份有限公司 Test method and device for auxiliary diagnosis system, computer equipment and storage medium
CN112052186A (en) * 2020-10-10 2020-12-08 腾讯科技(深圳)有限公司 Target detection method, device, equipment and storage medium

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2008060022A1 (en) * 2006-11-13 2008-05-22 Electronics And Telecommunications Research Institute System and method for evaluating and certifying image identifier
US20160314064A1 (en) * 2015-04-21 2016-10-27 Cloudy Days Inc. Dba Nouvola Systems and methods to identify and classify performance bottlenecks in cloud based applications
US20200285896A1 (en) * 2019-03-09 2020-09-10 Tongji University Method for person re-identification based on deep model with multi-loss fusion training strategy
CN110516695A (en) * 2019-07-11 2019-11-29 南京航空航天大学 Confrontation sample generating method and system towards Medical Images Classification
CN110458213A (en) * 2019-07-29 2019-11-15 四川大学 A kind of disaggregated model robust performance appraisal procedure
CN111191660A (en) * 2019-12-30 2020-05-22 浙江工业大学 Rectal cancer pathology image classification method based on multi-channel collaborative capsule network
CN111681210A (en) * 2020-05-16 2020-09-18 浙江德尚韵兴医疗科技有限公司 Method for identifying benign and malignant breast nodules by shear wave elastogram based on deep learning
CN111753985A (en) * 2020-06-28 2020-10-09 浙江工业大学 Image deep learning model testing method and device based on neuron coverage rate
CN111782529A (en) * 2020-06-30 2020-10-16 平安国际智慧城市科技股份有限公司 Test method and device for auxiliary diagnosis system, computer equipment and storage medium
CN112052186A (en) * 2020-10-10 2020-12-08 腾讯科技(深圳)有限公司 Target detection method, device, equipment and storage medium

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BIN LIU ET AL.: "A Data Augmentation Method Based on Generative Adversarial Networks for Grape Leaf Disease Identification", 《IEEE ACCESS》 *
施俊 等: "深度学习在医学影像中的应用综述", 《中国图像图形学报》 *
袁公萍等: "基于深度卷积神经网络的车型识别方法", 《浙江大学学报(工学版)》 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113268870A (en) * 2021-05-19 2021-08-17 北京航空航天大学 Monte Carlo-based image recognition reliability evaluation method under outdoor environment condition
CN113486899A (en) * 2021-05-26 2021-10-08 南开大学 Saliency target detection method based on complementary branch network
CN113780557A (en) * 2021-11-11 2021-12-10 中南大学 Method, device, product and medium for resisting image attack based on immune theory
CN113780557B (en) * 2021-11-11 2022-02-15 中南大学 Method, device, product and medium for resisting image attack based on immune theory
US11900553B2 (en) 2021-12-31 2024-02-13 Samsung Electronics Co., Ltd. Processing method and apparatus with augmented reality

Also Published As

Publication number Publication date
CN112506797B (en) 2022-05-24

Similar Documents

Publication Publication Date Title
CN112506797B (en) Performance test method for medical image recognition system
CN109583342B (en) Human face living body detection method based on transfer learning
CN108537743B (en) Face image enhancement method based on generation countermeasure network
CN109815893B (en) Color face image illumination domain normalization method based on cyclic generation countermeasure network
CN112101426B (en) Unsupervised learning image anomaly detection method based on self-encoder
CN111754596B (en) Editing model generation method, device, equipment and medium for editing face image
CN110490158B (en) Robust face alignment method based on multistage model
CN113283444B (en) Heterogeneous image migration method based on generation countermeasure network
CN109977887A (en) A kind of face identification method of anti-age interference
CN113642621A (en) Zero sample image classification method based on generation countermeasure network
CN111652864A (en) Casting defect image generation method for generating countermeasure network based on conditional expression
CN113362416A (en) Method for generating image based on text of target detection
CN113378949A (en) Dual-generation confrontation learning method based on capsule network and mixed attention
CN101404059A (en) Iris image database synthesis method based on block texture sampling
CN114565880B (en) Method, system and equipment for detecting counterfeit video based on optical flow tracking
CN114266933A (en) GAN image defogging algorithm based on deep learning improvement
CN113627504B (en) Multi-mode multi-scale feature fusion target detection method based on generation of countermeasure network
CN113486712B (en) Multi-face recognition method, system and medium based on deep learning
CN116563957B (en) Face fake video detection method based on Fourier domain adaptation
CN112818774A (en) Living body detection method and device
Duan et al. Image information hiding method based on image compression and deep neural network
Meng et al. A Novel Steganography Algorithm Based on Instance Segmentation.
CN114119356A (en) Method for converting thermal infrared image into visible light color image based on cycleGAN
CN114067187A (en) Infrared polarization visible light face translation method based on countermeasure generation network
CN113420608A (en) Human body abnormal behavior identification method based on dense space-time graph convolutional network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant