WO2021151276A1

WO2021151276A1 - Oct image-based image recognition method and apparatus, and device and storage medium

Info

Publication number: WO2021151276A1
Application number: PCT/CN2020/098976
Authority: WO
Inventors: 张成奋; 吕彬; 吕传峰; 谢国彤
Original assignee: 平安科技（深圳）有限公司
Priority date: 2020-05-20
Filing date: 2020-06-29
Publication date: 2021-08-05
Also published as: CN111695605A; CN111695605B

Abstract

An OCT image-based image recognition method, relating to the field of artificial intelligence, the method comprising: acquiring OCT images not containing an abnormal region to serve as sample images to construct a generative adversarial network, training a generator and a discriminator of the generative adversarial network respectively to obtain a target discriminator and a target generator, alternately iterating the target generator and the target discriminator to train the generative adversarial network until training is complete, acquiring an image to be recognized uploaded by a client and input same into the completely trained generative adversarial network to obtain a simulated image, using a first algorithm to calculate an abnormality score between the simulated image and the image to be recognized, and when the abnormality score is greater than a second pre-set threshold, determining the image to be recognized to be an abnormal image containing an abnormal region. The present method is able to improve the accuracy of recognizing whether the information reflected in an OCT image is abnormal.

Description

Image recognition method, device, equipment and storage medium based on OCT image

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on May 20, 2020, the application number is CN202010431416.4, and the invention title is "Image recognition method, server and storage medium based on OCT images", and its entire contents Incorporated in this application by reference.

Technical field

This application relates to the field of artificial intelligence technology, and in particular to an image recognition method, device, equipment and storage medium based on OCT images.

Background technique

OCT (Optical Coherence Tomography) is an imaging technology that has developed rapidly in the past ten years. It uses the basic principle of weak coherence light interferometer to detect the back of the incident weak coherence light at different depth levels of biological tissues. By scanning the reflected or scattered signals, the two-dimensional or three-dimensional structure image of the biological tissue can be obtained, that is, the OCT image. Due to the particularity of OCT images, it is usually necessary to use specific instruments to identify whether the information reflected in the corresponding OCT images is abnormal. Not only the accuracy of image recognition is low, but the recognition efficiency of image results is not high, and with the rapid development of neural networks, More and more neural networks are also applied to intelligently identify whether OCT images are abnormal.

Since most of the existing neural networks need to use a large number of abnormal OCT images (that is, images containing suspected lesion areas) samples in the training process, in practice, because abnormal OCT images involve patient privacy issues, they cannot be like normal OCT images. Such easy availability has caused many difficulties in the application of existing neural networks in the medical field. The inventor realized that even a recognition model obtained by training a small amount of abnormal OCT images has the problem of low recognition accuracy.

Summary of the invention

The main purpose of this application is to provide an image recognition method, device, equipment and storage medium based on OCT images, aiming to improve the accuracy of identifying and judging whether the information reflected in the OCT image is abnormal.

In order to achieve the foregoing objectives, an image recognition method based on OCT images provided in this application is applied to computer equipment, and the method includes:

Obtaining steps: Obtain the OCT image of the non-abnormal area as a sample image, and construct a generative confrontation network including a generator and a discriminator;

The first processing step: input the sample image into the generator, use the convolutional layer of the generator to down-sample each sample image to obtain the first image, and perform high-level feature encoding on the first image to obtain the first image A feature vector, calculating the similarity value between each first feature vector and each second feature vector in the preset storage table, using the second feature vector corresponding to the maximum similarity value as the target feature vector, and using the target feature vector The corresponding first feature vector is stored in the preset storage table as the second feature vector, and the target feature vector is up-sampled using the transposed convolution layer of the generator to obtain an analog image and used as the generator's Output result;

The second processing step: based on the output result, adjust the parameters of the generator with the goal of minimizing the first loss function value of the generator, and when the first loss function value is less than the first preset threshold, use all the parameters of the generator. The parameter of the first loss function value update generator obtains the target generator;

The third processing step: respectively input the sample image and its corresponding analog image to the discriminator to obtain the corresponding first probability value and second probability value, based on the first probability value and the second probability value, with the smallest value Change the value of the second loss function of the discriminator to the parameter of the target adjustment discriminator, and when the value of the second loss function is less than the first preset threshold, update the discriminator with the value of the second loss function To obtain a target discriminator from the parameters of, and perform alternate iterations on the target generator and target discriminator to train the generative confrontation network until the training is completed; and

Recognition step: receiving the image to be recognized uploaded by the client and inputting the trained generative countermeasure network to obtain a simulated image, using the first algorithm to calculate the anomaly score between the simulated image and the image to be recognized, when the abnormality When the score is greater than the second preset threshold, it is determined that the image to be recognized is an abnormal image including an abnormal area.

In order to solve the above-mentioned problems, the present application also provides an image recognition device based on OCT images, the device including:

Acquisition module: used to acquire OCT images of non-abnormal areas as sample images to construct a generative confrontation network including generators and discriminators;

The first processing module: used to input the sample image into the generator, use the convolutional layer of the generator to down-sample each sample image to obtain a first image, and perform high-level feature encoding on the first image Obtain the first feature vector, calculate the similarity value between each first feature vector and each second feature vector in the preset storage table, use the second feature vector corresponding to the maximum similarity value as the target feature vector, and set the target The first feature vector corresponding to the feature vector is stored in the preset storage table as the second feature vector, and the target feature vector is up-sampled using the transposed convolution layer of the generator to obtain an analog image and used as the generator The output result of the device;

The second processing module is used to adjust the parameters of the generator by minimizing the first loss function value of the generator based on the output result, and when the first loss function value is less than the first preset threshold, Update the parameters of the generator by using the first loss function value to obtain the target generator;

The third processing module is used to input the sample image and its corresponding analog image into the discriminator to obtain the corresponding first probability value and second probability value, based on the first probability value and the second probability value, The second loss function value of the discriminator is minimized to adjust the parameters of the discriminator, and when the second loss function value is less than the first preset threshold value, the second loss function value is used to update the The parameters of the discriminator obtain the target discriminator, and the target generator and the target discriminator are alternately iterated to train the generative confrontation network until the training is completed; and

Recognition module: used to receive the image to be recognized uploaded by the client and input the trained generative countermeasure network to obtain a simulated image, and use the first algorithm to calculate the anomaly score between the simulated image and the image to be recognized. When the abnormal score is greater than the second preset threshold, it is determined that the image to be recognized is an abnormal image including an abnormal area.

In order to achieve the above object, the present application also provides a computer device, including a memory, a processor, and a computer program stored in the memory and running on the processor, and the processor executes the computer program when the computer program is executed. The following steps:

To achieve the foregoing objective, the present application also provides a computer-readable storage medium on which a computer program is stored, wherein the computer program is executed by a processor to implement the following steps:

This application constructs a generative countermeasure network by acquiring OCT images without abnormal regions as sample images, and trains the generator and discriminator of the generative countermeasure network to obtain the target discriminator and target generator, and discriminate the target generator and target The generator performs alternate iterations to train the generative confrontation network until the training is completed, obtains the image to be recognized uploaded by the client and enters the generative confrontation network to obtain a simulated image, and uses the first algorithm to calculate the abnormal score between the simulated image and the image to be recognized. When the abnormal score is greater than the second preset threshold, it is determined that the image to be recognized is an abnormal image containing an abnormal area. This application can improve the accuracy of identifying and judging whether the information reflected in the OCT image is abnormal.

Description of the drawings

Figure 1 is an application environment diagram of a preferred embodiment of the computer equipment of this application;

2 is a schematic diagram of modules of an image recognition device based on OCT images;

FIG. 3 is a schematic flowchart of a preferred embodiment of an image recognition method based on OCT images according to the present application.

The realization, functional characteristics, and advantages of the purpose of this application will be further described in conjunction with the embodiments and with reference to the accompanying drawings.

Detailed ways

In order to make the purpose, technology, and advantages of the present application clearer and more comprehensible, the following further describes the present application in detail with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described here are only used to explain the application, and are not used to limit the application. Based on the embodiments in this application, all other embodiments obtained by those of ordinary skill in the art without creative work shall fall within the protection scope of this application.

It should be noted that the descriptions related to "first", "second", etc. in this application are only for descriptive purposes, and cannot be understood as indicating or implying their relative importance or implicitly indicating the number of indicated technical features . Therefore, the features defined with "first" and "second" may explicitly or implicitly include at least one of the features. In addition, the technology between the various embodiments in this embodiment can be combined with each other, but it must be based on what can be realized by a person of ordinary skill in the art. When the combination of the technology in this embodiment is contradictory or cannot be realized, it should be considered that this technology is essential. The combination of the embodiments does not exist, nor does it fall within the scope of protection claimed by this application.

This application provides a computer device 1.

The computer device 1 includes, but is not limited to, a memory 11, a processor 12, and a network interface 13.

The memory 11 includes at least one type of readable storage medium, and the readable storage medium includes flash memory, hard disk, multimedia card, card-type memory (for example, SD or DX memory, etc.), magnetic memory, magnetic disk, optical disk, and the like. The memory 11 may be an internal storage unit of the computer device 1 in some embodiments, for example, a hard disk of the computer device 1. In other embodiments, the memory 11 may also be an external storage device of the computer device 1, such as a plug-in hard disk, a smart media card (SMC), and a secure digital (Secure Digital, SD) equipped on the computer device 1. ) Card, Flash Card, etc.

Further, the memory 11 may also include both an internal storage unit of the computer device 1 and an external storage device. The memory 11 can be used not only to store application software and various data installed in the computer device 1, such as the code of the image recognition program 10 based on the OCT image, etc., but also to temporarily store data that has been output or will be output.

In some embodiments, the processor 12 may be a central processing unit (CPU), controller, microcontroller, microprocessor, or other data processing chip, for running program codes or processing stored in the memory 11 Data, for example, the image recognition program 10 based on the OCT image is executed.

The network interface 13 may optionally include a standard wired interface and a wireless interface (such as a WI-FI interface), and is usually used to establish a communication connection between the computer device 1 and other electronic devices.

The client can be a desktop computer, notebook, tablet computer, mobile phone, etc.

The network may be the Internet, a cloud network, a wireless fidelity (Wi-Fi) network, a personal network (PAN), a local area network (LAN), and/or a metropolitan area network (MAN). Various devices in the network environment can be configured to connect to the communication network according to various wired and wireless communication protocols. Examples of such wired and wireless communication protocols may include, but are not limited to, at least one of the following: Transmission Control Protocol and Internet Protocol (TCP/IP), User Datagram Protocol (UDP), Hypertext Transfer Protocol (HTTP), File Transfer Protocol (FTP), ZigBee, EDGE, IEEE 802.11, Optical Fidelity (Li-Fi), 802.16, IEEE 802.11s, IEEE 802.11g, multi-hop communication, wireless access point (AP), device-to-device communication, cellular communication Protocol and/or Bluetooth (BlueTooth) communication protocol or a combination thereof.

Optionally, the computer device 1 may also include a user interface. The user interface may include a display (Display) and an input unit such as a keyboard (Keyboard). The optional user interface may also include a standard wired interface and a wireless interface. Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode, organic light-emitting diode) touch device, etc. Among them, the display may also be called a display screen or a display unit, which is used to display the information processed in the computer device 1 and to display a visualized user interface.

FIG. 1 only shows a computer device 1 with components 11-13 and an image recognition program 10 based on OCT images. Those skilled in the art can understand that the structure shown in FIG. 1 does not constitute a limitation on the computer device 1. It may include fewer or more components than shown, or a combination of some components, or a different component arrangement.

In this embodiment, when the OCT image-based image recognition program 10 of FIG. 1 is executed by the processor 12, the following steps are implemented:

The third processing step: respectively input the sample image and its corresponding analog image to the discriminator to obtain the corresponding first probability value and second probability value, based on the first probability value and the second probability value, with the smallest value Change the second loss function value of the discriminator to a target adjustment parameter of the discriminator, and when the second loss function value is less than the first preset threshold, update the discriminator with the second loss function value To obtain a target discriminator from the parameters of, and perform alternate iterations on the target generator and target discriminator to train the generative confrontation network until the training is completed; and

In another embodiment, the program further executes the following steps:

The target feature vector and the first feature vector are input into a weight calculation formula to obtain first result data with a value in a preset value interval (for example, 0-0.1), and the extraction weight calculation formula is:

Among them, w _i represents the extraction weight of the target feature vector, exp represents the exponential operation symbol with e as the base, d represents the similarity value between the first feature vector and the second feature vector, z represents the first feature vector of the first image, m represents the first feature vector, m _j represents the second feature vector, and j represents the total number of second feature vectors in the preset storage table.

In another embodiment, the program further executes the following steps:

Performing Gaussian down-sampling on the image to be recognized to obtain a second image;

Normalize the pixels larger than Maximum/10 in the second image, where Maximum represents the maximum brightness of the second image with different preset scales;

Construct brightness Gaussian pyramids at nine scales, use Gabor filters to construct four directions, θ{0°, 45°, 90°, 135°} direction Gaussian pyramids, and calculate the brightness and direction Gaussian pyramids separately The feature map corresponding to the brightness and direction Gaussian pyramid, where the brightness feature map is: I(c,s)=|I(c)-I(s)|, and the direction feature map is: O(c,s,θ)= |O(c,θ)-O(s,θ)|, c, s represent scale parameters, θ represents angle parameters, c∈{2,3,4}, s=c+δ,δ∈{3,4 };

Obtain a preset number of feature maps, suppress the feature maps with the first preset number of activity peaks, enhance the feature maps with the second preset number of activity peaks, adjust all the feature maps to a uniform size and add them to obtain a salient feature map, The first predetermined number is greater than the second predetermined number; and

The second algorithm is used to calculate the abnormal probability value of each pixel in the image to be recognized, the abnormal probability value and the saliency feature map are respectively subjected to matrix inner product to obtain the corresponding second result data, and the second result The pixel area corresponding to the data greater than or equal to the third preset threshold is used as the target area.

For a detailed description of the above steps, please refer to the following description of the schematic diagram of the module of the image recognition device based on OCT image in FIG. 2 and the schematic diagram of the method flow of the embodiment of the image recognition method based on OCT image in FIG. 3.

Referring to FIG. 2, it is a functional block diagram of an image recognition device 100 based on OCT images of this application.

The OCT image-based image recognition apparatus 100 described in this application can be installed in a computer device. According to the realized function. The module described in the present invention can also be called a unit, which refers to a series of computer program segments that can be executed by the processor of a computer device and can complete fixed functions, and are stored in the memory of the computer device.

In this embodiment, the OCT image-based image recognition device 100 includes an acquisition module 110, a first processing module 120, a second processing module 130, a third processing module 140, and an identification module 150.

The acquisition module 110 is used to acquire an OCT image of a non-abnormal area as a sample image, and construct a generative confrontation network including a generator and a discriminator.

In this embodiment, by acquiring a large number of OCT images that do not contain abnormal regions as sample images, a generative confrontation network is constructed.

Among them, Generative Adversarial Networks (GAN, Generative Adversarial Networks) is a deep learning model. The model generates fairly good output through the mutual game learning of (at least) two modules in the framework: Generative Model and Discriminative Model, also known as generator G and discriminator D.

For example, take an analog image generated by a generator and input it into the discriminator. The discriminator initiates a vote based on the input analog image to determine the authenticity of the input analog image. Generally speaking, the closer the value of the discriminator output is to 0, the simulation of the input The more real the image, and the closer the output value is to 1, the more false the input simulated image. The generator generates a simulated image input from a real image and trains itself to fool the discriminator into thinking that the simulated images it generates are real. Therefore, the goal of training the discriminator is to maximize the images from the real data distribution and minimize the images that are not from the real data distribution.

Therefore, in this embodiment, the simulated OCT image that is closest to the sample image similarity can be generated through the generative countermeasure network, and the simulated OCT image generated by the generative countermeasure network can be used to intelligently identify whether the image to be recognized is abnormal (that is, whether it contains Area of suspected lesion).

The first processing module 120 is configured to input the sample image into the generator, use the convolutional layer of the generator to down-sample each sample image to obtain a first image, and perform high-level features on the first image The first feature vector is obtained by encoding, the similarity value between each first feature vector and each second feature vector in the preset storage table is calculated, the second feature vector corresponding to the maximum similarity value is used as the target feature vector, and the The first feature vector corresponding to the target feature vector is stored in the preset storage table as the second feature vector, and the target feature vector is up-sampled using the transposed convolution layer of the generator to obtain an analog image and used as the The output of the generator.

In order to train the generator to generate the analog image with the highest similarity to the image to be recognized, in this embodiment, first input multiple sample images into the generator, and use a convolutional layer with a step size of 2 for each The sample image is down-sampled multiple times to obtain a low-resolution first image, and the first image is subjected to high-level feature encoding to obtain the corresponding first feature vector, and each first feature vector is respectively compared with each of the preset storage tables The preset second feature vector performs similarity value calculation to obtain the corresponding similarity value.

Wherein, a large number of randomly generated image feature vectors are pre-stored in the preset storage table, and the second feature corresponding to the highest similarity value is selected by continuously calculating the similarity value with the sample image in the process of training the generator The vector is stored in the preset storage table. Since the sample image is an OCT image that does not contain an abnormal area, that is, a normal image, the second feature vector selected has the characteristics of a normal image, that is, the second feature vector in the preset storage table They are all feature vectors of normal images.

The second feature vector obtained from each training of the generator will optimize the preset storage table, so that the second feature vector in the preset storage table is richer and closer to the normal image.

The similarity value calculation method may adopt the cosine similarity algorithm. After the cosine similarity algorithm is used to calculate the similarity value corresponding to each first eigenvector and the second eigenvector, the second eigenvector corresponding to the largest similarity value is queried As the target feature vector, a transposed convolutional layer with a step size of 2 is used to upsample the target feature vector multiple times until the input resolution is restored for image reconstruction, and a high-resolution analog image is generated as the output result of the generator.

Since each preset second feature vector in the preset storage table is close to the feature vector of the normal image, no matter whether the image to be recognized in the input generator is abnormal or not, the simulated image output by the generator does not contain abnormal areas Normal image, but no matter how close the feature vector of the image to be recognized is to the normal image feature in the preset storage table, there is always a big difference from the feature vector of the normal image. Under normal circumstances, only when the If the image is a normal image, the simulated image output by the generator will be less different from the image to be recognized. Therefore, using this point, it is possible to determine whether the image to be recognized is abnormal by calculating the abnormal score between the simulated image and the image to be recognized.

In another embodiment, in order to avoid extreme situations, the abnormal image containing the abnormal area (that is, the suspected lesion area) of the partial input generator is compared with the simulated image obtained by the complex feature vector combination and the image to be recognized. Obviously, it affects the recognition accuracy of this solution. Therefore, the second feature vector corresponding to the one with the largest query similarity value as the target feature vector includes:

The second processing module 130 is configured to adjust the parameters of the generator with a goal of minimizing the first loss function value of the generator based on the output result, when the first loss function value is less than a first preset threshold , Using the first loss function value to update the parameters of the generator to obtain the target generator.

In order to improve the simulation image output by the generator to be more objective and accurate, therefore, in this embodiment, according to the first output result obtained, the first loss function value of the generator is minimized to adjust the parameters of the generator, when the generator When the first loss function value of is less than the first preset threshold value, the parameter of the generator is updated with the first loss function value to obtain the target generator.

The calculation formula of the first loss function value is:

L _g ＝E _x-ρ [log avg(E(x))+log(1-avg(E(G(x))))-αρ(z,E(G(x)))]

Among them, x represents the sample image, E(x) represents the convolutional layer in the discriminator, G(x) represents the generator, E(G(x)) represents the convolutional layer in the generator, α represents the weight coefficient, β represents The correlation between E(G(x)) and z.

In order to enable the generator to reconstruct well only when the OCT image without abnormal areas is input, the residual loss is designed to maximize the similarity between the image sample without abnormal areas and the simulated image generated by it. Therefore, in In another embodiment, the calculation formula of the first loss function value may also be:

L _g ＝E _x-ρ [log avg(E(x))+log(1-avg(E(G(x))))-αρ(z,E(G(x)))]+μE _{x- ρ} [xG(x)]

Among them, x represents the sample image, E(x) represents the convolutional layer in the discriminator, G(x) represents the generator, E(G(x)) represents the convolutional layer in the generator, α represents the weight coefficient, β represents The correlation between E(G(x)) and z, μ represents the value of the variable.

In order to avoid that the abnormal image containing the abnormal area is also well reconstructed, the extraction weight of the feature vector in the preset storage table is constrained to make it further sparse. Therefore, in another embodiment, the value of the first loss function is The calculation formula can also be:

L _g ＝E _x-ρ [log avg(E(x))+log(1-avg(E(G(x))))-αρ(z,E(G(x)))]+μE _{x- ρ} [xG(x)]+E _w-ρ [-log(w)]

Among them, x represents the sample image, E(x) represents the convolutional layer in the discriminator, G(x) represents the generator, E(G(x)) represents the convolutional layer in the generator, α represents the weight coefficient, β represents The correlation between E(G(x)) and z, μ represents the variable value, and w represents the extraction weight of the target feature vector.

The third processing module 140 is configured to input the sample image and its corresponding analog image into the discriminator to obtain the corresponding first probability value and second probability value, based on the first probability value and the second probability value , To minimize the second loss function value of the discriminator to adjust the parameters of the discriminator, when the second loss function value is less than the first preset threshold, use the second loss function value to update the The parameters of the discriminator obtain a target discriminator, and the target generator and the target discriminator are alternately iterated to train the generative confrontation network until the training is completed.

In this embodiment, the sample image and its corresponding analog image are respectively input to the discriminator to obtain the first probability value and the second probability value, based on the first probability value and the second probability value, to minimize the second loss of the discriminator The function value is the target to adjust the parameters of the discriminator. When the second loss function value of the discriminator is less than the first preset threshold, the second loss function value is used to update the parameters of the discriminator to obtain the target discriminator. The discriminator performs alternate iterations to train the generative adversarial network until the training is completed.

The method of alternating iteratively on the target generator and target discriminator is to minimize the objective function. The generator G and the discriminator D are interactively iterated respectively. When the generator G is fixed, the discriminator D is optimized, and when the discriminator D is fixed, it is optimized Generator G until the process converges.

The calculation formula of the second loss function value is:

L _d ＝E _x-ρ [log avg(E(G(x)))-αρ(z,E(G(x)))]

Among them, x represents the sample image, E(x) represents the convolutional layer in the discriminator, G(x) represents the generator, E(G(x)) represents the convolutional layer in the generator, α represents the weight coefficient, β represents The correlation between E(G(x)) and z. The added constraint items enable the discrimination network to have the ability of image coding while correctly outputting the true and false tags of the image, which improves the accuracy of the scheme for identifying the positive abnormality of the image to be recognized.

The recognition module 150 is configured to receive the image to be recognized uploaded by the client and input the trained generative countermeasure network to obtain a simulated image, and use the first algorithm to calculate the abnormal score between the simulated image and the image to be recognized, when When the abnormal score is greater than the second preset threshold, it is determined that the image to be recognized is an abnormal image including an abnormal area.

After completing the training of the generative countermeasure network, the computer device 1 inputs the image to be recognized and uploaded to the client into the generative countermeasure network to obtain a simulated image, and uses a predetermined first algorithm to calculate the distance between the simulated image and the image to be recognized. When the abnormal score is greater than the second preset threshold, it is determined that the image to be identified is an abnormal image containing an abnormal area.

The first algorithm is:

A(x)=(1-λ)R(x)+λD(x)

Among them, λ represents the variable value, R(x) represents the pixel residual of the simulated image and the image to be recognized, and D(x) represents the high-dimensional spatial residual of the discriminator encoding.

In another embodiment, in order to be able to identify the location of the abnormal area (that is, the suspected lesion area) in the OCT image containing the abnormal area, the program also executes a target detection module for:

Acquire a preset number (for example, 30) of feature maps, suppress the feature maps with the first preset number of activity peaks, enhance the feature maps with the second preset number of activity peaks, and adjust all the feature maps to a uniform size (for example, A quarter of the image to be recognized) and then add up to obtain a salient feature map, the first preset number is greater than the second preset number; and

The second algorithm is used to calculate the abnormal probability value of each pixel in the image to be recognized, the abnormal probability value and the saliency feature map are respectively subjected to matrix inner product to obtain the corresponding second result data, and the second result The pixel area corresponding to data greater than or equal to the third preset threshold is used as the target area, and the second algorithm is:

B(x)=x-G(x) B(x)=x-G(x)

Among them, x represents the image to be recognized, and G(x) represents the generator.

In addition, this application also provides an image recognition method based on OCT images. Refer to FIG. 3, which is a schematic diagram of the method flow of an embodiment of an OCT image-based image recognition method of this application. The processor 12 of the computer device 1 executes the image recognition program 10 based on the OCT image stored in the memory 11 to implement the following steps of the image recognition method based on the OCT image:

S110: Obtain an OCT image of a non-abnormal area as a sample image, and construct a generative confrontation network including a generator and a discriminator.

Among them, Generative Adversarial Networks (GAN, Generative Adversarial Networks) is a deep learning model. The model generates very good output through the mutual game learning of (at least) two modules in the framework: Generative Model and Discriminative Model, also known as generator G and discriminator D.

S120. Input the sample image to the generator, use the convolutional layer of the generator to down-sample each sample image to obtain a first image, and perform high-level feature encoding on the first image to obtain a first feature vector Calculate the similarity value between each first feature vector and each second feature vector in the preset storage table, use the second feature vector corresponding to the maximum similarity value as the target feature vector, and set the first feature vector corresponding to the target feature vector A feature vector is stored in the preset storage table as a second feature vector, and the target feature vector is up-sampled by using the transposed convolution layer of the generator to obtain a simulated image and used as the output result of the generator.

S130. Based on the output result, adjust the parameters of the generator with a goal of minimizing the first loss function value of the generator, and when the first loss function value is less than a first preset threshold, use the first loss function value The loss function value updates the parameters of the generator to obtain the target generator.

The calculation formula of the first loss function value is:

L _g ＝E _x-ρ [log avg(E(x))+log(1-avg(E(G(x))))-αρ(z,E(G(x)))]

S140, respectively input the sample image and its corresponding analog image to the discriminator to obtain corresponding first probability value and second probability value, based on the first probability value and the second probability value, to minimize the The second loss function value of the discriminator is the parameter of the target adjustment discriminator. When the second loss function value is less than the first preset threshold, the second loss function value is used to update the parameters of the discriminator to obtain A target discriminator, which alternately iterates the target generator and the target discriminator to train the generative confrontation network until the training is completed.

The calculation formula of the second loss function value is:

L _d ＝E _x-ρ [log avg(E(G(x)))-αρ(z,E(G(x)))]

S150. Receive an image to be recognized uploaded by the client and input the trained generative countermeasure network to obtain a simulated image, and calculate an abnormal score between the simulated image and the image to be recognized by using the first algorithm. When the abnormal score is When the value is greater than the second preset threshold, it is determined that the image to be identified is an abnormal image including an abnormal area.

The first algorithm is:

A(x)=(1-λ)R(x)+λD(x)

In another embodiment, in order to be able to identify the location of the abnormal area (that is, the suspected lesion area) in the OCT image containing the abnormal area, the method further includes a target detection step:

B(x)=x-G(x) B(x)=x-G(x)

In addition, the embodiment of the present application also proposes a computer-readable storage medium. The computer-readable storage medium may be non-volatile or volatile. The computer-readable storage medium may be a hard disk, a multimedia card, or an SD card. , Flash memory card, SMC, read only memory (ROM), erasable programmable read only memory (EPROM), portable compact disk read only memory (CD-ROM), USB memory, etc. any one or more of them random combination. The computer-readable storage medium includes an image recognition program 10 based on OCT images. The specific implementation of the computer-readable storage medium of this application is substantially the same as the above-mentioned OCT image-based image recognition method and the specific implementation of the computer device 1, here No longer.

It should be noted that the sequence date of the above examples of the present application is only for description, and does not represent the merits of the examples. And the terms "include", "include" or any other variants thereof in this article are intended to cover non-exclusive inclusion, so that a process, device, article or method including a series of elements not only includes those elements, but also includes those elements that are not explicitly included. The other elements listed may also include elements inherent to the process, device, article, or method. If there are no more restrictions, the element defined by the sentence "including a..." does not exclude the existence of other identical elements in the process, device, article, or method that includes the element.

The above sequence days of the embodiments of the present application are only for description, and do not represent the advantages and disadvantages of the embodiments. Through the description of the above implementation manners, those skilled in the art can clearly understand that the above-mentioned embodiment method can be implemented by means of software plus the necessary general hardware platform, of course, it can also be implemented by hardware, but in many cases the former is better.的实施方式。 Based on this understanding, this embodiment of the technology of the present application essentially or the part that contributes to the existing technology can be embodied in the form of a software product, and the computer software product is stored in a storage medium (such as ROM) as described above. /RAM, magnetic disk, optical disk) includes several instructions to make a terminal device (which can be a mobile phone, a computer, a server, or a network device, etc.) execute the methods described in the various embodiments of the present application.

The above are only the preferred embodiments of the application, and do not limit the scope of the patent for this application. Any equivalent structure or equivalent process transformation made using the content of the description and drawings of the application, or directly or indirectly applied to other related technical fields , The same reason is included in the scope of patent protection of this application.

Claims

An image recognition method based on OCT images, applied to computer equipment, wherein the method includes:

Obtaining steps: Obtain the OCT image of the non-abnormal area as a sample image, and construct a generative confrontation network including a generator and a discriminator;

The first processing step: input the sample image into the generator, use the convolutional layer of the generator to down-sample each sample image to obtain the first image, and perform high-level feature encoding on the first image to obtain the first image A feature vector, calculating the similarity value between each first feature vector and each second feature vector in the preset storage table, using the second feature vector corresponding to the maximum similarity value as the target feature vector, and using the target feature vector The corresponding first feature vector is stored in the preset storage table as the second feature vector, and the target feature vector is up-sampled using the transposed convolution layer of the generator to obtain an analog image and used as the generator's Output result;

The second processing step: based on the output result, adjust the parameters of the generator with the goal of minimizing the first loss function value of the generator, and when the first loss function value is less than the first preset threshold, use all the parameters of the generator. The parameter of the first loss function value update generator obtains the target generator;

The third processing step: respectively input the sample image and its corresponding analog image to the discriminator to obtain the corresponding first probability value and second probability value, based on the first probability value and the second probability value, with the smallest value Change the value of the second loss function of the discriminator to the parameter of the target adjustment discriminator, and when the value of the second loss function is less than the first preset threshold, update the discriminator with the value of the second loss function To obtain a target discriminator from the parameters of, and perform alternate iterations on the target generator and target discriminator to train the generative confrontation network until the training is completed; and

Recognition step: receiving the image to be recognized uploaded by the client and inputting the trained generative countermeasure network to obtain a simulated image, using the first algorithm to calculate the anomaly score between the simulated image and the image to be recognized, when the abnormality When the score is greater than the second preset threshold, it is determined that the image to be recognized is an abnormal image including an abnormal area.
8. The image recognition method based on OCT images according to claim 1, wherein said using the second feature vector corresponding to the maximum similarity value as the target feature vector comprises:

The target feature vector and the first feature vector are input into a weight calculation formula to obtain first result data with a value in a preset value interval (for example, 0-0.1), and the extraction weight calculation formula is:

Among them, w i represents the extraction weight of the target feature vector, exp represents the exponential operation symbol with e as the base, d represents the similarity value between the first feature vector and the second feature vector, z represents the first feature vector of the first image, m represents the first feature vector, m j represents the second feature vector, and j represents the total number of second feature vectors in the preset storage table.
The image recognition method based on OCT images according to claim 1, wherein the first algorithm is:

A(x)=(1-λ)R(x)+λD(x)

Among them, λ represents the variable value, R(x) represents the pixel residual of the simulated image and the image to be recognized, and D(x) represents the high-dimensional spatial residual of the discriminator encoding.
The image recognition method based on OCT images as claimed in claim 1, wherein the method further comprises a target detection step:

Performing Gaussian down-sampling on the image to be recognized to obtain a second image;

Normalize the pixels larger than Maximum/10 in the second image, where Maximum represents the maximum brightness of the second image with different preset scales;

Construct brightness Gaussian pyramids at nine scales, use Gabor filters to construct four directions, θ{0°, 45°, 90°, 135°} direction Gaussian pyramids, and calculate the brightness and direction Gaussian pyramids separately The feature map corresponding to the brightness and direction Gaussian pyramid, where the brightness feature map is: I(c,s)=|I(c)-I(s)|, and the direction feature map is: O(c,s,θ)= |O(c,θ)-O(s,θ)|, c, s represent scale parameters, θ represents angle parameters, c∈{2,3,4}, s=c+δ,δ∈{3,4 };

Obtain a preset number of feature maps, suppress the feature maps with the first preset number of activity peaks, enhance the feature maps with the second preset number of activity peaks, adjust all the feature maps to a uniform size and add them to obtain a salient feature map, The first predetermined number is greater than the second predetermined number; and

The second algorithm is used to calculate the abnormal probability value of each pixel in the image to be recognized, the abnormal probability value and the saliency feature map are respectively subjected to matrix inner product to obtain the corresponding second result data, and the second result The pixel area corresponding to the data greater than or equal to the third preset threshold is used as the target area.
The image recognition method based on OCT images according to claim 4, wherein the second algorithm is:

B(x)=x-G(x) B(x)=x-G(x)

Among them, x represents the image to be recognized, and G(x) represents the generator.
3. The image recognition method based on OCT images according to claim 1, wherein the calculation formula of the first loss function value is:

L g ＝E x-ρ [log avg(E(x))+log(1-avg(E(G(x))))-αρ(z,E(G(x)))]

Among them, x represents the sample image, E(x) represents the convolutional layer in the discriminator, G(x) represents the generator, E(G(x)) represents the convolutional layer in the generator, α represents the weight coefficient, β represents The correlation between E(G(x)) and z.
3. The image recognition method based on OCT images according to claim 1, wherein the calculation formula of the second loss function value is:

L d ＝E x-ρ [log avg(E(G(x)))-αρ(z,E(G(x)))]

Among them, x represents the sample image, E(x) represents the convolutional layer in the discriminator, G(x) represents the generator, E(G(x)) represents the convolutional layer in the generator, α represents the weight coefficient, β represents The correlation between E(G(x)) and z.
An image recognition device based on OCT images, wherein the device includes:

Acquisition module: used to acquire OCT images of non-abnormal areas as sample images to construct a generative confrontation network including generators and discriminators;

The first processing module: used to input the sample image into the generator, use the convolutional layer of the generator to down-sample each sample image to obtain a first image, and perform high-level feature encoding on the first image Obtain the first feature vector, calculate the similarity value between each first feature vector and each second feature vector in the preset storage table, use the second feature vector corresponding to the maximum similarity value as the target feature vector, and set the target The first feature vector corresponding to the feature vector is stored in the preset storage table as the second feature vector, and the target feature vector is up-sampled using the transposed convolution layer of the generator to obtain an analog image and used as the generator The output result of the device;

The second processing module is used to adjust the parameters of the generator by minimizing the first loss function value of the generator based on the output result, and when the first loss function value is less than the first preset threshold, Update the parameters of the generator by using the first loss function value to obtain the target generator;

The third processing module is used to input the sample image and its corresponding analog image into the discriminator to obtain the corresponding first probability value and second probability value, based on the first probability value and the second probability value, The second loss function value of the discriminator is minimized to adjust the parameters of the discriminator, and when the second loss function value is less than the first preset threshold value, the second loss function value is used to update the The parameters of the discriminator obtain the target discriminator, and the target generator and the target discriminator are alternately iterated to train the generative confrontation network until the training is completed; and

Recognition module: used to receive the image to be recognized uploaded by the client and input the trained generative countermeasure network to obtain a simulated image, and use the first algorithm to calculate the anomaly score between the simulated image and the image to be recognized. When the abnormal score is greater than the second preset threshold, it is determined that the image to be recognized is an abnormal image including an abnormal area.
A computer device includes a memory, a processor, and a computer program stored in the memory and running on the processor, wherein the processor implements the following steps when the processor executes the computer program:

Obtaining steps: Obtain the OCT image of the non-abnormal area as a sample image, and construct a generative confrontation network including a generator and a discriminator;

The first processing step: input the sample image into the generator, use the convolutional layer of the generator to down-sample each sample image to obtain the first image, and perform high-level feature encoding on the first image to obtain the first image A feature vector, calculating the similarity value between each first feature vector and each second feature vector in the preset storage table, using the second feature vector corresponding to the maximum similarity value as the target feature vector, and using the target feature vector The corresponding first feature vector is stored in the preset storage table as the second feature vector, and the target feature vector is up-sampled using the transposed convolution layer of the generator to obtain an analog image and used as the generator's Output result;

The second processing step: based on the output result, adjust the parameters of the generator with the goal of minimizing the first loss function value of the generator, and when the first loss function value is less than the first preset threshold, use all the parameters of the generator. The parameter of the first loss function value update generator obtains the target generator;

The third processing step: respectively input the sample image and its corresponding analog image to the discriminator to obtain the corresponding first probability value and second probability value, based on the first probability value and the second probability value, with the smallest value Change the value of the second loss function of the discriminator to the parameter of the target adjustment discriminator, and when the value of the second loss function is less than the first preset threshold, update the discriminator with the value of the second loss function To obtain a target discriminator from the parameters of, and perform alternate iterations on the target generator and target discriminator to train the generative confrontation network until the training is completed; and

Recognition step: receiving the image to be recognized uploaded by the client and inputting the trained generative countermeasure network to obtain a simulated image, using the first algorithm to calculate the anomaly score between the simulated image and the image to be recognized, when the abnormality When the score is greater than the second preset threshold, it is determined that the image to be recognized is an abnormal image including an abnormal area.
9. The computer device according to claim 9, wherein said using the second feature vector corresponding to the maximum similarity value as a target feature vector comprises:

The target feature vector and the first feature vector are input into a weight calculation formula to obtain first result data with a value in a preset value interval (for example, 0-0.1), and the extraction weight calculation formula is:

Among them, w i represents the extraction weight of the target feature vector, exp represents the exponential operation symbol with e as the base, d represents the similarity value between the first feature vector and the second feature vector, z represents the first feature vector of the first image, m represents the first feature vector, m j represents the second feature vector, and j represents the total number of second feature vectors in the preset storage table.
9. The computer device of claim 9, wherein the first algorithm is:

A(x)=(1-λ)R(x)+λD(x)

Among them, λ represents the variable value, R(x) represents the pixel residual of the simulated image and the image to be recognized, and D(x) represents the high-dimensional spatial residual of the discriminator encoding.
11. The computer device of claim 11, wherein the processor further implements the target detection step when executing the computer program:

Performing Gaussian down-sampling on the image to be recognized to obtain a second image;

Normalize the pixels larger than Maximum/10 in the second image, where Maximum represents the maximum brightness of the second image with different preset scales;

Construct brightness Gaussian pyramids at nine scales, use Gabor filters to construct four directions, θ{0°, 45°, 90°, 135°} direction Gaussian pyramids, and calculate the brightness and direction Gaussian pyramids separately The feature map corresponding to the brightness and direction Gaussian pyramid, where the brightness feature map is: I(c,s)=|I(c)-I(s)|, and the direction feature map is: O(c,s,θ)= |O(c,θ)-O(s,θ)|, c, s represent scale parameters, θ represents angle parameters, c∈{2,3,4}, s=c+δ,δ∈{3,4 };

Obtain a preset number of feature maps, suppress the feature maps with the first preset number of activity peaks, enhance the feature maps with the second preset number of activity peaks, adjust all the feature maps to a uniform size and add them to obtain a salient feature map, The first predetermined number is greater than the second predetermined number; and

The second algorithm is used to calculate the abnormal probability value of each pixel in the image to be recognized, the abnormal probability value and the saliency feature map are respectively subjected to matrix inner product to obtain the corresponding second result data, and the second result The pixel area corresponding to the data greater than or equal to the third preset threshold is used as the target area.
The computer device of claim 12, wherein the second algorithm is:

B(x)=x-G(x) B(x)=x-G(x)

Among them, x represents the image to be recognized, and G(x) represents the generator.
9. The computer device of claim 9, wherein the calculation formula of the first loss function value is:

L g ＝E x-ρ [log avg(E(x))+log(1-avg(E(G(x))))-αρ(z,E(G(x)))]

Among them, x represents the sample image, E(x) represents the convolutional layer in the discriminator, G(x) represents the generator, E(G(x)) represents the convolutional layer in the generator, α represents the weight coefficient, β represents The correlation between E(G(x)) and z.
9. The computer device according to claim 9, wherein the calculation formula of the second loss function value is:

L d ＝E x-ρ [log avg(E(G(x)))-αρ(z,E(G(x)))]

Among them, x represents the sample image, E(x) represents the convolutional layer in the discriminator, G(x) represents the generator, E(G(x)) represents the convolutional layer in the generator, α represents the weight coefficient, β represents The correlation between E(G(x)) and z.
A computer-readable storage medium having a computer program stored on the computer-readable storage medium, wherein, when the computer program is executed by a processor, the following steps are implemented:

Obtaining steps: Obtain the OCT image of the non-abnormal area as a sample image, and construct a generative confrontation network including a generator and a discriminator;

The first processing step: input the sample image into the generator, use the convolutional layer of the generator to down-sample each sample image to obtain the first image, and perform high-level feature encoding on the first image to obtain the first image A feature vector, calculating the similarity value between each first feature vector and each second feature vector in the preset storage table, using the second feature vector corresponding to the maximum similarity value as the target feature vector, and using the target feature vector The corresponding first feature vector is stored in the preset storage table as the second feature vector, and the target feature vector is up-sampled using the transposed convolution layer of the generator to obtain an analog image and used as the generator's Output result;

The second processing step: based on the output result, adjust the parameters of the generator with the goal of minimizing the first loss function value of the generator, and when the first loss function value is less than the first preset threshold, use all the parameters of the generator. The parameter of the first loss function value update generator obtains the target generator;

The third processing step: respectively input the sample image and its corresponding analog image to the discriminator to obtain the corresponding first probability value and second probability value, based on the first probability value and the second probability value, with the smallest value Change the value of the second loss function of the discriminator to the parameter of the target adjustment discriminator, and when the value of the second loss function is less than the first preset threshold, update the discriminator with the value of the second loss function To obtain a target discriminator from the parameters of, and perform alternate iterations on the target generator and target discriminator to train the generative confrontation network until the training is completed; and

Recognition step: receiving the image to be recognized uploaded by the client and inputting the trained generative countermeasure network to obtain a simulated image, using the first algorithm to calculate the anomaly score between the simulated image and the image to be recognized, when the abnormality When the score is greater than the second preset threshold, it is determined that the image to be recognized is an abnormal image including an abnormal area.
15. The computer-readable storage medium of claim 16, wherein said using the second feature vector corresponding to the maximum similarity value as a target feature vector comprises:

The target feature vector and the first feature vector are input into a weight calculation formula to obtain first result data with a value in a preset value interval (for example, 0-0.1), and the extraction weight calculation formula is:

Among them, w i represents the extraction weight of the target feature vector, exp represents the exponential operation symbol with e as the base, d represents the similarity value between the first feature vector and the second feature vector, z represents the first feature vector of the first image, m represents the first feature vector, m j represents the second feature vector, and j represents the total number of second feature vectors in the preset storage table.
The computer-readable storage medium of claim 16, wherein the first algorithm is:

A(x)=(1-λ)R(x)+λD(x)

Among them, λ represents the variable value, R(x) represents the pixel residual of the simulated image and the image to be recognized, and D(x) represents the high-dimensional spatial residual of the discriminator encoding.
15. The computer-readable storage medium of claim 16, wherein the calculation formula of the first loss function value is:

L g ＝E x-ρ [log avg(E(x))+log(1-avg(E(G(x))))-αρ(z,E(G(x)))]

Among them, x represents the sample image, E(x) represents the convolutional layer in the discriminator, G(x) represents the generator, E(G(x)) represents the convolutional layer in the generator, α represents the weight coefficient, β represents The correlation between E(G(x)) and z.
15. The computer-readable storage medium of claim 16, wherein the calculation formula of the second loss function value is:

L d ＝E x-ρ [log avg(E(G(x)))-αρ(z,E(G(x)))]

Among them, x represents the sample image, E(x) represents the convolutional layer in the discriminator, G(x) represents the generator, E(G(x)) represents the convolutional layer in the generator, α represents the weight coefficient, β represents The correlation between E(G(x)) and z.