WO2020168647A1

WO2020168647A1 - Image recognition method and related device

Info

Publication number: WO2020168647A1
Application number: PCT/CN2019/088825
Authority: WO
Inventors: 王健宗; 肖京
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-02-21
Filing date: 2019-05-28
Publication date: 2020-08-27
Also published as: CN109978004B; CN109978004A

Abstract

The embodiments of the present application disclose an image recognition method and a related device. Said method comprises: inputting a target lung scanning image into a first neural network, so as to obtain a first category probability map; inputting the first category probability map into a second neural network, so as to obtain a second category probability map; extracting nodule units in the target lung scanning image according to the first category probability map, so as to obtain a plurality of nodule units; inputting each of the plurality of nodule units into a third neural network respectively, so as to obtain a third category probability map for the nodule type of each of the plurality of nodule units; and inputting the second category probability map and the third category probability map into a fourth neural network, so as to obtain a lung cancer prevalence rate of a target patient corresponding to the target lung scanning image. The present application improves the accuracy of lung cancer lesion site image recognitions.

Description

Image recognition method and related equipment

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on February 21, 2019, the application number is 201910135802.6, and the application name is "Image Recognition Method and Related Equipment", the entire content of which is incorporated into this application by reference.

Technical field

This application relates to the field of data processing technology, and mainly relates to an image recognition method and related equipment.

Background technique

Lung cancer is one of the malignant tumors with the fastest increase in morbidity and mortality and the greatest threat to the health and life of the population. In the past 50 years, many countries have reported that the incidence and mortality of lung cancer have increased significantly. Traditional lung cancer screening relies on professional medical personnel to interpret lung images and screen out suspicious lung nodules. This is a problem for medical personnel. The workload is extremely demanding and false positive diagnosis is prone to occur. Therefore, how to improve the accuracy of image recognition of lung cancer lesions is a technical problem to be solved by those skilled in the art.

Summary of the invention

The embodiments of the present application provide an image recognition method and related equipment, which can recognize the lung cancer prevalence probability of a patient through lung scan images, and improve the accuracy of image recognition of lung cancer lesions.

In the first aspect, an embodiment of the present application provides an image recognition method, wherein:

Input the target lung scan image to a first neural network to obtain a first category probability map for nodules and no nodules, and the first neural network is used to identify nodules in the target lung scan image image;

Input the first category probability map to a second neural network to obtain a second category probability map for benign nodules, malignant nodules and no nodules, and the second neural network is used to identify the first category The nodule type of the nodule image in the probability map;

Extracting nodular units in the scan image of the target lung according to the first category probability map to obtain multiple nodular units;

Each nodular unit in the multiple nodular units is input to a third neural network to obtain a third category probability map for the nodule type of each nodular unit in the multiple nodular units, so The nodule types include benign nodules and malignant nodules, and the third neural network is used to identify the nodule type of each nodular unit of the plurality of nodular units;

The second category probability map and the third category probability map are input to a fourth neural network to obtain the lung cancer probability of the target patient corresponding to the target lung scan image, and the fourth neural network is used for Classify the second category probability map and the third category probability map.

In the second aspect, an embodiment of the present application provides an image recognition device, wherein:

The first processing unit is configured to input the scanned image of the target lung to a first neural network to obtain a first category probability map for nodules and no nodules, and the first neural network is used to identify the target lung Nodules in the scanned images;

The second processing unit is configured to input the first category probability map to a second neural network to obtain a second category probability map for benign nodules, malignant nodules and no nodules, and the second neural network uses To identify the nodule type of the nodule image in the first category probability map;

The third processing unit is configured to extract nodule units in the scan image of the target lung according to the first category probability map to obtain a plurality of nodule units; respectively, each of the multiple nodule units The nodule unit is input to the third neural network to obtain a third category probability map for the nodule type of each nodule unit in the plurality of nodule units, where the nodule types include benign nodules and malignant nodules, The third neural network is used to identify the nodule type of each nodular unit in the multiple nodular units;

The fourth processing unit is used to input the second category probability map and the third category probability map to a fourth neural network to obtain the lung cancer probability of the target patient corresponding to the target lung scan image, so The fourth neural network is used to classify the second category probability map and the third category probability map.

In a third aspect, embodiments of the present application provide an electronic device, including a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be processed by the above The program includes instructions for some or all of the steps described in the first aspect.

In a fourth aspect, an embodiment of the present application provides a computer-readable storage medium, wherein the computer-readable storage medium stores a computer program, wherein the computer program causes the computer to execute the Some or all of the steps described.

After adopting the above-mentioned image recognition method and related equipment, the electronic device first recognizes the nodule image of the lung scan image, and then determines the lung cancer probability through the locally recognized nodule type and the globally recognized nodule type, which improves the lung cancer focus Accuracy of image recognition of parts.

Description of the drawings

In order to more clearly describe the technical solutions in the embodiments of the present application, the following will briefly introduce the drawings needed in the embodiments. Obviously, the drawings in the following description are only some embodiments of the present application. For those of ordinary skill in the art, without creative work, other drawings can be obtained based on these drawings.

FIG. 1 is a schematic flowchart of an image recognition method provided by an embodiment of this application;

2 is a schematic structural diagram of an image recognition device provided by an embodiment of this application;

FIG. 3 is a schematic structural diagram of an electronic device provided by an embodiment of the application.

detailed description

In order to enable those skilled in the art to better understand the solutions of the present application, the technical solutions in the embodiments of the present application will be clearly and completely described below in conjunction with the drawings in the embodiments of the present application. The following describes the embodiments of the present application in detail.

Please refer to FIG. 1, an embodiment of the present application provides a schematic flowchart of an image recognition method. The image recognition method is applied to electronic equipment. The electronic devices involved in the embodiments of the present application may include various handheld devices with wireless communication functions, wearable devices, computing devices or other processing devices connected to a wireless modem, as well as various forms of user equipment (UE). ), mobile station (mobile station, MS), terminal device (terminal device), etc. For ease of description, the devices mentioned above are collectively referred to as electronic devices.

Specifically, as shown in Figure 1, an image recognition method is applied to an electronic device, where:

S101: Input the scanned image of the target lung to the first neural network to obtain the first category probability map for nodules and no nodules.

In the present application, the target lung scan image is an image obtained by a patient’s lung computed tomography (CT) in a hospital. This application does not limit the specific scanning method. The patient can be placed in the supine position with the head advanced. The acquisition spiral scanning method scans from the tip of the lung to the bottom of the lung. The thickness of the acquisition layer is less than or equal to 1 mm. The spacing is 5 to 7 mm, the width of the mediastinal window is 300 to 500 HU, and the window level is 30 to 50 HU; the width of the lung window is 800 to 1500 HU, and the window level is -600 to 800 HU. Among them, HU is the unit of CT value, also known as the Heinz unit, used to express the relative density of the tissue structure on the CT image.

In a possible embodiment, before inputting the scanned image of the target lung to the first neural network to obtain the first category probability map for nodules and no nodules, the method further includes: obtaining Multiple lung scan images to be recognized; perform morphological denoising on each lung scan image in the multiple lung scan images to obtain multiple first processed images; Each first processed image undergoes pixel normalization processing to obtain multiple second processed images; according to the scan sequence and preset size of the multiple lung scan images, the multiple second processed images are stereoscopically stacked To obtain a scanned image of the target lung.

Among them, the multiple lung scan images are planar scan images, and the pixel value range is (-1024, 3071), which corresponds to the radiodensity unit of the houns field. Morphology operation is an image processing method developed for binary images based on the set theory method of Mathematical Morphology. It can be understood that the lung scan images inevitably have noise, for example, the original CT includes clothing, medical equipment, etc., which is not limited here. In this embodiment, denoising processing is performed based on morphology, which can remove noise in the lung scan image, which facilitates the improvement of the recognition efficiency and accuracy of image recognition.

In a possible embodiment, if the multiple first processed images include the target first processed image, the morphological denoising is performed on each lung scan image in the multiple lung scan images to Obtaining multiple first processed images includes: performing an expansion operation on the target first processed image to obtain a first vector; performing an erosion operation on the target first processed image to obtain a second vector; Merge with the second vector to obtain a first processed image corresponding to the target first processed image.

Among them, the dilation operation is to expand the highlighted part of the image, similar to the field expansion, the effect picture has a larger highlight area than the original image. The Erosion operation is that the highlight part of the original image is corroded, similar to the area being eroded, the effect image has a smaller highlight area than the original image. From a mathematical point of view, the expansion operation and the erosion operation are to convolve the image with the kernel, which can be of any shape and size. It can be understood that taking the target first processed image in the multiple first processed images as an example, the expansion operation and the erosion operation are performed on the target first processed image respectively, and then vector addition is used to merge the two sets to obtain the denoising First process the image. In this way, noise in the lung scan image can be removed, which is convenient for improving the recognition efficiency and accuracy of image recognition.

In a possible embodiment, the performing morphological denoising on each lung scan image in the multiple lung scan images to obtain multiple first processed images includes: denoising the multiple lung scan images. Each lung scan image in the scanned image is preprocessed to obtain multiple fourth processed images; each fourth processed image in the multiple fourth processed images is morphologically denoised to obtain the multiple first processed images Process images.

Among them, preprocessing may include image format conversion processing, image deletion filling processing, average subtraction, normalization, principal component analysis (PCA), whiten, and so on. In this embodiment, the fourth processed image obtained by preprocessing the lung scan image can further improve the recognition efficiency and accuracy of image recognition.

This application does not limit the preset size, which can be 512*512*512, and the real aspect ratio can be maintained as much as possible. It can be understood that the morphological denoising of lung scan images obtained by multiple scans is performed to obtain multiple first processed images with noise removed, which is convenient to improve the recognition efficiency and accuracy of image recognition. Then perform pixel normalization processing on each of the first processed images to obtain multiple second processed images whose pixel values are normalized to the range (0,1), which can eliminate the dimensional influence between the indicators to improve the data indicators. Comparability between. Then, the multiple second processed images are stereo-stacked according to the scanning sequence and the preset size of the multiple lung scan images to obtain a three-dimensional target lung scan image. In this way, it is convenient to meet the processing requirements of the neural network, and it is convenient to improve the recognition efficiency and accuracy of image recognition.

In this application, the first neural network is used to identify the nodule image in the target lung scan image, that is, input the first neural network to obtain the first category probability map for nodules and no nodules. Before step S101 is executed, the training of the first neural network is completed, and the training method thereof is not limited in this application. In a possible embodiment, the method further includes: dividing each marked image in the plurality of marked images to obtain a plurality of first images; from each of the plurality of first images Extracting a second threshold of the uniform grid images to obtain multiple second images; performing size processing on each of the multiple second images to obtain multiple third images; according to the multiple marks Obtain the reference nodule position corresponding to each third image in the multiple third images according to the nodule marking information included in each marked image in the image; according to the multiple third images and the multiple third images Training the first initial neural network to obtain the first network parameters of the first neural network according to the reference nodule position corresponding to each third image in the third image; according to the first initial neural network and the first neural network The network parameter obtains the first neural network.

In this application, each marked image includes nodule marking information, using the aforementioned scanning method and processing method, and each marked image is manually marked. For example: three radiologists or a designated number of radiologists agree on the number, location, size or type of nodules in each first image and other nodule marking information.

Each first image includes a plurality of uniform grid images, and the size of each uniform network image is a first threshold. This application does not limit the first threshold, which can be 16*16*16. That is, each marker image is divided into regions, so that the size of each uniform grid image in the first image obtained after the region division is the first threshold.

Each second image includes a second threshold of uniform grid images. This application does not limit the second threshold, which can be 128, that is, only a specified number of uniform grids in each first image are extracted Images, in this way, can improve computing efficiency.

The first initial neural network is the first neural network without network parameters defined, and the size of each third image meets the input size defined by the first initial neural network. This application also does not limit the size processing method of the second image, which can be zero-filled; it can also copy a uniform grid image with nodules to maintain class balance; it can also use 3D convolution to merge and use (1*1 *1) Convolution replaces the global average merge operation to obtain an image that meets the size of the training image. This application does not limit the size of the third image, which can be 32*32*32. It can be understood that due to the smaller input size, the calculation efficiency can be improved.

As mentioned above, each marked image includes nodule marking information, and the third image is a processed image corresponding to the marked image. The reference nodule position corresponding to the third image can be obtained according to the nodule marking information of the marked image, that is, the position to be The reference nodule position of the training image.

In a possible embodiment, the performing size processing on each of the plurality of second images to obtain a plurality of third images includes: extracting nodules in the plurality of second images To obtain a plurality of fourth images; and perform copy processing on the fourth image of each second image in the plurality of second images to obtain the plurality of third images.

Wherein, the fourth image is a uniform network image with nodules. This application does not limit the method for extracting uniform grid images with nodules. In a possible embodiment, if the multiple second images include The target second image, where the target second image corresponds to multiple target second uniform network images, and the method further includes: dividing the multiple target second uniform grid images to obtain multiple uniform grid images Set; superimpose the nodule probability corresponding to each uniform network image set in the plurality of uniform network image sets to obtain multiple superposition values; superimpose corresponding to each uniform network image set in the plurality of uniform network image sets Perform an averaging operation on the values to obtain a plurality of average values; extract the uniform grid images in the uniform grid image set corresponding to the average value of the plurality of averages greater than the third threshold to obtain the plurality of fourth images.

Among them, the method of dividing the uniform grid image set can be randomly assigned, for example, scanning to 10 uniform grid images is divided into a group. This application does not limit the third threshold, which can be 0.5. It can be understood that the second uniform grid images of multiple targets are collected to obtain a uniform grid image set, and the nodule probabilities corresponding to each uniform grid image set are superimposed to obtain multiple superimposed values, and for each uniform grid The superimposed values corresponding to the grid image set are averaged to obtain multiple average values. If the average value is greater than the third threshold, it is determined that nodules exist in each uniform grid image in the uniform grid image set corresponding to the average value. In this way, determining whether a nodule is included in the image set can improve the efficiency of extracting the fourth image.

This application does not limit the training process of the first initial neural network. Batch Gradient Descent (BGD), Stochastic Gradient Descent (SGD), or mini-batch gradient descent (mini-batch) algorithm can be used. SGD) and so on for training. A training cycle is completed by a single forward operation and reverse gradient propagation, that is, the image to be trained is forwardly input to the neural network to be trained to obtain the output target object. If the target object fails to match the reference object, the target object Obtain the loss function from the reference object, and then input the loss function back to the neural network to adjust the network parameters of the neural network, such as weight and bias. Then, input the next image to be trained until the matching is successful or the training of all images is completed. In the training process of the first neural network, the reference object is the reference nodule position, and the target object is the target nodule position.

In a possible embodiment, the first initial neural network is trained based on the multiple third images and the reference nodule position corresponding to each third image in the multiple third images To obtain the first network parameters of the first neural network, including: dividing the multiple third images according to a preset ratio to obtain multiple first training images and multiple first verification images; Classify the first initial neural network to obtain the to-be-verified network parameters of the first neural network according to the reference nodule position corresponding to each first training image in the first training images; The image verifies the network parameter to be verified to obtain the first network parameter.

This application does not limit the preset ratio, which can be 7:3. This application does not limit the classification algorithm. Logistic regression or decision tree algorithm can be used to classify the image features and reference nodule positions corresponding to the multiple first training images, so as to obtain the network parameters to be verified of the first neural network. .

The verification process is used to train the neural network to be verified for which the network parameters have been obtained according to the multiple first verification images to obtain the first network parameters of the first neural network. For details, please refer to the method of the aforementioned training period, which will not be repeated here. . In this way, the test image can be input, that is, S101 is executed.

It can be understood that the plurality of third images are divided according to a preset ratio to obtain a plurality of first training images and a plurality of first verification images, and then the first initial neural network is analyzed according to the plurality of first training images. The network is classified to obtain the network parameters to be verified of the first neural network, and finally the network parameters to be verified of the first neural network are verified according to the multiple first verification images to obtain the first neural network. Network parameters. In this way, the batch gradient descent algorithm is used for training and verification, which improves the training speed of the first neural network.

The training parameters of the first initial neural network of this application are also not limited. For example, 24 small batches of 10,000 iterations are used for training, the learning rate is 0.01, and the weight attenuation is 0.0001. The default parameters (β ₁ ＝0.9, β ₂ ＝ 0.999) the jth Adam optimizer.

In a possible embodiment, a linear rectification (Rectified Linear Units, Relu) function is used as an activation function (Activation function).

Among them: the expression of Relu function: f(x)=max(0,x). It can be understood that the Relu function as an excitation function can enhance the non-linear characteristics of the decision function and the entire neural network without changing the convolutional layer itself.

In a possible embodiment, a weighted cross entropy function is used as the loss function. In this way, a strong class imbalance can be avoided. In addition, the loss can be balanced by the weight of each batch and applied to the weaker category.

This application does not limit the probability map of the first category, and may be a density histogram used to describe the nodule probability of each uniform grid image.

It can be understood that firstly, each marked image is divided into regions to obtain multiple first images with the same grid size, and then a specified number of uniform grid images are extracted to obtain multiple second images. In this way, the computational efficiency is improved. In order to meet the operating conditions, the multiple second images are further subjected to size processing to obtain multiple third images. Then, the reference nodule position corresponding to each third image is obtained according to the nodule marking information of each marked image. Finally, the first initial neural network is trained according to the multiple third images and the position of the reference nodule corresponding to each third image to obtain the first network parameters of the first neural network, so that according to the first initial neural network and the first network Parameter acquisition of the first neural network. In this way, the training speed of the first neural network is improved.

S102: Input the probability map of the first category to the second neural network to obtain a probability map of the second category for benign nodules, malignant nodules and no nodules.

In this application, the second neural network is used to identify the nodule type of the nodule image, that is, to further identify the nodule type of the nodule image in the first category probability map, and input the first category probability map to the first category probability map. In the second neural network, the second category probability map for benign nodules, malignant nodules and no nodules can be obtained. It can be understood that directly inputting the probability map of the first category to the second neural network can save the time for identifying no nodules and improve the efficiency of recognition.

This application does not limit the probability map of the second category, and may be a density histogram used to describe the nodule type probability of each uniform grid image.

This application does not limit the labeling method of the target nodule type. All nodules of patients with cancer can be marked as malignant, and all nodules of non-cancer patients can be marked as benign. The diagnosis time of cancer is 1 year. That is, the nodules in the scan pictures diagnosed with cancer within 1 year are all marked as malignant.

Before step S102 is performed, the second neural network is trained. The training method can refer to the training method of the first neural network, which will not be repeated here. The reference object is the reference nodule type, and the target object is the target nodule. Types of.

This application also does not limit the training parameters of the second neural network. For example, the training phase performs 20,000 iterations with a learning rate of 0.01, and the verification phase performs 30,000 iterations with a learning rate of 0.001.

S103: Extract nodule units in the scan image of the target lung according to the first category probability map to obtain multiple nodule units.

In this application, a nodule unit is a unit that is identified as a unit in the first category probability map. If the uniform grid image intersects the bounding box of the nodule, the uniform grid image can be determined to be a nodule unit.

S104: Input each nodular unit of the multiple nodular units to a third neural network to obtain a third category probability map for the nodule type of each nodular unit of the multiple nodular units , The nodule types include benign nodules and malignant nodules.

In this application, the third neural network is used to separately identify the nodule type of each nodule unit, that is, to further identify the nodule type of each nodule unit corresponding to the first category probability map, and the multiple nodules When the unit is input to the third neural network, the probability that each nodular unit is a benign nodule or a malignant nodule can be determined. It can be understood that directly inputting multiple nodule images extracted from the first category probability map to the third neural network can improve the accuracy of identifying the nodule type.

This application does not limit the probability map of the third category, and may be a density histogram to describe the nodule type probability of each nodular unit.

In a possible embodiment, the marking information of each first image in the first image set further includes a target nodule type, and the method further includes: performing an operation on each fourth image in the plurality of fourth images. Perform data enhancement to obtain multiple fifth images; obtain the reference nodule type corresponding to each fifth image in the multiple fifth images according to the nodule marking information included in each marked image in the multiple marked images ; According to the multiple fifth images and the reference nodule type corresponding to each fifth image in the multiple fifth images, the second initial neural network is trained to obtain the second network parameters of the third neural network .

This application does not limit the method of data enhancement, which may include volume enhancement, rotation, average subtraction, enlargement and reduction, etc. In a possible embodiment, if the plurality of fourth images include a target fourth image, the data enhancement is performed on each fourth image in the plurality of fourth images to obtain a plurality of fifth images , Including: performing rotation processing on the mask corresponding to the target fourth image according to a first angle to obtain a first sub-processed image; subtracting an average value from the first sub-processed image to obtain a second sub-processing Image; according to the first multiple, size processing of the width of the mask corresponding to the second sub-processed image to obtain a third sub-processed image; according to the second multiple, the size of the mask corresponding to the third sub-processed image The length is subjected to size processing to obtain the fourth sub-processed image; according to the third multiple, the fourth sub-processed image is subjected to size processing to obtain the fifth sub-processed image; according to the second angle, the size of the sixth sub-processed image is The mask undergoes mirror inversion processing to obtain a fifth image corresponding to the target fourth processed image.

This application does not limit the first angle, the first multiple, the second multiple, the third multiple, and the fourth angle. The first angle may be less than or equal to 270 degrees, the first multiple may be 0.9 or 1.1, and the second The multiple can be 0.9 or 1.1, the third multiple can be 0.8 or 1.2, and the second angle can be less than or equal to 270 degrees. The display object can be rotated by setting the rotation property, that is, setting this property to a number (0-360), in degrees, which represents the amount of rotation applied to the object.

It can be understood that, in this embodiment, taking the target third image as an example, before training any third image in the multiple third images, the above-mentioned multiple processing steps are performed, that is, the target third image is rotated , Subtract the average value, size, and mirror inversion processing, so that the fifth image corresponding to the target third image undergoes data enhancement processing. In this way, the definition of the image is improved, which is convenient for improving the recognition efficiency of the second neural network.

In this application, the second initial neural network is the third neural network without defining network parameters. The training method of the third neural network can refer to the training method of the first neural network, wherein the reference object is the reference nodule type, and the target object is the target nodule type. That is, input multiple fifth images to the neural network to be trained or to be verified to obtain the target nodule type in each fifth image. If the target nodule type fails to match the previously marked reference nodule type, then The target nodule type and the reference nodule type obtain a loss function, and update the network parameters of the neural network according to the loss function.

This application also does not limit the training parameters of the third neural network. For example, the batch size is 32, the Adam optimizer is used for 6000 iterations, the learning rate is 0.01, and the weight attenuation is 0.0001.

It can be understood that, firstly extract the uniform grid images with nodules in the multiple second images to obtain multiple fourth images, that is, only extract the nodular units. Then, data enhancement is performed on each fourth image among the multiple fourth images to obtain multiple fifth images, which can improve the data processing efficiency. Then, according to the nodule marking information included in each marked image in the multiple marked images, the reference nodule type corresponding to each fifth image in the multiple fifth images is obtained. Finally, according to the reference nodule type corresponding to each fifth image in the multiple fifth images and the multiple fifth images, the second initial neural network is trained to obtain the second network parameters of the third neural network, and the second initial neural network The network is the third neural network with no network parameters defined. In this way, the training efficiency of the third neural network is improved.

It should be noted that the training image of the third neural network can be a batch of images different from the training image of the first neural network, and the processing method before training can refer to the method of the training image of the first neural network, which will not be repeated here. .

S105: Input the second category probability map and the third category probability map to a fourth neural network to obtain the lung cancer probability of the target patient corresponding to the target lung scan image.

In this application, the fourth neural network is used to classify the second category probability map and the third category probability map. That is to say, classify the globally recognized nodule type obtained by the second neural network and the locally recognized nodule type obtained by the third neural network to obtain the lung cancer probability of the target patient corresponding to the target lung scan image, namely When the second category probability map and the third category probability map are input to the fourth neural network, the probability that the target patient corresponding to the target lung scan image has lung cancer can be determined. It can be understood that the probability of lung cancer is determined by the recognition results of locally recognizing nodule types and global recognizing nodule types, which further improves the accuracy of recognizing lung cancer.

In this application, the training method of the fourth neural network can refer to the training method of the first neural network, wherein the reference object is the reference lung cancer probability, and the target object is the target lung cancer probability. This application also does not limit the training parameters of the fourth neural network. For example, all data is used as a batch, and the Adam optimizer is used for 2000 iterations, and the weight is attenuated to 0.0001.

In a possible embodiment, the second category probability map and the third category probability map are input to a fourth neural network to obtain the lung cancer patients of the target patient corresponding to the target lung scan image. The disease probability includes: performing data enhancement on the second category probability map and the third category probability map to obtain the target second category probability map and the target third category probability map; and the target second category probability map And the target third category probability map is input to the fourth neural network to obtain the lung cancer probability.

Among them, the data enhancement can perform volume transposition enhancement or tailoring, and can also refer to the data enhancement operation of the third neural network, which is not limited here. It can be understood that through the data enhancement operation, the clarity of the image is improved, which facilitates the improvement of the recognition efficiency of the fourth neural network.

In a possible embodiment, the second category probability map and the third category probability map are input to a fourth neural network to obtain the lung cancer disease of the target patient corresponding to the target lung scan image The probability includes: performing feature weighting on the second category probability map and the third category probability map to obtain a fourth category probability map for the nodule type of each nodular unit in the plurality of nodular units ; Input the fourth category probability map to a fourth neural network to obtain the lung cancer probability.

This application does not limit the probability map of the fourth category, and may be a density histogram to describe the nodule type probability of each uniform grid image.

This application can calculate the second neural network and the third neural network based on the number, minimum, maximum, average, standard deviation, and integration of all maximum outputs in the second category probability map and the third category probability map Then, the feature weights are performed according to their weights.

It can be understood that firstly, the recognition results of the nodule types in the locally and globally determined target lung scan images are feature-weighted to obtain the fourth category probability map, and then the lung cancer patients are determined for the nodule types of each nodule in the fourth category probability map. The disease probability improves the accuracy of identifying lung cancer.

In the image recognition method shown in Figure 1, the nodule image of the lung scan image is first recognized, and then the locally recognized nodule type and the globally recognized nodule type are used to determine the probability of lung cancer, which improves the location of the lung cancer lesion. The accuracy of image recognition.

Consistent with the embodiment in FIG. 1, please refer to FIG. 2. FIG. 2 is a schematic structural diagram of an image recognition device provided by an embodiment of the present application, and the device is applied to electronic equipment. As shown in FIG. 2, the above-mentioned image recognition device 200 includes:

The first processing unit 201 is configured to input a scanned image of the target lung to a first neural network to obtain a first category probability map for nodules and no nodules, and the first neural network is used to identify the target Nodules in the lung scan image;

The second processing unit 202 is configured to input the first category probability map to a second neural network to obtain a second category probability map for benign nodules, malignant nodules and no nodules, the second neural network Used to identify the nodule type of the nodule image in the first category probability map;

The third processing unit 203 is configured to extract a nodule unit in the target lung scan image according to the first category probability map to obtain a plurality of nodule units; each of the multiple nodule units The nodule unit is input to the third neural network to obtain a third category probability map for the nodule type of each nodule unit in the plurality of nodule units, and the nodule types include benign nodules and malignant nodules , The third neural network is used to identify the nodule type of each nodular unit in the plurality of nodular units;

The fourth processing unit 204 is configured to input the second category probability map and the third category probability map to a fourth neural network to obtain the lung cancer probability of the target patient corresponding to the target lung scan image, The fourth neural network is used to classify the second category probability map and the third category probability map.

It can be understood that the image recognition device first recognizes the nodule image of the lung scan image, and then determines the probability of lung cancer through the locally recognized nodule types and the globally recognized nodule types, which improves the accuracy of image recognition of lung cancer lesions. .

In a possible example, the device 200 further includes:

The preprocessing unit 205 is configured to obtain multiple lung scan images to be recognized; perform morphological denoising on each lung scan image in the multiple lung scan images to obtain multiple first processed images; Each of the plurality of first processed images is subjected to pixel normalization processing to obtain a plurality of second processed images; according to the scan sequence and preset size of the plurality of lung scan images, all The multiple second processed images are three-dimensionally stacked to obtain the target lung scan image.

In a possible example, the preprocessing unit 205 is further configured to divide each marked image in the multiple marked images to obtain multiple first images, and each first image includes multiple uniform grids. Image, the size of each uniform grid image is a first threshold, and each marked image includes nodule marking information; extracting a second threshold of the uniform grid images from each first image in the plurality of first images , In order to obtain multiple second images; the size of each second image in the multiple second images is processed to obtain multiple third images, the size of each third image meets the definition of the first initial neural network Input size, the first initial neural network is the first neural network without defined network parameters; according to the nodule marking information included in each marked image in the multiple marked images, the multiple third images are acquired The position of the reference nodule corresponding to each third image in the image; the device 200 further includes:

The training unit 206 is configured to train the first initial neural network according to the multiple third images and the reference nodule position corresponding to each third image in the multiple third images to obtain the The first network parameter of the first neural network; the first neural network is obtained according to the first initial neural network and the first network parameter.

In a possible example, in the aspect of performing size processing on each second image of the plurality of second images to obtain a plurality of third images, the preprocessing unit 205 is specifically configured to extract the plurality of third images. There are uniform grid images of nodules in the second images to obtain multiple fourth images; copy processing is performed on the fourth image of each second image in the multiple second images to obtain the multiple The third image.

In a possible example, if the multiple second images include a target second image, and the target second image corresponds to multiple target second uniform network images, the preprocessing unit 205 is specifically configured to use the The multiple target second uniform grid images are divided to obtain multiple uniform grid image sets; the nodule probability corresponding to each uniform network image set in the multiple uniform network image sets is superimposed to obtain multiple Superimposed value; averaging the superimposed value corresponding to each uniform network image set in the plurality of uniform network image sets to obtain multiple average values; extracting the average value of the multiple average values greater than the third threshold A uniform grid image in a uniform grid image set to obtain the plurality of fourth images.

In a possible example, the preprocessing unit 205 is further configured to perform data enhancement on each fourth image in the multiple fourth images to obtain multiple fifth images; according to the multiple labeled images The nodule marking information included in each marked image is used to obtain the reference nodule type corresponding to each fifth image in the plurality of fifth images; according to the plurality of fifth images and each of the plurality of fifth images For a reference nodule type corresponding to a fifth image, a second initial neural network is trained to obtain the second network parameters of the third neural network, and the second initial neural network is the first network parameter without defined network parameters. Three neural networks.

In a possible example, the label information of each first image in the first image set further includes a target nodule type, and the training unit is further configured to determine the type of nodule included in each of the multiple labeled images. Section mark information to obtain the reference nodule type corresponding to each fifth image in the multiple fifth images; according to the multiple fifth images and the reference corresponding to each fifth image in the multiple fifth images Nodules type, training the second initial neural network to obtain the second network parameters of the third neural network.

In a possible example, if the multiple fourth images include a target fourth image, the preprocessing unit is specifically configured to perform rotation processing on the mask corresponding to the target fourth image according to the first angle, so as to Obtain the first sub-processed image; subtract the average value from the first sub-processed image to obtain the second sub-processed image; perform the width of the mask corresponding to the second sub-processed image according to the first multiple Size processing to obtain the third sub-processed image; according to the second multiple, size processing is performed on the length of the mask corresponding to the third sub-processed image to obtain the fourth sub-processed image; according to the third multiple, the The fourth sub-processed image is subjected to size processing to obtain the fifth sub-processed image; according to the second angle, the mask of the sixth sub-processed image is mirrored and inverted to obtain the first corresponding to the target fourth processed image Five images.

In a possible example, the fourth processing unit 204 is specifically configured to perform feature weighting on the probability map of the second category and the probability map of the third category, so as to obtain a reference to each of the multiple nodule units. The fourth category probability map of the nodule type of the nodular unit; the fourth category probability map is input to the fourth neural network to obtain the lung cancer probability.

Consistent with the embodiment of FIG. 1, please refer to FIG. 3. FIG. 3 is a schematic structural diagram of an electronic device provided by an embodiment of the present application. As shown in FIG. 3, the electronic device 300 includes a processor 310, a memory 320, a communication interface 330, and one or more programs 340. The one or more programs 340 are stored in the memory 320 and are configured by The foregoing processor 310 executes, and the foregoing program 340 includes instructions for executing the following steps:

It can be understood that the electronic device first recognizes the nodule image of the lung scan image, and then determines the probability of lung cancer through the locally recognized nodule type and the globally recognized nodule type, which improves the accuracy of image recognition of lung cancer lesions.

In a possible example, the program 340 is also used to execute the instructions of the following steps:

Acquire multiple lung scan images to be recognized;

Performing morphological denoising on each lung scan image in the plurality of lung scan images to obtain a plurality of first processed images;

Performing pixel normalization processing on each first processed image in the plurality of first processed images to obtain a plurality of second processed images;

According to the scanning sequence and the preset size of the multiple lung scan images, the multiple second processed images are stereoscopically stacked to obtain the target lung scan image.

Each marker image in the multiple marker images is divided into regions to obtain multiple first images. Each first image includes multiple uniform grid images. The size of each uniform grid image is the first threshold. The marked image includes nodule marking information;

Extracting a second threshold number of the uniform grid images from each of the plurality of first images to obtain a plurality of second images;

Perform size processing on each second image in the plurality of second images to obtain a plurality of third images, and the size of each third image meets the input size defined by the first initial neural network, and the first initial neural network The network is the first neural network without defining network parameters;

Acquiring a reference nodule position corresponding to each third image in the plurality of third images according to nodule marking information included in each of the plurality of marked images;

According to the multiple third images and the reference nodule position corresponding to each third image in the multiple third images, the first initial neural network is trained to obtain the first neural network A network parameter;

Acquiring the first neural network according to the first initial neural network and the first network parameters.

In a possible example, the program 340 is specifically used to execute instructions of the following steps:

Extracting uniform grid images with nodules in the plurality of second images to obtain a plurality of fourth images;

Copy processing is performed on the fourth image of each second image in the plurality of second images to obtain the plurality of third images.

In a possible example, if the multiple second images include a target second image, and the target second image corresponds to multiple target second uniform network images, the program 340 is specifically configured to execute the instructions of the following steps :

Dividing the multiple target second uniform grid images to obtain multiple uniform grid image sets;

Performing a superposition operation on the probabilities of nodules corresponding to each uniform network image set in the plurality of uniform network image sets to obtain a plurality of superposition values;

Performing an averaging operation on the superimposed values corresponding to each uniform network image set in the plurality of uniform network image sets to obtain multiple average values;

Extracting the uniform grid image in the uniform grid image set corresponding to the average value of the plurality of average values greater than the third threshold to obtain the plurality of fourth images.

Performing data enhancement on each fourth image in the plurality of fourth images to obtain a plurality of fifth images;

Obtaining a reference nodule type corresponding to each fifth image in the plurality of fifth images according to the nodule marking information included in each of the plurality of marked images;

According to the multiple fifth images and the reference nodule type corresponding to each fifth image in the multiple fifth images, the second initial neural network is trained to obtain the second network of the third neural network Parameters, the second initial neural network is the third neural network without defining network parameters.

In a possible example, the marking information of each first image in the first image set further includes the target nodule type, and the program 340 is further used to execute instructions of the following steps:

In a possible example, if the multiple fourth images include the target fourth image, the program 340 is further used to execute the instructions of the following steps:

Performing rotation processing on the mask corresponding to the target fourth image according to the first angle to obtain the first sub-processed image;

Performing average value subtraction processing on the first sub-processed image to obtain a second sub-processed image;

According to the first multiple, size processing is performed on the width of the mask corresponding to the second sub-processed image to obtain a third sub-processed image;

According to the second multiple, size processing is performed on the length of the mask corresponding to the third sub-processed image to obtain a fourth sub-processed image;

Performing size processing on the fourth sub-processed image according to the third multiple to obtain a fifth sub-processed image;

According to the second angle, the mask of the sixth sub-processed image is mirrored and reversed to obtain the fifth image corresponding to the target fourth processed image.

Performing feature weighting on the second category probability map and the third category probability map to obtain a fourth category probability map for the nodule type of each nodular unit in the plurality of nodular units;

The fourth category probability map is input to a fourth neural network to obtain the lung cancer probability.

The embodiments of the present application also provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program for storing a computer program that enables a computer to execute part or all of the steps of any method as recorded in the method embodiment , Computers include electronic equipment.

The embodiments of the present application also provide a computer program product. The computer program product includes a non-transitory computer-readable storage medium storing a computer program. The computer program is operable to make a computer execute a part of any method described in the method embodiment. Or all steps. The computer program product may be a software installation package, and the computer includes electronic equipment.

In the above-mentioned embodiments, the description of each embodiment has its own focus. For parts that are not described in detail in an embodiment, reference may be made to related descriptions of other embodiments.

Those skilled in the art should be aware that in one or more of the above examples, the functions described in this application can be implemented by hardware, software, firmware or any combination thereof. When implemented by software, these functions can be stored in a computer-readable medium or transmitted as one or more instructions or codes on the computer-readable medium. The computer-readable medium includes a computer storage medium and a communication medium, where the communication medium includes any medium that facilitates the transfer of a computer program from one place to another. The storage medium may be any available medium that can be accessed by a general-purpose or special-purpose computer.

The specific implementations described above further describe the purpose, technical solutions and beneficial effects of this application in detail. It should be understood that the above are only specific implementations of this application and are not intended to limit the scope of this application. The protection scope, any modification, equivalent replacement, improvement, etc. made on the basis of the technical solution of this application shall be included in the protection scope of this application.

Claims

An image recognition method, characterized in that it comprises:

Input the target lung scan image to a first neural network to obtain a first category probability map for nodules and no nodules, and the first neural network is used to identify nodules in the target lung scan image image;

Input the first category probability map to a second neural network to obtain a second category probability map for benign nodules, malignant nodules and no nodules, and the second neural network is used to identify the first category The nodule type of the nodule image in the probability map;

Extracting nodular units in the scan image of the target lung according to the first category probability map to obtain multiple nodular units;

Each nodular unit in the multiple nodular units is input to a third neural network to obtain a third category probability map for the nodule type of each nodular unit in the multiple nodular units, so The nodule types include benign nodules and malignant nodules, and the third neural network is used to identify the nodule type of each nodular unit of the plurality of nodular units;

The second category probability map and the third category probability map are input to a fourth neural network to obtain the lung cancer probability of the target patient corresponding to the target lung scan image, and the fourth neural network is used for Classify the second category probability map and the third category probability map.
The method according to claim 1, characterized in that, before inputting the target lung scan image to the first neural network to obtain the first category probability map for nodules and no nodules, the method Also includes:

Acquire multiple lung scan images to be recognized;

Performing morphological denoising on each lung scan image in the plurality of lung scan images to obtain a plurality of first processed images;

Performing pixel normalization processing on each first processed image in the plurality of first processed images to obtain a plurality of second processed images;

According to the scanning sequence and the preset size of the multiple lung scan images, the multiple second processed images are stereoscopically stacked to obtain the target lung scan image.
The method according to claim 1, characterized in that, before inputting the target lung scan image to the first neural network to obtain the first category probability map for nodules and no nodules, the method Also includes:

Each marker image in the multiple marker images is divided into regions to obtain multiple first images. Each first image includes multiple uniform grid images. The size of each uniform grid image is the first threshold. The marked image includes nodule marking information;

Extracting a second threshold number of the uniform grid images from each of the plurality of first images to obtain a plurality of second images;

Perform size processing on each second image in the plurality of second images to obtain a plurality of third images, and the size of each third image meets the input size defined by the first initial neural network, and the first initial neural network The network is the first neural network without defining network parameters;

Acquiring a reference nodule position corresponding to each third image in the plurality of third images according to nodule marking information included in each of the plurality of marked images;

According to the multiple third images and the reference nodule position corresponding to each third image in the multiple third images, the first initial neural network is trained to obtain the first neural network A network parameter;

Acquiring the first neural network according to the first initial neural network and the first network parameters.
The method according to claim 3, wherein the performing size processing on each second image in the plurality of second images to obtain a plurality of third images comprises:

Extracting uniform grid images with nodules in the plurality of second images to obtain a plurality of fourth images;

Copy processing is performed on the fourth image of each second image in the plurality of second images to obtain the plurality of third images.
The method according to claim 4, wherein if the plurality of second images includes a target second image, and the target second image corresponds to a plurality of target second uniform network images, the extracting the plurality of second images A uniform grid image with nodules in the second image to obtain multiple fourth images, including:

Dividing the multiple target second uniform grid images to obtain multiple uniform grid image sets;

Performing a superposition operation on the probabilities of nodules corresponding to each uniform network image set in the plurality of uniform network image sets to obtain a plurality of superposition values;

Performing an averaging operation on the superimposed values corresponding to each uniform network image set in the plurality of uniform network image sets to obtain multiple average values;

Extracting the uniform grid image in the uniform grid image set corresponding to the average value of the plurality of average values greater than the third threshold to obtain the plurality of fourth images.
The method according to claim 4, wherein the method further comprises:

Performing data enhancement on each fourth image in the plurality of fourth images to obtain a plurality of fifth images;

Acquiring the reference nodule type corresponding to each fifth image in the plurality of fifth images according to the nodule marking information included in each of the plurality of marked images;

According to the multiple fifth images and the reference nodule type corresponding to each fifth image in the multiple fifth images, the second initial neural network is trained to obtain the second network of the third neural network Parameters, the second initial neural network is the third neural network without defining network parameters.
The method according to claim 6, wherein the marking information of each first image in the first image set further includes a target nodule type, and each fourth image in the pair of fourth images After the image is data-enhanced to obtain multiple fifth images, the method further includes:

Acquiring the reference nodule type corresponding to each fifth image in the plurality of fifth images according to the nodule marking information included in each of the plurality of marked images;

According to the multiple fifth images and the reference nodule type corresponding to each fifth image in the multiple fifth images, the second initial neural network is trained to obtain the second network of the third neural network parameter.
The method according to claim 6 or 7, wherein if the plurality of fourth images include a target fourth image, then performing data enhancement on each fourth image in the plurality of fourth images, To get multiple fifth images, including:

Performing rotation processing on the mask corresponding to the target fourth image according to the first angle to obtain the first sub-processed image;

Performing average value subtraction processing on the first sub-processed image to obtain a second sub-processed image;

According to the first multiple, size processing is performed on the width of the mask corresponding to the second sub-processed image to obtain a third sub-processed image;

According to the second multiple, size processing is performed on the length of the mask corresponding to the third sub-processed image to obtain a fourth sub-processed image;

Performing size processing on the fourth sub-processed image according to the third multiple to obtain a fifth sub-processed image;

According to the second angle, the mask of the sixth sub-processed image is mirrored and reversed to obtain the fifth image corresponding to the target fourth processed image.
The method according to any one of claims 1-8, wherein the second category probability map and the third category probability map are input to a fourth neural network to obtain the target lung The lung cancer probability of the target patient corresponding to the scanned image includes:

Performing feature weighting on the second category probability map and the third category probability map to obtain a fourth category probability map for the nodule type of each nodular unit in the plurality of nodular units;

The fourth category probability map is input to a fourth neural network to obtain the lung cancer probability.
An image recognition device, characterized by comprising:

The first processing unit is configured to input the scanned image of the target lung to a first neural network to obtain a first category probability map for nodules and no nodules, and the first neural network is used to identify the target lung Nodules in the scanned images;

The second processing unit is configured to input the first category probability map to a second neural network to obtain a second category probability map for benign nodules, malignant nodules and no nodules, and the second neural network uses To identify the nodule type of the nodule image in the first category probability map;

The third processing unit is configured to extract nodule units in the scan image of the target lung according to the first category probability map to obtain a plurality of nodule units; respectively, each of the multiple nodule units The nodule unit is input to the third neural network to obtain a third category probability map for the nodule type of each nodule unit in the plurality of nodule units, where the nodule types include benign nodules and malignant nodules, The third neural network is used to identify the nodule type of each nodular unit in the multiple nodular units;

The fourth processing unit is used to input the second category probability map and the third category probability map to a fourth neural network to obtain the lung cancer probability of the target patient corresponding to the target lung scan image, so The fourth neural network is used to classify the second category probability map and the third category probability map.
The device according to claim 10, wherein the device further comprises:

The preprocessing unit is used to obtain multiple lung scan images to be recognized; perform morphological denoising on each lung scan image in the multiple lung scan images to obtain multiple first processed images; Each of the plurality of first processed images is subjected to pixel normalization processing to obtain a plurality of second processed images; according to the scan sequence and preset size of the plurality of lung scan images, the A plurality of second processed images are three-dimensionally stacked to obtain the target lung scan image.
The device according to claim 10, wherein the device further comprises:

The pre-processing unit is used to divide each marked image in the multiple marked images to obtain multiple first images, each first image includes multiple uniform grid images, and the size of each uniform grid image is A first threshold, each marked image includes nodule marking information; a second threshold of the uniform grid images is extracted from each first image in the plurality of first images to obtain a plurality of second images; Each of the multiple second images is subjected to size processing to obtain multiple third images, and the size of each third image meets the input size defined by the first initial neural network, and the first initial neural network is The first neural network without defining network parameters; according to the nodule marking information included in each marked image in the multiple marked images, obtain the reference nodule corresponding to each third image in the multiple third images position;

The training unit is configured to train the first initial neural network according to the multiple third images and the reference nodule position corresponding to each third image in the multiple third images to obtain the first A first network parameter of a neural network; obtaining the first neural network according to the first initial neural network and the first network parameters.
The device according to claim 12, wherein the preprocessing unit is specifically configured to extract uniform grid images with nodules in the multiple second images to obtain multiple fourth images; The fourth image of each second image among the plurality of second images is copied to obtain the plurality of third images.
The device according to claim 13, wherein if the plurality of second images include a target second image, and the target second image corresponds to a plurality of target second uniform network images, the preprocessing unit specifically Used to divide the multiple target second uniform grid images to obtain multiple uniform grid image sets; superimpose the nodule probability corresponding to each uniform network image set in the multiple uniform network image sets , In order to obtain multiple superimposed values; averaging the superimposed values corresponding to each uniform network image set in the multiple uniform network image sets to obtain multiple average values; extracting the average value of the multiple average values greater than The uniform grid image in the uniform grid image set corresponding to the third threshold is used to obtain the plurality of fourth images.
The device according to claim 13, wherein the preprocessing unit is further configured to perform data enhancement on each fourth image in the plurality of fourth images to obtain a plurality of fifth images; The nodule marking information included in each marked image in the plurality of marked images is obtained, and the reference nodule type corresponding to each fifth image in the plurality of fifth images is acquired; according to the plurality of fifth images and the plurality of fifth images For the reference nodule type corresponding to each fifth image in the fifth image, the second initial neural network is trained to obtain the second network parameters of the third neural network, and the second initial neural network is an undefined network Parameters of the third neural network.
The device according to claim 15, wherein the label information of each first image in the first image set further includes a target nodule type, and the training unit is further configured to perform according to each of the multiple labeled images. The nodule marking information included in a marked image is obtained, and the reference nodule type corresponding to each fifth image in the plurality of fifth images is acquired; according to the plurality of fifth images and each of the plurality of fifth images For the reference nodule type corresponding to the fifth image, the second initial neural network is trained to obtain the second network parameters of the third neural network.
The device according to claim 15 or 16, wherein if the plurality of fourth images include a target fourth image, the preprocessing unit is specifically configured to correspond to the target fourth image according to a first angle Perform rotation processing on the mask of the first sub-processed image to obtain the first sub-processed image; subtract the average value from the first sub-processed image to obtain the second sub-processed image; perform the second sub-processed image according to the first multiple Performing size processing on the width of the mask corresponding to the image to obtain a third sub-processed image; performing size processing on the length of the mask corresponding to the third sub-processed image according to the second multiple to obtain a fourth sub-processed image; According to the third multiple, the size of the fourth sub-processed image is processed to obtain the fifth sub-processed image; according to the second angle, the mask of the sixth sub-processed image is mirrored and inverted to obtain the The fifth image corresponding to the fourth processed image of the target.
The device according to any one of claims 10-17, wherein the fourth processing unit is specifically configured to perform feature weighting on the second category probability map and the third category probability map to obtain The fourth category probability map of the nodule type of each nodular unit in the plurality of nodular units; the fourth category probability map is input to a fourth neural network to obtain the lung cancer probability.
An electronic device characterized by comprising a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor The program includes instructions for executing the steps in any one of the methods of claims 1-9.
A computer-readable storage medium, characterized in that it is used to store a computer program, wherein the computer program causes a computer to execute the method according to any one of claims 1-9.