CN110136103B

CN110136103B - Medical image interpretation method, device, computer equipment and storage medium

Info

Publication number: CN110136103B
Application number: CN201910334702.6A
Authority: CN
Inventors: 庞烨; 韦嘉楠; 王义文; 王健宗
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2019-04-24
Filing date: 2019-04-24
Publication date: 2024-05-28
Anticipated expiration: 2039-04-24
Also published as: WO2020215557A1; CN110136103A

Abstract

The invention discloses a medical image interpretation method, a medical image interpretation device, computer equipment and a storage medium, wherein the method comprises the steps of acquiring an image analysis request, wherein the image analysis request comprises a target medical image; identifying the target medical image by adopting a pre-trained image identification model, and obtaining a characteristic map output by a last layer of convolution layer in the image identification model; based on the feature map, obtaining a predicted probability value corresponding to each original focus category output by the image recognition model; the original focus category with the maximum prediction probability value is determined as a target focus category, the map weight corresponding to the target focus category is obtained, the characteristic map and the map weight are classified and activated by adopting an activation mapping formula, a thermal map is obtained, the thermal map and the target medical image are overlapped, a target thermodynamic diagram is generated, and the image recognition rate is improved.

Description

Medical image interpretation method, device, computer equipment and storage medium

Technical Field

The present invention relates to the field of intelligent decision making technologies, and in particular, to a medical image interpretation method, a medical image interpretation device, a computer device, and a storage medium.

Background

With the development of science, convolutional neural networks have achieved excellent results in the field of image recognition in recent years, wherein the network structure of the convolutional neural networks is gradually improved, the classification precision on each data set is also rapidly improved, the classification error rate is gradually reduced, and the classification error rate is gradually reduced over that of a person who is subjected to simple training. In the medical field, medical staff usually diagnose according to medical images according to experience, and misdiagnosis may be caused if the experience is insufficient, so that improvement of the recognition rate of the medical images becomes a urgent problem to be solved.

Disclosure of Invention

The embodiment of the invention provides a medical image interpretation method, a medical image interpretation device, computer equipment and a storage medium, which are used for solving the problem of low recognition rate of medical images.

A medical image interpretation method, comprising:

Acquiring an image analysis request, wherein the image analysis request comprises a target medical image;

Identifying the target medical image by adopting a pre-trained image identification model, and obtaining a characteristic map output by a last layer of convolution layer in the image identification model;

based on the characteristic map, obtaining a predicted probability value corresponding to each original focus category output by the image recognition model;

Determining an original focus category with the maximum prediction probability value as a target focus category, acquiring a map weight corresponding to the target focus category, and classifying and activating the characteristic map and the map weight by adopting an activation mapping formula to acquire a thermal map, wherein the activation mapping formula is as follows C refers to the target lesion category, M _c (x, y) refers to the thermodynamic map corresponding to the target lesion category,/>The characteristic spectrum weight corresponding to the kth characteristic spectrum is indicated, K is the number of the characteristic spectrum, and f (x, y) is indicated as the kth characteristic spectrum;

And superposing the thermal map and the target medical image to generate a target thermodynamic diagram.

A medical image interpretation device, comprising:

the image analysis request acquisition module is used for acquiring an image analysis request, wherein the image analysis request comprises a target medical image;

The characteristic spectrum acquisition module is used for identifying the target medical image by adopting a pre-trained image identification model and acquiring a characteristic spectrum output by a last layer of convolution layer in the image identification model;

The prediction probability value acquisition module is used for acquiring a prediction probability value corresponding to each original focus category output by the image recognition model based on the characteristic map;

The thermodynamic mapping diagram acquisition module is used for determining an original focus category with the maximum predicted probability value as a target focus category, acquiring a map weight corresponding to the target focus category, and classifying and activating the characteristic map and the map weight by adopting an activating mapping formula to acquire a thermodynamic mapping diagram, wherein the activating mapping formula is that C refers to the target lesion category, M _c (x, y) refers to the thermodynamic map corresponding to the target lesion category,/>The characteristic spectrum weight corresponding to the kth characteristic spectrum is indicated, K is the number of the characteristic spectrum, and f (x, y) is indicated as the kth characteristic spectrum;

And the target thermodynamic diagram acquisition module is used for superposing the thermodynamic mapping diagram and the target medical image to generate a target thermodynamic diagram.

A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the medical image interpretation method described above when executing the computer program.

A computer readable storage medium storing a computer program which when executed by a processor implements the medical image interpretation method described above.

The method, the device, the computer equipment and the storage medium for interpreting the medical image are provided, the image analysis request is acquired, the target medical image is identified by adopting a pre-trained image identification model, and the characteristic spectrum output by the last layer of convolution layer in the image identification model is acquired, wherein the characteristic spectrum represents the semantic information and the position information of the target medical image, so that the target focus category is subjected to thermal mapping according to the position information and the semantic information. Based on the feature map, obtaining a predicted probability value corresponding to each original focus category output by the image recognition model, determining the original focus category with the maximum predicted probability value as a target focus category, obtaining a map weight corresponding to the target focus category, classifying, activating and mapping the feature map and the map weight by adopting an activation mapping formula to obtain a thermal map representing the focus category, superposing the thermal map and the target medical image, and visualizing the thermal map and the target medical image in a form of a target thermodynamic diagram, so that the image recognition model based on the convolutional neural network has a certain degree of interpretation on the classification of the target medical image, is convenient for medical staff to diagnose according to the target thermodynamic diagram, assists clinical decision, is used as the basis of clinical diagnosis, reduces misdiagnosis, and improves the recognition rate of the medical image.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic view of an application environment of a method for explaining a medical image according to an embodiment of the present invention;

FIG. 2 is a flowchart of a method for explaining a medical image according to an embodiment of the present invention;

FIG. 3 is a flowchart of a method for explaining a medical image according to an embodiment of the present invention;

FIG. 4 is a flowchart of a method for explaining a medical image according to an embodiment of the present invention;

FIG. 5 is a flowchart of a method for explaining a medical image according to an embodiment of the present invention;

FIG. 6 is a flowchart of a method for explaining a medical image according to an embodiment of the present invention;

FIG. 7 is a flowchart of a method for explaining a medical image according to an embodiment of the present invention;

FIG. 8 is a schematic block diagram of a medical image interpretation apparatus according to an embodiment of the present invention;

FIG. 9 is a schematic diagram of a computer device in accordance with an embodiment of the invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

The medical image interpretation method provided by the embodiment of the invention can be applied to the application environment as shown in fig. 1. And identifying the target medical image through a pre-trained image identification model, visually displaying a passing target thermodynamic diagram of the abnormal focus category in the target medical image according to the characteristic map and the map weight of the original focus category with the maximum prediction probability value, and diagnosing symptoms through the target thermodynamic diagram, so that the identification rate of the medical image is improved. The user terminal may be, but is not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices. The server may be implemented by an independent server or a server cluster formed by a plurality of servers.

In an embodiment, as shown in fig. 2, a medical image interpretation method is provided, and the method is applied to the server in fig. 1 for illustration, and specifically includes the following steps:

S10: an image analysis request is acquired, the image analysis request comprising a target medical image.

The target medical image refers to an image which is interacted with a human body by means of a certain medium (such as X-rays, electromagnetic fields, ultrasonic waves and the like) and displays the internal tissue organ structure and density of the human body in an image mode.

Specifically, after the server acquires the target medical image, if the medical staff directly analyzes the target medical image according to experience, a diagnosis result is acquired to determine the focus, and a larger probability misjudgment exists, so that an image analysis request can be sent to the server, the image analysis request contains the target medical image, the subsequent server can identify the target medical image through a pre-trained image identification model, abnormal focus categories are displayed, the identification rate of the medical image is improved, and misjudgment is reduced.

S20: and identifying the target medical image by adopting a pre-trained image identification model, and obtaining a characteristic map output by a last layer of convolution layer in the image identification model.

The feature map refers to a map output by a last layer of convolution layer in the image recognition model, wherein the feature map is similar to a Tensor (Tensor) of m×h, and the Tensor can be simply understood as a multidimensional array, and specifically is a multiple linear map that can be used to represent a linear relationship among vectors, scalar quantities and other tensors.

Specifically, after the server acquires the target medical image, the target medical image is identified by adopting a pre-trained image identification model, wherein the image identification model comprises at least two convolution layers. In this embodiment, a feature map output by a final layer of convolution layer in the image recognition model is obtained, so that features of the target medical image are represented by the feature map, that is, semantic information and position information of the target medical image are represented by the feature map.

The feature map of the output of the convolution layer can be calculated through a formula a _i ^l＝σ(z_i ^l)＝σ(a_i ^l-1*W^l+b^l), wherein a _i ^l represents the output of the ith focus category label of the convolution layer of the first layer, z _i ^l represents the output of the ith focus category label before the processing of the activation function is not adopted, a _i ^l-1 represents the output of the ith focus category label of the l-1 layer convolution layer (i.e. the output of the last layer), sigma represents the activation function, sigma for the convolution layer can be ReLu (RECTIFIED LINEAR Unit, linear rectification function), compared with other activation functions, the effect of the adopted activation function sigma can be better, the x represents the convolution operation, W ^l represents the convolution kernel (weight) of the convolution layer of the first layer, and b ^l represents the bias of the convolution layer of the first layer. Preferably, the convolution kernel of the convolution layer is (3*3) and the number of channels is doubled layer by layer.

S30: based on the characteristic spectrum, obtaining a predicted probability value corresponding to each original focus category output by the image recognition model.

The original focus category refers to a focus category which can be identified through an image identification model, and when the image identification model is trained, focus categories marked in historical medical images can be understood. The predicted probability value refers to a probability value that the feature map of the target medical image belongs to each original focus category, specifically, the output layer in the image recognition model is used for calculating so as to determine the predicted probability value that the feature map of the target medical image belongs to each original focus category.

Specifically, the image recognition model includes at least two pooling layers, the window of the pooling layers is (2×2), the step length is 2, in order to improve the display precision of the target thermodynamic diagram, the last pooling layer in the image recognition model adopts a global average pooling layer (global average pooling, GAP), the global average pooling layer does not need to be downsampled, and the rest pooling layers adopt the largest pooling layer. If the first layer is the largest pooling layer, the output of the pooling layer may be denoted as a ^l＝pool(a^l-1), where pool refers to a downsampling calculation that may select the largest pooling method, and a ^l-1 represents the i-th lesion category label output of the l-1 layer. When an image passes through a convolution layer, the convolution is carried out through a convolution kernel, and a plurality of convolution layers exist, so that the dimension of a feature map output by the convolution layer is increased by a plurality of times relative to the dimension of input data (image training sample data), and therefore, the feature dimension reduction is carried out by adopting downsampling calculation. If the first layer is a global averaging pooling layer, the output of the global averaging pooling layer is denoted as (m× 1*1), and typically, the previous layer of the global averaging pooling layer is a feature map (feature map) of m×h, to improve the accuracy of the final target thermodynamic diagram, the last downsampling of the global averaging pooling layer in the network is removed, and the feature map of m×h is converted into a feature vector with a length of m× 1*1 by the global averaging pooling layer, where the feature vector represents highly abstract semantic information about the target medical image.

The image recognition model comprises an output layer, wherein the output layer inputs a feature vector, the output layer adopts a softmax function which is equivalent to a plurality of classifiers, namely, the first layer is an output layer L, the activation function sigma adopts the softmax function, a formula of calculating the output of the output layer L is a ^L＝softmax(z^l)＝softmax(W^La^L-1+b^L),a^L which is the output of the finally obtained output layer (namely, a predicted probability value), W ^L represents a weight corresponding to the L layer, and a ^L-1 represents the output corresponding to the L-1 layer.

It can be understood that, after the image recognition model obtains the feature map, the feature map is input into the global average pooling layer, the feature vector is output through the global average pooling layer, the feature vector is input into the output layer, the classification prediction of the focus category is performed through the softmax function of the output layer, and the prediction probability value corresponding to each original focus category is output, so that the prediction probability value corresponding to each original focus category output by the image recognition model is obtained.

S40: determining an original focus category with the maximum prediction probability value as a target focus category, acquiring a map weight corresponding to the target focus category, classifying and activating and mapping the characteristic map and the map weight by adopting an activating and mapping formula to acquire a thermal map, wherein the activating and mapping formula is as followsC refers to the target lesion category, M _c (x, y) refers to the thermodynamic map corresponding to the target lesion category,/>The characteristic spectrum weight value corresponding to the kth characteristic spectrum is indicated, K is the number of the characteristic spectrum, and f (x, y) is indicated as the kth characteristic spectrum.

The target lesion category refers to an original lesion category corresponding to the maximum predicted probability value. The map weight refers to a set of weights corresponding to the target lesion category when the output layer calculates the maximum prediction probability value in the image recognition model, the output layer L outputs a formula of a ^L＝softmax(z^l)＝softmax(W^La^L-1+b^L, no bias b is added in the embodiment, the weight corresponding to the calculated maximum prediction probability value is obtained through the formula and is taken as the map weight, namely W ^L corresponding to the maximum prediction probability value a ^L is taken as the map weight. The thermal map is a map which is mapped according to the map weight corresponding to the target focus category and the characteristic map and can represent the focus category.

Specifically, according to the predicted probability value corresponding to each original focus category, the original focus category corresponding to the maximum predicted probability value is obtained, the original focus category is determined as a target focus category, according to the target focus category, a group of weights of the maximum predicted probability value of the target focus category calculated by an output layer in an image recognition model are determined as map weights, and an activation mapping formula is adopted to classify and activate the feature map and the map weights, so as to obtain a thermal map. Wherein, the activation mapping formula is thatC is the target focus category, M _c (x, y) is the thermodynamic mapping diagram corresponding to the target focus category c, k is the number of characteristic maps,/>The characteristic spectrum weight corresponding to the kth characteristic spectrum is indicated, and f (x, y) is indicated as the kth characteristic spectrum. It should be noted that, corresponding semantic information and position information thereof, specifically, tensor of m×h×h, are retained in the feature map, classification is completed through the output layer, the target focus category is determined, the input data is identified as data corresponding to the target focus category, the map weight of the output layer is weighted to the feature map, and the process of summing the weighted map weights is referred to as CAM. It can be appreciated that the convolutional neural network has strong image processing and classifying capabilities, and can also locate key parts in the image, namely, the focus category in the target medical image can be located, and the process is called Class Activation Mapping, CAM for short. Wherein CAM is the process by which pointers locate key parts in the target medical image.

S50: and superposing the thermodynamic mapping diagram and the target medical image to generate a target thermodynamic diagram.

The target thermodynamic diagram is a diagram showing the lesion type visualized in the form of heat on the target medical image, and it is understood that the heat generated at the position with relatively large influence in the target medical image is relatively high, whereas the heat generated at the position with relatively large influence in the target medical image is relatively low or no heat is generated.

Specifically, after the server acquires the thermal map, the thermal map is superimposed on the target medical image, and after the thermal map is superimposed, the location including the lesion category may be displayed in the target medical image by heat, so as to form a target thermodynamic diagram.

In step S10-S50, the server acquires an image analysis request, and identifies the target medical image by using a pre-trained image identification model, and acquires a feature map output by a convolution layer of the last layer in the image identification model, where the feature map represents semantic information and position information of the target medical image, so that thermal mapping is performed on the target focus category according to the position information and the semantic information. Based on the feature map, obtaining a predicted probability value corresponding to each original focus category output by the image recognition model, determining the original focus category with the maximum predicted probability value as a target focus category, obtaining a map weight corresponding to the target focus category, classifying, activating and mapping the feature map and the map weight by adopting an activation mapping formula to obtain a thermal map representing the focus category, superposing the thermal map and the target medical image, and visualizing the thermal map and the target medical image in a form of a target thermodynamic diagram, so that the image recognition model based on the convolutional neural network has a certain degree of interpretation on the classification of the target medical image, is convenient for medical staff to diagnose according to the target thermodynamic diagram, assists clinical decision, is used as the basis of clinical diagnosis, reduces misdiagnosis, and improves the recognition rate of the medical image.

In one embodiment, the image analysis request further includes a user type. The user type refers to a type of a user sending the target medical image to the server, and the user type may include a common user type.

As shown in fig. 3, after step S30, that is, after obtaining the predicted probability value corresponding to each original lesion category output by the image recognition model, the medical image interpretation method further includes the following steps:

s301: if the user type is the common user type, comparing each predicted probability value with a probability threshold value to acquire a target probability value larger than the probability threshold value and an original focus category corresponding to the target probability value.

The common user type refers to a user type which cannot know the focus category through the target thermodynamic diagram. The probability threshold is a preset threshold for determining whether the lesion category and the corresponding predicted probability value are displayed on the user side. The target probability value refers to a predicted probability value that is greater than the probability threshold.

Specifically, after the server acquires the predicted probability value corresponding to each original focus category, determining whether the user type is a common user type; if the user type is not the normal user type, executing the steps S40 and S50, and displaying the target thermodynamic diagram on a display interface of the user terminal; if the user type is the common user type, comparing each predicted probability value with a probability threshold, taking the predicted probability value larger than the probability threshold as a target probability value, and acquiring the target probability value and the original focus category corresponding to the target probability value.

S302: and displaying the target probability value and the original focus category corresponding to the target probability value on the user side.

Specifically, the obtained target probability value and the original focus category corresponding to the target probability value are in one-to-one correspondence and displayed on the user side, so that the user corresponding to the common user type can clearly know each focus category and the corresponding target probability value, and related review is performed, so that the review has pertinence.

Further, the target thermodynamic diagram, the target probability values and the original focus categories corresponding to the target probability values are displayed simultaneously on a user side display interface, so that a user can determine the position of the original focus category corresponding to each target probability value in the target medical image. It will be appreciated that the greater the target probability value, the higher the heat that is visually displayed in the target thermodynamic diagram.

In steps S301-S302, if the user type is a common user type, the server compares each predicted probability value with a probability threshold, so that the user side displays a target probability value greater than the probability threshold and an original focus category corresponding to the target probability value, thereby realizing the requirements of different crowds.

In one embodiment, as shown in fig. 4, after step S10, i.e. after the acquisition of the image analysis request, the medical image interpretation method further specifically includes the following steps:

s101: based on the target medical image, a gray scale image is acquired.

Wherein the gray image refers to a monochrome image having a gray gamut or level of 256 levels from black to white.

Specifically, when a target medical image is acquired, judging whether the target medical image is a gray image or not; and if the target medical image is a color image, carrying out graying treatment on the target medical image to obtain a gray image. The graying process is a process of converting a color image into a gray image. Specifically, gray scale processing can be performed on the color image by using a component maximum value method, an average value method, a weighted average method and the like, so as to obtain a gray scale image.

S102: and filtering the gray level image by using a Laplace variance algorithm, calculating the mean value and the variance value of the filtered image, and comparing the variance value with a preset threshold value.

The filtered image is an image obtained by filtering a gray-scale image. The preset threshold value is a preset value for judging whether the target medical image is a blurred image or not.

Specifically, after the gray image is obtained, a gray value is obtained from the gray image. The complete image is composed of R, G and B channels, the gray value is the value of R=G=B called gray value, and the gray range is 0-255. Performing convolution operation on gray values of gray images and a Laplace mask, wherein the Laplace mask isAnd (5) averaging the matrix after convolution operation, and calculating a variance value. After the variance value is obtained, a database is queried, wherein a preset threshold value is preset in the database, and the variance value is compared with the preset threshold value. And the subsequent image blurring detection is facilitated by carrying out convolution filtering on the gray level image by using a Laplacian mask. It will be appreciated that the Laplace algorithm is used to measure the second derivative of the image, highlighting areas of rapid intensity change in the image, and is based on the assumption that: if an image has a high variance value, then the image has a wide frequency response range, and the image represents a normal and accurately focused image. If an image has a small variance value, the image has a narrow frequency response range, meaning that the number of edges in the image is small, and the more blurred the image, the fewer edges are. Therefore, whether the target medical image is a blurred image is determined by the Laplace variance algorithm, and a proper preset threshold value needs to be set. The preset threshold is set too low to cause the target medical image to be mistakenly broken into a fuzzy graph, the preset threshold is too high to cause the fuzzy target medical image to be mistakenly broken into a normal image, and the preset threshold can be specifically set according to experience.

S103: if the variance value is larger than a preset threshold value, performing recognition on the target medical image by adopting a pre-trained image recognition model, and obtaining a characteristic map output by a last layer of convolution layer in the image recognition model.

Specifically, if the variance value is greater than the preset threshold, it indicates that the target medical image is a normal and accurately focused image, then the method performs recognition on the target medical image by using a pre-trained image recognition model, and obtains a feature map output by the last layer of convolution layer in the image recognition model, that is, performs the step S20.

S104: if the variance value is not greater than the preset threshold value, generating reminding information and feeding the reminding information back to the user side.

Specifically, if the variance value is greater than a preset threshold value, indicating that the target medical image is a blurred image, generating reminding information, wherein the reminding information can be specifically that 'the input target medical image is a blurred image, please input again', feeding the reminding information back to the user side, and displaying a page on the user side so that a user can input a clear target medical image according to the reminding information.

In the steps S101-S104, a service end obtains a gray image based on a target medical image; and filtering the gray level image by using a Laplace variance algorithm, calculating the mean value and the variance value of the filtered image, and comparing the variance value with a preset threshold value to judge whether the target medical image is a blurred image or not by using the Laplace variance algorithm, thereby improving the accuracy of the subsequent model identification.

In one embodiment, as shown in fig. 5, before step S201, that is, before the target medical image is identified by using the pre-trained image identification model, the medical image interpretation method further specifically includes the following steps:

s201: and acquiring a historical medical image, marking the focus of the historical medical image, wherein the historical medical image carries a corresponding focus label.

The historical medical image refers to an image containing a focus. The focus label refers to a label for representing the focus category in the history medical image.

Specifically, a large number of historical medical images corresponding to different focus categories are obtained in advance, and focus positions and focus names of each historical medical image are marked so that an image recognition model can be obtained by training the historical medical images later.

S202: and carrying out augmentation treatment on the historical medical image to obtain an augmented image.

Specifically, the image augmentation technology is adopted to amplify the historical medical image to obtain an augmented image, wherein the augmentation treatment refers to that the image augmentation technology (image augmentation) is adopted to carry out a series of changes on the historical medical image so as to generate similar but different augmented images, thereby expanding the scale of the training data set.

S203: and carrying out normalization processing on the amplified image to obtain an image training sample.

Specifically, the normalization processing is to process the amplified image data and limit the amplified image data to a certain range. Such as typically limited to the interval 0,1 or-1, 1.

Specifically, a gray value feature matrix corresponding to the augmented image is obtained, normalization processing is carried out on each gray value in the gray value feature matrix, and a normalized gray value feature matrix of the image is obtained, wherein the formula of normalization processing is thatMaxValue is the maximum value of the gray values in the gray value feature matrix of the image, minValue is the minimum value of the gray values in the gray value feature matrix of the image, x is the gray value before normalization, y is the gray value after normalization, and an image training sample is obtained according to the normalized gray value feature matrix corresponding to each augmented image. The gray value characteristic matrix is composed of brightness values of pixel points in the enhanced image. By carrying out normalization processing on the augmented image, the speed of gradient descent to solve the optimal solution during training of the image recognition model is increased, and the accuracy is improved.

S204: and inputting the image training sample into a convolutional neural network for training, and updating the weight and bias of the convolutional neural network by adopting a backward propagation algorithm with random gradient descent to obtain an image recognition model.

The convolutional neural network is a feedforward neural network, and an artificial neuron of the convolutional neural network can respond to surrounding units in a part of coverage area and can perform image processing and classification. Convolutional neural networks typically include a nonlinear trainable convolutional layer, a pooling layer and a fully connected layer, in this proposal, after the last convolutional layer, a global average pooling layer is employed and the fully connected layer is replaced, and further, an input layer and an output layer. Wherein, the layer is pooled by global average in order to reduce the data dimension and parameters. It can be understood that, by the global averaging pooling layer, the position information of each channel in the feature map output by the last convolution layer can be erased, so that each element in the feature vector output by the global averaging pooling layer is guided to relatively independent high-abstract semantic information through back propagation as far as possible, thereby avoiding overfitting, and meanwhile, the global averaging pooling layer can realize the input of any image size.

Specifically, the image training sample image is input into a convolutional neural network for training, the characteristic extraction is carried out through a convolutional layer, and the dimension reduction is carried out through a pooling layer. And because excessive parameters of the full-connection layer can cause over fitting, after the final layer of convolution layer, carrying out mean value pooling on each characteristic map output by the final layer of convolution layer by adopting a global mean pooling layer to form a characteristic point, and forming a characteristic vector by the characteristic point. The feature vector output by the global average pooling layer is input into the output layer, a multi-classifier can be formed through a softmax function, namely, each focus category corresponds to one classifier, the classifier of each focus category corresponds to one set of weight in the output layer, the feature vector is classified through the softmax function, the output value is obtained, and the weight and bias of the convolutional neural network are updated by adopting a backward propagation algorithm with random gradient descent until the model converges to obtain the image recognition model. The backward propagation algorithm with random gradient descent is adopted to carry out backward propagation update on errors generated by the error image training samples during convolutional neural network training, so that all the generated errors can be guaranteed to be adjusted and updated on the network, the convolutional neural network can be comprehensively trained, and an image recognition model with high recognition rate is obtained.

In steps S201-S204, the server acquires the historical medical image, and focus labeling is carried out on the historical medical image, so that subsequent model training is facilitated. And (3) performing augmentation treatment on the historical medical images to obtain augmented images, thereby expanding the scale of the training data set. By carrying out normalization processing on the augmented image, the speed of gradient descent to solve the optimal solution during training of the image recognition model is increased, and the accuracy is improved. The image training sample is input into the convolutional neural network for training, and the weight and bias of the convolutional neural network are updated by adopting a backward propagation algorithm with random gradient descent, so that an image recognition model is obtained, and the recognition precision is improved.

In one embodiment, as shown in fig. 6, in step S202, the historical medical image is subjected to an augmentation process to obtain an augmented image, which specifically includes the following steps:

s2021: and acquiring preset amplification conditions, and performing amplification treatment on the historical medical image according to the preset amplification conditions to acquire an image to be determined.

The preset amplification conditions are conditions for performing left-right translation, up-down translation, random scaling and the like on the historical medical image within a certain range.

Specifically, after the historical medical image is obtained, a preset amplification condition is obtained first, the historical medical image is subjected to amplification treatment by adopting an image amplification technology according to the preset amplification condition, an image to be determined is obtained, a training sample is added, and training data is enriched. When the historical medical image is translated left and right, translated up and down and randomly zoomed in a certain range, the image part of the historical medical image needs to be ensured to be in an effective range, and a plurality of preset amplification conditions can be overlapped.

S2022: and carrying out noise adding treatment on the image to be determined to obtain an amplified image.

Specifically, the image noise generally includes noise in a spatial domain and noise in a frequency domain, and specifically, the image to be determined may be subjected to noise adding processing by a MATLAB tool. For example, salt and pepper noise adding, gaussian noise adding and the like are performed to obtain an amplified image. And the accuracy of the subsequent recognition of the image recognition model is improved by carrying out noise adding processing on the image to be determined.

In steps S2021-S2022, the server performs augmentation processing on the historical medical image according to the preset augmentation conditions, so as to enrich the training data. The server performs noise adding processing on the image to be determined, and accuracy of subsequent model identification is improved.

In one embodiment, as shown in fig. 7, in step S204, an image training sample is input into a convolutional neural network for training, and a backward propagation algorithm with a random gradient descent is used to update the weight and bias of the convolutional neural network, so as to obtain an image recognition model, which specifically includes the following steps:

s2041: and initializing a convolutional neural network.

Specifically, initializing a convolutional neural network includes: the weight initialized by the convolutional neural network satisfies the formulaWherein n ^l represents the number of samples of the video training sample input at the first layer, S () represents the variance operation, W ^l represents the weight of the first layer,/>Representing arbitrary, l represents the first layer in the convolutional neural network.

In this embodiment, the server initializes the convolutional neural network, and the initialization operation is to perform initialization setting on the weight and the bias by using a preset value, where the preset value is a value preset by a developer according to experience. The weight and bias of the convolutional neural network model are initialized by the preset value, so that the training time of the model can be shortened and the recognition accuracy of the model can be improved when training is carried out according to the image training sample.

S2042: and inputting the image training sample into a convolutional neural network for training, and obtaining a prediction result of the image training sample in the convolutional neural network.

The prediction result is an output result obtained by training the image training sample through a convolutional neural network model.

Specifically, after obtaining an image training sample, inputting each image training sample with focus labels into a convolutional neural network model for training, specifically, extracting features of the image training sample through a plurality of convolutional layers, performing dimension reduction processing through a pooling layer, but after the last layer of convolutional layer, adopting a local average pooling layer, removing downsampling, obtaining feature vectors, inputting the feature vectors into an output layer, and obtaining an output value of the convolutional neural network as a prediction result through calculation of the output layer. The convolutional neural network includes a large number of layers, and the functions of the layers are different, so that the outputs of the layers are different.

S2043: constructing an error function according to the prediction result and the focus label, wherein the expression of the error function is as followsWhere n represents the total number of image training samples, x _i represents the predicted result of the ith image training sample, and y _i represents the lesion label of the ith image training sample corresponding to x _i.

Specifically, the service side passes throughThe convolutional neural network is trained, and weights and offsets are updated so that the predicted result is more similar to the real result. The error between the predicted result and the real result can be reflected well through the error function.

S2044: and calculating the gradient by adopting a back propagation algorithm according to the error function, and updating the weight and the bias in the convolutional neural network by adopting random gradient descent to obtain an image recognition model.

Specifically, after one training is completed, after the prediction results of a plurality of image training samples are obtained, an error function is constructed based on the prediction results and the real results, the error between each image training sample and the corresponding real result (focus category marked by focus labels) is calculated according to the error function, and the weight and bias in the convolutional neural network are updated based on the error, so that an image recognition model is obtained. Specifically, since the server only increases the weight value in the output layer, in the back propagation process, the weight value of the updated output layer is calculated first, and the error function is used to calculate the bias of the weight value W, so that a common factor, that is, the sensitivity delta ^L (L represents the output layer) of the output layer can be obtained, the sensitivity delta ^l of the first layer can be obtained in turn from the sensitivity delta ^L, the gradient of the first layer in the convolutional neural network is obtained according to delta ^l, that is, the gradient of each layer is obtained, and then the weight and bias of the convolutional neural network are updated by using the gradient.

If the current layer is a convolution layer, the sensitivity of the first layerHere, the x represents convolution operation, rot180 represents operation of turning the matrix 180 degrees, and the meaning of the rest parameters in the formula is explained with reference to the meaning of the parameters above, which is not described herein.

Wherein, the formula of the convolution layer updating weight in the convolution neural network is as followsWherein, W ^l' represents the weight after updating, W ^l represents the weight before updating, α represents the learning rate, m represents the number of image training samples, i represents the input ith image training sample, δ ^i,l represents the sensitivity of the input ith image training sample at the first layer, a ^i,l-1 represents the output of the input ith image training sample at the first-1 layer, rot180 represents the operation of turning the matrix 180 degrees; updating the formula of the bias to/>B ^l' denotes the updated bias, b ^l denotes the pre-update bias α denotes the learning rate, m denotes the number of video training samples, i denotes the input i-th video training sample, and δ ^i,l denotes the sensitivity of the input i-th video training sample at the first layer. Wherein, (u, v) refers to the position of a small block (element constituting the convolution feature map) in each convolution feature map obtained when the convolution operation is performed.

In steps S2041-S2043, the server builds an error function according to the prediction result obtained by the image training sample in the convolutional neural network, and back propagates according to the error function, and updates the weight and bias, so that an image recognition model can be obtained, and the model learns the deep features of the image training sample, so that the lesion category can be accurately recognized.

It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.

In one embodiment, a medical image interpretation apparatus is provided, which corresponds to the medical image interpretation method in the above embodiment one by one. As shown in fig. 8, the medical image interpretation apparatus includes an image analysis request acquisition module 10, a feature map acquisition module 20, a prediction probability value acquisition module 30, a thermal map acquisition module 40, and a target thermodynamic diagram acquisition module 50. The functional modules are described in detail as follows:

the image analysis request acquisition module 10 is configured to acquire an image analysis request, where the image analysis request includes a target medical image.

The feature map obtaining module 20 is configured to identify the target medical image by using a pre-trained image identification model, and obtain a feature map output by a last convolution layer in the image identification model.

The predicted probability value obtaining module 30 is configured to obtain, based on the feature map, a predicted probability value corresponding to each original lesion category output by the image recognition model.

A thermal map obtaining module 40, configured to determine an original focus category with a maximum predicted probability value as a target focus category, obtain a map weight corresponding to the target focus category, and classify and activate the feature map and the map weight by using an activation mapping formula to obtain a thermal map, where the activation mapping formula isC refers to the target lesion category, M _c (x, y) refers to the thermodynamic map corresponding to the target lesion category,/>The characteristic spectrum weight value corresponding to the kth characteristic spectrum is indicated, K is the number of the characteristic spectrum, and f (x, y) is indicated as the kth characteristic spectrum.

The target thermodynamic diagram obtaining module 50 is configured to superimpose the thermodynamic mapping diagram and the target medical image to generate a target thermodynamic diagram.

In one embodiment, the image analysis request further includes a user type.

After the prediction probability value acquisition module 30, the medical image interpretation apparatus further includes an original lesion category acquisition unit and a data display unit.

The original lesion category obtaining unit 301 is configured to compare each predicted probability value with a probability threshold value, and obtain a target probability value greater than the probability threshold value and an original lesion category corresponding to the target probability value if the user type is a normal user type.

The data display unit 302 is configured to display the target probability value and the original lesion category corresponding to the target probability value on the user side.

In one embodiment, after the image analysis request acquisition module 10, the medical image interpretation apparatus includes: a gray image acquisition unit, a variance value comparison unit, a first processing unit and a second processing unit.

And the gray image acquisition unit is used for acquiring gray images based on the target medical images.

And the variance value comparison unit is used for carrying out filtering processing on the gray level image by adopting a Laplace variance algorithm, calculating the mean value and the variance value of the filtered image, and comparing the variance value with a preset threshold value.

And the first processing unit is used for executing the recognition of the target medical image by adopting the pre-trained image recognition model if the variance value is larger than the preset threshold value, and acquiring the characteristic map output by the last layer of convolution layer in the image recognition model.

And the second processing unit is used for generating reminding information and feeding the reminding information back to the user side if the variance value is not greater than the preset threshold value.

In an embodiment, before the feature map obtaining module 20, the medical image interpretation apparatus further includes: the device comprises a historical medical image acquisition unit, an augmented image acquisition unit, an image training sample acquisition unit and an image recognition model acquisition unit.

The history medical image acquisition unit is used for acquiring a history medical image, marking the focus of the history medical image, and carrying a corresponding focus label.

And the augmentation image acquisition unit is used for carrying out augmentation processing on the historical medical image to acquire an augmentation image.

The image training sample acquisition unit is used for carrying out normalization processing on the amplified images to acquire image training samples.

The image recognition model acquisition unit is used for inputting the image training sample into the convolutional neural network for training, and updating the weight and bias of the convolutional neural network by adopting a backward propagation algorithm with random gradient descent to acquire the image recognition model.

In one embodiment, the augmented image acquisition unit includes an image acquisition subunit to be determined and an augmented image acquisition subunit.

The image acquisition subunit to be determined is used for acquiring preset amplification conditions, and performing augmentation treatment on the historical medical image according to the preset amplification conditions to acquire the image to be determined.

And the augmented image acquisition subunit is used for carrying out noise adding processing on the image to be determined to acquire an augmented image.

In one embodiment, the image recognition model acquisition unit includes: the system comprises a neural network initialization subunit, a prediction result acquisition subunit, an error function construction subunit and an image recognition model acquisition subunit.

And the neural network initializing subunit is used for initializing the convolutional neural network.

The prediction result obtaining subunit is used for inputting the image training sample into the convolutional neural network for training, obtaining the prediction result of the image training sample in the convolutional neural network, and the last pooling layer of the convolutional neural network is a global average pooling layer.

An error function construction subunit, configured to construct an error function according to the prediction result and the focus label, where the expression of the error function isWhere n represents the total number of image training samples, x _i represents the predicted result of the ith image training sample, and y _i represents the lesion label of the ith image training sample corresponding to x _i.

And the image recognition model acquisition subunit is used for calculating the gradient by adopting a counter propagation algorithm according to the error function, and updating the weight and the bias in the convolutional neural network by adopting random gradient descent to acquire the image recognition model.

For specific limitations of the medical image interpretation apparatus, reference may be made to the above limitations of the medical image interpretation method, and no further description is given here. The modules in the medical image interpretation device can be realized in whole or in part by software, hardware and a combination thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.

In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 9. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The database of the computer device is used for storing data generated or acquired during the medical image interpretation method, for example, storing an image recognition model, etc. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a medical image interpretation method.

In one embodiment, a computer device is provided, including a memory, a processor, and a computer program stored in the memory and executable on the processor, where the processor executes the computer program to implement the steps of the method for interpreting medical images in the above embodiment, for example, the steps S10 to S50 shown in fig. 2, or the steps shown in fig. 3 to 7. Or the processor, when executing the computer program, implements the functions of each module in the medical image interpretation apparatus of the above embodiment, for example, the functions of the modules 10 to 50 shown in fig. 8. To avoid repetition, no further description is provided here.

In an embodiment, a computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the method of interpreting a medical image according to the above-described method embodiment, for example, steps S10 to S50 shown in fig. 2 or steps shown in fig. 3 to 7. Or the computer program, when executed by a processor, performs the functions of the modules in the apparatus for interpreting a medical image according to the above-described embodiment, for example, the functions of the modules 10 to 50 shown in fig. 8. To avoid repetition, no further description is provided here.

Those skilled in the art will appreciate that implementing all or part of the above-described methods may be accomplished by way of a computer program, which may be stored on a non-transitory computer readable storage medium and which, when executed, may comprise the steps of the above-described embodiments of the methods. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous link (SYNCHLINK) DRAM (SLDRAM), memory bus (RambuS) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.

It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.

The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims

1. A medical image interpretation method, comprising:

acquiring an image analysis request, wherein the image analysis request comprises a target medical image and a user type;

based on the characteristic map, obtaining a predicted probability value corresponding to each original focus category output by the image recognition model; the predicted probability value refers to a probability value that the feature map of the target medical image belongs to each original focus category;

superposing the thermodynamic mapping diagram and the target medical image to generate a target thermodynamic diagram;

After the obtaining of the predicted probability value corresponding to each original focus category output by the image recognition model, the medical image interpretation method further comprises:

if the user type is a common user type, comparing each predicted probability value with a probability threshold value to obtain a target probability value larger than the probability threshold value and an original focus category corresponding to the target probability value;

Displaying the target probability value and the original focus category corresponding to the target probability value on a user side;

after the acquired image analysis request, the medical image interpretation method includes:

acquiring a gray level image based on the target medical image;

filtering the gray level image by using a Laplace variance algorithm, calculating the mean value and the variance value of the filtered image, and comparing the variance value with a preset threshold value;

If the variance value is larger than a preset threshold value, executing the recognition of the target medical image by adopting a pre-trained image recognition model, and obtaining a characteristic map output by a last layer of convolution layer in the image recognition model;

if the variance value is not greater than a preset threshold value, generating reminding information and feeding the reminding information back to the user side;

before the target medical image is identified by adopting the pre-trained image identification model, the medical image interpretation method further comprises the following steps:

Acquiring a historical medical image, and marking focus on the historical medical image, wherein the historical medical image carries a corresponding focus label;

performing augmentation treatment on the historical medical image to obtain an augmented image;

Normalizing the amplified image to obtain an image training sample;

And inputting the image training sample into a convolutional neural network for training, and updating the weight and bias of the convolutional neural network by adopting a backward propagation algorithm with random gradient descent to obtain an image recognition model.

2. The medical image interpretation method as claimed in claim 1, wherein the step of performing the augmentation process on the history medical image to obtain an augmented image includes:

Acquiring preset amplification conditions, and performing amplification treatment on the historical medical image according to the preset amplification conditions to acquire an image to be determined;

And carrying out noise adding treatment on the image to be determined to obtain an amplified image.

3. The medical image interpretation method as claimed in claim 1, wherein the inputting the image training samples into the convolutional neural network for training, and updating weights and offsets of the convolutional neural network by using a backward propagation algorithm with random gradient descent, to obtain the image recognition model, comprises:

Initializing a convolutional neural network;

Inputting the image training sample into the convolutional neural network for training, and obtaining a prediction result of the image training sample in the convolutional neural network;

Constructing an error function according to the prediction result and the focus label, wherein the expression of the error function is as follows Wherein n represents the total number of image training samples, x _i represents the predicted result of the ith image training sample, y _i represents the focus label of the ith image training sample corresponding to x _i;

and calculating the gradient by adopting a back propagation algorithm according to the error function, and updating the weight and the bias in the convolutional neural network by adopting random gradient descent to obtain an image recognition model.

4. A medical image interpretation apparatus, comprising:

The image analysis request acquisition module is used for acquiring an image analysis request, wherein the image analysis request comprises a target medical image and a user type;

the prediction probability value acquisition module is used for acquiring a prediction probability value corresponding to each original focus category output by the image recognition model based on the characteristic map; the predicted probability value refers to a probability value that the feature map of the target medical image belongs to each original focus category;

the target thermodynamic diagram acquisition module is used for superposing the thermodynamic mapping diagram and the target medical image to generate a target thermodynamic diagram;

The original focus category obtaining unit is used for comparing each predicted probability value with a probability threshold value if the user type is a common user type, and obtaining a target probability value larger than the probability threshold value and an original focus category corresponding to the target probability value;

the data display unit is used for displaying the target probability value and the original focus category corresponding to the target probability value on a user side;

a gray image acquisition unit for acquiring a gray image based on the target medical image;

The variance value comparison unit is used for carrying out filtering processing on the gray level image by adopting a Laplace variance algorithm, calculating the mean value and the variance value of the filtered image, and comparing the variance value with a preset threshold value;

The first processing unit is used for executing the recognition of the target medical image by adopting a pre-trained image recognition model if the variance value is larger than a preset threshold value, and acquiring a characteristic map output by a last layer of convolution layer in the image recognition model;

The second processing unit is used for generating reminding information and feeding the reminding information back to the user side if the variance value is not greater than a preset threshold value;

the system comprises a historical medical image acquisition unit, a focus marking unit and a focus marking unit, wherein the historical medical image acquisition unit is used for acquiring a historical medical image and marking focuses on the historical medical image, and the historical medical image carries corresponding focus labels;

the augmentation image acquisition unit is used for carrying out augmentation processing on the historical medical image to acquire an augmentation image;

The image training sample acquisition unit is used for carrying out normalization processing on the amplified images to acquire image training samples;

the image recognition model acquisition unit is used for inputting the image training sample into the convolutional neural network for training, and updating the weight and bias of the convolutional neural network by adopting a backward propagation algorithm with random gradient descent to acquire an image recognition model.

5. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the medical image interpretation method as claimed in any one of claims 1 to 3 when executing the computer program.

6. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the medical image interpretation method as claimed in any one of claims 1 to 3.