WO2020215676A1 - 基于残差网络的图像识别方法、装置、设备及存储介质 - Google Patents

基于残差网络的图像识别方法、装置、设备及存储介质 Download PDF

Info

Publication number
WO2020215676A1
WO2020215676A1 PCT/CN2019/117426 CN2019117426W WO2020215676A1 WO 2020215676 A1 WO2020215676 A1 WO 2020215676A1 CN 2019117426 W CN2019117426 W CN 2019117426W WO 2020215676 A1 WO2020215676 A1 WO 2020215676A1
Authority
WO
WIPO (PCT)
Prior art keywords
image
recognized
residual network
preset
resolution
Prior art date
Application number
PCT/CN2019/117426
Other languages
English (en)
French (fr)
Inventor
任嘉祥
马进
王健宗
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020215676A1 publication Critical patent/WO2020215676A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • This application relates to the field of information technology, and in particular to an image recognition method, device, equipment and storage medium based on residual network.
  • Pneumonia is a high-risk disease for infants and young children, accounting for more than 15% of all child deaths. In 2015, about 900,000 children under the age of 5 died from the disease. Therefore, accurately diagnosing pneumonia is a difficult task.
  • the existing technology is mainly confirmed by the examination of chest X-rays by well-trained experts, combined with clinical history, vital signs and laboratory examinations. X-ray diagnosis is the most frequently performed radiographic diagnosis method, and its importance is self-evident. In X-rays, pneumonia usually appears as areas of increased opacity.
  • the embodiments of the present application provide a residual network-based image recognition method, device, equipment, and storage medium to solve the problem of low recognition accuracy of target images containing pneumonia signals in the prior art.
  • An image recognition method based on residual network including:
  • the recognition result is output according to the predicted values of the two blocks, where the recognition result includes that the image to be recognized is a target image and the image to be recognized is a non-target image.
  • the performing preprocessing on the image to be recognized includes:
  • the normalized image to be recognized is expanded into a three-layer image.
  • the adjusting the image to be recognized to a preset resolution includes:
  • the image to be recognized is up-sampled to the resolution threshold.
  • the output of the recognition result according to the predicted values of the two blocks, wherein the recognition result includes that the image to be recognized is a target image and the image to be recognized is a non-target image includes:
  • the output recognition result is a non-target image.
  • a residual network which includes an input layer, a convolutional layer, a maximum pooling layer, 16 residual modules, a fully connected layer, and an output layer;
  • the image information with the preset label in the test sample set is used as an input vector to pass into the residual network obtained by the iterative training for testing.
  • a pneumonia recognition device based on residual network including:
  • the training module is used to construct a residual network, and use preset training samples to train the residual network;
  • the acquisition module is used to acquire the image to be recognized
  • the recognition module is used to divide the pre-processed image to be recognized into two non-overlapping blocks, which are passed as input to the residual network in turn, to obtain the prediction of each block after passing through the residual network value;
  • the output module is configured to output the recognition result according to the predicted values of the two blocks, wherein the recognition result includes the image to be recognized as a target image and the image to be recognized as a non-target image.
  • the preprocessing module includes:
  • An adjustment unit configured to adjust the image to be recognized to a preset resolution
  • a normalization unit configured to perform normalization processing on each pixel value in the image to be recognized after the resolution adjustment
  • the expansion unit is used to expand the normalized image to be recognized into a three-layer image.
  • the adjustment unit includes:
  • the comparison subunit is used to compare the resolution of the image to be recognized with a preset resolution threshold
  • the down-sampling subunit is configured to down-sample the image to be identified to the resolution threshold when the resolution of the image to be identified is higher than a preset resolution threshold;
  • the up-sampling subunit is configured to up-sample the image to be identified to the resolution threshold when the resolution of the image to be identified is lower than the preset resolution threshold.
  • a computer device includes a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
  • the recognition result is output according to the predicted values of the two blocks, where the recognition result includes that the image to be recognized is a target image and the image to be recognized is a non-target image.
  • One or more non-volatile readable storage media storing computer readable instructions.
  • the computer readable instructions execute the following steps:
  • the recognition result is output according to the predicted values of the two blocks, where the recognition result includes that the image to be recognized is a target image and the image to be recognized is a non-target image.
  • FIG. 1 is a flowchart of an image recognition method based on a residual network in an embodiment of the present application
  • step S101 in the image recognition method based on residual network in an embodiment of the present application
  • FIG. 3 is a schematic structural diagram of a residual network provided by an embodiment of the present application.
  • step S103 is a flowchart of step S103 in the image recognition method based on residual network in an embodiment of the present application
  • step S401 in the image recognition method based on the residual network in an embodiment of the present application
  • step S105 in the image recognition method based on residual network in an embodiment of the present application
  • FIG. 7 is a functional block diagram of an image recognition device based on a residual network in an embodiment of the present application.
  • Fig. 8 is a schematic diagram of a computer device in an embodiment of the present application.
  • the image recognition method based on residual network provided by the embodiment of the present application is applied to a server.
  • the server can be implemented by an independent server or a server cluster composed of multiple servers.
  • an image recognition method based on residual network is provided, which includes the following steps:
  • step S101 a residual network is constructed, and a preset training sample is used to train the residual network.
  • the deep neural network selected in the embodiment of the present application is a residual network ResNet (Residual Network, ResNet network for short) with excellent classification performance.
  • ResNet residual Network, ResNet network for short
  • the training process of the residual network will be described in detail below.
  • the step S101 includes:
  • a residual network is constructed, the residual network including an input layer, a convolutional layer, a maximum pooling layer, 16 residual modules, a fully connected layer, and an output layer.
  • FIG. 3 it is a schematic structural diagram of a residual network provided by an embodiment of this application.
  • the residual network includes an input layer, a convolutional layer, a maximum pooling layer, 16 residual modules, a fully connected layer, and an output layer.
  • the convolution kernel of the convolution layer is 7*7, and the number of channels is 64.
  • the window of the largest pooling layer is 3*3, and the step size is 2.
  • the 16 residual modules have the same structure, and all include three convolutional layers, which are the first convolutional layer with a 1*1 convolution kernel, the second convolutional layer with a 3*3 convolution kernel, and the convolutional layer.
  • the core is a 1*1 third convolutional layer, and each convolutional layer also includes a batch normalization layer and an activation layer.
  • the dimension of the fully connected layer is 2.
  • the 16 residual modules are divided into four groups according to different channel numbers.
  • the first group includes 3 residual modules. In each residual module, the number of channels in the first convolutional layer is 64, the number of channels in the second convolutional layer is 64, and the number of channels in the third convolutional layer is 256.
  • the second group includes 4 residual modules. In each residual module, the number of channels in the first convolutional layer is 128, the number of channels in the second convolutional layer is 128, and the number of channels in the third convolutional layer is 512.
  • the third group includes 6 residual modules. In each residual module, the number of channels in the first convolutional layer is 256, the number of channels in the second convolutional layer is 256, and the number of channels in the third convolutional layer is 1024.
  • the fourth group includes 3 residual modules. In each residual module, the number of channels in the first convolutional layer is 512, the number of channels in the second convolutional layer is 512, and the number of channels in the third convolutional layer is 2048
  • step S202 multiple pieces of image information of multiple designated users are collected as a training sample set and a test sample set.
  • the image recognition method based on the residual network provided in the embodiment of the present application can be used to recognize the target image containing the pneumonia signal through the residual network. Therefore, the designated user may be a patient with pneumonia, the image information may be an X-ray image of the lung, and more than 5000 X-ray images of multiple pneumonia patients may be collected in advance as a training sample set, and a different set of images may be selected. The 500 X-film images of the training sample set are used as the test sample set.
  • step S203 preprocessing is performed on each piece of image information in the training sample set and the test sample set.
  • each piece of image information in the training sample set and the test sample set is the same as the subsequent step S103, including resolution adjustment, normalization processing, and expansion.
  • resolution adjustment For details, please refer to the description of the subsequent embodiments. I won't repeat it here.
  • Each image information after preprocessing is a three-layer image with 1024*1024 pixels.
  • each piece of preprocessed image information is divided into two non-overlapping blocks, and each block is labeled with a preset label, and the preset label includes a first label and a second label. .
  • each block is labeled with a preset label.
  • the preset label is used to distinguish whether the image information contains a key signal, and includes a first label and a second label.
  • the first label indicates that the image information contains the key signal, and the second label indicates The image information does not contain key signals.
  • the two blocks respectively represent the left lung image and the right lung image.
  • Each block includes three layers of images, and the pixels of each layer of image It is 512*1024.
  • the two blocks are labeled with preset labels according to actual conditions.
  • the preset label is used to indicate whether the left lung image and the right lung image suffer from pneumonia.
  • the first label indicates that the image contains a pneumonia signal
  • the second label indicates that the image does not contain a pneumonia signal. If the left lung image/right lung image has pneumonia, the corresponding left/right segment will be labeled first; if the left lung image/right lung image has no pneumonia, the corresponding left/right segment Put the second label in pieces.
  • the first label and the second label may be represented by binary numbers 0 and 1. It should be understood that the above is only an example of the present application and is not used to limit the present application. In other embodiments, the preset label may also be represented by other forms of labels.
  • the embodiment of the present application uses the image information with the preset label as the input of the residual network to train the residual network several times. Train several pieces of image information at a time, such as 30 X-ray images.
  • step S205 several pieces of image information are obtained from the training sample set, and two blocks with preset labels for each piece of image information are respectively passed as input vectors to the residual network for training.
  • the labeled sub-block is used as an input vector and passed into the preset residual network for training, and the recognition result of each sub-block is obtained.
  • the input dimension of the residual network is 512*1024*3, which is the size of a block of image information.
  • each block first passes through a 7*7*64 convolutional layer and a 3*3 maximum pooling layer with a step size of 2 to obtain a 256*512*64 feature map, so
  • the feature map passes through four groups of residual modules in sequence, changes to a 128*256*256 feature map through the first set of residual modules, changes to a 64*128*512 feature map through the second set of residual modules, and passes through the third set of residual modules.
  • the group residual module is changed to a feature map of 32*64*1024, and the fourth group of residual module is changed to a feature map of 16*32*2048; finally, the fully connected layer of dimension 2 is passed to obtain the prediction of the block Value; the predicted value is further obtained through the output layer whether the block is the target image.
  • the image information is an X-ray image of the lungs
  • the predicted value indicates that the segment is a target image containing pneumonia signals.
  • the larger the score the greater the segment includes The greater the probability of pneumonia signals.
  • the embodiment of the present application sets a prediction threshold in the output layer, and compares the prediction value of the block with the prediction threshold. If the prediction value is greater than or equal to the prediction threshold, Predict the threshold, it is determined that the block is a target image containing a pneumonia signal, otherwise the block is a non-target image that does not contain a pneumonia signal.
  • step S205 is performed on several pieces of image information in this training until the several pieces of image information are traversed.
  • Step S206 is executed.
  • step S206 a preset loss function is used to calculate the error between the recognition result of each of the blocks passing through the residual network and the corresponding preset label, and the residual network is modified according to the error. parameter.
  • the preset loss function is used to calculate the difference between the recognition result of each block and the corresponding preset label. Error, and based on the error return to modify the parameters of the convolutional layer and the residual module in the residual network.
  • the embodiment of the present application adopts a cross-entropy loss function and applies a back-propagation algorithm to transmit the error back to each convolutional layer, so as to encourage it to continuously learn features until convergence.
  • step S207 several pieces of image information are obtained from the training sample set, and two blocks with preset labels for each piece of image information are respectively transferred to the residual network after parameter modification to perform the next iteration training .
  • the residual network after modifying the parameters in step S206 is used for the next training.
  • the embodiment of the present application first uses a learning rate of 0.01 to train 3000 times, each time including 30 pieces of image information, and then uses a learning rate of 0.001 to continue training 1000 times.
  • Learning rate learning rate is an important parameter of training residual network, which is defined as the update range of the parameters in the network model. The greater the learning rate, the faster the parameters in the model change.
  • each training process 30 pieces of image information with preset labels are randomly selected from the training sample set, and two blocks of each image information are used as input vectors to the residual network with modified parameters for training.
  • the training process is the same as that of step S205. For details, please refer to the above description, which will not be repeated here. Repeat the iterative steps S205 and S206 until the training with a learning rate of 0.01 and training with a learning rate of 0.001 is completed, so that the residual network can learn the key features in the image information, such as the lung X-ray image in the previous example The key features of the pneumonia signal, and finally a convergent model.
  • step S208 after the iterative training reaches a preset number of times, the image information in the test sample set with the preset label is used as an input vector to the residual network obtained by the iterative training for testing.
  • the test sample set is several pieces of image information that do not overlap with the training sample set.
  • each labeled image information in the test sample set is passed into the residual network as an input vector for testing.
  • the testing process is the same as that of steps S204 and S205, see for details The above description will not be repeated here.
  • the criterion for passing the test is that the accuracy of the residual network's recognition result of the test sample set reaches the specified accuracy threshold, that is, the residual network's recognition result of each image information in the test sample set and the corresponding preset If the probability of the same label reaches the specified accuracy threshold, it indicates that each parameter in the residual network has been adjusted in place.
  • the specified threshold may be 90%.
  • the trained residual network can be used to extract features of image information, which can effectively reduce noise interference and determine the target image with high accuracy.
  • the image recognition method based on residual network includes:
  • step S102 an image to be recognized is acquired.
  • the image to be recognized may be an X-ray image of the lung, including image information of the left lung and the right lung.
  • the server may obtain the image to be recognized according to actual needs or application scenarios. For example, the server obtains the image to be recognized from a preset database in which a large number of X-ray images of the lungs are collected in advance. The server may also obtain the image to be recognized through an imaging device connected to the hospital. It is understandable that the server can also obtain the image to be recognized in a variety of ways, which will not be repeated here.
  • step S103 preprocessing is performed on the image to be recognized.
  • the embodiment of the present application Before training or using the residual network, the embodiment of the present application first preprocesses the image to be recognized to improve the recognition speed and accuracy of the residual network.
  • the step S103 performing preprocessing on the image to be recognized includes:
  • step S401 the image to be recognized is adjusted to a preset resolution.
  • the embodiment of the present application first adjusts the resolution of the image to be recognized, so that the image to be recognized conforms to the input vector of the residual network.
  • the step S401 adjusting the image to be recognized to a preset resolution includes:
  • step S501 the resolution of the image to be recognized is compared with a preset resolution threshold.
  • the preset resolution threshold is related to the input dimension of the residual network.
  • the input of the residual network is a block representing the left lung or the right lung, and its dimension is 512*1024*3, and the image to be recognized usually includes two lungs. Therefore, the preset The resolution threshold of is preferably 1024*1024 pixels.
  • the resolution of the image to be recognized is compared with the preset resolution threshold of 1024*1024 to determine whether the resolution of the image to be recognized is too high or too low.
  • step S502 when the resolution of the image to be recognized is higher than a preset resolution threshold, the image to be recognized is down-sampled to the resolution threshold.
  • down-sampling also known as down-sampling
  • down-sampling is a multi-rate digital signal processing technique or a process of reducing the signal sampling rate, usually used to reduce the data transmission rate or data size.
  • the downsampling coefficient is k
  • one point is taken every k points in each row and column of the original image to form an image, so that the resolution of the new image reaches the predetermined resolution. Set the resolution.
  • step S503 when the resolution of the image to be recognized is lower than the preset resolution threshold, the image to be recognized is up-sampled to the resolution threshold.
  • both up-sampling and down-sampling are resampled digital signals.
  • Upsampling that is, interpolation
  • k-1 points are inserted between the two points n and n+1 in the original image to form k points.
  • the bilinear interpolation method is used to upsample the image to be recognized to the preset resolution, that is, interpolation is also performed for each column after the interpolation of each row of the image to be recognized is completed.
  • the resolution of the image to be recognized is adjusted so that the image to be recognized conforms to the input dimensions of the residual network, which is beneficial to improve the speed of the residual network in recognizing the image to be recognized.
  • step S402 a normalization process is performed on each pixel value in the image to be recognized after the resolution is adjusted.
  • the embodiment of the present application After adjusting the resolution of the image to be recognized, the embodiment of the present application performs normalization processing on the value of each pixel in the image to be recognized, that is, transforms the value of each pixel to [-1,1 ]between.
  • the calculation formula for normalization processing is:
  • x represents the value of any pixel in the X-ray film image
  • x′ represents the normalized value of the pixel
  • the embodiment of the present application normalizes the value of each pixel of the image to be identified after the resolution is adjusted, so that the data distribution in the image to be identified is more uniform, which is beneficial to speed up the identification of the residual network process.
  • step S403 the normalized image to be recognized is expanded into a three-layer image.
  • the embodiment of the present application further copies the image to be recognized and expands it into a three-layer image.
  • the resolution of the image to be recognized after the normalization processing is 1024*1024
  • the image to be recognized obtained through step S402 is 1024*1024*1
  • the image to be recognized after step S403 is expanded to Three layers, namely 1024*1024*3.
  • step S104 the preprocessed image to be recognized is divided into two non-overlapping blocks, which are successively passed as input to the residual network, and the prediction of each block after passing through the residual network is obtained. value.
  • the three-layer image is divided into two non-overlapping blocks according to the left and right.
  • the two partitions should be equal parts.
  • the image to be recognized is an X-ray image of the lungs, including the image information of the left lung and the right lung
  • the two blocks represent the left lung image and the right lung image
  • each block It includes three layers of images, and the pixels of each layer are 512*1024.
  • each block is passed into the trained residual network for independent recognition.
  • the residual network predicts each block, and outputs the predicted value of each block.
  • the predicted value represents the score of the target image.
  • the larger the score the greater the probability that the block is classified as the target image, and the smaller the score, the smaller the score.
  • the probability of being classified as a target image is smaller.
  • the predicted value represents the score of the target image containing the pneumonia signal. The larger the score, the greater the score. The greater the probability that a block is classified as a target image containing a pneumonia signal, the smaller the score, and the smaller the probability that the block is classified as a target image containing a pneumonia signal.
  • step S105 a recognition result is output according to the predicted values of the two partitions, where the recognition result includes the image to be recognized as a target image and the image to be recognized as a non-target image.
  • the embodiment of the present application comprehensively analyzes the predicted value of the two sub-blocks to obtain the predicted value of the image to be recognized, and according to the image to be recognized The predicted value of the output recognition result.
  • the step S105 outputting a recognition result according to the predicted values of the two blocks includes:
  • step S601 the predicted values of the two blocks are compared, and the larger of the predicted values is selected as the predicted value of the image to be recognized.
  • the predicted value of each block represents the probability that the block is the target image containing the pneumonia signal.
  • the present application compares the predicted values of the two blocks, and selects the larger predicted value as the predicted value of the image to be recognized.
  • step S602 the prediction value of the image to be recognized is compared with a preset prediction threshold.
  • the prediction threshold is set in advance based on experience.
  • the prediction threshold is a criterion for determining whether the image to be recognized is a target image containing a pneumonia signal. After the predicted value of the image to be recognized is obtained, the predicted value is compared with the predicted threshold.
  • step S603 if the predicted value of the image to be identified is greater than or equal to the predicted threshold, the identification result is output as the target image.
  • the predicted value represents the score of the target image containing the pneumonia signal in the block, and the larger the score, the greater the probability that the block contains the pneumonia signal.
  • the predicted value of the image to be recognized is greater than or equal to the prediction threshold, it is determined that pneumonia is suspected in the image to be recognized, and the classification result is output as the target image.
  • step S604 if the predicted value of the image to be recognized is less than the predicted threshold, the output recognition result is a non-target image.
  • the classification result is output as a non-target image.
  • the embodiment of the present application divides the image to be recognized into two blocks and inputs it into the residual network for prediction, which can reduce the calculation cost, reduce the training time, and improve the prediction of the residual network while retaining sufficient clarity. effectiveness.
  • the embodiment of the present application preprocesses the image to be recognized, and then extracts the key features of the preprocessed image to be recognized through the residual network to predict the predicted value of the image to be recognized.
  • the predicted value represents the score of the target image containing the pneumonia signal in the block. The larger the score, the greater the probability that the block contains the pneumonia signal; finally, the comparison is based on the predicted value and a preset threshold Yes, and according to the result of the comparison, whether the image to be recognized is the target image is obtained; thus, the target image containing the pneumonia signal is identified based on the residual network, and the key features are extracted through the residual network, which can reduce noise interference and achieve high accuracy Rate to determine whether you have pneumonia, and improve the accuracy of pneumonia prediction.
  • an image recognition device based on a residual network is provided, and the image recognition device based on the residual network corresponds to the image recognition method based on the residual network in the foregoing embodiment in a one-to-one correspondence.
  • the image recognition device based on residual network includes a training module, an acquisition module, a preprocessing module, a recognition module, and an output module. The detailed description of each functional module is as follows:
  • the training module 71 is configured to construct a residual network, and use preset training samples to train the residual network;
  • the obtaining module 72 is used to obtain the image to be recognized
  • the preprocessing module 73 is configured to perform preprocessing on the image to be recognized
  • the recognition module 74 is configured to divide the preprocessed image to be recognized into two non-overlapping blocks, which are successively passed into the residual network as input to obtain the image of each block after passing through the residual network Predictive value;
  • the output module 75 is configured to output a recognition result according to the predicted values of the two blocks, where the recognition result includes the image to be recognized as a target image and the image to be recognized as a non-target image.
  • the preprocessing module 73 includes:
  • An adjustment unit configured to adjust the image to be recognized to a preset resolution
  • a normalization unit configured to perform normalization processing on each pixel value in the image to be recognized after the resolution adjustment
  • the expansion unit is used to expand the normalized image to be recognized into a three-layer image.
  • the adjustment unit includes:
  • the comparison subunit is used to compare the resolution of the image to be recognized with a preset resolution threshold
  • the down-sampling subunit is configured to down-sample the image to be identified to the resolution threshold when the resolution of the image to be identified is higher than a preset resolution threshold;
  • the up-sampling subunit is configured to up-sample the image to be identified to the resolution threshold when the resolution of the image to be identified is lower than the preset resolution threshold.
  • the output module 75 includes:
  • a first comparison unit configured to compare the predicted values of the two blocks, and select the larger of the predicted values as the predicted value of the image to be recognized;
  • a second comparison unit configured to compare the prediction value of the image to be recognized with a preset prediction threshold
  • the first output unit is configured to output the recognition result as the target image if the predicted value of the image to be recognized is greater than or equal to the predicted threshold;
  • the second output unit is configured to output the recognition result as a non-target image if the predicted value of the image to be recognized is less than the predicted threshold.
  • the training module 71 blocks:
  • a construction unit for constructing a residual network including an input layer, a convolution layer, a maximum pooling layer, 16 residual modules, a fully connected layer, and an output layer;
  • the collection unit is used to collect multiple image information of multiple designated users as a training sample set and a test sample set;
  • a preprocessing unit configured to perform preprocessing on each piece of image information in the training sample set and the test sample set;
  • the label unit is used to divide each piece of preprocessed image information into two non-overlapping blocks, and to label each block with a preset label, the preset label including a first label and a second label ;
  • a training unit configured to obtain several pieces of image information from the training sample set, and use two blocks of each piece of image information with preset labels as input vectors to the residual network for training;
  • the correction unit is configured to use a preset loss function to calculate the error between the recognition result of each of the blocks passing through the residual network and the corresponding preset label, and to modify the residual network according to the error parameter;
  • the iterative unit is used to obtain several pieces of image information from the training sample set, and pass the two blocks with preset labels of each piece of image information into the residual network after parameter modification to perform the next iteration training ;
  • the test unit is configured to pass the image information with the preset label in the test sample set as the input vector to the residual network obtained by the iterative training after the iterative training reaches the preset number of times for testing.
  • each module in the above-mentioned image recognition device based on residual network can be implemented in whole or in part by software, hardware, and a combination thereof.
  • the foregoing modules may be embedded in the form of hardware or independent of the processor in the computer device, or may be stored in the memory of the computer device in the form of software, so that the processor can call and execute the operations corresponding to the foregoing modules.
  • a computer device is provided.
  • the computer device may be a server, and its internal structure diagram may be as shown in FIG. 8.
  • the computer equipment includes a processor, a memory, a network interface and a database connected through a system bus.
  • the processor of the computer device is used to provide calculation and control capabilities.
  • the memory of the computer device includes a non-volatile storage medium and an internal memory.
  • the non-volatile storage medium stores an operating system, computer readable instructions, and a database.
  • the internal memory provides an environment for the operation of the operating system and computer-readable instructions in the non-volatile storage medium.
  • the network interface of the computer device is used to communicate with an external terminal through a network connection.
  • the computer-readable instructions are executed by the processor to realize an image recognition method based on the residual network.
  • a computer device including a memory, a processor, and computer-readable instructions stored in the memory and capable of running on the processor, and the processor implements the following steps when the processor executes the computer-readable instructions:
  • the recognition result is output according to the predicted values of the two blocks, where the recognition result includes that the image to be recognized is a target image and the image to be recognized is a non-target image.
  • one or more non-volatile readable storage media storing computer readable instructions are provided.
  • the computer readable instructions are executed by one or more processors, the one or more Each processor performs the following steps:
  • the recognition result is output according to the predicted values of the two blocks, where the recognition result includes that the image to be recognized is a target image and the image to be recognized is a non-target image.
  • Non-volatile memory may include read only memory (ROM), programmable ROM (PROM), electrically programmable ROM (EPROM), electrically erasable programmable ROM (EEPROM), or flash memory.
  • ROM read only memory
  • PROM programmable ROM
  • EPROM electrically programmable ROM
  • EEPROM electrically erasable programmable ROM
  • Volatile memory may include random access memory (RAM) or external cache memory.
  • RAM is available in many forms, such as static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double data rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous chain Channel (Synchlink) DRAM (SLDRAM), memory bus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Image Analysis (AREA)

Abstract

一种残差网络的图像识别方法、装置、设备及存储介质,所述方法包括:构建残差网络,采用预设的训练样本对所述残差网络进行训练(S101);获取待识别图像(S102);对所述待识别图像执行预处理(S103);将预处理后的所述待识别图像划分为不重叠的两个分块,依次作为输入传入所述残差网络,获取每一分块经过所述残差网络后的预测值(S104);根据所述两个分块的预测值输出识别结果,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像(S105)。通过残差网络提取出关键特征,能够减低噪声干扰,以高准确率判断出目标图像,将所述图像识别方法应用到肺部X光片图像,实现了基于残差网络识别包含肺炎信号的目标图像,有效地提高了肺炎预测的准确度。

Description

基于残差网络的图像识别方法、装置、设备及存储介质
本申请以2019年4月26日提交的申请号为201910345031.3,名称为“基于残差网络的图像识别方法、装置、设备及存储介质”的中国发明专利申请为基础,并要求其优先权。
技术领域
本申请涉及信息技术领域,尤其涉及一种基于残差网络的图像识别方法、装置、设备及存储介质。
背景技术
肺炎对婴幼儿是一种高危疾病,占所有儿童死亡的15%以上。2015年,有约90万名5岁以下儿童死于该病。因此,准确诊断肺炎是一项艰巨的任务。现有技术主要通过训练有素的专家审查胸部X光片,并通过结合临床病史、生命体征和实验室的检查才能确认。X光诊断作为最常进行的放射成像诊断方式,其重要性不言而喻。在X光片中,肺炎通常表现为不透明度增加的区域。然而,肺部有许多其他疾病,如肺水肿,出血,肺不张或塌陷,肺癌或放疗后或手术后的其他变化,都会影响X光片中肺炎的判断;在肺外,胸膜腔内的液体,比如胸腔积液,也表现为X光片中的不透明度增加,从降低了从X光片图像中识别包含肺炎信号的目标图像的准确率。
因此,寻找一种提高从X光片图像中识别出包含肺炎信号的目标图像的识别准确率的方法成为本领域技术人员亟需解决的问题。
发明内容
本申请实施例提供了一种基于残差网络的图像识别方法、装置、设备及存储介质,以解决现有技术对包含肺炎信号的目标图像的识别准确率低的问题。
一种基于残差网络的图像识别方法,包括:
构建残差网络,采用预设的训练样本对所述残差网络进行训练;
获取待识别图像;
对所述待识别图像执行预处理;
将预处理后的所述待识别图像划分为不重叠的两个分块,依次作为输入传入所述残差网络,获取每一分块经过所述残差网络后的预测值;
根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像。
进一步地,所述对所述待识别图像执行预处理包括:
将所述待识别图像调整为预设分辨率;
对分辨率调整后的所述待识别图像中的每个像素点值执行归一化处理;
将归一化处理后的所述待识别图像扩展为三层图像。
进一步地,所述将所述待识别图像调整为预设分辨率包括:
将所述待识别图像的分辨率与预设的分辨率阈值进行比对;
当所述待识别图像的分辨率高于预设的分辨率阈值时,对所述待识别图像降采样至所述分辨率阈值;
当所述待识别图像的分辨率低于预设的分辨率阈值时,对所述待识别图像上采样至所述分辨率阈值。
进一步地,所述根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像包括:
将所述两个分块的预测值进行比较,选择预测值中的较大值作为所述待识别图像的预测值;
将所述待识别图像的预测值与预设的预测阈值进行比较;
若所述待识别图像的预测值大于或等于所述预测阈值时,输出识别结果为目标图像;
若所述待识别图像的预测值小于所述预测阈值时,输出识别结果为非目标图像。
进一步地,所述构建残差网络,采用预设的训练样本对所述残差网络进行训练包块:
构建残差网络,所述残差网络包括输入层、卷积层、最大池化层、16个残差模块、全连接层以及输出层;
收集多个指定用户的多张图像信息作为训练样本集和测试样本集;
对所述训练样本集和测试样本集中的每一张图像信息执行预处理;
将每一张预处理后的所述图像信息划分为不重叠的两个分块,对每一分块打上预设标签,所述预设标签包括第一标签和第二标签;
从所述训练样本集中获取若干张图像信息,将每一张图像信息带有预设标签的两个分块分别作为输入向量传入所述残差网络进行训练;
采用预设的损失函数计算每一所述分块经过所述残差网络的识别结果与对应的预设标签之间的误差,并根据所述误差修改所述残差网络的参数;
从所述训练样本集中获取若干张图像信息,将每一张图像信息带有预设标签的两个分块分别传入参数修改后的所述残差网络执行下一次迭代训练;
在迭代训练达到预设次数后,将测试样本集中带有预设标签的图像信息作为输入向量传入迭代训练得到的所述残差网络进行测试。
一种基于残差网络的肺炎识别装置,包括:
训练模块,用于构建残差网络,采用预设的训练样本对所述残差网络进行训练;
获取模块,用于获取待识别图像;
预处理模块,用于对所述待识别图像执行预处理;
识别模块,用于将预处理后的所述待识别图像划分为不重叠的两个分块,依次作为输入传入所述残差网络,获取每一分块经过所述残差网络后的预测值;
输出模块,用于根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像。
进一步地,所述预处理模块包括:
调整单元,用于将所述待识别图像调整为预设分辨率;
归一化单元,用于对分辨率调整后的所述待识别图像中的每个像素点值执行归一化处理;
扩展单元,用于将归一化处理后的所述待识别图像扩展为三层图像。
进一步地,所述调整单元包括:
比对子单元,用于将所述待识别图像的分辨率与预设的分辨率阈值进行比对;
降采样子单元,用于当所述待识别图像的分辨率高于预设的分辨率阈值时,对所述待识别图像降采样至所述分辨率阈值;
上采样子单元,用于当所述待识别图像的分辨率低于预设的分辨率阈值时,对所述待识别图像上采样至所述分辨率阈值。
一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,所述处理器执行所述计算机可读指令时实现如下步骤:
构建残差网络,采用预设的训练样本对所述残差网络进行训练;
获取待识别图像;
对所述待识别图像执行预处理;
将预处理后的所述待识别图像划分为不重叠的两个分块,依次作为输入传入所述残差网络,获取每一分块经过所述残差网络后的预测值;
根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像。
一个或多个存储有计算机可读指令的非易失性可读存储介质,所述计算机可读指令被一个或多个处 理器执行时,使得所述一个或多个处理器执行如下步骤:
构建残差网络,采用预设的训练样本对所述残差网络进行训练;
获取待识别图像;
对所述待识别图像执行预处理;
将预处理后的所述待识别图像划分为不重叠的两个分块,依次作为输入传入所述残差网络,获取每一分块经过所述残差网络后的预测值;
根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像。
本申请的一个或多个实施例的细节在下面的附图和描述中提出,本申请的其他特征和优点将从说明书、附图以及权利要求变得明显。
附图说明
为了更清楚地说明本申请实施例的技术方案,下面将对本申请实施例的描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其他的附图。
图1是本申请一实施例中基于残差网络的图像识别方法的一流程图;
图2是本申请一实施例中基于残差网络的图像识别方法中步骤S101的一流程图;
图3是本申请一实施例提供的残差网络的结构示意图;
图4是本申请一实施例中基于残差网络的图像识别方法中步骤S103的一流程图;
图5是本申请一实施例中基于残差网络的图像识别方法中步骤S401的一流程图;
图6是本申请一实施例中基于残差网络的图像识别方法中步骤S105的一流程图;
图7是本申请一实施例中基于残差网络的图像识别装置的一原理框图;
图8是本申请一实施例中计算机设备的一示意图。
具体实施方式
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
本申请实施例提供的基于残差网络的图像识别方法应用于服务器。所述服务器可以用独立的服务器或者是多个服务器组成的服务器集群来实现。在一实施例中,如图1所示,提供了一种基于残差网络的图像识别方法,包括如下步骤:
在步骤S101中,构建残差网络,采用预设的训练样本对所述残差网络进行训练。
在这里,本申请实施例选取的深度神经网络为分类性能优异的残差网络ResNet(Residual Network,简称ResNet网络)。为便于理解,下面将对残差网络的训练过程进行详细描述。如图2所述,所述步骤S101包括:
在步骤S201中,构建残差网络,所述残差网络包括输入层、卷积层、最大池化层、16个残差模块、全连接层以及输出层。
如图3所示,为本申请实施例提供的残差网络的结构示意图。所述残差网络包括输入层、卷积层、最大池化层、16个残差模块、全连接层以及输出层。其中,所述卷积层的卷积核为7*7,通道数为64。最大池化层的窗口为3*3、步长为2。所述16个残差模块的结构相同,均包括三层卷积层,分别为卷积 核为1*1的第一卷积层、卷积核为3*3的第二卷积层、卷积核为1*1的第三卷积层,每个卷积层后还包括一个批归一化层和激活层。所述全连接层的维度为2。
进一步地,16个残差模块根据不同的通道数划分为四组。第一组包括3个残差模块,每个残差模块中的第一卷积层的通道数为64、第二卷积层的通道数为64、第三卷积层的通道数为256。第二组包括4个残差模块,每个残差模块中的第一卷积层的通道数为128、第二卷积层的通道数为128、第三卷积层的通道数为512。第三组包括6个残差模块,每个残差模块中的第一卷积层的通道数为256、第二卷积层的通道数为256、第三卷积层的通道数为1024。第四组包括3个残差模块,每个残差模块中的第一卷积层的通道数为512、第二卷积层的通道数为512、第三卷积层的通道数为2048。
在步骤S202中,收集多个指定用户的多张图像信息作为训练样本集和测试样本集。
可选地,作为发明的一个优选示例,本申请实施例提供的基于残差网络的图像识别方法可用于通过基于残差网络识别包含肺炎信号的目标图像。因此,所述指定用户可以为肺炎患者,所述图像信息可以为肺部X光片图像,可以预先收集多个肺炎患者的5000以上张的X光片图像作为训练样本集,并选取不同于所述训练样本集的500张X片图像作为测试样本集。
在步骤S203中,对所述训练样本集和测试样本集中的每一张图像信息执行预处理。
在这里,对所述训练样本集和测试样本集中的每一张图像信息进行预处理与后续步骤S103相同,包括分辨率调整、归一化处理以及扩展,具体请参见后续实施例的叙述,此处不再赘述。预处理后的每一张图像信息为1024*1024像素的三层图像。
在步骤S204中,将每一张预处理后的所述图像信息划分为不重叠的两个分块,对每一分块打上预设标签,所述预设标签包括第一标签和第二标签。
在得到三层图像之后,将所述三层图像按照左右划分为不重叠的两个分块。所述两个分块应当为均等的两部分。然后为每一分块打上预设标签,所述预设标签用于区分图像信息是否包含关键信号,包括第一标签和第二标签,其中第一标签表示图像信息包含关键信号,第二标签表示图像信息未包含关键信号。
如前所述,当所述图像信息为肺部X光片图像时,所述两个分块分别表示左肺影像和右肺影像,每一分块包括三层图像,每一层图像的像素为512*1024。然后根据实际情况对所述两个分块打上预设标签。所述预设标签用于指示所述左肺影像和右肺影像是否患有肺炎。在这里,所述第一标签表示图像中包含肺炎信号,所述第二标签表示图像中未包含肺炎信号。若左肺影像/右肺影像患有肺炎,则将对应的左分块/右分块打上第一标签;若左肺影像/右肺影像未患有肺炎,则将对应的左分块/右分块打上第二标签。
可选地,所述第一标签和第二标签可以通过二进制数字0和1来表示。应当理解,以上仅为本申请的一个示例,并不用于限制本申请,在其他实施例中,所述预设标签还可以通过其他形式的标签来表示。
在完成标签设置后,本申请实施例以带有预设标签的所述图像信息作为所述残差网络的输入对所述残差网络进行若干次训练。每次训练若干张图像信息,比如30张X光片图像。
在步骤S205中,从所述训练样本集中获取若干张图像信息,将每一张图像信息带有预设标签的两个分块分别作为输入向量传入所述残差网络进行训练。
在训练时,针对每一个图像信息的两个分块,将带有标签的分块作为一个输入向量,传入预设的残差网络进行训练,得到每一个分块的识别结果。
所述残差网络的输入维度为512*1024*3,即图像信息的一个分块的大小。每一个分块在所述残差网络中,首先经过7*7*64的卷积层和3*3、步长为2的最大池化层后,得到256*512*64的特征图,所述特征图依次通过四组残差模块,经过第一组残差模块变化为128*256*256的特征图、经过第二组残差模块变化为64*128*512的特征图、经过第三组残差模块变化为32*64*1024的特征图、经过第四组残差模块变化为16*32*2048的特征图;最后经过维度为2的全连接层,得到所述分块的预测值;所述预测值进一步通过输出层得到分块是否为目标图像。承接前文示例,所述图像信息为肺部的X光片图像时,所述预测值表示所述分块为包含肺炎信号的目标图像的分值,所述分值越大,所述分块包含肺炎信号的概率越大。在得到所述分块的预测值之后,本申请实施例在输出层中设置预测阈值,将所述分块的预测值与所述预测阈值进行比对,若所述预测值大于或等于所述预测阈值,则判定所述分块为包含肺炎信号 的目标图像,否则所述分块为未包含肺炎信号的非目标图像。
对该次训练中的若干张图像信息均执行上述步骤S205,直至遍历完所述若干张图像信息。执行步骤S206。
在步骤S206中,采用预设的损失函数计算每一所述分块经过所述残差网络的识别结果与对应的预设标签之间的误差,并根据所述误差修改所述残差网络的参数。
在完成一次训练,得到所述若干张图像信息的每一所述分块对应的识别结果后,采用预设的损失函数计算每一所述分块的识别结果与对应的预设标签之间的误差,并基于所述误差返回去修改所述残差网络中卷积层及残差模块的参数。可选地,本申请实施例采用交叉熵损失函数并应用反向传播算法把所述误差传回给各个卷积层,促使其不断学习特征,直至收敛。
在步骤S207中,从所述训练样本集中获取若干张图像信息,将每一张图像信息带有预设标签的两个分块分别传入参数修改后的所述残差网络执行下一次迭代训练。
通过步骤S206修改参数后的残差网络,用于进行下一次训练。在这里,本申请实施例先使用0.01的学习率训练3000次,每次包括30张图像信息,之后使用0.001的学习率继续训练1000次。学习率learning rate是训练残差网络的一个重要参数,其定义为网络模型中参数的更新幅度。学习率越大,模型中参数变化的越快。
因此,每次训练过程中,从训练样本集中随机选取30张带有预设标签的图像信息,将每一图像信息的两个分块作为输入向量依次传入参数修改后的残差网络进行训练,训练过程和步骤S205的相同,具体参见上面的叙述,此处不再赘述。重复迭代步骤S205、S206,直至完成学习率为0.01的训练和学习率为0.001的训练,使得所述残差网络能够学习到图像信息中的关键特征,比如前文示例的肺部X光片图像中的肺炎信号的关键特征,最终得到收敛的模型。
在步骤S208中,在迭代训练达到预设次数后,将测试样本集中带有预设标签的图像信息作为输入向量传入迭代训练得到的所述残差网络进行测试。
在这里,所述测试样本集为与训练样本集为不重合的若干张图像信息。在所述残差网络训练完毕后,将所述测试样本集中的每一个带有标签的图像信息作为输入向量传入所述残差网络进行测试,测试过程与步骤S204、S205的相同,具体参见上面的叙述,此处不再赘述。测试通过的标准是残差网络对测试样本集的识别结果的准确率达到指定的准确率阈值,即所述残差网络对测试样本集中的每一所述图像信息的识别结果与对应的预设标签相同的概率达到所述指定的准确率阈值,则说明所述残差网络中的各个参数已经调整到位。可选地,所述指定阈值可以为90%。
训练好的残差网络可用于对图像信息提取特征,能够有效减低噪声干扰,以高准确率判断出目标图像。所述基于残差网络的图像识别方法包括:
在步骤S102中,获取待识别图像。
作为本申请的一个优选示例,所述待识别图像可以为肺部的X光片图像,包括左肺和右肺的影像信息。可选地,服务器可以根据实际需要或者应用场景的需要获取待识别图像。例如,服务器从预设数据库中获取待识别图像,所述预设数据库中预先收集了大量肺部的X光片图像。所述服务器还可以通过连接到医院的影像设备得到所述待识别图像。可以理解的是,服务器还可以通过多种方式获取到待识别图像,此处不再过多赘述。
在步骤S103中,对所述待识别图像执行预处理。
在训练或者使用残差网络之前,本申请实施例首先对待识别图像进行预处理,以提高残差网络识别的速度和识别的准确度。可选地,如图4所示,所述步骤S103对所述待识别图像执行预处理包括:
在步骤S401中,将所述待识别图像调整为预设分辨率。
在这里,由于所述待识别图像为原始图像,可能存在像素、大小不一等影响识别的问题。鉴于此,本申请实施例首先对所述待识别图像的分辨率进行调整,以使得待识别图像符合残差网络的输入向量。可选地,如图5所示,所述步骤S401将所述待识别图像调整为预设分辨率包括:
在步骤S501中,将所述待识别图像的分辨率与预设的分辨率阈值进行比对。
可选地,所述预设的分辨率阈值与所述残差网络的输入维度相关。承接前文示例,假设所述残差网络的输入为一个表示左肺或右肺的分块,其维度为512*1024*3,而待识别图像通常包括两个肺部,因 此,所述预设的分辨率阈值优选为1024*1024像素。本申请实施例将所述待识别图像的分辨率与所述预设的分辨率阈值1024*1024进行比对,以判断所述待识别图像的分辨率是过高还是过低。
在步骤S502中,当所述待识别图像的分辨率高于预设的分辨率阈值时,对所述待识别图像降采样至所述分辨率阈值。
在数位信号处理领域中,降采样,又称为减采样,是一种多速率数字信号处理的技术或是降低信号采样率的过程,通常用于降低数据传输速率或者数据大小。对于一幅N*M的图像来说,如果降采样系数为k,则在原图中每行每列每隔k个点取一个点组成一幅图像,使得新的图像的分辨率达到所述预设分辨率。
在步骤S503中,当所述待识别图像的分辨率低于预设的分辨率阈值时,对所述待识别图像上采样至所述分辨率阈值。
在这里,上采样和下采样都是对数字信号进行重采。上采样,也即插值,如果上采样系数为k,则在原图n与n+1两点之间插入k-1个点,使其构成k分。本申请实施例采用双线性插值法对所述待识别图像上采样至所述预设分辨率,即对所述待识别图像的每行插值完之后对于每列也进行插值。
本申请实施例通过对所述待识别图像的分辨率进行调整,使得所述待识别图像符合残差网络的输入维度,有利于提高残差网络对所述待识别图像进行识别的速度。
在步骤S402中,对分辨率调整后的所述待识别图像中的每个像素点值执行归一化处理。
在完成对所述待识别图像的分辨率调整之后,本申请实施例对所述待识别图像中的每个像素点值进行归一化处理,即将每个像素点值变换到[-1,1]之间。可选地,归一化处理的计算公式为:
Figure PCTCN2019117426-appb-000001
在上式中,x表示X光片图像中任意像素点的值,所述x'表示像素点经过归一化后的值。
在这里,本申请实施例通过对分辨率调整后的所述待识别图像的每个像素点值进行归一化处理,使得待识别图像中的数据分布更加均匀,有利于加速残差网络的识别过程。
在步骤S403中,将归一化处理后的所述待识别图像扩展为三层图像。
在完成归一化处理之后,本申请实施例进一步将所述待识别图像进行复制,扩展为三层图像。示例性地,假设归一化处理后的所述待识别图像的分辨率为1024*1024,通过步骤S402得到的待识别图像为1024*1024*1,通过步骤S403后所述待识别图像扩展为三层,即1024*1024*3。通过将所述待识别图像扩展为三层图像,有利于残差网络导入预设参数,避免网络无法收敛的情况。
在步骤S104中,将预处理后的所述待识别图像划分为不重叠的两个分块,依次作为输入传入所述残差网络,获取每一分块经过所述残差网络后的预测值。
在得到三层图像之后,将所述三层图像按照左右划分为不重叠的两个分块。所述两个分块应当为均等的两部分。承接前文示例,若所述待识别图像为肺部的X光片图像,包括左肺和右肺的影像信息,则所述两个分块分别表示左肺影像和右肺影像,每一分块包括三层图像,每一层图像的像素为512*1024。然后将每一分块传入训练好的所述残差网络进行独立的识别。所述残差网络对每一分块进行预测,输出每一分块的预测值。
在这里,所述预测值表示所述分块为目标图像的分值,所述分值越大,所述分块归为目标图像的概率越大,所述分值越小,所述分块归为目标图像的概率越小。承接前文示例,若所述待识别图像为肺部的X光片图像时,所述预测值表示所述分块为包含肺炎信号的目标图像的分值,所述分值越大,所述分块归为包含肺炎信号的目标图像的概率越大,所述分值越小,所述分块归为包含肺炎信号的目标图像的概率越小。
在步骤S105中,根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像。
在得到每一分块经过所述残差网络的预测值后,本申请实施例综合分析所述两个分块的预测值,得到所述待识别图像的预测值,并根据所述待识别图像的预测值输出识别结果。可选地,如图6所示,所述步骤S105根据所述两个分块的预测值输出识别结果包括:
在步骤S601中,将所述两个分块的预测值进行比较,选择预测值中的较大值作为所述待识别图像 的预测值。
在这里,每个分块的预测值表示该分块为包含肺炎信号的目标图像的概率。在得到两个分块的预测值后,本申请将两个分块的预测值进行比较,选取较大预测值作为所述待识别图像的预测值。
在步骤S602中,将所述待识别图像的预测值与预设的预测阈值进行比较。
在本申请实施例中,预先根据经验设置预测阈值。所述预测阈值为待识别图像是否为包含肺炎信号的目标图像的判断标准。在得到待识别图像的预测值之后,将所述预测值与所述预测阈值进行比较。
在步骤S603中,若所述待识别图像的预测值大于或等于所述预测阈值时,输出识别结果为目标图像。
如前所述,所述预测值表示所述分块为包含肺炎信号的目标图像的分值,所述分值越大,所述分块包含肺炎信号的概率越大。当所述待识别图像的预测值大于或等于所述预测阈值时,判定所述待识别图像中疑似存在肺炎,输出分类结果为目标图像。
在步骤S604中,若所述待识别图像的预测值小于所述预测阈值时,输出识别结果为非目标图像。
当所述待识别图像的预测值小于所述预测阈值时,判定所述待识别图像未存在肺炎,输出分类结果为非目标图像。
在这里,本申请实施例通过将所述待识别图像划分为两个分块输入残差网络进行预测,可以在保留足够清晰度的同时,降低计算成本,减少训练时间,提高残差网络预测的效率。
综上所述,本申请实施例通过对待识别图像进行预处理,然后通过残差网络对预处理后的所述待识别图像提取出关键特征进行预测,得到所述待识别图像的预测值,所述预测值表示所述分块为包含肺炎信号的目标图像的分值,所述分值越大,所述分块包含肺炎信号的概率越大;最后基于所述预测值与预设阈值进行比对,并根据比对结果得到所述待识别图像是否为目标图像;从而实现了基于残差网络识别包含肺炎信号的目标图像,通过残差网络提取出关键特征,能够减低噪声干扰,以高准确率判断出是否患有肺炎,提高肺炎预测的准确度。
应理解,上述实施例中各步骤的序号的大小并不意味着执行顺序的先后,各过程的执行顺序应以其功能和内在逻辑确定,而不应对本申请实施例的实施过程构成任何限定。
在一实施例中,提供一种基于残差网络的图像识别装置,该基于残差网络的图像识别装置与上述实施例中基于残差网络的图像识别方法一一对应。如图7所示,该基于残差网络的图像识别装置包括训练模块、获取模块、预处理模块、识别模块、输出模块。各功能模块详细说明如下:
训练模块71,用于构建残差网络,采用预设的训练样本对所述残差网络进行训练;
获取模块72,用于获取待识别图像;
预处理模块73,用于对所述待识别图像执行预处理;
识别模块74,用于将预处理后的所述待识别图像划分为不重叠的两个分块,依次作为输入传入所述残差网络,获取每一分块经过所述残差网络后的预测值;
输出模块75,用于根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像。
可选地,所述预处理模块73包括:
调整单元,用于将所述待识别图像调整为预设分辨率;
归一化单元,用于对分辨率调整后的所述待识别图像中的每个像素点值执行归一化处理;
扩展单元,用于将归一化处理后的所述待识别图像扩展为三层图像。
可选地,所述调整单元包括:
比对子单元,用于将所述待识别图像的分辨率与预设的分辨率阈值进行比对;
降采样子单元,用于当所述待识别图像的分辨率高于预设的分辨率阈值时,对所述待识别图像降采样至所述分辨率阈值;
上采样子单元,用于当所述待识别图像的分辨率低于预设的分辨率阈值时,对所述待识别图像上采样至所述分辨率阈值。
可选地,所述输出模块75包括:
第一比较单元,用于将所述两个分块的预测值进行比较,选择预测值中的较大值作为所述待识别图像的预测值;
第二比较单元,用于将所述待识别图像的预测值与预设的预测阈值进行比较;
第一输出单元,用于若所述待识别图像的预测值大于或等于所述预测阈值时,输出识别结果为目标图像;
第二输出单元,用于若所述待识别图像的预测值小于所述预测阈值时,输出识别结果为非目标图像。
可选地,所述训练模块71包块:
构建单元,用于构建残差网络,所述残差网络包括输入层、卷积层、最大池化层、16个残差模块、全连接层以及输出层;
收集单元,用于收集多个指定用户的多张图像信息作为训练样本集和测试样本集;
预处理单元,用于对所述训练样本集和测试样本集中的每一张图像信息执行预处理;
标签单元,用于将每一张预处理后的所述图像信息划分为不重叠的两个分块,对每一分块打上预设标签,所述预设标签包括第一标签和第二标签;
训练单元,用于从所述训练样本集中获取若干张图像信息,将每一张图像信息带有预设标签的两个分块分别作为输入向量传入所述残差网络进行训练;
修正单元,用于采用预设的损失函数计算每一所述分块经过所述残差网络的识别结果与对应的预设标签之间的误差,并根据所述误差修改所述残差网络的参数;
迭代单元,用于从所述训练样本集中获取若干张图像信息,将每一张图像信息带有预设标签的两个分块分别传入参数修改后的所述残差网络执行下一次迭代训练;
测试单元,用于在迭代训练达到预设次数后,将测试样本集中带有预设标签的图像信息作为输入向量传入迭代训练得到的所述残差网络进行测试。
关于基于残差网络的图像识别装置的具体限定可以参见上文中对于基于残差网络的图像识别方法的限定,在此不再赘述。上述基于残差网络的图像识别装置中的各个模块可全部或部分通过软件、硬件及其组合来实现。上述各模块可以硬件形式内嵌于或独立于计算机设备中的处理器中,也可以以软件形式存储于计算机设备中的存储器中,以便于处理器调用执行以上各个模块对应的操作。
在一个实施例中,提供了一种计算机设备,该计算机设备可以是服务器,其内部结构图可以如图8所示。该计算机设备包括通过系统总线连接的处理器、存储器、网络接口和数据库。其中,该计算机设备的处理器用于提供计算和控制能力。该计算机设备的存储器包括非易失性存储介质、内存储器。该非易失性存储介质存储有操作系统、计算机可读指令和数据库。该内存储器为非易失性存储介质中的操作系统和计算机可读指令的运行提供环境。该计算机设备的网络接口用于与外部的终端通过网络连接通信。该计算机可读指令被处理器执行时以实现一种基于残差网络的图像识别方法。
在一个实施例中,提供了一种计算机设备,包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机可读指令,处理器执行计算机可读指令时实现以下步骤:
构建残差网络,采用预设的训练样本对所述残差网络进行训练;
获取待识别图像;
对所述待识别图像执行预处理;
将预处理后的所述待识别图像划分为不重叠的两个分块,依次作为输入传入所述残差网络,获取每一分块经过所述残差网络后的预测值;
根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像。
在一个实施例中,提供了一个或多个存储有计算机可读指令的非易失性可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
构建残差网络,采用预设的训练样本对所述残差网络进行训练;
获取待识别图像;
对所述待识别图像执行预处理;
将预处理后的所述待识别图像划分为不重叠的两个分块,依次作为输入传入所述残差网络,获取每一分块经过所述残差网络后的预测值;
根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像。
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机可读指令来指令相关的硬件来完成,所述的计算机可读指令可存储于一非易失性计算机可读取存储介质中,该计算机可读指令在执行时,可包括如上述各方法的实施例的流程。其中,本申请所提供的各实施例中所使用的对存储器、存储、数据库或其它介质的任何引用,均可包括非易失性和/或易失性存储器。非易失性存储器可包括只读存储器(ROM)、可编程ROM(PROM)、电可编程ROM(EPROM)、电可擦除可编程ROM(EEPROM)或闪存。易失性存储器可包括随机存取存储器(RAM)或者外部高速缓冲存储器。作为说明而非局限,RAM以多种形式可得,诸如静态RAM(SRAM)、动态RAM(DRAM)、同步DRAM(SDRAM)、双数据率SDRAM(DDRSDRAM)、增强型SDRAM(ESDRAM)、同步链路(Synchlink)DRAM(SLDRAM)、存储器总线(Rambus)直接RAM(RDRAM)、直接存储器总线动态RAM(DRDRAM)、以及存储器总线动态RAM(RDRAM)等。
所属领域的技术人员可以清楚地了解到,为了描述的方便和简洁,仅以上述各功能单元、模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能单元、模块完成,即将所述装置的内部结构划分成不同的功能单元或模块,以完成以上描述的全部或者部分功能。
以上所述实施例仅用以说明本申请的技术方案,而非对其限制;尽管参照前述实施例对本申请进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本申请各实施例技术方案的精神和范围,均应包含在本申请的保护范围之内。

Claims (20)

  1. 一种基于残差网络的图像识别方法,其特征在于,包括:
    构建残差网络,采用预设的训练样本对所述残差网络进行训练;
    获取待识别图像;
    对所述待识别图像执行预处理;
    将预处理后的所述待识别图像划分为不重叠的两个分块,依次作为输入传入所述残差网络,获取每一分块经过所述残差网络后的预测值;
    根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像。
  2. 如权利要求1所述的基于残差网络的图像识别方法,其特征在于,所述对所述待识别图像执行预处理包括:
    将所述待识别图像调整为预设分辨率;
    对分辨率调整后的所述待识别图像中的每个像素点值执行归一化处理;
    将归一化处理后的所述待识别图像扩展为三层图像。
  3. 如权利要求2所述的基于残差网络的图像识别方法,其特征在于,所述将所述待识别图像调整为预设分辨率包括:
    将所述待识别图像的分辨率与预设的分辨率阈值进行比对;
    当所述待识别图像的分辨率高于预设的分辨率阈值时,对所述待识别图像降采样至所述分辨率阈值;
    当所述待识别图像的分辨率低于预设的分辨率阈值时,对所述待识别图像上采样至所述分辨率阈值。
  4. 如权利要求1所述的基于残差网络的图像识别方法,其特征在于,所述根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像包括:
    将所述两个分块的预测值进行比较,选择预测值中的较大值作为所述待识别图像的预测值;
    将所述待识别图像的预测值与预设的预测阈值进行比较;
    若所述待识别图像的预测值大于或等于所述预测阈值时,输出识别结果为目标图像;
    若所述待识别图像的预测值小于所述预测阈值时,输出识别结果为非目标图像。
  5. 如权利要求1所述的基于残差网络的图像识别方法,其特征在于,所述构建残差网络,采用预设的训练样本对所述残差网络进行训练包括:
    构建残差网络,所述残差网络包括输入层、卷积层、最大池化层、16个残差模块、全连接层以及输出层;
    收集多个指定用户的多张图像信息作为训练样本集和测试样本集;
    对所述训练样本集和测试样本集中的每一张图像信息执行预处理;
    将每一张预处理后的所述图像信息划分为不重叠的两个分块,对每一分块打上预设标签,所述预设标签包括第一标签和第二标签;
    从所述训练样本集中获取若干张图像信息,将每一张图像信息带有预设标签的两个分块分别作为输入向量传入所述残差网络进行训练;
    采用预设的损失函数计算每一所述分块经过所述残差网络的识别结果与对应的预设标签之间的误差,并根据所述误差修改所述残差网络的参数;
    从所述训练样本集中获取若干张图像信息,将每一张图像信息带有预设标签的两个分块分别传入参数修改后的所述残差网络执行下一次迭代训练;
    在迭代训练达到预设次数后,将测试样本集中带有预设标签的图像信息作为输入向量传入迭代训练得到的所述残差网络进行测试。
  6. 一种基于残差网络的图像识别装置,其特征在于,包括:
    训练模块,用于构建残差网络,采用预设的训练样本对所述残差网络进行训练;
    获取模块,用于获取待识别图像;
    预处理模块,用于对所述待识别图像执行预处理;
    识别模块,用于将预处理后的所述待识别图像划分为不重叠的两个分块,依次作为输入传入所述残差网络,获取每一分块经过所述残差网络后的预测值;
    输出模块,用于根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像。
  7. 如权利要求6所述的基于残差网络的图像识别装置,其特征在于,所述预处理模块包括:
    调整单元,用于将所述待识别图像调整为预设分辨率;
    归一化单元,用于对分辨率调整后的所述待识别图像中的每个像素点值执行归一化处理;
    扩展单元,用于将归一化处理后的所述待识别图像扩展为三层图像。
  8. 如权利要求7所述的基于残差网络的图像识别装置,其特征在于,所述调整单元包括:
    比对子单元,用于将所述待识别图像的分辨率与预设的分辨率阈值进行比对;
    降采样子单元,用于当所述待识别图像的分辨率高于预设的分辨率阈值时,对所述待识别图像降采样至所述分辨率阈值;
    上采样子单元,用于当所述待识别图像的分辨率低于预设的分辨率阈值时,对所述待识别图像上采样至所述分辨率阈值。
  9. 如权利要求6所述的基于残差网络的图像识别装置,其特征在于,所述输出模块包括:
    第一比较单元,用于将所述两个分块的预测值进行比较,选择预测值中的较大值作为所述待识别图像的预测值;
    第二比较单元,用于将所述待识别图像的预测值与预设的预测阈值进行比较;
    第一输出单元,用于若所述待识别图像的预测值大于或等于所述预测阈值时,输出识别结果为目标图像;
    第二输出单元,用于若所述待识别图像的预测值小于所述预测阈值时,输出识别结果为非目标图像。
  10. 如权利要求6所述的基于残差网络的图像识别装置,其特征在于,所述训练模块包括:
    构建单元,用于构建残差网络,所述残差网络包括输入层、卷积层、最大池化层、16个残差模块、全连接层以及输出层;
    收集单元,用于收集多个指定用户的多张图像信息作为训练样本集和测试样本集;
    预处理单元,用于对所述训练样本集和测试样本集中的每一张图像信息执行预处理;
    标签单元,用于将每一张预处理后的所述图像信息划分为不重叠的两个分块,对每一分块打上预设标签,所述预设标签包括第一标签和第二标签;
    训练单元,用于从所述训练样本集中获取若干张图像信息,将每一张图像信息带有预设标签的两个分块分别作为输入向量传入所述残差网络进行训练;
    修正单元,用于采用预设的损失函数计算每一所述分块经过所述残差网络的识别结果与对应的预设标签之间的误差,并根据所述误差修改所述残差网络的参数;
    迭代单元,用于从所述训练样本集中获取若干张图像信息,将每一张图像信息带有预设标签的两个分块分别传入参数修改后的所述残差网络执行下一次迭代训练;
    测试单元,用于在迭代训练达到预设次数后,将测试样本集中带有预设标签的图像信息作为输入向量传入迭代训练得到的所述残差网络进行测试。
  11. 一种计算机设备,包括存储器、处理器以及存储在所述存储器中并可在所述处理器上运行的计算机可读指令,其特征在于,所述处理器执行所述计算机可读指令时实现如下步骤:
    构建残差网络,采用预设的训练样本对所述残差网络进行训练;
    获取待识别图像;
    对所述待识别图像执行预处理;
    将预处理后的所述待识别图像划分为不重叠的两个分块,依次作为输入传入所述残差网络,获取每 一分块经过所述残差网络后的预测值;
    根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像。
  12. 如权利要求11所述的计算机设备,其特征在于,其特征在于,所述对所述待识别图像执行预处理包括:
    将所述待识别图像调整为预设分辨率;
    对分辨率调整后的所述待识别图像中的每个像素点值执行归一化处理;
    将归一化处理后的所述待识别图像扩展为三层图像。
  13. 如权利要求12所述的计算机设备,其特征在于,所述将所述待识别图像调整为预设分辨率包括:
    将所述待识别图像的分辨率与预设的分辨率阈值进行比对;
    当所述待识别图像的分辨率高于预设的分辨率阈值时,对所述待识别图像降采样至所述分辨率阈值;
    当所述待识别图像的分辨率低于预设的分辨率阈值时,对所述待识别图像上采样至所述分辨率阈值。
  14. 如权利要求11所述的计算机设备,其特征在于,所述根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像包括:
    将所述两个分块的预测值进行比较,选择预测值中的较大值作为所述待识别图像的预测值;
    将所述待识别图像的预测值与预设的预测阈值进行比较;
    若所述待识别图像的预测值大于或等于所述预测阈值时,输出识别结果为目标图像;
    若所述待识别图像的预测值小于所述预测阈值时,输出识别结果为非目标图像。
  15. 如权利要求11所述的计算机设备,其特征在于,所述构建残差网络,采用预设的训练样本对所述残差网络进行训练包括:
    构建残差网络,所述残差网络包括输入层、卷积层、最大池化层、16个残差模块、全连接层以及输出层;
    收集多个指定用户的多张图像信息作为训练样本集和测试样本集;
    对所述训练样本集和测试样本集中的每一张图像信息执行预处理;
    将每一张预处理后的所述图像信息划分为不重叠的两个分块,对每一分块打上预设标签,所述预设标签包括第一标签和第二标签;
    从所述训练样本集中获取若干张图像信息,将每一张图像信息带有预设标签的两个分块分别作为输 入向量传入所述残差网络进行训练;
    采用预设的损失函数计算每一所述分块经过所述残差网络的识别结果与对应的预设标签之间的误差,并根据所述误差修改所述残差网络的参数;
    从所述训练样本集中获取若干张图像信息,将每一张图像信息带有预设标签的两个分块分别传入参数修改后的所述残差网络执行下一次迭代训练;
    在迭代训练达到预设次数后,将测试样本集中带有预设标签的图像信息作为输入向量传入迭代训练得到的所述残差网络进行测试。
  16. 一个或多个存储有计算机可读指令的非易失性可读存储介质,所述计算机可读指令被一个或多个处理器执行时,使得所述一个或多个处理器执行如下步骤:
    构建残差网络,采用预设的训练样本对所述残差网络进行训练;
    获取待识别图像;
    对所述待识别图像执行预处理;
    将预处理后的所述待识别图像划分为不重叠的两个分块,依次作为输入传入所述残差网络,获取每一分块经过所述残差网络后的预测值;
    根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像。
  17. 如权利要求16所述的非易失性可读存储介质,其特征在于,所述对所述待识别图像执行预处理包括:
    将所述待识别图像调整为预设分辨率;
    对分辨率调整后的所述待识别图像中的每个像素点值执行归一化处理;
    将归一化处理后的所述待识别图像扩展为三层图像。
  18. 如权利要求17所述的非易失性可读存储介质,其特征在于,所述将所述待识别图像调整为预设分辨率包括:
    将所述待识别图像的分辨率与预设的分辨率阈值进行比对;
    当所述待识别图像的分辨率高于预设的分辨率阈值时,对所述待识别图像降采样至所述分辨率阈值;
    当所述待识别图像的分辨率低于预设的分辨率阈值时,对所述待识别图像上采样至所述分辨率阈值。
  19. 如权利要求16所述的非易失性可读存储介质,其特征在于,所述根据所述两个分块的预测值输出识别结果,其中,所述识别结果包括所述待识别图像为目标图像和所述待识别图像为非目标图像包 括:
    将所述两个分块的预测值进行比较,选择预测值中的较大值作为所述待识别图像的预测值;
    将所述待识别图像的预测值与预设的预测阈值进行比较;
    若所述待识别图像的预测值大于或等于所述预测阈值时,输出识别结果为目标图像;
    若所述待识别图像的预测值小于所述预测阈值时,输出识别结果为非目标图像。
  20. 如权利要求16所述的非易失性可读存储介质,其特征在于,所述构建残差网络,采用预设的训练样本对所述残差网络进行训练包括:
    构建残差网络,所述残差网络包括输入层、卷积层、最大池化层、16个残差模块、全连接层以及输出层;
    收集多个指定用户的多张图像信息作为训练样本集和测试样本集;
    对所述训练样本集和测试样本集中的每一张图像信息执行预处理;
    将每一张预处理后的所述图像信息划分为不重叠的两个分块,对每一分块打上预设标签,所述预设标签包括第一标签和第二标签;
    从所述训练样本集中获取若干张图像信息,将每一张图像信息带有预设标签的两个分块分别作为输入向量传入所述残差网络进行训练;
    采用预设的损失函数计算每一所述分块经过所述残差网络的识别结果与对应的预设标签之间的误差,并根据所述误差修改所述残差网络的参数;
    从所述训练样本集中获取若干张图像信息,将每一张图像信息带有预设标签的两个分块分别传入参数修改后的所述残差网络执行下一次迭代训练;
    在迭代训练达到预设次数后,将测试样本集中带有预设标签的图像信息作为输入向量传入迭代训练得到的所述残差网络进行测试。
PCT/CN2019/117426 2019-04-26 2019-11-12 基于残差网络的图像识别方法、装置、设备及存储介质 WO2020215676A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910345031.3A CN110163260B (zh) 2019-04-26 2019-04-26 基于残差网络的图像识别方法、装置、设备及存储介质
CN201910345031.3 2019-04-26

Publications (1)

Publication Number Publication Date
WO2020215676A1 true WO2020215676A1 (zh) 2020-10-29

Family

ID=67638758

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/117426 WO2020215676A1 (zh) 2019-04-26 2019-11-12 基于残差网络的图像识别方法、装置、设备及存储介质

Country Status (2)

Country Link
CN (1) CN110163260B (zh)
WO (1) WO2020215676A1 (zh)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801128A (zh) * 2020-12-14 2021-05-14 深圳云天励飞技术股份有限公司 非机动车识别方法、装置、电子设备及存储介质
CN113449682A (zh) * 2021-07-15 2021-09-28 四川九洲电器集团有限责任公司 一种基于动态融合模型识别民航领域射频指纹的方法
CN113673568A (zh) * 2021-07-19 2021-11-19 华南理工大学 篡改图像的检测方法、系统、计算机设备和存储介质
CN114092759A (zh) * 2021-10-27 2022-02-25 北京百度网讯科技有限公司 图像识别模型的训练方法、装置、电子设备及存储介质
CN114359958A (zh) * 2021-12-14 2022-04-15 合肥工业大学 一种基于通道注意力机制的猪脸识别方法
CN115001937A (zh) * 2022-04-11 2022-09-02 北京邮电大学 面向智慧城市物联网的故障预测方法及装置
CN115462550A (zh) * 2022-10-24 2022-12-13 西昌学院 烟叶烘烤控制方法、装置、电子设备及可读存储介质
CN116524327A (zh) * 2023-06-25 2023-08-01 云账户技术(天津)有限公司 人脸识别模型的训练方法、装置、电子设备及存储介质

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110163260B (zh) * 2019-04-26 2024-05-28 平安科技(深圳)有限公司 基于残差网络的图像识别方法、装置、设备及存储介质
CN110738235B (zh) * 2019-09-16 2023-05-30 平安科技(深圳)有限公司 肺结核判定方法、装置、计算机设备及存储介质
CN110751221A (zh) * 2019-10-24 2020-02-04 广东三维家信息科技有限公司 图片分类方法、装置、电子设备及计算机可读存储介质
CN111104967B (zh) * 2019-12-02 2023-12-22 精锐视觉智能科技(上海)有限公司 图像识别网络训练方法、图像识别方法、装置及终端设备
CN113052308B (zh) * 2019-12-26 2024-05-03 中国移动通信集团北京有限公司 训练目标小区识别模型的方法及目标小区识别方法
CN111581418B (zh) * 2020-04-29 2023-04-28 山东科技大学 一种基于图像关联人物信息的目标人员搜索方法
CN112232338B (zh) * 2020-10-13 2023-09-08 中国平安人寿保险股份有限公司 核保理赔过程的资料录入方法、装置、设备及存储介质
CN113379779B (zh) * 2021-06-07 2023-04-07 华南理工大学 堆叠宽度学习系统的边缘计算方法、装置、介质和设备
WO2023082103A1 (zh) * 2021-11-10 2023-05-19 深圳先进技术研究院 路面状态识别方法、装置、终端设备、存储介质及产品
CN114202746B (zh) * 2021-11-10 2024-04-12 深圳先进技术研究院 路面状态识别方法、装置、终端设备及存储介质
CN114998695B (zh) * 2022-07-18 2022-11-15 深圳市前海泽金产融科技有限公司 一种提高图像识别速度的方法及系统
CN116543789B (zh) * 2023-07-06 2023-09-29 中国电信股份有限公司 设备异常识别方法、装置、设备及介质

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106097379A (zh) * 2016-07-22 2016-11-09 宁波大学 一种使用自适应阈值的图像篡改检测与定位方法
CN109241967A (zh) * 2018-09-04 2019-01-18 青岛大学附属医院 基于深度神经网络的甲状腺超声图像自动识别系统、计算机设备、存储介质
US10187171B2 (en) * 2017-03-07 2019-01-22 The United States Of America, As Represented By The Secretary Of The Navy Method for free space optical communication utilizing patterned light and convolutional neural networks
CN109583369A (zh) * 2018-11-29 2019-04-05 北京邮电大学 一种基于目标区域分割网络的目标识别方法及装置
CN110163260A (zh) * 2019-04-26 2019-08-23 平安科技(深圳)有限公司 基于残差网络的图像识别方法、装置、设备及存储介质

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106228162B (zh) * 2016-07-22 2019-05-17 王威 一种基于深度学习的移动机器人快速物体识别方法
CN106874840B (zh) * 2016-12-30 2019-10-22 东软集团股份有限公司 车辆信息识别方法及装置
CN107944458A (zh) * 2017-12-08 2018-04-20 北京维大成科技有限公司 一种基于卷积神经网络的图像识别方法和装置
CN108229379A (zh) * 2017-12-29 2018-06-29 广东欧珀移动通信有限公司 图像识别方法、装置、计算机设备和存储介质
CN108596143B (zh) * 2018-05-03 2021-07-27 复旦大学 基于残差量化卷积神经网络的人脸识别方法及装置
CN109583297B (zh) * 2018-10-25 2020-10-02 清华大学 视网膜oct体数据识别方法及装置
CN109492556B (zh) * 2018-10-28 2022-09-20 北京化工大学 面向小样本残差学习的合成孔径雷达目标识别方法
CN109636780A (zh) * 2018-11-26 2019-04-16 深圳先进技术研究院 乳腺密度自动分级方法及装置

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106097379A (zh) * 2016-07-22 2016-11-09 宁波大学 一种使用自适应阈值的图像篡改检测与定位方法
US10187171B2 (en) * 2017-03-07 2019-01-22 The United States Of America, As Represented By The Secretary Of The Navy Method for free space optical communication utilizing patterned light and convolutional neural networks
CN109241967A (zh) * 2018-09-04 2019-01-18 青岛大学附属医院 基于深度神经网络的甲状腺超声图像自动识别系统、计算机设备、存储介质
CN109583369A (zh) * 2018-11-29 2019-04-05 北京邮电大学 一种基于目标区域分割网络的目标识别方法及装置
CN110163260A (zh) * 2019-04-26 2019-08-23 平安科技(深圳)有限公司 基于残差网络的图像识别方法、装置、设备及存储介质

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112801128A (zh) * 2020-12-14 2021-05-14 深圳云天励飞技术股份有限公司 非机动车识别方法、装置、电子设备及存储介质
CN112801128B (zh) * 2020-12-14 2023-10-13 深圳云天励飞技术股份有限公司 非机动车识别方法、装置、电子设备及存储介质
CN113449682B (zh) * 2021-07-15 2023-08-08 四川九洲电器集团有限责任公司 一种基于动态融合模型识别民航领域射频指纹的方法
CN113449682A (zh) * 2021-07-15 2021-09-28 四川九洲电器集团有限责任公司 一种基于动态融合模型识别民航领域射频指纹的方法
CN113673568A (zh) * 2021-07-19 2021-11-19 华南理工大学 篡改图像的检测方法、系统、计算机设备和存储介质
CN113673568B (zh) * 2021-07-19 2023-08-22 华南理工大学 篡改图像的检测方法、系统、计算机设备和存储介质
CN114092759A (zh) * 2021-10-27 2022-02-25 北京百度网讯科技有限公司 图像识别模型的训练方法、装置、电子设备及存储介质
CN114359958A (zh) * 2021-12-14 2022-04-15 合肥工业大学 一种基于通道注意力机制的猪脸识别方法
CN114359958B (zh) * 2021-12-14 2024-02-20 合肥工业大学 一种基于通道注意力机制的猪脸识别方法
CN115001937B (zh) * 2022-04-11 2023-06-16 北京邮电大学 面向智慧城市物联网的故障预测方法及装置
CN115001937A (zh) * 2022-04-11 2022-09-02 北京邮电大学 面向智慧城市物联网的故障预测方法及装置
CN115462550A (zh) * 2022-10-24 2022-12-13 西昌学院 烟叶烘烤控制方法、装置、电子设备及可读存储介质
CN116524327A (zh) * 2023-06-25 2023-08-01 云账户技术(天津)有限公司 人脸识别模型的训练方法、装置、电子设备及存储介质
CN116524327B (zh) * 2023-06-25 2023-08-25 云账户技术(天津)有限公司 人脸识别模型的训练方法、装置、电子设备及存储介质

Also Published As

Publication number Publication date
CN110163260B (zh) 2024-05-28
CN110163260A (zh) 2019-08-23

Similar Documents

Publication Publication Date Title
WO2020215676A1 (zh) 基于残差网络的图像识别方法、装置、设备及存储介质
US10347010B2 (en) Anomaly detection in volumetric images using sequential convolutional and recurrent neural networks
US10706333B2 (en) Medical image analysis method, medical image analysis system and storage medium
WO2020215557A1 (zh) 医学影像解释方法、装置、计算机设备及存储介质
CN111784671B (zh) 基于多尺度深度学习的病理图像病灶区域检测方法
US11651850B2 (en) Computer vision technologies for rapid detection
WO2019200753A1 (zh) 病变监测方法、装置、计算机设备和存储介质
CN109346159B (zh) 病例图像分类方法、装置、计算机设备及存储介质
WO2021115084A1 (zh) 一种基于结构磁共振影像的大脑年龄深度学习预测系统
WO2021017006A1 (zh) 图像处理方法及装置、神经网络及训练方法、存储介质
US20210287054A1 (en) System and method for identification and localization of images using triplet loss and predicted regions
CN112085745A (zh) 基于均衡采样拼接的多通道u型全卷积神经网络的视网膜血管图像分割方法
WO2022178997A1 (zh) 医学影像配准方法、装置、计算机设备及存储介质
CN110570394A (zh) 医学图像分割方法、装置、设备及存储介质
CN117272052B (zh) 大语言模型训练方法、装置、设备以及存储介质
US11521323B2 (en) Systems and methods for generating bullseye plots
CN111986242B (zh) 脑组织分区的确定方法、装置、存储介质及电子设备
CN116309507A (zh) 注意力机制下对ctp进行特征融合的ais病灶预测方法
CN113379770B (zh) 鼻咽癌mr图像分割网络的构建方法、图像分割方法及装置
CN112734798B (zh) 神经网络的在线自适应系统和方法
CN113298827B (zh) 一种基于DP-Net网络的图像分割方法
CN111210414B (zh) 医学图像分析方法、计算机设备和可读存储介质
CN115861150A (zh) 分割模型训练方法、医学图像分割方法、电子设备和介质
US20230169659A1 (en) Image segmentation and tracking based on statistical shape model
Lu et al. An Alzheimer's disease classification method based on ConvNeXt

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19926342

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19926342

Country of ref document: EP

Kind code of ref document: A1