CN111242911A

CN111242911A - Method and system for determining image definition based on deep learning algorithm

Info

Publication number: CN111242911A
Application number: CN202010017473.8A
Authority: CN
Inventors: 柴胜; 杨强; 刘华根; 何韦澄; 王玉鑫
Original assignee: Laikang Technology Co Ltd
Current assignee: Laikang Technology Co Ltd
Priority date: 2020-01-08
Filing date: 2020-01-08
Publication date: 2020-06-05

Abstract

The invention discloses a method and a system for determining image definition based on a deep learning algorithm, wherein the method comprises the following steps: acquiring an original face image data set containing a tongue region, and labeling the definition; performing data enhancement processing on the marked original face image data set to obtain a data-enhanced face image data set; establishing a depth network on the basis of a frame of a residual error network, establishing an incidence relation between an optimizer and the residual error network to form an image definition judgment model, and training and testing the image definition judgment model by using the data-enhanced face image data set to determine the trained image definition judgment model; and preprocessing the to-be-detected face image containing the tongue region, and analyzing the preprocessed face image by utilizing the trained image definition judgment model to determine the image definition of the to-be-detected face image containing the tongue region.

Description

Method and system for determining image definition based on deep learning algorithm

Technical Field

The present invention relates to the field of deep learning technologies, and in particular, to a method and a system for determining image sharpness based on a deep learning algorithm.

Background

With the rapid development of computer vision and deep learning, the progress of image analysis in the field of traditional Chinese medicine is also promoted. In order to realize intelligent facial tongue diagnosis, the definition of images of the face and the tongue photographed by a doctor is highly required, so that the images need to be subjected to fuzzy judgment before diagnosis so as to improve the accuracy of facial tongue diagnosis. The traditional image processing scheme or the scheme of feature engineering plus machine learning classification has the following problems that: a. difficulty in extracting image features; b. the model manually screened for characteristics and trained by adopting the traditional machine learning has poor generalization capability and poor robustness; c. the accuracy of the result obtained by using a machine learning method or a traditional image processing method is low; d. the threshold is difficult to select during image processing.

The concept of deep learning stems from the study of artificial neural networks. A multi-layer perceptron with multiple hidden layers is a deep learning structure. Deep learning forms a more abstract class or feature of high-level representation properties by combining low-level features to discover a distributed feature representation of the data. The concept of deep learning was proposed by Hinton et al in 2006. An unsupervised greedy layer-by-layer training algorithm is provided based on a Deep Belief Network (DBN), and a multilayer automatic encoder deep structure is provided later to hope for solving the optimization problem related to the deep structure. In addition, the convolutional neural network proposed by Lecun et al is the first true multi-level structure learning algorithm that uses spatial relative relationships to reduce the number of parameters to improve training performance.

Therefore, a method for determining the image sharpness in a deep learning algorithm is needed.

Disclosure of Invention

The invention provides a method and a system for determining image definition based on a deep learning algorithm, which are used for solving the problem of how to automatically determine the image definition.

In order to solve the above problem, according to an aspect of the present invention, there is provided a method of determining sharpness of an image based on a deep learning algorithm, the method including:

acquiring an original face image data set containing a tongue region, and labeling the definition degree of each original face image according to the definition degree of the original face image;

performing data enhancement processing on the marked original face image data set to obtain a data-enhanced face image data set;

establishing a depth network on the basis of a frame of a residual error network, establishing an incidence relation between an optimizer and the residual error network to form an image definition judgment model, and training and testing the image definition judgment model by using the data-enhanced face image data set to determine the trained image definition judgment model;

and preprocessing the to-be-detected face image containing the tongue region, and analyzing the preprocessed face image by utilizing the trained image definition judgment model to determine the image definition of the to-be-detected face image containing the tongue region.

Preferably, the data enhancement processing on the labeled original facial image data set to obtain the data enhanced facial image data set includes:

processing the face image data in the labeled original face image data set by using at least one processing mode of color space conversion processing, brightness adjustment processing, saturation adjustment processing, channel conversion processing, random clipping processing, horizontal mirroring processing and normalization processing to obtain a data-enhanced face image data set.

Preferably, wherein the residual network is ResNet-18, consisting of 17 convolutional layers and 1 fully-connected layer; the depth network is SENET, and for SENET, a channel attention mechanism is added on the basis of the framework of ResNet-18 and is used for selecting weights in the training process.

Preferably, the training and testing the image sharpness judging model by using the data-enhanced facial image data set to determine a trained image sharpness judging model includes:

initializing the network weight value of a convolution layer group of a residual error network ResNet by adopting a network weight value of ResNet-18, and randomly initializing the network weight value of a complete connection layer of a ResNet structure;

setting the initial learning rate of a convolution layer group and a full connection layer of a residual error network ResNet to be 0.01, using Cross Engine as a loss function, keeping the learning rate of a first preset number of training samples in the early stage when iterative training is carried out unchanged, and reducing the learning rate of every second preset number of training samples in the later stage to 0.001 and 0.0001 when iterative training is carried out;

randomly selecting a face image with a preset percentage threshold value from the data-enhanced face image data set as a training set, and using the rest face images as a test set for training and testing;

and (3) performing iterative training on the image definition judgment model by adopting a random gradient descent algorithm, and selecting the network model with the minimum loss function as the trained image definition judgment model.

Preferably, the preprocessing the to-be-detected face image including the tongue region, and analyzing the preprocessed face image by using the trained image sharpness determination model to determine the image sharpness of the to-be-detected face image including the tongue region includes:

carrying out scaling processing on the face image to be detected containing the tongue region according to a preset size, and subtracting the mean value of each channel to obtain the preprocessed face image to be detected containing the tongue region;

analyzing the preprocessed face image by using the trained image definition judging model to obtain probability values corresponding to different definition degrees, and selecting the definition degree corresponding to the maximum probability value as the image definition of the face image containing the tongue region to be detected;

wherein the degree of clarity comprises: clear, relatively clear, and unclear.

According to another aspect of the present invention, there is provided a system for determining sharpness of an image based on a deep learning algorithm, the system comprising:

the definition degree labeling unit is used for acquiring an original face image data set containing a tongue region and labeling the definition degree of each original face image according to the definition degree of the original face image;

the data enhancement processing unit is used for carrying out data enhancement processing on the marked original face image data set so as to obtain a data enhanced face image data set;

the image definition judgment model determining unit is used for building a depth network on the basis of a frame of a residual error network, building an incidence relation between an optimizer and the residual error network to form an image definition judgment model, and training and testing the image definition judgment model by utilizing the data-enhanced face image data set to determine the trained image definition judgment model;

and the image definition determining unit is used for preprocessing the to-be-detected face image containing the tongue region and analyzing the preprocessed face image by utilizing the trained image definition judging model so as to determine the image definition of the to-be-detected face image containing the tongue region.

Preferably, the data enhancement processing unit performs data enhancement processing on the labeled original face image data set to obtain a data-enhanced face image data set, and includes:

Preferably, the image sharpness determination model determining unit, which trains and tests the image sharpness determination model by using the data-enhanced face image data set to determine a trained image sharpness determination model, includes:

Preferably, the image sharpness determining unit preprocesses the to-be-detected face image including the tongue region, and analyzes the preprocessed face image by using the trained image sharpness determining model to determine the image sharpness of the to-be-detected face image including the tongue region, and includes:

wherein the degree of clarity comprises: clear, relatively clear, and unclear.

The invention provides a method and a system for determining image definition based on a deep learning algorithm, wherein the method comprises the following steps: marking the definition of each original face image; performing data enhancement processing on the marked original face image data set; establishing a depth network on the basis of a frame of a residual error network, establishing an incidence relation between an optimizer and the residual error network to form an image definition judgment model, and performing training and testing to determine the trained image definition judgment model; and analyzing the face image to be detected by using the trained image definition judgment model so as to determine the image definition of the face image to be detected. The method does not need manual feature screening, avoids the process of selecting classification features by feature engineering, can lead the original data to derive richer new data sets as training data through data enhancement, thereby leading the model to adapt to different scenes, and laying a foundation for the accuracy of the diagnosis of the tongue and the face by judging the definition of the image.

Drawings

A more complete understanding of exemplary embodiments of the present invention may be had by reference to the following drawings in which:

FIG. 1 is a flow diagram of a method 100 for determining sharpness of an image based on a deep learning algorithm according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a residual unit according to an embodiment of the present invention;

FIG. 3 is a schematic diagram of a deep network SEnet according to an embodiment of the present invention; and

fig. 4 is a schematic structural diagram of a system 400 for determining image sharpness based on a deep learning algorithm according to an embodiment of the present invention.

Detailed Description

The exemplary embodiments of the present invention will now be described with reference to the accompanying drawings, however, the present invention may be embodied in many different forms and is not limited to the embodiments described herein, which are provided for complete and complete disclosure of the present invention and to fully convey the scope of the present invention to those skilled in the art. The terminology used in the exemplary embodiments illustrated in the accompanying drawings is not intended to be limiting of the invention. In the drawings, the same units/elements are denoted by the same reference numerals.

Unless otherwise defined, terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Further, it will be understood that terms, such as those defined in commonly used dictionaries, should be interpreted as having a meaning that is consistent with their meaning in the context of the relevant art and will not be interpreted in an idealized or overly formal sense.

Fig. 1 is a flow chart of a method 100 for determining sharpness of an image based on a deep learning algorithm according to an embodiment of the present invention. As shown in fig. 1, the method for determining the image sharpness based on the deep learning algorithm provided by the embodiment of the present invention does not need manual feature screening, avoids the process of selecting classification features in feature engineering, and can derive a new data set, which is richer in raw data derivation, as training data through data enhancement, so that the model can adapt to different scenes, and can lay a foundation for the accuracy of the diagnosis of the facial and tongue by judging the sharpness of the image. In the method 100 for determining the image definition based on the deep learning algorithm according to the embodiment of the present invention, starting from step 101, an original face image dataset including a tongue region is obtained in step 101, and the definition of each original face image is labeled according to the definition of the original face image.

In the embodiment of the invention, the images with the face and the tongue are collected through a hospital, a community and the like, so as to ensure the diversity and uniformity of data sources. Classification criteria according to degree of clarity: clear, relatively clear and unclear, and carries out classification and labeling on the data.

At step 102, data enhancement processing is performed on the annotated raw facial image dataset to obtain a data enhanced facial image dataset.

In the embodiment of the invention, in order to improve the robustness of the training model, a richer new data set is derived from the original data as training data in a data enhancement mode, so that the model can adapt to different scenes. The processing mode for enhancing the data of the original face image data containing the tongue part comprises the following steps: the transformation processing of the color space, transform the picture from RGB space to HUE space; adjusting brightness value of the image randomly; adjusting the saturation degree, and randomly adjusting the saturation degree of the image; performing channel conversion processing, and exchanging positions of the three channels of the image to generate a new image; random clipping processing, wherein images are randomly clipped on the basis of the original image to enrich the background; horizontal mirroring to increase diversity; and normalization processing is carried out, so that the interference of the direct current signal of the image is reduced. One or more processing modes can be selected from any original facial image data to be processed simultaneously.

In step 103, a depth network is established on the basis of a frame of a residual error network, an association relationship between an optimizer and the residual error network is established to construct an image definition judgment model, and the image definition judgment model is trained and tested by using the data-enhanced face image data set to determine the trained image definition judgment model.

In the embodiment of the invention, the residual network is ResNet-18 and is composed of 17 convolutional layers and 1 complete connection layer. Fig. 2 is a schematic diagram of a residual error unit according to an embodiment of the present invention. As shown in fig. 2, the network at the early stage mainly extracts the high-level features of the image data of the data, the network at the middle stage mainly extracts the low-level features of the data, and the network at the later stage synthesizes some features by itself to form the feature combination of the whole model.

Fig. 3 is a schematic diagram of a deep network SENet according to an embodiment of the present invention. As shown in fig. 3, C is 16 for the deep network SENet classification network constructed in the embodiment of the present invention. The conditions of global blurring and local blurring of the blur distribution in the image acquired by analysis are divided into the following according to the reasons for blurring: motion blur and out-of-focus blur. In the face of such practical situations, we introduce a mechanism of attention to improve the accuracy of classification.

For the SENET network, on the basis of a ResNet-18 framework, a Channel Attention mechanism is added to solve the problem that local blurring in image blurring is not easy to detect, and the function is to select weights in the training process. In the process of classifying the blurred image, the classification sizes of blurred pixel blocks are different, and then the classification accuracy can be improved by performing weight screening on the part of pixels effectively during training and aiming at the blurred part through centralized learning.

In the training process, an SGD optimizer is adopted, Cross Engine is used as a loss function, the initial learning rate is set to be 0.01, the first 30k iterations are set, the learning rate is unchanged, and the learning rate is reduced to 0.001 and 0.0001 every 10k iterations in the later period. Then, 80% of the face images in the data-enhanced face image data set are randomly selected as a training set, and the remaining 20% of the face images are selected as a testing set, so as to perform training and testing. And (3) performing iterative training on the image definition judgment model by adopting a random gradient descent algorithm, adjusting parameters, re-training until the available accuracy is reached, and selecting the network model with the minimum loss function as the trained image definition judgment model.

In step 104, preprocessing the to-be-detected face image including the tongue region, and analyzing the preprocessed face image by using the trained image sharpness judgment model to determine the image sharpness of the to-be-detected face image including the tongue region.

wherein the degree of clarity comprises: clear, relatively clear, and unclear.

In the implementation method of the invention, when a to-be-detected face image containing a tongue region is obtained, firstly, the to-be-detected face image is scaled according to a preset size, then, the average value of each channel is subtracted, then, the preprocessed image is input into a trained model, three probability values are output, and the class with the maximum probability is selected as a corresponding classification result. When the classification result indicates that the image is clear or not clear, the user can be reminded to take the picture again in time.

Fig. 4 is a schematic structural diagram of a system 400 for determining image sharpness based on a deep learning algorithm according to an embodiment of the present invention. As shown in fig. 4, a system 400 for determining image sharpness based on a deep learning algorithm according to an embodiment of the present invention includes: a definition labeling unit 401, a data enhancement processing unit 402, an image definition judgment model determining unit 403, and an image definition determining unit 404.

Preferably, the definition labeling unit 401 is configured to obtain an original face image data set including a tongue region, and label the definition of each original face image according to the definition of the original face image.

Preferably, the data enhancement processing unit 402 is configured to perform data enhancement processing on the labeled original facial image data set to obtain a data enhanced facial image data set.

Preferably, the data enhancement processing unit 402 performs data enhancement processing on the labeled original facial image data set to obtain a data enhanced facial image data set, including:

Preferably, the image sharpness determination model determining unit 403 is configured to build a depth network based on a frame of a residual network, build an association relationship between an optimizer and the residual network to form an image sharpness determination model, and train and test the image sharpness determination model by using the data-enhanced facial image data set to determine the trained image sharpness determination model.

Preferably, the image sharpness determination model determining unit 403, which trains and tests the image sharpness determination model by using the data-enhanced face image data set to determine a trained image sharpness determination model, includes:

Preferably, the image sharpness determining unit 404 is configured to pre-process the to-be-detected face image including the tongue region, and analyze the pre-processed face image by using the trained image sharpness determining model to determine the image sharpness of the to-be-detected face image including the tongue region.

wherein the degree of clarity comprises: clear, relatively clear, and unclear.

The system 400 for determining the image sharpness based on the deep learning algorithm according to the embodiment of the present invention corresponds to the method 100 for determining the image sharpness based on the deep learning algorithm according to another embodiment of the present invention, and is not described herein again.

The invention has been described with reference to a few embodiments. However, other embodiments of the invention than the one disclosed above are equally possible within the scope of the invention, as would be apparent to a person skilled in the art from the appended patent claims.

Generally, all terms used in the claims are to be interpreted according to their ordinary meaning in the technical field, unless explicitly defined otherwise herein. All references to "a/an/the [ device, component, etc ]" are to be interpreted openly as referring to at least one instance of said device, component, etc., unless explicitly stated otherwise. The steps of any method disclosed herein do not have to be performed in the exact order disclosed, unless explicitly stated.

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

Finally, it should be noted that: the above embodiments are only for illustrating the technical solutions of the present invention and not for limiting the same, and although the present invention is described in detail with reference to the above embodiments, those of ordinary skill in the art should understand that: modifications and equivalents may be made to the embodiments of the invention without departing from the spirit and scope of the invention, which is to be covered by the claims.

Claims

1. A method for determining image sharpness based on a deep learning algorithm, the method comprising:

2. The method of claim 1, wherein the data enhancement processing the annotated original facial image dataset to obtain a data enhanced facial image dataset comprises:

3. The method of claim 1, wherein the residual network is ResNet-18, consisting of 17 convolutional layers and 1 fully-connected layer; the depth network is SENET, and for SENET, a channel attention mechanism is added on the basis of the framework of ResNet-18 and is used for selecting weights in the training process.

4. The method of claim 3, wherein the training and testing the image sharpness determination model with the data-enhanced facial image dataset to determine a trained image sharpness determination model comprises:

5. The method of claim 1, wherein the preprocessing the image of the face containing the tongue region to be tested and analyzing the preprocessed image of the face with the trained image sharpness determination model to determine the image sharpness of the image of the face containing the tongue region to be tested comprises:

wherein the degree of clarity comprises: clear, relatively clear, and unclear.

6. A system for determining sharpness of an image based on a deep learning algorithm, the system comprising:

7. The system of claim 6, wherein the data enhancement processing unit performs data enhancement processing on the labeled original facial image dataset to obtain a data enhanced facial image dataset, comprising:

8. The system of claim 6, wherein the residual network is ResNet-18, consisting of 17 convolutional layers and 1 fully-connected layer; the depth network is SENET, and for SENET, a channel attention mechanism is added on the basis of the framework of ResNet-18 and is used for selecting weights in the training process.

9. The system according to claim 8, wherein the image sharpness determination model determining unit trains and tests the image sharpness determination model using the data-enhanced facial image data set to determine a trained image sharpness determination model, comprising:

10. The system of claim 6, wherein the image sharpness determining unit pre-processes the image of the face containing the tongue region to be measured, and analyzes the pre-processed image of the face using the trained image sharpness determining model to determine the image sharpness of the image of the face containing the tongue region to be measured, and comprises:

wherein the degree of clarity comprises: clear, relatively clear, and unclear.