CN110705600A

CN110705600A - Cross-correlation entropy based multi-depth learning model fusion method, terminal device and readable storage medium

Info

Publication number: CN110705600A
Application number: CN201910843623.8A
Authority: CN
Inventors: 杜少毅; 宋盼盼; 国雷达; 许美凤; 崔文婷; 万腾
Original assignee: Xian Jiaotong University
Current assignee: Xian Jiaotong University
Priority date: 2019-09-06
Filing date: 2019-09-06
Publication date: 2020-01-17

Abstract

The invention discloses a multi-depth learning model fusion method based on cross-correlation entropy, terminal equipment and a readable storage medium, and belongs to the field of computer image processing. The invention trains and tests the collected data to obtain the image classification model. The invention can effectively finish the accurate classification of the images, and has good robustness and accuracy under the conditions of extremely similar image characteristics, much image noise and the like. The image classification is used as the basis of high-level visual tasks such as image detection, image segmentation, object tracking, behavior analysis and the like, and can be widely applied to a plurality of fields such as face recognition and intelligent video analysis in the security field, scene recognition in the traffic field, content-based image retrieval and image automatic classification in the internet field, image recognition in the medical field and the like.

Description

Cross-correlation entropy based multi-deep learning model fusion method, terminal device and readable storage medium

[ technical field ] A method for producing a semiconductor device

The invention belongs to the field of computer image processing, and relates to a cross-correlation entropy-based multi-deep learning model fusion method, terminal equipment and a readable storage medium.

[ background of the invention ]

Computer vision is one of the most popular research areas in the field of deep learning. Computer vision, as a cross-domain discipline, includes computer science (graphics, algorithms, theories, systems, architectures), mathematics (information retrieval, machine learning), engineering (robotics, speech, natural language processing, image processing), physics (optics), biology (neuroscience), psychology (cognitive science), and so on, which opens the way for the development of artificial intelligence. Image classification is the most fundamental task in computer vision, and the classification problem refers to giving a group of images of respective marked classes, predicting the classes of a new group of test images, measuring the accuracy of prediction, and realizing the minimum classification error. However, one of the most important disadvantages of the visual technology is that the requirements for image quality are relatively strict. The image information obtained by the vision technology has more noise under the conditions of point-of-view change, scale change, intra-class change, image deformation, image occlusion, background mottle and the like, and great challenge is brought to the precision of the classification algorithm.

[ summary of the invention ]

The present invention is directed to overcome the disadvantages of the prior art, and provides a method for fusing multiple deep learning models based on cross-correlation entropy, a terminal device and a readable storage medium

In order to achieve the purpose, the invention adopts the following technical scheme to realize the purpose:

a cross-correlation entropy based multi-deep learning model fusion method is suitable for being executed in computing equipment and comprises the following steps:

step 1: preprocessing the data set;

step 2: respectively training a plurality of single model classification convolution neural networks to enable each model to obtain an optimal result;

and step 3: and (2) using a multi-model integration method for the single model with the optimal result, and learning the contribution weight of each single-model classification convolutional neural network to the final result by using a multi-model fusion method based on a cross-correlation entropy criterion, wherein the objective function is as follows:

wherein w ═ w₁,w₂,…,w_n]Representing the weights of the integrated model, b is the bias, m is the number of samples, σ is the kernel width, x_iRepresenting the output of the ith sample in the plurality of single models, y_iA true tag representing the ith sample;

using a gradient-rise method to obtain a parameter w which brings the objective function close to a maximum_jAnd b;

and 4, step 4: and (4) training the target function in the step (3) to obtain an optimal image classification result of the multi-depth convolution neural network integrated model based on the cross-correlation entropy criterion.

The invention further improves the following steps:

the specific method of the step 2 is as follows:

step 2-1: training a plurality of deep convolution neural network models by using the same training set X according to the collected data, firstly initializing model variables w and b, and obtaining a predicted value by network forward propagation;

step 2-2: and (3) performing back propagation by using a loss function and a function optimization method, updating model weights w and b, and calculating the formula as follows:

where λ represents the learning rate, w_jRepresents the weight of the jth parameter, k represents the kth iteration,

updating the k-th weight value of the jth model, taking J (w, b) as an objective function, and applying a gradient descent method to the parameter w in the objective function_jB, updating the weight;

step 2-3: and (5) repeating the step 2-1 to the step 2-2 until the training times are reached or the training target is obtained to obtain the optimal model.

The multiple deep convolutional neural network models are one or more of VGG, inclusion V4, Xception, ResNet or DensNet.

The specific method of the step 3 is as follows:

step 3-1: for a plurality of existing single models, learning the contribution weight w of each single model to the final result by using an MCC-based multi-model integration method₁,w₂,…,w_n]N is the number of single models;

step 3-2: taking the correlation entropy criterion as an objective function, the calculation formula is as follows:

step 3-3: and (3) maximizing a target function defined by the cross-correlation entropy by using the corresponding relation obtained in the step (3-2), wherein the contribution of each single model to a final result is calculated according to the following formula:

the entropy criterion is subjected to gradient rise, and the objective function is close to the maximum value after multiple iterations to achieve the learning parameter w_jAnd purpose of b;

step 3-4: judging whether the classification precision of the current model meets the requirement, if so, outputting a parameter w_jB, if not, continuing to adjust the model parameters and returningStep 3-2, continuously performing iterative computation to obtain optimal parameters to obtain a final model

A memory device having stored therein a plurality of instructions adapted to be loaded and executed by a processor:

step 1: preprocessing the data set;

A mobile terminal, comprising:

a processor adapted to implement instructions; and

a storage device adapted to store a plurality of instructions, the instructions adapted to be loaded and executed by a processor to:

step 1: preprocessing the data set;

Compared with the prior art, the invention has the following beneficial effects:

the method comprises the steps of training a plurality of single model networks respectively by utilizing an acquired data set including scene recognition in the traffic field and skin disease classification in the medical field to enable the classification effect of each network to be optimal, then performing multi-model fusion by utilizing a maximum cross-correlation entropy criterion as a target function, and calculating the contribution weight of each single model network to a final result to obtain a final classification model.

The method utilizes a traffic scene data set acquired by unmanned vehicles from an artificial intelligent robot institute of the university of transport in Western Ann and pathological sections of the department of dermatology of a second subsidiary hospital of the university of transport in Western Ann as data sources, and finishes the classification work of the traffic scene and the pathological sections through the steps of data preprocessing, multiple single-model network training, multi-model integrated training based on maximum cross-correlation entropy and the like. Due to the fact that slicing properties of skin diseases have very high similarity, the judgment result of each single model is unstable and limited in precision, and therefore the method for model integration can well classify the slices in high precision. In addition, aiming at the problem that the slice has a lot of noise, the multi-model integration method based on the maximum cross-correlation entropy in the step 3.3) can be used for well reducing noise interference and improving the model accuracy. Thus, a function of robust and accurate classification of images can be realized.

[ description of the drawings ]

FIG. 1 is an overall structure of a convolutional neural network;

FIG. 2 is an overall framework of an integrated model based on a plurality of convolutional neural networks;

FIG. 3 shows the experimental results of different models.

[ detailed description ] embodiments

In order to make the technical solutions of the present invention better understood, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, not all of the embodiments, and are not intended to limit the scope of the present disclosure. Moreover, in the following description, descriptions of well-known structures and techniques are omitted so as to not unnecessarily obscure the concepts of the present disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Various structural schematics according to the disclosed embodiments of the invention are shown in the drawings. The figures are not drawn to scale, wherein certain details are exaggerated and possibly omitted for clarity of presentation. The shapes of various regions, layers and their relative sizes and positional relationships shown in the drawings are merely exemplary, and deviations may occur in practice due to manufacturing tolerances or technical limitations, and a person skilled in the art may additionally design regions/layers having different shapes, sizes, relative positions, according to actual needs.

In the context of the present disclosure, when a layer/element is referred to as being "on" another layer/element, it can be directly on the other layer/element or intervening layers/elements may be present. In addition, if a layer/element is "on" another layer/element in one orientation, then that layer/element may be "under" the other layer/element when the orientation is reversed.

It should be noted that the terms "first," "second," and the like in the description and claims of the present invention and in the drawings described above are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used is interchangeable under appropriate circumstances such that the embodiments of the invention described herein are capable of operation in sequences other than those illustrated or described herein. Furthermore, the terms "comprises," "comprising," and "having," and any variations thereof, are intended to cover a non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements expressly listed, but may include other steps or elements not expressly listed or inherent to such process, method, article, or apparatus.

The invention is described in further detail below with reference to the accompanying drawings:

referring to fig. 1, in order to accurately, stably and robustly classify images, the invention provides a cross-correlation entropy based multi-deep learning model fusion method and an application thereof in image classification, and the method specifically comprises three parts, namely data set collection and preprocessing, training of a plurality of single models, and multi-model fusion, as shown in fig. 1.

The method is specifically carried out according to the following steps:

step 1.1: in order to realize the function of classifying input images, data needs to be preprocessed, the preprocessing comprises preprocessing the data into a format which can be read by a neural network and normalizing, and the training model has better robustness through means of rotation, mirror image, cutting and the like.

Step 1.2: in order to evaluate and select the model, independent and identically distributed sampling needs to be performed on a data set, the data set is divided into a training set for training and learning of the model, a cross validation set is used for preventing overfitting of the model, and a test set is used for testing the discrimination capability of the model on the sample.

Step 2.1: respectively training the models in a plurality of deep convolutional neural networks, such as VGG, inclusion V4, Xconvergence, ResNet, DenNet and other networks by using the same training set X according to the collected data, so that each model obtains an optimal result, and storing model parameters w and b;

step 2.2: and (3) performing back propagation by using a loss function and a function optimization method, updating model weights w and b, and calculating the formula as follows:

where λ represents the learning rate, w_jExpressing the weight of the jth parameter, J (w, b) is an objective function, and the gradient descent method is used for the parameter w in the objective function_jAnd b, updating the weight.

Step 2.3: and repeating the steps until the training times are reached or the training target is obtained to obtain the optimal model.

Because the accuracy of a single model is very limited, in order to solve the problem, the method uses a multi-model integration method, learns the contribution of each model to the final result, improves the sensitivity of the model, and provides the maximum mutual entropy criterion to improve the anti-noise capability of the model and acquire a more accurate prediction result aiming at more image noise points.

Step 3.1: for a plurality of existing single models, a multi-model integration method based on maximum cross-correlation entropy is used to learn the contribution weight w of each single model to the final result [ w ]₁,w₂,…,w_n]And n is the number of single models trained.

Step 3.2: taking the correlation entropy criterion as an objective function, the calculation formula is as follows:

where there are m samples, w ═ w₁,w₂,…,w_n]Weights, w, representing integration models_jWeight, x, representing the jth model_iRepresenting the output of the ith sample in a plurality of models, y_iThe authentic label representing the ith sample.

Step 3.3: after the step 3.2), maximizing the target function defined by the cross-correlation entropy by using the corresponding relation obtained by the step 3.2), wherein the contribution of each single model to the final result is calculated according to the following formula:

where σ denotes the kernel width and λ denotes the learning rate. The entropy criterion is subjected to gradient rise, and the objective function is close to the maximum value after multiple iterations to achieve the learning parameter w_jAnd purpose of b.

Step 3.4: judging whether the classification precision of the current model meets the requirement, if so, outputting a parameter w_jAnd b, if the model parameters do not meet the requirements, continuing to adjust the model parameters, returning to the step 3.2), and continuing iterative computation to obtain the optimal parameters.

Through the training, the multi-model fusion based on the cross-correlation entropy is obtained, and the accurate classification result of the input images can be obtained. The experiment on the medical skin data set is to classify pathological sections of basal cell carcinoma and seborrheic keratosis, the image characteristics of the pathological sections are very similar and have more noise, and the results of image classification are compared by applying a plurality of single model networks and the multi-deep learning model fusion method based on cross-correlation entropy provided by the invention to the image classification, and the method is shown in figure 3, wherein a basal cell carcinoma image is arranged above a middle line, and a seborrheic keratosis image is arranged below the middle line. The dashed box indicates a sample classification error. Wherein, (a) is an original pathological image, (b) is a ResNet classification result, (c) is an Xception classification result, (d) is a DensNet classification result, and (e) is an MCC-based integration method classification result.

The image classification method based on deep learning of the invention learns hierarchical feature description through a supervised or unsupervised mode, replaces the traditional work of manually designing or selecting image features, a convolutional neural network in a deep learning model has achieved amazing performance in the image field, the convolutional neural network utilizes image pixel information as input, retains all information of the input image to the maximum extent, and performs feature extraction through convolution operation, and the end-to-end learning method achieves very good effect and is widely applied.

The mobile terminal provided by the embodiment of the invention. The mobile terminal of this embodiment includes: a processor, a memory, and a computer program stored in the memory and executable on the processor. The processor, when executing the computer program, implements the steps in the above-described cross-correlation entropy based multi-deep learning model fusion method embodiment, such as the steps shown in fig. 1. Alternatively, the processor implements the functions of the modules/units in the above device embodiments when executing the computer program.

The computer program may be partitioned into one or more modules/units that are stored in the memory and executed by the processor to implement the invention.

The mobile terminal can be a desktop computer, a notebook, a palm computer, a cloud server and other computing equipment. The storage device may include, but is not limited to, a processor, a memory.

The processor may be a Central Processing Unit (CPU), other general purpose processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, etc.

The mobile terminal integrated module/unit, if implemented in the form of a software functional unit and sold or used as a separate product, may be stored in a computer-readable storage medium. Based on such understanding, all or part of the flow of the method according to the embodiments of the present invention may also be implemented by a computer program, which may be stored in a computer-readable storage medium, and when the computer program is executed by a processor, the steps of the method embodiments may be implemented. Wherein the computer program comprises computer program code, which may be in the form of source code, object code, an executable file or some intermediate form, etc. The computer-readable medium may include: any entity or device capable of carrying the computer program code, recording medium, usb disk, removable hard disk, magnetic disk, optical disk, computer memory, Read-only memory (ROM), Random Access Memory (RAM), electrical carrier wave signals, telecommunications signals, software distribution medium, etc. It should be noted that the computer readable medium may contain content that is subject to appropriate increase or decrease as required by legislation and patent practice in jurisdictions, for example, in some jurisdictions, computer readable media does not include electrical carrier signals and telecommunications signals as is required by legislation and patent practice.

The memory may be used to store the computer programs and/or modules, and the processor may implement various functions of the mobile terminal by executing or executing the computer programs and/or modules stored in the memory and calling data stored in the memory.

The above-mentioned contents are only for illustrating the technical idea of the present invention, and the protection scope of the present invention is not limited thereby, and any modification made on the basis of the technical idea of the present invention falls within the protection scope of the claims of the present invention.

Claims

1. A cross-correlation entropy based multi-deep learning model fusion method suitable for being executed in a computing device is characterized by comprising the following steps of:

step 1: preprocessing the data set;

wherein w ═ w₁，w₂，…，w_n]Representing the weights of the integrated model, b is the bias, m is the number of samples, σ is the kernel width, x_iRepresenting the output of the ith sample in the plurality of single models, y_iA true tag representing the ith sample;

2. The fusion method of the multi-deep learning model based on the cross-correlation entropy of claim 1, wherein the specific method of the step 2 is as follows:

3. The fusion method of the multiple deep learning models based on the cross-correlation entropy of claim 2, wherein the multiple deep convolutional neural network models are one or more of VGG, IncepotionV 4, Xception, ResNet or DensNet.

4. The fusion method of the multi-deep learning model based on the cross-correlation entropy of claim 1, wherein the specific method of the step 3 is as follows:

step 3-1: for a plurality of existing single models, learning the contribution weight w of each single model to the final result by using an MCC-based multi-model integration method₁，w₂，…，w_n]N is the number of single models;

step 3-4: judging whether the classification precision of the current model meets the requirement, if so, outputting a parameter w_jAnd b, if the model parameters do not meet the requirements, continuing to adjust the model parameters, returning to the step 3-2, continuing iterative computation to obtain the optimal parameters, and obtaining the final model.

5. A memory device having stored therein a plurality of instructions adapted to be loaded and executed by a processor:

step 1: preprocessing the data set;

wherein w ═ w₁，w₂，…，w_n]Representing the weights of the integrated model, b is the bias, m is the number of samples, σ is the kernel width, x_iRepresents the ithOutput of samples in the plurality of single models, y_iA true tag representing the ith sample;

6. A mobile terminal, comprising:

a processor adapted to implement instructions; and

step 1: preprocessing the data set;