CN114863242A

CN114863242A - Deep learning network optimization method and system for image recognition

Info

Publication number: CN114863242A
Application number: CN202210447416.2A
Authority: CN
Inventors: 袁潮; 赵月峰; 其他发明人请求不公开姓名
Original assignee: Beijing Zhuohe Technology Co Ltd
Current assignee: Beijing Zhuohe Technology Co Ltd
Priority date: 2022-04-26
Filing date: 2022-04-26
Publication date: 2022-08-05
Anticipated expiration: 2042-04-26
Also published as: CN114863242B

Abstract

The method and the system for optimizing the deep learning network facing to the image recognition are particularly applied to the field of image recognition, and image data to be trained are collected; wherein, the image data to be trained carries an initial label; constructing a deep learning network model; inputting image data to be trained into a deep learning network model, and training parameters of the deep learning network model by improving a proportional-integral controller to construct a first image recognition model; determining a correction label of image data to be trained according to the first image recognition model; calculating the distance between the initial label and the correction label by adopting a FASTA algorithm; screening a correction label of the image data to be trained according to the distance; inputting the screened image data to be trained carrying the correction label into a first image recognition model, and training parameters of the first image recognition model by improving a proportional-integral controller to obtain a second image recognition model. Therefore, the identification efficiency and accuracy of the image identification model are improved.

Description

Deep learning network optimization method and system for image recognition

Technical Field

The present application relates to the field of image recognition, and more particularly, to an optimization method and system for a deep learning network for image recognition.

Background

Image recognition refers to a technology of processing, analyzing and understanding an image by using a computer to recognize various targets and objects, and people use a deep learning algorithm to practice application as computer technology develops. For example, the image recognition includes pedestrian face recognition, vehicle recognition, and the like.

With the development of deep learning technology, under a controllable environment, the accuracy of image recognition based on a deep network model achieves satisfactory accuracy. However, as the quality and number of the acquired images decrease, the accuracy of the image recognition method decreases significantly. Further, the image recognition method lacks the capability of distinguishing features in the inference stage, and cannot meet the actual requirements of users.

Therefore, when a deep learning network model is used, how to train a high-accuracy image recognition model by using low-quality and small amount of image data is an urgent problem to be solved.

Disclosure of Invention

The embodiment of the invention aims to provide an image recognition-oriented deep learning network optimization method and system, wherein labels of image data to be trained are corrected through a forward propagation network with four convolutional layers and four pooling layers and a deep learning network model with a backward propagation network with two convolutional layers, four expansion convolutional layers and four pooling layers, and simultaneously, a FASTA algorithm is introduced to calculate the distance between the labels, so that the label correction accuracy is improved, and the recognition efficiency and the accuracy of an image recognition model are further optimized. The specific technical scheme is as follows:

in a first aspect of the embodiments of the present invention, an optimization method for an image recognition-oriented deep learning network is provided, including: collecting image data to be trained; wherein the image data to be trained carries an initial label; constructing a deep learning network model; the deep learning network model is a convolutional neural network model and comprises a forward propagation network and a backward propagation network; the forward propagation network comprises four convolution layers and four pooling layers, and the backward propagation network comprises two convolution layers, four expansion convolution layers and four pooling layers; inputting the image data to be trained into the deep learning network model, and training parameters of the deep learning network model by improving a proportional-integral controller to construct a first image recognition model; determining a correction label of the image data to be trained according to the first image recognition model; calculating the distance between the initial label and the correction label by adopting a FASTA algorithm; screening a correction label of the image data to be trained according to the distance; inputting the screened image data to be trained carrying the correction label into a first image recognition model, and training parameters of the first image recognition model by improving a proportional-integral controller to obtain a second image recognition model.

Optionally, the training parameters of the deep learning network model by improving a proportional-integral controller to construct a first image recognition model includes: updating the model parameters according to the estimated value of the gradient of the current step, judging whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step, and if so, performing parameter training by adopting a momentum optimization algorithm; if not, performing parameter training by adopting a random gradient descent algorithm until the model meets a preset training termination condition.

Optionally, whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step is judged according to the following formula:

wherein dic represents a judgment value, sgn (. cndot.) represents a sign function,

gradient, v, representing the current step t _t Representing the speed of the current step t.

Optionally, if the determination value is 1, the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step; and if the judgment value is-1, the direction of the gradient of the previous step is inconsistent with the direction of the gradient of the current step.

Optionally, the calculating, by using a FASTA algorithm, a distance between the initial tag and the revised tag includes: encoding the initial label and the corrected label by adopting a reference format of a deoxynucleotide sequence to obtain a first initial label sequence and a first corrected label sequence; calculating a distance between the first initial tag sequence and the first modified tag sequence as a first distance; encoding the initial label and the corrected label by one-hot to obtain a second initial label sequence and a second corrected label sequence; calculating a distance between the second initial tag sequence and the second modified tag sequence as a second distance; and performing weighted summation processing on the first distance and the second distance, and taking the obtained result as the distance between the initial label and the corrected label.

Optionally, said encoding said initial tag and said modified tag in a reference format of deoxynucleotide sequence comprises: encoding the tone of the initial tag and the tone of the modified tag in a reference format of a deoxynucleotide sequence; wherein the deoxynucleotide sequence has a reference format in which adenine deoxynucleotide A corresponds to a first sound, thymine deoxynucleotide T corresponds to a second sound, cytosine deoxynucleotide C corresponds to a third sound, and guanine deoxynucleotide G corresponds to a fourth sound.

Optionally, the screening, according to the distance, the correction label of the image data to be trained includes: normalizing the distance to obtain a normalized distance; if the normalized distance is larger than a preset threshold value, removing the corresponding correction label of the image data to be trained; otherwise, the corresponding correction label of the image data to be trained is reserved.

Optionally, the normalized distance is calculated by the following formula:

wherein epsilon represents an adjustment coefficient, the number of initial labels is n, tf (i) represents the number of appearance times of the initial label i, and SIM (subscriber identity Module) _i,j Representing the distance between the original tag i and the corresponding modified tag j.

Optionally, the method further comprises: acquiring image data to be identified; and identifying the image data to be identified through the second image identification model to obtain an image identification result.

In another aspect of the embodiments of the present invention, there is provided an optimization system for an image recognition-oriented deep learning network, including: the data acquisition module is used for acquiring image data to be trained; wherein the image data to be trained carries an initial label; the deep learning network model building module is used for building a deep learning network model; the deep learning network model is a convolutional neural network model and comprises a forward propagation network and a backward propagation network; the forward propagation network comprises four convolution layers and four pooling layers, and the backward propagation network comprises two convolution layers, four expansion convolution layers and four pooling layers; the first image recognition model building module is used for inputting the image data to be trained into the deep learning network model, training parameters of the deep learning network model through an improved proportional-integral controller and building a first image recognition model; a modified label determining module, configured to determine a modified label of the image data to be trained according to the first image recognition model; the distance calculation module is used for calculating the distance between the initial label and the correction label by adopting a FASTA algorithm; the correction label screening module is used for screening the correction label of the image data to be trained according to the distance; and the second image recognition model construction module is used for inputting the screened image data to be trained carrying the correction labels into the first image recognition model, and obtaining a second image recognition model by improving the parameters of the proportional-integral controller for training the first image recognition model.

Optionally, the first image recognition model building module is further configured to: updating the model parameters according to the estimated value of the gradient of the current step, judging whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step, and if so, performing parameter training by adopting a momentum optimization algorithm; if not, performing parameter training by adopting a random gradient descent algorithm until the model meets a preset training termination condition.

Optionally, the distance calculating module is further configured to: encoding the initial label and the corrected label by adopting a reference format of a deoxynucleotide sequence to obtain a first initial label sequence and a first corrected label sequence; calculating a distance between the first initial tag sequence and the first modified tag sequence as a first distance; encoding the initial label and the corrected label by one-hot to obtain a second initial label sequence and a second corrected label sequence; calculating a distance between the second initial tag sequence and the second modified tag sequence as a second distance; and performing weighted summation processing on the first distance and the second distance, and taking the obtained result as the distance between the initial label and the corrected label.

Further, said encoding said initial tag and said modified tag in a reference format of a deoxynucleotide sequence comprises: encoding the tone of the initial tag and the tone of the modified tag in a reference format of a deoxynucleotide sequence; wherein the deoxynucleotide sequence has a reference format in which adenine deoxynucleotide A corresponds to a first sound, thymine deoxynucleotide T corresponds to a second sound, cytosine deoxynucleotide C corresponds to a third sound, and guanine deoxynucleotide G corresponds to a fourth sound.

Optionally, the revised tag screening module is further configured to: normalizing the distance to obtain a normalized distance; if the normalized distance is larger than a preset threshold value, removing the corresponding correction label of the image data to be trained; otherwise, the corresponding correction label of the image data to be trained is reserved.

Further, the normalized distance is calculated by the following formula:

Optionally, the system further includes an image recognition module, configured to acquire image data to be recognized; and identifying the image data to be identified through the second image identification model to obtain an image identification result.

Has the advantages that:

(1) using image data to be trained carrying an initial label for training a first image recognition model, recognizing a correction label according to the first image recognition model, inputting the correction label into the first image recognition model, and further training a second image recognition model; in the process, screening of the correction label is also carried out, FASTA coding and one-hot coding are introduced, and the distance between the initial label and the correction label is calculated; a brand-new normalized distance calculation mode is provided, and correction labels are screened; through twice training and label screening, the recognition efficiency and accuracy of the image recognition model are improved.

(2) Adopting a convolutional neural network model of a forward propagation network and a backward propagation network to construct a deep learning network model; the forward propagation network comprises four convolutional layers and four pooling layers, and the backward propagation network comprises two convolutional layers, four expansion convolutional layers and four pooling layers. The deep learning network architecture can improve the identification performance of the model without increasing the inference calculation overhead.

(3) The advantages of the momentum optimization algorithm and the random gradient descent algorithm are combined, the improved proportional-integral controller is provided and used for deeply learning the network model, the convergence speed of the model is high, and the situation that the model falls into a local optimal solution is avoided.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

Fig. 1 is a flowchart of an optimization method of an image recognition-oriented deep learning network provided in an embodiment of the present application;

FIG. 2 is a flowchart of an image recognition method provided in an embodiment of the present application;

fig. 3 is a schematic structural diagram of an optimization system of a deep learning network facing image recognition provided in an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.

Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

The embodiment of the application provides an optimization method and system for a deep learning network facing image recognition, which specifically comprise the following steps: collecting image data to be trained; wherein the image data to be trained carries an initial label; constructing a deep learning network model; inputting the image data to be trained into the deep learning network model, and training parameters of the deep learning network model by improving a proportional-integral controller to construct a first image recognition model; determining a correction label of the image data to be trained according to the first image recognition model; calculating the distance between the initial label and the correction label by adopting a FASTA algorithm; screening a correction label of the image data to be trained according to the distance; inputting the screened image data to be trained carrying the correction label into a first image recognition model, and training parameters of the first image recognition model by improving a proportional-integral controller to obtain a second image recognition model. Therefore, the identification efficiency and accuracy of the image identification model are improved.

The method and the system for optimizing the deep learning network facing the image recognition can be specifically integrated in electronic equipment, and the electronic equipment can be equipment such as a terminal and a server. The terminal can be a light field camera, a vehicle-mounted camera, a mobile phone, a tablet Computer, an intelligent Bluetooth device, a notebook Computer, or a Personal Computer (PC) and other devices; the server may be a single server or a server cluster composed of a plurality of servers.

It can be understood that the method and system for optimizing the deep learning network facing the image recognition in this embodiment may be executed on a terminal, may also be executed on a server, and may also be executed by both the terminal and the server. The above examples should not be construed as limiting the present application.

Artificial Intelligence (AI) is a theory, method, technique and application device that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.

In recent years, with research and progress of artificial intelligence technology, artificial intelligence technology is widely applied in a plurality of fields, and the scheme provided by the embodiment of the disclosure relates to technologies such as computer vision technology and machine learning/deep learning of artificial intelligence, and is specifically described by the following embodiments:

referring to fig. 1, fig. 1 is a flowchart illustrating an optimization method for an image recognition-oriented deep learning network according to an embodiment of the present disclosure, where the method specifically includes the following steps:

s110, collecting image data to be trained; wherein the image data to be trained carries an initial label.

S120, constructing a deep learning network model; the deep learning network model is a convolutional neural network model and comprises a forward propagation network and a backward propagation network; the forward propagation network comprises four convolutional layers and four pooling layers, and the backward propagation network comprises two convolutional layers, four expansion convolutional layers and four pooling layers.

The deep learning network architecture can improve the identification performance of the model without increasing the inference calculation overhead.

S130, inputting the image data to be trained into the deep learning network model, and training parameters of the deep learning network model through an improved proportional-integral controller to construct a first image recognition model.

In an embodiment, the step S130 may specifically include the following steps:

and S131, updating the model parameters according to the estimated value of the current step gradient.

S132, judging whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step.

Specifically, whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step is judged through the following formula:

Further, if the determination value is 1, that is, if dic is 1, the direction of the previous step gradient is consistent with the direction of the current step gradient; if the judgment value is-1, that is, if dic is-1, the direction of the gradient of the previous step is inconsistent with the direction of the gradient of the current step.

S133, if the direction of the gradient of the previous step is consistent with that of the gradient of the current step, performing parameter training by adopting a momentum optimization algorithm; and if the direction of the gradient of the previous step is inconsistent with the direction of the gradient of the current step, performing parameter training by adopting a random gradient descent algorithm.

Specifically, the iterative formula of the momentum optimization algorithm is as follows:

θ _t+1 ＝θ _t -v _t+1

wherein, theta _t Model parameters, θ, representing the current step t _t+1 Model parameters, v, representing the next step t +1 _t+1 Representing the speed of the next step t + 1.

The iterative formula of the random gradient descent algorithm is as follows:

wherein, theta _t Model parameters, θ, representing the current step t _t+1 The model parameters representing the next step t +1,

denotes the gradient of the current step t, r denotes the learning rate, and also denotes the step size of each approximation of the gradient, L (θ) _t ) Representing the loss function for the current step t.

And S134, until the model meets the preset training termination condition.

And presetting a loss function, and stopping training if the loss data meets the preset loss function.

Therefore, the advantages of the momentum optimization algorithm and the stochastic gradient descent algorithm are combined to train the deep neural network model, the convergence speed of the model is increased, and the situation that the model falls into a local optimal solution is avoided.

S140, determining a correction label of the image data to be trained according to the first image recognition model.

S150, calculating the distance between the initial label and the correction label by adopting a FASTA algorithm.

In one embodiment, step S150 may specifically include the following steps:

and S151, encoding the initial tag and the corrected tag by adopting a reference format of a deoxynucleotide sequence to obtain a first initial tag sequence and a first corrected tag sequence.

The FASTA algorithm is an algorithm proposed based on a genetic sequence, wherein the FASTA algorithm comprises four deoxynucleotides, and tones of Chinese characters are also four, so that the tone of a label is proposed to be encoded by adopting a reference format of the deoxynucleotide sequence, wherein the reference format refers to an encoding format of the label, and the format of the reference deoxynucleotide sequence is adopted.

Specifically, the tone of the initial tag and the tone of the modified tag are encoded in a reference format of a deoxynucleotide sequence; wherein the deoxynucleotide sequence has a reference format in which adenine deoxynucleotide A corresponds to a first sound, thymine deoxynucleotide T corresponds to a second sound, cytosine deoxynucleotide C corresponds to a third sound, and guanine deoxynucleotide G corresponds to a fourth sound. For example, the initial tag is "masculine star", the first initial tag sequence obtained after encoding is { TTA }, the corresponding modified tag is "comedy masculine star", and the first modified tag sequence obtained after encoding is { CGTTA }.

S152, calculating the distance between the first initial label sequence and the first correction label sequence as a first distance.

For example, the edit distance between the first initial sequence of tags { TTA } and the first modified sequence of tags { CGTTA } is calculated.

S153, encoding the initial label and the modified label by one-hot to obtain a second initial label sequence and a second modified label sequence.

And S154, calculating the distance between the second initial label sequence and the second corrected label sequence as a second distance.

The second distance may specifically be an edit distance.

S155, carrying out weighted summation processing on the first distance and the second distance, and taking the obtained result as the distance between the initial label and the corrected label.

Specifically, the distance between the initial tag i and its corresponding modified tag j can be calculated by the following formula:

|SIM _i,j |＝α×SIM ₁ +β×SIM ₂

where α and β represent weight parameters, SIM ₁ Indicating a first distance, SIM ₂ Representing the second distance.

In the embodiment, the A code is combined with the one-hot code, and the tone information and the characteristic information of the label are considered, so that the accuracy of label screening is corrected.

And S160, screening the correction label of the image data to be trained according to the distance.

In one embodiment, step S160 may specifically include the following steps:

and S161, normalizing the distance to obtain a normalized distance.

Specifically, the normalized distance may be calculated by the following formula:

The normalized distance calculation mode can further improve the accuracy of the correction label screening.

S162, if the normalized distance is larger than a preset threshold value, removing a corresponding correction label of the image data to be trained; otherwise, the corresponding correction label of the image data to be trained is reserved.

S170, inputting the screened image data to be trained carrying the correction labels into a first image recognition model, and training parameters of the first image recognition model through an improved proportional-integral controller to obtain a second image recognition model.

Therefore, through two times of training and label screening of the model, the recognition efficiency and accuracy of the image recognition model are improved.

Fig. 2 shows a flowchart of an image recognition method provided in an embodiment of the present application, please refer to fig. 2, which specifically includes the following steps:

and S210, acquiring image data to be identified.

In this embodiment, the image recognition device is used to obtain the image data to be recognized, and it can be understood that the image recognition device is disposed on the terminal device, and the image data to be recognized may be image data obtained by real-time shooting through a camera of the terminal device, or may be image data stored locally in the terminal device.

The image data to be recognized may be a static image or a dynamic image, and specifically may be face image data, animal image data, pedestrian image data, or vehicle image data.

S220, identifying the image data to be identified through the second image identification model to obtain an image identification result.

Optionally, this step is preceded by preprocessing the image data to be recognized. For example, preprocessing techniques such as graying, geometric transformation, and image enhancement based on a spatial domain method.

The second image recognition model may be obtained by the image recognition model training process.

In the embodiment, the accuracy of the image recognition result can be improved by performing image recognition through the second image recognition model.

The embodiment also provides an optimization system of a deep learning network facing image recognition, as shown in fig. 3, the system includes:

a data acquisition module 310, configured to acquire image data to be trained; wherein the image data to be trained carries an initial label.

A deep learning network model construction module 320, configured to construct a deep learning network model; the deep learning network model is a convolutional neural network model and comprises a forward propagation network and a backward propagation network; the forward propagation network comprises four convolutional layers and four pooling layers, and the backward propagation network comprises two convolutional layers, four expansion convolutional layers and four pooling layers.

The first image recognition model building module 330 is configured to input the image data to be trained into the deep learning network model, and build a first image recognition model by improving parameters of a proportional-integral controller for training the deep learning network model.

And a modified label determining module 340, configured to determine a modified label of the image data to be trained according to the first image recognition model.

A distance calculating module 350, configured to calculate a distance between the initial tag and the modified tag by using a FASTA algorithm.

And a revised label screening module 360, configured to screen a revised label of the image data to be trained according to the distance.

And a second image recognition model building module 370, configured to input the screened to-be-trained image data carrying the correction label into the first image recognition model, and train parameters of the first image recognition model through an improved proportional-integral controller, so as to obtain a second image recognition model.

Optionally, the first image recognition model building module 330 is further configured to: updating the model parameters according to the estimated value of the gradient of the current step, judging whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step, and if so, performing parameter training by adopting a momentum optimization algorithm; if not, performing parameter training by adopting a random gradient descent algorithm until the model meets a preset training termination condition.

Optionally, the distance calculating module 350 is further configured to: encoding the initial label and the corrected label by adopting a reference format of a deoxynucleotide sequence to obtain a first initial label sequence and a first corrected label sequence; calculating a distance between the first initial tag sequence and the first modified tag sequence as a first distance; encoding the initial label and the corrected label by one-hot to obtain a second initial label sequence and a second corrected label sequence; calculating a distance between the second initial tag sequence and the second modified tag sequence as a second distance; and performing weighted summation processing on the first distance and the second distance, and taking the obtained result as the distance between the initial label and the corrected label.

Optionally, the revised tag screening module 360 is further configured to: normalizing the distance to obtain a normalized distance; if the normalized distance is larger than a preset threshold value, removing the corresponding correction label of the image data to be trained; otherwise, the corresponding correction label of the image data to be trained is reserved.

Further, the normalized distance is calculated by the following formula:

Optionally, the system further comprises a second image recognition module 370, configured to obtain image data to be recognized; and identifying the image data to be identified through the second image identification model to obtain an image identification result.

Therefore, the system can improve the identification efficiency and accuracy of the image identification model.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the modules/units/sub-units/components in the above-described apparatus may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.

In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another apparatus, or some features may be omitted, or not executed. In addition, the shown or discussed coupling or direct coupling or communication connection between each other may be through some communication interfaces, indirect coupling or communication connection between devices or units, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments provided in the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus once an item is defined in one figure, it need not be further defined and explained in subsequent figures, and moreover, the terms "first", "second", "third", etc. are used merely to distinguish one description from another and are not to be construed as indicating or implying relative importance.

Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present application. Are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims

1. An optimization method for a deep learning network facing image recognition is characterized by comprising the following steps:

collecting image data to be trained; wherein the image data to be trained carries an initial label;

constructing a deep learning network model; the deep learning network model is a convolutional neural network model and comprises a forward propagation network and a backward propagation network; the forward propagation network comprises four convolution layers and four pooling layers, and the backward propagation network comprises two convolution layers, four expansion convolution layers and four pooling layers;

inputting the image data to be trained into the deep learning network model, and training parameters of the deep learning network model by improving a proportional-integral controller to construct a first image recognition model;

determining a correction label of the image data to be trained according to the first image recognition model;

calculating the distance between the initial label and the correction label by adopting a FASTA algorithm;

screening a correction label of the image data to be trained according to the distance;

inputting the screened image data to be trained carrying the correction label into a first image recognition model, and training parameters of the first image recognition model by improving a proportional-integral controller to obtain a second image recognition model.

2. The method of claim 1, wherein the training parameters of the deep learning network model by improving a proportional-integral controller to construct a first image recognition model comprises:

updating the model parameters according to the estimated value of the gradient of the current step, judging whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step, and if so, performing parameter training by adopting a momentum optimization algorithm; if not, performing parameter training by adopting a random gradient descent algorithm until the model meets a preset training termination condition.

3. The method of claim 2, wherein whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step is determined according to the following formula:

4. The method according to claim 3, wherein if the determination value is 1, the direction of the previous step gradient is consistent with the direction of the current step gradient; and if the judgment value is-1, the direction of the gradient of the previous step is inconsistent with the direction of the gradient of the current step.

5. The method of claim 1, wherein said calculating the distance between the initial tag and the revised tag using the FASTA algorithm comprises:

encoding the initial label and the corrected label by adopting a reference format of a deoxynucleotide sequence to obtain a first initial label sequence and a first corrected label sequence;

calculating a distance between the first initial tag sequence and the first modified tag sequence as a first distance;

encoding the initial label and the corrected label by one-hot to obtain a second initial label sequence and a second corrected label sequence;

calculating a distance between the second initial tag sequence and the second modified tag sequence as a second distance;

and performing weighted summation processing on the first distance and the second distance, and taking the obtained result as the distance between the initial label and the corrected label.

6. The method of claim 5, wherein said encoding said initial tag and said modified tag in a deoxynucleotide sequence reference format comprises:

encoding the tone of the initial tag and the tone of the modified tag in a reference format of the deoxynucleotide sequence; wherein, the deoxynucleotide sequence has the reference format that adenine deoxynucleotide A corresponds to the first sound, thymine deoxynucleotide T corresponds to the second sound, cytosine deoxynucleotide C corresponds to the third sound and guanine deoxynucleotide G corresponds to the fourth sound.

7. The method according to claim 6, wherein the screening the revised label of the image data to be trained according to the distance comprises:

normalizing the distance to obtain a normalized distance;

if the normalized distance is larger than a preset threshold value, removing the corresponding correction label of the image data to be trained;

otherwise, the corresponding correction label of the image data to be trained is reserved.

8. The method of claim 7, wherein the normalized distance is calculated by the formula:

9. The method of claim 1, further comprising:

acquiring image data to be identified;

and identifying the image data to be identified through the second image identification model to obtain an image identification result.

10. An optimization system of the deep learning network facing image recognition based on the method of any one of claims 1 to 9, characterized in that the system comprises:

the data acquisition module is used for acquiring image data to be trained; wherein the image data to be trained carries an initial label;

the deep learning network model building module is used for building a deep learning network model; the deep learning network model is a convolutional neural network model and comprises a forward propagation network and a backward propagation network; the forward propagation network comprises four convolution layers and four pooling layers, and the backward propagation network comprises two convolution layers, four expansion convolution layers and four pooling layers;

the first image recognition model building module is used for inputting the image data to be trained into the deep learning network model, training parameters of the deep learning network model through an improved proportional-integral controller and building a first image recognition model;

a modified label determining module, configured to determine a modified label of the image data to be trained according to the first image recognition model;

the distance calculation module is used for calculating the distance between the initial label and the correction label by adopting a FASTA algorithm;

the correction label screening module is used for screening the correction label of the image data to be trained according to the distance;

and the second image recognition model construction module is used for inputting the screened image data to be trained carrying the correction labels into the first image recognition model, and obtaining a second image recognition model by improving the parameters of the proportional-integral controller for training the first image recognition model.