CN114863242A - Deep learning network optimization method and system for image recognition - Google Patents

Deep learning network optimization method and system for image recognition Download PDF

Info

Publication number
CN114863242A
CN114863242A CN202210447416.2A CN202210447416A CN114863242A CN 114863242 A CN114863242 A CN 114863242A CN 202210447416 A CN202210447416 A CN 202210447416A CN 114863242 A CN114863242 A CN 114863242A
Authority
CN
China
Prior art keywords
label
distance
image data
image recognition
trained
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210447416.2A
Other languages
Chinese (zh)
Other versions
CN114863242B (en
Inventor
袁潮
赵月峰
其他发明人请求不公开姓名
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Zhuohe Technology Co Ltd
Original Assignee
Beijing Zhuohe Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Zhuohe Technology Co Ltd filed Critical Beijing Zhuohe Technology Co Ltd
Priority to CN202210447416.2A priority Critical patent/CN114863242B/en
Publication of CN114863242A publication Critical patent/CN114863242A/en
Application granted granted Critical
Publication of CN114863242B publication Critical patent/CN114863242B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02TCLIMATE CHANGE MITIGATION TECHNOLOGIES RELATED TO TRANSPORTATION
    • Y02T10/00Road transport of goods or passengers
    • Y02T10/10Internal combustion engine [ICE] based vehicles
    • Y02T10/40Engine management systems

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Physics & Mathematics (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Data Mining & Analysis (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The method and the system for optimizing the deep learning network facing to the image recognition are particularly applied to the field of image recognition, and image data to be trained are collected; wherein, the image data to be trained carries an initial label; constructing a deep learning network model; inputting image data to be trained into a deep learning network model, and training parameters of the deep learning network model by improving a proportional-integral controller to construct a first image recognition model; determining a correction label of image data to be trained according to the first image recognition model; calculating the distance between the initial label and the correction label by adopting a FASTA algorithm; screening a correction label of the image data to be trained according to the distance; inputting the screened image data to be trained carrying the correction label into a first image recognition model, and training parameters of the first image recognition model by improving a proportional-integral controller to obtain a second image recognition model. Therefore, the identification efficiency and accuracy of the image identification model are improved.

Description

Deep learning network optimization method and system for image recognition
Technical Field
The present application relates to the field of image recognition, and more particularly, to an optimization method and system for a deep learning network for image recognition.
Background
Image recognition refers to a technology of processing, analyzing and understanding an image by using a computer to recognize various targets and objects, and people use a deep learning algorithm to practice application as computer technology develops. For example, the image recognition includes pedestrian face recognition, vehicle recognition, and the like.
With the development of deep learning technology, under a controllable environment, the accuracy of image recognition based on a deep network model achieves satisfactory accuracy. However, as the quality and number of the acquired images decrease, the accuracy of the image recognition method decreases significantly. Further, the image recognition method lacks the capability of distinguishing features in the inference stage, and cannot meet the actual requirements of users.
Therefore, when a deep learning network model is used, how to train a high-accuracy image recognition model by using low-quality and small amount of image data is an urgent problem to be solved.
Disclosure of Invention
The embodiment of the invention aims to provide an image recognition-oriented deep learning network optimization method and system, wherein labels of image data to be trained are corrected through a forward propagation network with four convolutional layers and four pooling layers and a deep learning network model with a backward propagation network with two convolutional layers, four expansion convolutional layers and four pooling layers, and simultaneously, a FASTA algorithm is introduced to calculate the distance between the labels, so that the label correction accuracy is improved, and the recognition efficiency and the accuracy of an image recognition model are further optimized. The specific technical scheme is as follows:
in a first aspect of the embodiments of the present invention, an optimization method for an image recognition-oriented deep learning network is provided, including: collecting image data to be trained; wherein the image data to be trained carries an initial label; constructing a deep learning network model; the deep learning network model is a convolutional neural network model and comprises a forward propagation network and a backward propagation network; the forward propagation network comprises four convolution layers and four pooling layers, and the backward propagation network comprises two convolution layers, four expansion convolution layers and four pooling layers; inputting the image data to be trained into the deep learning network model, and training parameters of the deep learning network model by improving a proportional-integral controller to construct a first image recognition model; determining a correction label of the image data to be trained according to the first image recognition model; calculating the distance between the initial label and the correction label by adopting a FASTA algorithm; screening a correction label of the image data to be trained according to the distance; inputting the screened image data to be trained carrying the correction label into a first image recognition model, and training parameters of the first image recognition model by improving a proportional-integral controller to obtain a second image recognition model.
Optionally, the training parameters of the deep learning network model by improving a proportional-integral controller to construct a first image recognition model includes: updating the model parameters according to the estimated value of the gradient of the current step, judging whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step, and if so, performing parameter training by adopting a momentum optimization algorithm; if not, performing parameter training by adopting a random gradient descent algorithm until the model meets a preset training termination condition.
Optionally, whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step is judged according to the following formula:
Figure BDA0003615960480000021
wherein dic represents a judgment value, sgn (. cndot.) represents a sign function,
Figure BDA0003615960480000022
gradient, v, representing the current step t t Representing the speed of the current step t.
Optionally, if the determination value is 1, the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step; and if the judgment value is-1, the direction of the gradient of the previous step is inconsistent with the direction of the gradient of the current step.
Optionally, the calculating, by using a FASTA algorithm, a distance between the initial tag and the revised tag includes: encoding the initial label and the corrected label by adopting a reference format of a deoxynucleotide sequence to obtain a first initial label sequence and a first corrected label sequence; calculating a distance between the first initial tag sequence and the first modified tag sequence as a first distance; encoding the initial label and the corrected label by one-hot to obtain a second initial label sequence and a second corrected label sequence; calculating a distance between the second initial tag sequence and the second modified tag sequence as a second distance; and performing weighted summation processing on the first distance and the second distance, and taking the obtained result as the distance between the initial label and the corrected label.
Optionally, said encoding said initial tag and said modified tag in a reference format of deoxynucleotide sequence comprises: encoding the tone of the initial tag and the tone of the modified tag in a reference format of a deoxynucleotide sequence; wherein the deoxynucleotide sequence has a reference format in which adenine deoxynucleotide A corresponds to a first sound, thymine deoxynucleotide T corresponds to a second sound, cytosine deoxynucleotide C corresponds to a third sound, and guanine deoxynucleotide G corresponds to a fourth sound.
Optionally, the screening, according to the distance, the correction label of the image data to be trained includes: normalizing the distance to obtain a normalized distance; if the normalized distance is larger than a preset threshold value, removing the corresponding correction label of the image data to be trained; otherwise, the corresponding correction label of the image data to be trained is reserved.
Optionally, the normalized distance is calculated by the following formula:
Figure BDA0003615960480000031
wherein epsilon represents an adjustment coefficient, the number of initial labels is n, tf (i) represents the number of appearance times of the initial label i, and SIM (subscriber identity Module) i,j Representing the distance between the original tag i and the corresponding modified tag j.
Optionally, the method further comprises: acquiring image data to be identified; and identifying the image data to be identified through the second image identification model to obtain an image identification result.
In another aspect of the embodiments of the present invention, there is provided an optimization system for an image recognition-oriented deep learning network, including: the data acquisition module is used for acquiring image data to be trained; wherein the image data to be trained carries an initial label; the deep learning network model building module is used for building a deep learning network model; the deep learning network model is a convolutional neural network model and comprises a forward propagation network and a backward propagation network; the forward propagation network comprises four convolution layers and four pooling layers, and the backward propagation network comprises two convolution layers, four expansion convolution layers and four pooling layers; the first image recognition model building module is used for inputting the image data to be trained into the deep learning network model, training parameters of the deep learning network model through an improved proportional-integral controller and building a first image recognition model; a modified label determining module, configured to determine a modified label of the image data to be trained according to the first image recognition model; the distance calculation module is used for calculating the distance between the initial label and the correction label by adopting a FASTA algorithm; the correction label screening module is used for screening the correction label of the image data to be trained according to the distance; and the second image recognition model construction module is used for inputting the screened image data to be trained carrying the correction labels into the first image recognition model, and obtaining a second image recognition model by improving the parameters of the proportional-integral controller for training the first image recognition model.
Optionally, the first image recognition model building module is further configured to: updating the model parameters according to the estimated value of the gradient of the current step, judging whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step, and if so, performing parameter training by adopting a momentum optimization algorithm; if not, performing parameter training by adopting a random gradient descent algorithm until the model meets a preset training termination condition.
Optionally, whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step is judged according to the following formula:
Figure BDA0003615960480000041
wherein dic represents a judgment value, sgn (. cndot.) represents a sign function,
Figure BDA0003615960480000042
gradient, v, representing the current step t t Representing the speed of the current step t.
Optionally, if the determination value is 1, the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step; and if the judgment value is-1, the direction of the gradient of the previous step is inconsistent with the direction of the gradient of the current step.
Optionally, the distance calculating module is further configured to: encoding the initial label and the corrected label by adopting a reference format of a deoxynucleotide sequence to obtain a first initial label sequence and a first corrected label sequence; calculating a distance between the first initial tag sequence and the first modified tag sequence as a first distance; encoding the initial label and the corrected label by one-hot to obtain a second initial label sequence and a second corrected label sequence; calculating a distance between the second initial tag sequence and the second modified tag sequence as a second distance; and performing weighted summation processing on the first distance and the second distance, and taking the obtained result as the distance between the initial label and the corrected label.
Further, said encoding said initial tag and said modified tag in a reference format of a deoxynucleotide sequence comprises: encoding the tone of the initial tag and the tone of the modified tag in a reference format of a deoxynucleotide sequence; wherein the deoxynucleotide sequence has a reference format in which adenine deoxynucleotide A corresponds to a first sound, thymine deoxynucleotide T corresponds to a second sound, cytosine deoxynucleotide C corresponds to a third sound, and guanine deoxynucleotide G corresponds to a fourth sound.
Optionally, the revised tag screening module is further configured to: normalizing the distance to obtain a normalized distance; if the normalized distance is larger than a preset threshold value, removing the corresponding correction label of the image data to be trained; otherwise, the corresponding correction label of the image data to be trained is reserved.
Further, the normalized distance is calculated by the following formula:
Figure BDA0003615960480000051
wherein epsilon represents an adjustment coefficient, the number of initial labels is n, tf (i) represents the number of appearance times of the initial label i, and SIM (subscriber identity Module) i,j Representing the distance between the original tag i and the corresponding modified tag j.
Optionally, the system further includes an image recognition module, configured to acquire image data to be recognized; and identifying the image data to be identified through the second image identification model to obtain an image identification result.
Has the advantages that:
(1) using image data to be trained carrying an initial label for training a first image recognition model, recognizing a correction label according to the first image recognition model, inputting the correction label into the first image recognition model, and further training a second image recognition model; in the process, screening of the correction label is also carried out, FASTA coding and one-hot coding are introduced, and the distance between the initial label and the correction label is calculated; a brand-new normalized distance calculation mode is provided, and correction labels are screened; through twice training and label screening, the recognition efficiency and accuracy of the image recognition model are improved.
(2) Adopting a convolutional neural network model of a forward propagation network and a backward propagation network to construct a deep learning network model; the forward propagation network comprises four convolutional layers and four pooling layers, and the backward propagation network comprises two convolutional layers, four expansion convolutional layers and four pooling layers. The deep learning network architecture can improve the identification performance of the model without increasing the inference calculation overhead.
(3) The advantages of the momentum optimization algorithm and the random gradient descent algorithm are combined, the improved proportional-integral controller is provided and used for deeply learning the network model, the convergence speed of the model is high, and the situation that the model falls into a local optimal solution is avoided.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.
Fig. 1 is a flowchart of an optimization method of an image recognition-oriented deep learning network provided in an embodiment of the present application;
FIG. 2 is a flowchart of an image recognition method provided in an embodiment of the present application;
fig. 3 is a schematic structural diagram of an optimization system of a deep learning network facing image recognition provided in an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present application clearer, the technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are some embodiments of the present application, but not all embodiments. The components of the embodiments of the present application, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations.
Thus, the following detailed description of the embodiments of the present application, presented in the accompanying drawings, is not intended to limit the scope of the claimed application, but is merely representative of selected embodiments of the application. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.
The embodiment of the application provides an optimization method and system for a deep learning network facing image recognition, which specifically comprise the following steps: collecting image data to be trained; wherein the image data to be trained carries an initial label; constructing a deep learning network model; inputting the image data to be trained into the deep learning network model, and training parameters of the deep learning network model by improving a proportional-integral controller to construct a first image recognition model; determining a correction label of the image data to be trained according to the first image recognition model; calculating the distance between the initial label and the correction label by adopting a FASTA algorithm; screening a correction label of the image data to be trained according to the distance; inputting the screened image data to be trained carrying the correction label into a first image recognition model, and training parameters of the first image recognition model by improving a proportional-integral controller to obtain a second image recognition model. Therefore, the identification efficiency and accuracy of the image identification model are improved.
The method and the system for optimizing the deep learning network facing the image recognition can be specifically integrated in electronic equipment, and the electronic equipment can be equipment such as a terminal and a server. The terminal can be a light field camera, a vehicle-mounted camera, a mobile phone, a tablet Computer, an intelligent Bluetooth device, a notebook Computer, or a Personal Computer (PC) and other devices; the server may be a single server or a server cluster composed of a plurality of servers.
It can be understood that the method and system for optimizing the deep learning network facing the image recognition in this embodiment may be executed on a terminal, may also be executed on a server, and may also be executed by both the terminal and the server. The above examples should not be construed as limiting the present application.
Artificial Intelligence (AI) is a theory, method, technique and application device that uses a digital computer or a machine controlled by a digital computer to simulate, extend and expand human Intelligence, perceive the environment, acquire knowledge and use the knowledge to obtain the best results. The artificial intelligence software technology mainly comprises a computer vision technology, a voice processing technology, a natural language processing technology, machine learning/deep learning and the like.
In recent years, with research and progress of artificial intelligence technology, artificial intelligence technology is widely applied in a plurality of fields, and the scheme provided by the embodiment of the disclosure relates to technologies such as computer vision technology and machine learning/deep learning of artificial intelligence, and is specifically described by the following embodiments:
referring to fig. 1, fig. 1 is a flowchart illustrating an optimization method for an image recognition-oriented deep learning network according to an embodiment of the present disclosure, where the method specifically includes the following steps:
s110, collecting image data to be trained; wherein the image data to be trained carries an initial label.
S120, constructing a deep learning network model; the deep learning network model is a convolutional neural network model and comprises a forward propagation network and a backward propagation network; the forward propagation network comprises four convolutional layers and four pooling layers, and the backward propagation network comprises two convolutional layers, four expansion convolutional layers and four pooling layers.
The deep learning network architecture can improve the identification performance of the model without increasing the inference calculation overhead.
S130, inputting the image data to be trained into the deep learning network model, and training parameters of the deep learning network model through an improved proportional-integral controller to construct a first image recognition model.
In an embodiment, the step S130 may specifically include the following steps:
and S131, updating the model parameters according to the estimated value of the current step gradient.
S132, judging whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step.
Specifically, whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step is judged through the following formula:
Figure BDA0003615960480000081
wherein dic represents a judgment value, sgn (. cndot.) represents a sign function,
Figure BDA0003615960480000082
gradient, v, representing the current step t t Representing the speed of the current step t.
Further, if the determination value is 1, that is, if dic is 1, the direction of the previous step gradient is consistent with the direction of the current step gradient; if the judgment value is-1, that is, if dic is-1, the direction of the gradient of the previous step is inconsistent with the direction of the gradient of the current step.
S133, if the direction of the gradient of the previous step is consistent with that of the gradient of the current step, performing parameter training by adopting a momentum optimization algorithm; and if the direction of the gradient of the previous step is inconsistent with the direction of the gradient of the current step, performing parameter training by adopting a random gradient descent algorithm.
Specifically, the iterative formula of the momentum optimization algorithm is as follows:
θ t+1 =θ t -v t+1
wherein, theta t Model parameters, θ, representing the current step t t+1 Model parameters, v, representing the next step t +1 t+1 Representing the speed of the next step t + 1.
The iterative formula of the random gradient descent algorithm is as follows:
Figure BDA0003615960480000091
wherein, theta t Model parameters, θ, representing the current step t t+1 The model parameters representing the next step t +1,
Figure BDA0003615960480000092
denotes the gradient of the current step t, r denotes the learning rate, and also denotes the step size of each approximation of the gradient, L (θ) t ) Representing the loss function for the current step t.
And S134, until the model meets the preset training termination condition.
And presetting a loss function, and stopping training if the loss data meets the preset loss function.
Therefore, the advantages of the momentum optimization algorithm and the stochastic gradient descent algorithm are combined to train the deep neural network model, the convergence speed of the model is increased, and the situation that the model falls into a local optimal solution is avoided.
S140, determining a correction label of the image data to be trained according to the first image recognition model.
S150, calculating the distance between the initial label and the correction label by adopting a FASTA algorithm.
In one embodiment, step S150 may specifically include the following steps:
and S151, encoding the initial tag and the corrected tag by adopting a reference format of a deoxynucleotide sequence to obtain a first initial tag sequence and a first corrected tag sequence.
The FASTA algorithm is an algorithm proposed based on a genetic sequence, wherein the FASTA algorithm comprises four deoxynucleotides, and tones of Chinese characters are also four, so that the tone of a label is proposed to be encoded by adopting a reference format of the deoxynucleotide sequence, wherein the reference format refers to an encoding format of the label, and the format of the reference deoxynucleotide sequence is adopted.
Specifically, the tone of the initial tag and the tone of the modified tag are encoded in a reference format of a deoxynucleotide sequence; wherein the deoxynucleotide sequence has a reference format in which adenine deoxynucleotide A corresponds to a first sound, thymine deoxynucleotide T corresponds to a second sound, cytosine deoxynucleotide C corresponds to a third sound, and guanine deoxynucleotide G corresponds to a fourth sound. For example, the initial tag is "masculine star", the first initial tag sequence obtained after encoding is { TTA }, the corresponding modified tag is "comedy masculine star", and the first modified tag sequence obtained after encoding is { CGTTA }.
S152, calculating the distance between the first initial label sequence and the first correction label sequence as a first distance.
For example, the edit distance between the first initial sequence of tags { TTA } and the first modified sequence of tags { CGTTA } is calculated.
S153, encoding the initial label and the modified label by one-hot to obtain a second initial label sequence and a second modified label sequence.
And S154, calculating the distance between the second initial label sequence and the second corrected label sequence as a second distance.
The second distance may specifically be an edit distance.
S155, carrying out weighted summation processing on the first distance and the second distance, and taking the obtained result as the distance between the initial label and the corrected label.
Specifically, the distance between the initial tag i and its corresponding modified tag j can be calculated by the following formula:
|SIM i,j |=α×SIM 1 +β×SIM 2
where α and β represent weight parameters, SIM 1 Indicating a first distance, SIM 2 Representing the second distance.
In the embodiment, the A code is combined with the one-hot code, and the tone information and the characteristic information of the label are considered, so that the accuracy of label screening is corrected.
And S160, screening the correction label of the image data to be trained according to the distance.
In one embodiment, step S160 may specifically include the following steps:
and S161, normalizing the distance to obtain a normalized distance.
Specifically, the normalized distance may be calculated by the following formula:
Figure BDA0003615960480000111
wherein epsilon represents an adjustment coefficient, the number of initial labels is n, tf (i) represents the number of appearance times of the initial label i, and SIM (subscriber identity Module) i,j Representing the distance between the original tag i and the corresponding modified tag j.
The normalized distance calculation mode can further improve the accuracy of the correction label screening.
S162, if the normalized distance is larger than a preset threshold value, removing a corresponding correction label of the image data to be trained; otherwise, the corresponding correction label of the image data to be trained is reserved.
S170, inputting the screened image data to be trained carrying the correction labels into a first image recognition model, and training parameters of the first image recognition model through an improved proportional-integral controller to obtain a second image recognition model.
Therefore, through two times of training and label screening of the model, the recognition efficiency and accuracy of the image recognition model are improved.
Fig. 2 shows a flowchart of an image recognition method provided in an embodiment of the present application, please refer to fig. 2, which specifically includes the following steps:
and S210, acquiring image data to be identified.
In this embodiment, the image recognition device is used to obtain the image data to be recognized, and it can be understood that the image recognition device is disposed on the terminal device, and the image data to be recognized may be image data obtained by real-time shooting through a camera of the terminal device, or may be image data stored locally in the terminal device.
The image data to be recognized may be a static image or a dynamic image, and specifically may be face image data, animal image data, pedestrian image data, or vehicle image data.
S220, identifying the image data to be identified through the second image identification model to obtain an image identification result.
Optionally, this step is preceded by preprocessing the image data to be recognized. For example, preprocessing techniques such as graying, geometric transformation, and image enhancement based on a spatial domain method.
The second image recognition model may be obtained by the image recognition model training process.
In the embodiment, the accuracy of the image recognition result can be improved by performing image recognition through the second image recognition model.
The embodiment also provides an optimization system of a deep learning network facing image recognition, as shown in fig. 3, the system includes:
a data acquisition module 310, configured to acquire image data to be trained; wherein the image data to be trained carries an initial label.
A deep learning network model construction module 320, configured to construct a deep learning network model; the deep learning network model is a convolutional neural network model and comprises a forward propagation network and a backward propagation network; the forward propagation network comprises four convolutional layers and four pooling layers, and the backward propagation network comprises two convolutional layers, four expansion convolutional layers and four pooling layers.
The first image recognition model building module 330 is configured to input the image data to be trained into the deep learning network model, and build a first image recognition model by improving parameters of a proportional-integral controller for training the deep learning network model.
And a modified label determining module 340, configured to determine a modified label of the image data to be trained according to the first image recognition model.
A distance calculating module 350, configured to calculate a distance between the initial tag and the modified tag by using a FASTA algorithm.
And a revised label screening module 360, configured to screen a revised label of the image data to be trained according to the distance.
And a second image recognition model building module 370, configured to input the screened to-be-trained image data carrying the correction label into the first image recognition model, and train parameters of the first image recognition model through an improved proportional-integral controller, so as to obtain a second image recognition model.
Optionally, the first image recognition model building module 330 is further configured to: updating the model parameters according to the estimated value of the gradient of the current step, judging whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step, and if so, performing parameter training by adopting a momentum optimization algorithm; if not, performing parameter training by adopting a random gradient descent algorithm until the model meets a preset training termination condition.
Optionally, whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step is judged according to the following formula:
Figure BDA0003615960480000131
wherein dic represents a judgment value, sgn (. cndot.) represents a sign function,
Figure BDA0003615960480000132
gradient, v, representing the current step t t Representing the speed of the current step t.
Optionally, if the determination value is 1, the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step; and if the judgment value is-1, the direction of the gradient of the previous step is inconsistent with the direction of the gradient of the current step.
Optionally, the distance calculating module 350 is further configured to: encoding the initial label and the corrected label by adopting a reference format of a deoxynucleotide sequence to obtain a first initial label sequence and a first corrected label sequence; calculating a distance between the first initial tag sequence and the first modified tag sequence as a first distance; encoding the initial label and the corrected label by one-hot to obtain a second initial label sequence and a second corrected label sequence; calculating a distance between the second initial tag sequence and the second modified tag sequence as a second distance; and performing weighted summation processing on the first distance and the second distance, and taking the obtained result as the distance between the initial label and the corrected label.
Further, said encoding said initial tag and said modified tag in a reference format of a deoxynucleotide sequence comprises: encoding the tone of the initial tag and the tone of the modified tag in a reference format of a deoxynucleotide sequence; wherein the deoxynucleotide sequence has a reference format in which adenine deoxynucleotide A corresponds to a first sound, thymine deoxynucleotide T corresponds to a second sound, cytosine deoxynucleotide C corresponds to a third sound, and guanine deoxynucleotide G corresponds to a fourth sound.
Optionally, the revised tag screening module 360 is further configured to: normalizing the distance to obtain a normalized distance; if the normalized distance is larger than a preset threshold value, removing the corresponding correction label of the image data to be trained; otherwise, the corresponding correction label of the image data to be trained is reserved.
Further, the normalized distance is calculated by the following formula:
Figure BDA0003615960480000141
wherein epsilon represents an adjustment coefficient, the number of initial labels is n, tf (i) represents the number of appearance times of the initial label i, and SIM (subscriber identity Module) i,j Representing the distance between the original tag i and the corresponding modified tag j.
Optionally, the system further comprises a second image recognition module 370, configured to obtain image data to be recognized; and identifying the image data to be identified through the second image identification model to obtain an image identification result.
Therefore, the system can improve the identification efficiency and accuracy of the image identification model.
It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the modules/units/sub-units/components in the above-described apparatus may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another apparatus, or some features may be omitted, or not executed. In addition, the shown or discussed coupling or direct coupling or communication connection between each other may be through some communication interfaces, indirect coupling or communication connection between devices or units, and may be in an electrical, mechanical or other form.
The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.
In addition, functional units in the embodiments provided in the present application may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present application or portions thereof that substantially contribute to the prior art may be embodied in the form of a software product stored in a storage medium and including instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present application. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
It should be noted that: like reference numbers and letters refer to like items in the following figures, and thus once an item is defined in one figure, it need not be further defined and explained in subsequent figures, and moreover, the terms "first", "second", "third", etc. are used merely to distinguish one description from another and are not to be construed as indicating or implying relative importance.
Finally, it should be noted that: the above-mentioned embodiments are only specific embodiments of the present application, and are used for illustrating the technical solutions of the present application, but not limiting the same, and the scope of the present application is not limited thereto, and although the present application is described in detail with reference to the foregoing embodiments, those skilled in the art should understand that: any person skilled in the art can modify or easily conceive the technical solutions described in the foregoing embodiments or equivalent substitutes for some technical features within the technical scope disclosed in the present application; such modifications, changes or substitutions do not depart from the spirit and scope of the embodiments of the present application. Are intended to be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.

Claims (10)

1. An optimization method for a deep learning network facing image recognition is characterized by comprising the following steps:
collecting image data to be trained; wherein the image data to be trained carries an initial label;
constructing a deep learning network model; the deep learning network model is a convolutional neural network model and comprises a forward propagation network and a backward propagation network; the forward propagation network comprises four convolution layers and four pooling layers, and the backward propagation network comprises two convolution layers, four expansion convolution layers and four pooling layers;
inputting the image data to be trained into the deep learning network model, and training parameters of the deep learning network model by improving a proportional-integral controller to construct a first image recognition model;
determining a correction label of the image data to be trained according to the first image recognition model;
calculating the distance between the initial label and the correction label by adopting a FASTA algorithm;
screening a correction label of the image data to be trained according to the distance;
inputting the screened image data to be trained carrying the correction label into a first image recognition model, and training parameters of the first image recognition model by improving a proportional-integral controller to obtain a second image recognition model.
2. The method of claim 1, wherein the training parameters of the deep learning network model by improving a proportional-integral controller to construct a first image recognition model comprises:
updating the model parameters according to the estimated value of the gradient of the current step, judging whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step, and if so, performing parameter training by adopting a momentum optimization algorithm; if not, performing parameter training by adopting a random gradient descent algorithm until the model meets a preset training termination condition.
3. The method of claim 2, wherein whether the direction of the gradient of the previous step is consistent with the direction of the gradient of the current step is determined according to the following formula:
Figure FDA0003615960470000011
wherein dic represents a judgment value, sgn (. cndot.) represents a sign function,
Figure FDA0003615960470000012
gradient, v, representing the current step t t Representing the speed of the current step t.
4. The method according to claim 3, wherein if the determination value is 1, the direction of the previous step gradient is consistent with the direction of the current step gradient; and if the judgment value is-1, the direction of the gradient of the previous step is inconsistent with the direction of the gradient of the current step.
5. The method of claim 1, wherein said calculating the distance between the initial tag and the revised tag using the FASTA algorithm comprises:
encoding the initial label and the corrected label by adopting a reference format of a deoxynucleotide sequence to obtain a first initial label sequence and a first corrected label sequence;
calculating a distance between the first initial tag sequence and the first modified tag sequence as a first distance;
encoding the initial label and the corrected label by one-hot to obtain a second initial label sequence and a second corrected label sequence;
calculating a distance between the second initial tag sequence and the second modified tag sequence as a second distance;
and performing weighted summation processing on the first distance and the second distance, and taking the obtained result as the distance between the initial label and the corrected label.
6. The method of claim 5, wherein said encoding said initial tag and said modified tag in a deoxynucleotide sequence reference format comprises:
encoding the tone of the initial tag and the tone of the modified tag in a reference format of the deoxynucleotide sequence; wherein, the deoxynucleotide sequence has the reference format that adenine deoxynucleotide A corresponds to the first sound, thymine deoxynucleotide T corresponds to the second sound, cytosine deoxynucleotide C corresponds to the third sound and guanine deoxynucleotide G corresponds to the fourth sound.
7. The method according to claim 6, wherein the screening the revised label of the image data to be trained according to the distance comprises:
normalizing the distance to obtain a normalized distance;
if the normalized distance is larger than a preset threshold value, removing the corresponding correction label of the image data to be trained;
otherwise, the corresponding correction label of the image data to be trained is reserved.
8. The method of claim 7, wherein the normalized distance is calculated by the formula:
Figure FDA0003615960470000031
wherein epsilon represents an adjustment coefficient, the number of initial labels is n, tf (i) represents the number of appearance times of the initial label i, and SIM (subscriber identity Module) i,j Representing the distance between the original tag i and the corresponding modified tag j.
9. The method of claim 1, further comprising:
acquiring image data to be identified;
and identifying the image data to be identified through the second image identification model to obtain an image identification result.
10. An optimization system of the deep learning network facing image recognition based on the method of any one of claims 1 to 9, characterized in that the system comprises:
the data acquisition module is used for acquiring image data to be trained; wherein the image data to be trained carries an initial label;
the deep learning network model building module is used for building a deep learning network model; the deep learning network model is a convolutional neural network model and comprises a forward propagation network and a backward propagation network; the forward propagation network comprises four convolution layers and four pooling layers, and the backward propagation network comprises two convolution layers, four expansion convolution layers and four pooling layers;
the first image recognition model building module is used for inputting the image data to be trained into the deep learning network model, training parameters of the deep learning network model through an improved proportional-integral controller and building a first image recognition model;
a modified label determining module, configured to determine a modified label of the image data to be trained according to the first image recognition model;
the distance calculation module is used for calculating the distance between the initial label and the correction label by adopting a FASTA algorithm;
the correction label screening module is used for screening the correction label of the image data to be trained according to the distance;
and the second image recognition model construction module is used for inputting the screened image data to be trained carrying the correction labels into the first image recognition model, and obtaining a second image recognition model by improving the parameters of the proportional-integral controller for training the first image recognition model.
CN202210447416.2A 2022-04-26 2022-04-26 Deep learning network optimization method and system for image recognition Active CN114863242B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210447416.2A CN114863242B (en) 2022-04-26 2022-04-26 Deep learning network optimization method and system for image recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210447416.2A CN114863242B (en) 2022-04-26 2022-04-26 Deep learning network optimization method and system for image recognition

Publications (2)

Publication Number Publication Date
CN114863242A true CN114863242A (en) 2022-08-05
CN114863242B CN114863242B (en) 2022-11-29

Family

ID=82634184

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210447416.2A Active CN114863242B (en) 2022-04-26 2022-04-26 Deep learning network optimization method and system for image recognition

Country Status (1)

Country Link
CN (1) CN114863242B (en)

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110543920A (en) * 2019-09-12 2019-12-06 北京达佳互联信息技术有限公司 Performance detection method and device of image recognition model, server and storage medium
CN111353549A (en) * 2020-03-10 2020-06-30 创新奇智(重庆)科技有限公司 Image tag verification method and device, electronic device and storage medium
WO2020243469A1 (en) * 2019-05-31 2020-12-03 Kiromic BioPharma, Inc. Methods for identifying and using disease-associated antigens
AU2020103613A4 (en) * 2020-11-23 2021-02-04 Agricultural Information and Rural Economic Research Institute of Sichuan Academy of Agricultural Sciences Cnn and transfer learning based disease intelligent identification method and system
CN112529210A (en) * 2020-12-09 2021-03-19 广州云从鼎望科技有限公司 Model training method, device and computer readable storage medium
CN112598020A (en) * 2020-11-24 2021-04-02 深兰人工智能(深圳)有限公司 Target identification method and system
US20210118559A1 (en) * 2019-10-22 2021-04-22 Tempus Labs, Inc. Artificial intelligence assisted precision medicine enhancements to standardized laboratory diagnostic testing
CN113469236A (en) * 2021-06-25 2021-10-01 江苏大学 Deep clustering image recognition system and method for self-label learning
CN113658643A (en) * 2021-07-22 2021-11-16 西安理工大学 Prediction method for lncRNA and mRNA based on attention mechanism
CN113903395A (en) * 2021-10-28 2022-01-07 聊城大学 BP neural network copy number variation detection method and system for improving particle swarm optimization
CN113963258A (en) * 2021-09-28 2022-01-21 上海东普信息科技有限公司 Worker card wearing identification method, device, equipment and storage medium
CN113971183A (en) * 2020-07-22 2022-01-25 阿里巴巴集团控股有限公司 Method and device for training entity marking model and electronic equipment
CN114119403A (en) * 2021-11-23 2022-03-01 北京拙河科技有限公司 Image defogging method and system based on red channel guidance

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020243469A1 (en) * 2019-05-31 2020-12-03 Kiromic BioPharma, Inc. Methods for identifying and using disease-associated antigens
CN110543920A (en) * 2019-09-12 2019-12-06 北京达佳互联信息技术有限公司 Performance detection method and device of image recognition model, server and storage medium
US20210118559A1 (en) * 2019-10-22 2021-04-22 Tempus Labs, Inc. Artificial intelligence assisted precision medicine enhancements to standardized laboratory diagnostic testing
CN111353549A (en) * 2020-03-10 2020-06-30 创新奇智(重庆)科技有限公司 Image tag verification method and device, electronic device and storage medium
CN113971183A (en) * 2020-07-22 2022-01-25 阿里巴巴集团控股有限公司 Method and device for training entity marking model and electronic equipment
AU2020103613A4 (en) * 2020-11-23 2021-02-04 Agricultural Information and Rural Economic Research Institute of Sichuan Academy of Agricultural Sciences Cnn and transfer learning based disease intelligent identification method and system
CN112598020A (en) * 2020-11-24 2021-04-02 深兰人工智能(深圳)有限公司 Target identification method and system
CN112529210A (en) * 2020-12-09 2021-03-19 广州云从鼎望科技有限公司 Model training method, device and computer readable storage medium
CN113469236A (en) * 2021-06-25 2021-10-01 江苏大学 Deep clustering image recognition system and method for self-label learning
CN113658643A (en) * 2021-07-22 2021-11-16 西安理工大学 Prediction method for lncRNA and mRNA based on attention mechanism
CN113963258A (en) * 2021-09-28 2022-01-21 上海东普信息科技有限公司 Worker card wearing identification method, device, equipment and storage medium
CN113903395A (en) * 2021-10-28 2022-01-07 聊城大学 BP neural network copy number variation detection method and system for improving particle swarm optimization
CN114119403A (en) * 2021-11-23 2022-03-01 北京拙河科技有限公司 Image defogging method and system based on red channel guidance

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
QUAN ZOU ETAL.: "Gene2vec: gene subsequence embedding for prediction of mammalian N6-methyladenosine sites from mRNA", 《RNA》 *
付华等: "VMD-PE 协同 SNN 的输电线路故障辨识方法", 《电子测量与仪器学报》 *

Also Published As

Publication number Publication date
CN114863242B (en) 2022-11-29

Similar Documents

Publication Publication Date Title
CN110223292B (en) Image evaluation method, device and computer readable storage medium
CN110188829B (en) Neural network training method, target recognition method and related products
CN112784929B (en) Small sample image classification method and device based on double-element group expansion
CN112418292B (en) Image quality evaluation method, device, computer equipment and storage medium
CN110288513B (en) Method, apparatus, device and storage medium for changing face attribute
CN111126470B (en) Image data iterative cluster analysis method based on depth measurement learning
CN113902913A (en) Image semantic segmentation method and device
CN111709398A (en) Image recognition method, and training method and device of image recognition model
CN114282059A (en) Video retrieval method, device, equipment and storage medium
CN115690797A (en) Character recognition method, device, equipment and storage medium
CN117726884B (en) Training method of object class identification model, object class identification method and device
CN106446844B (en) Posture estimation method and device and computer system
CN114863242B (en) Deep learning network optimization method and system for image recognition
CN112329666A (en) Face recognition method and device, electronic equipment and storage medium
CN113572981A (en) Video dubbing method and device, electronic equipment and storage medium
CN111738059A (en) Non-sensory scene-oriented face recognition method
CN114170484B (en) Picture attribute prediction method and device, electronic equipment and storage medium
CN116051924A (en) Divide-and-conquer defense method for image countermeasure sample
CN115423031A (en) Model training method and related device
CN113076963B (en) Image recognition method and device and computer readable storage medium
CN112084371B (en) Movie multi-label classification method and device, electronic equipment and storage medium
CN114882582A (en) Gait recognition model training method and system based on federal learning mode
CN111340329B (en) Actor evaluation method and device and electronic equipment
CN107609645B (en) Method and apparatus for training convolutional neural network
CN114724009B (en) Image identification method and device based on improved deep learning network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant