CN110598737B - Online learning method, device, equipment and medium of deep learning model - Google Patents

Online learning method, device, equipment and medium of deep learning model Download PDF

Info

Publication number
CN110598737B
CN110598737B CN201910722508.5A CN201910722508A CN110598737B CN 110598737 B CN110598737 B CN 110598737B CN 201910722508 A CN201910722508 A CN 201910722508A CN 110598737 B CN110598737 B CN 110598737B
Authority
CN
China
Prior art keywords
deep learning
learning model
training
layer
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201910722508.5A
Other languages
Chinese (zh)
Other versions
CN110598737A (en
Inventor
石大明
刘露
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen University
Original Assignee
Shenzhen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen University filed Critical Shenzhen University
Priority to CN201910722508.5A priority Critical patent/CN110598737B/en
Publication of CN110598737A publication Critical patent/CN110598737A/en
Application granted granted Critical
Publication of CN110598737B publication Critical patent/CN110598737B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The invention is suitable for the technical field of deep learning, and provides an online learning method, device, equipment and medium of a deep learning model, wherein the method comprises the following steps: the method comprises the steps of performing offline training on a deep learning model with an introduced inhibiting signal, issuing the deep learning model to an online after training, performing image recognition on a received online training image, cutting an unidentifiable online training image through a sliding window to obtain a corresponding basic feature set, performing similarity matching on the obtained basic feature set and a training image set, setting the obtained basic features which are lower than a similarity threshold in all similarities as singular features, and performing retraining on the deep learning model according to the singular feature set formed by the singular features and a preset model training algorithm to finish online learning of the deep learning model, so that the noise robustness of the deep learning model is improved by introducing the inhibiting signal, and the model recognition accuracy is improved by personalized training.

Description

Online learning method, device, equipment and medium of deep learning model
Technical Field
The invention belongs to the technical field of deep learning, and particularly relates to an online learning method, device, equipment and medium of a deep learning model.
Background
On-line Learning (Online Learning) is not a model but a model training method, after the prediction model is trained Online, the Online Learning can be used for quickly optimizing and adjusting the original prediction model in real time according to Online feedback data, so that the adjusted prediction model can reflect Online changes in time, the Online prediction accuracy is improved, and the Online used data is different from pure offline test data and often contains certain noise, and the Online Learning has higher requirements on the aspects of expandability, noise immunity and memory utility for the model due to the dynamic Learning characteristic and the complexity of the data.
At present, most of online learning algorithms are machine learning algorithms, namely, the online convex optimization is designed to be used for learning a shallow model, but the online convex optimization cannot learn a nonlinear function in a complex application scene, and therefore, the data cannot be sufficiently fitted. Deep learning has been widely applied to many fields because of its highly linear expression ability, however, it has a great disadvantage that after the deep model is online, because the deep neural network is trained in a batch learning setting, this setting requires all training data sets to be prepared before the learning task starts, which is impossible for the task that many data arrive in a streaming form one after another in reality, and there may not be enough memory space to store, so there is a great need for a new online deep learning method to solve this disadvantage.
Disclosure of Invention
The invention aims to provide an online learning method, device, equipment and medium of a deep learning model, and aims to solve the problems of poor noise resistance and low model identification precision of the deep learning model caused by the fact that the prior art cannot provide an effective online learning method of the deep learning model.
In one aspect, the present invention provides an online learning method for a deep learning model, including the following steps:
carrying out image recognition on the received on-line training image through a deep learning model which is trained in advance under the line and introduces an inhibition signal and an excitation signal to obtain an image recognition result;
when the on-line training image is determined to be an unrecognizable image according to the image recognition result, cutting the on-line training image through a sliding window to obtain corresponding basic features with the same receptive field size of each layer of the deep learning model;
according to the size of the basic features, carrying out similarity matching on the basic feature set formed by the basic features and a pre-stored training image set to obtain the similarity corresponding to each basic feature, and setting the basic features corresponding to the similarity lower than a preset similarity threshold in all the similarities as singular features;
and according to a singular feature set formed by the singular features and a preset model training algorithm, retraining the deep learning model again so as to adjust each parameter of the deep learning model and finish the on-line learning of the deep learning model.
In another aspect, the present invention provides an online learning apparatus for deep learning models, the apparatus including:
the on-line image recognition unit is used for carrying out image recognition on the received on-line training image through a deep learning model which is trained in advance under the line and introduced with a suppression signal and an excitation signal to obtain an image recognition result;
the basic feature extraction unit is used for cutting the on-line training image through a sliding window to obtain corresponding basic features with the same size as the receptive field of each layer of the deep learning model when the on-line training image is determined to be an unrecognizable image according to the image recognition result;
the similarity matching unit is used for matching the similarity of a basic feature set formed by the basic features with a pre-stored training image set according to the size of the basic features to obtain the similarity corresponding to each basic feature, and setting the basic features corresponding to the similarity lower than a preset similarity threshold in all the similarities as singular features; and
and the model training unit is used for retraining the deep learning model according to a singular feature set formed by the singular features and a preset model training algorithm so as to adjust all parameters of the deep learning model and complete the online learning of the deep learning model.
In another aspect, the present invention further provides a computing device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor implements the steps of the online learning method of the deep learning model when executing the computer program.
In another aspect, the present invention further provides a computer-readable storage medium, which stores a computer program, and when the computer program is executed by a processor, the computer program implements the steps of the online learning method of the deep learning model.
The method comprises the steps of firstly performing offline training on a deep learning model with an introduced inhibition signal, releasing the deep learning model onto an online after training, then performing image recognition on a received online training image through the deep learning model, cutting the online training image through a sliding window when the online training image is determined to be an unrecognizable image according to the obtained image recognition result to obtain corresponding basic features, performing similarity matching on a basic feature set formed by the basic features and a pre-stored training image set according to the size of the basic features to obtain the similarity corresponding to each basic feature, setting the basic features corresponding to the similarity lower than a preset similarity threshold in all the similarities as singular features, performing retraining on the deep learning model according to the singular feature set formed by the singular features and a preset model training algorithm to adjust each parameter of the deep learning model to complete online learning of the deep learning model, thereby improving the noise robustness of the deep learning model by introducing the inhibition signal, and improving the model recognition accuracy through personalized training so that the deep learning cortical visual characteristic of the trained deep learning model is more consistent with the human brain characteristic.
Drawings
FIG. 1 is a flowchart of an implementation of an online learning method of a deep learning model according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating an implementation of an online learning method of a deep learning model according to a second embodiment of the present invention;
FIG. 3 is a schematic structural diagram of an online learning apparatus for a deep learning model according to a third embodiment of the present invention;
FIG. 4 is a schematic structural diagram of an online learning apparatus for a deep learning model according to a fourth embodiment of the present invention; and
fig. 5 is a schematic structural diagram of a computing device according to a fifth embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.
The following detailed description of specific implementations of the invention is provided in conjunction with specific embodiments:
the first embodiment is as follows:
fig. 1 shows an implementation flow of an online learning method of a deep learning model according to an embodiment of the present invention, and for convenience of description, only parts related to the embodiment of the present invention are shown, which are detailed as follows:
in step S101, the received online training image is subjected to image recognition by a deep learning model, which is trained in advance under a line and into which a suppression signal and an excitation signal are introduced, to obtain an image recognition result.
The embodiment of the invention is suitable for computing equipment, such as a personal computer, a server and the like. In the embodiment of the invention, although the deep learning model trained in advance offline can realize accurate identification of most samples, a few images which cannot be identified still exist, therefore, the deep learning model trained in advance offline and introduced with the inhibition signal and the excitation signal is released to be online to learn online, the deep learning model receives the online training image sent by the online user and performs image identification on the received training image to obtain a corresponding image identification result, and the image identification result comprises image identification success or image identification failure.
In performing image recognition on the received on-line training image, it is preferable to use the feature extraction formula V combining the suppression signal and the excitation signal l (n,k)=E l (n,k)-I l (n) performing feature extraction on the on-line training image input to the deep learning model to perform image recognition on the on-line training image according to the extracted features, wherein V l (n, k) is the feature matrix of the kth plane on the l layer of the deep learning model, E l (n, k) is the excitation signal of the kth plane on the l layer of the deep learning model, I l And (n) is an inhibition signal of the l-th layer of the deep learning model, and n is a middle cell on the k-th plane, so that the aim of eliminating noise influence is fulfilled by weakening an excitation signal by using the inhibition signal, and the noise robustness of the deep learning model is improved.
In step S102, when it is determined that the on-line training image is an unrecognizable image according to the image recognition result, the on-line training image is cut through the sliding window, and the corresponding basic features having the same receptive field size as that of each layer of the deep learning model are obtained.
In the embodiment of the invention, when the on-line training image is determined to be the unrecognizable image according to the image recognition result, namely the deep learning model fails to recognize the on-line training image, the on-line training image is input into the deep learning model, and the on-line training image is subjected to image cutting through sliding windows with different sizes to obtain the corresponding basic features with the same receptive field size of each layer of the deep learning model.
In step S103, according to the size of the basic features, similarity matching is performed between the basic feature set formed by the basic features and a pre-stored training image set to obtain a similarity corresponding to each basic feature, and the basic features corresponding to the similarities lower than a preset similarity threshold in all the similarities are set as singular features.
In the embodiment of the invention, according to the size of basic features, the basic feature sets with the same size are taken to be subjected to similarity matching with a pre-stored training image set to obtain the similarity corresponding to each basic feature in the basic feature set, the basic features corresponding to the similarity lower than a preset similarity threshold in all the similarities are set as singular features, the operation of performing similarity matching on the basic feature sets with the same size and the pre-stored training image set is iterated until all the basic features in the basic feature sets are subjected to similarity matching, and then the singular features of multiple sizes of the on-line training images are obtained, wherein the training image set is a sample for pre-training the deep learning model.
In step S104, the deep learning model is retrained again according to the singular feature set composed of the singular features and the preset model training algorithm, so as to adjust each parameter of the deep learning model, and complete the online learning of the deep learning model.
In the embodiment of the invention, the deep learning model is retrained again according to a singular feature set formed by singular features and a preset model training algorithm so as to adjust all parameters of the deep learning model, realize the correct recognition of the deep learning model on the single online training image, add the singular feature set into the training image set and finish the online learning of the deep learning model.
Preferably, the retraining of the deep learning model is achieved by:
(1) Classifying all singular characteristics by adopting a similarity clustering algorithm to obtain a plurality of characteristic categories;
(2) Respectively carrying out weighted average calculation on all singular characteristics in each characteristic category to obtain target singular characteristics corresponding to the characteristic categories;
(3) According to a target singular characteristic set formed by the obtained target singular characteristics, local training is carried out on the hidden layer until the activation values of the connection domains corresponding to the seed cells arranged on the characteristic extraction surface and the down-sampling surface reach a preset activation threshold value;
(4) And performing classification training of a full connection layer on the deep learning model after the local training according to the on-line training image so as to realize correct recognition of the deep learning model on the single on-line training image and finish retraining of the deep learning model.
The specific implementation of the steps (1) - (4) refers to the description of the two related steps in the following embodiment, which is not repeated herein, so that the personalized training of the deep learning model is implemented through the steps (1) - (4), and the model identification precision of the deep learning model is improved.
In the embodiment of the invention, the received on-line training image is subjected to image recognition through an on-line pre-trained deep learning model, when the on-line training image cannot be recognized, a feature extraction formula combining an inhibition signal and an excitation signal is adopted to carry out feature extraction on the on-line training image to obtain corresponding basic features, a basic feature set formed by the basic features is subjected to similarity matching with a training image set, the basic features corresponding to the similarity lower than a similarity threshold in all similarities are set as singular features, and the deep learning model is retrained again according to the singular feature set formed by the singular features and a preset model training algorithm to complete the on-line learning of the deep learning model, so that the noise robustness of the deep learning model is improved by introducing the inhibition signal, and the model recognition precision is improved through personalized training, so that the trained deep learning model is more consistent with the visual cortex characteristics of a human brain.
Example two:
fig. 2 shows an implementation flow of an online learning method of a deep learning model provided in the second embodiment of the present invention, and for convenience of description, only the parts related to the second embodiment of the present invention are shown, which are detailed as follows:
before the image recognition of the received on-line training image is carried out by the deep learning model which is trained in advance under the line and introduced with the inhibition signal and the excitation signal, the off-line training of the deep learning model is realized by the following steps:
in step S201, a deep learning model is constructed from the training image set.
In the embodiment of the invention, according to the complexity (the complexity comprises the number of image samples in a training image set, the size of each image sample, the image definition and the like) of the received training image set input by a user, the structural hierarchy of a deep learning model is set, and a basic untrained deep learning model is constructed according to the structural hierarchy, wherein the deep learning model comprises an input layer, a hidden layer and an output layer, the input layer only comprises one layer and directly receives a two-dimensional visual pattern, the output layer is a full-connection layer and is used for integrating local features extracted from the hidden layer and classifying the training image samples in the training image set according to an integration result, the hidden layer comprises a plurality of feature extraction layers (marked as an S layer) and a downsampling layer (marked as a C layer) corresponding to the feature extraction layers, the S layer comprises a plurality of feature extraction surfaces (marked as an S surface), the C layer comprises a plurality of downsampling surfaces (marked as a C surface), the S surface comprises a plurality of excitation neurons and a plurality of inhibition neurons, and the C surface comprises a plurality of complex neurons. The S layer is used for extracting characteristic patterns, the C layer is used for solving the displacement distortion problem of the characteristics extracted by the S layer, L2 pooling operation is adopted, any middle stage of the hidden layer is formed by connecting the S layer and the C layer in series, namely one C layer is connected behind one S layer. The structural hierarchy of the set deep learning model comprises the number of S layers and C layers in the hidden layer, the number of S surfaces/C surfaces forming the S layers/C layers, and the number of excitatory neurons and inhibitory neurons/complex neurons forming the S surfaces/C surfaces.
In step S202, according to the training image set, the corresponding target feature is extracted by using the built deep learning model and a preset feature extraction algorithm.
In the embodiment of the invention, according to a training image set, a constructed untrained deep learning model and a preset feature extraction algorithm are adopted to extract target features which can represent the basic features of the whole training image set.
In extracting the corresponding target feature, preferably, the extracting of the target feature is realized by:
(1) And obtaining the initial characteristics corresponding to each training image in the training image set by using the built deep learning model.
In the embodiment of the invention, the receptive field size of each layer in the deep learning model is increased along with the increase of the depth, the receptive field size of the first layer is set to be small, and the receptive field of the output layer is the whole image. Inputting the training image set into an untrained deep learning model, and performing sliding interception on the training mode of the output layer by using a window with the same size as the receptive field of the hidden layer of the deep learning model to obtain the initial characteristic corresponding to each training image in the training image set.
(2) And classifying all the initial features by adopting a similarity clustering algorithm to obtain a plurality of feature categories.
In the embodiment of the invention, all the initial characteristics are classified by adopting a similarity clustering algorithm to obtain a plurality of characteristic categories, so that the initial characteristic similarity in each characteristic category is higher, and the initial characteristic similarity in different characteristic categories is weaker.
(3) And respectively carrying out weighted average calculation on all the initial characteristics in each characteristic category to obtain target characteristics corresponding to the characteristic categories.
The target characteristics which can represent the basic characteristics of the whole training image set are extracted through the steps (1) - (3), so that the aim of reducing the scale of the number of training samples is fulfilled, and meanwhile, the aim of eliminating the noise influence is fulfilled by weakening the excitation signal through the inhibition signal.
In step S203, local training is performed on the hidden layer according to a target feature set formed by the extracted target features until the activation values of the connection domains corresponding to the seed cells set on the feature extraction plane and the downsampling plane reach a preset activation threshold.
In the embodiment of the invention, each feature extraction surface/down-sampling surface on a feature extraction layer/down-sampling layer represents the extraction of a feature, firstly, a seed cell (seed neuron) is arranged on a plane to be trained (namely, the feature extraction surface or the down-sampling surface), then, local training is carried out on a hidden layer according to a target feature set formed by all extracted target features until the activation value of a connection domain corresponding to the arranged seed cell reaches a preset activation threshold value, and then, the training is stopped, wherein the activation value of the connection domain corresponding to the seed cell is an input signal transmitted from a corresponding surface of a layer above the surface where the seed cell is located.
In local training of the hidden layer, preferably, the feature extraction formula V combining the inhibitory signal and the excitatory signal is adopted according to the target feature set l (n,k)=E l (n,k)-I l And (n) starting from the first layer of the hidden layer, and locally training the hidden layer of the deep learning model in a way of respectively performing face-by-face layer-by-layer training on each feature extraction surface and down-sampling surface in each layer, so that the training of the hidden layer is completed through a small number of training samples, the training speed is improved, and the significance of the features extracted from the hidden layer after the training is improved.
Further preferably, the excitation signal of the deep learning model adopts a formula
Figure BDA0002157716910000091
To obtain wherein E l (n, k) is the activation value of the connecting domain of the kth plane (S plane or C plane) on the l layer of the deep neural network, namely the excitation signal extracted from the kth plane, n is the median cell (namely seed cell or seed neuron) on the kth plane,
Figure BDA0002157716910000092
is a nonlinear activation function of RELU, and satisfies
Figure BDA0002157716910000093
v represents a cell (or neuron) around the median cell, a l (v, K, K) is the weight matrix between the kth plane on the l-th layer and the kth plane on the l-1 th layer (i.e., the layer above the l-th layer), u Cl-1 (n + v, K) is an input signal from the layer C which is before the layer l, A l The characteristic is a receptive field window corresponding to the l-th layer, so that the significance of the extracted characteristics is improved.
Still preferably, the suppression signal of the deep learning model is formulated
Figure BDA0002157716910000094
To obtain (I) of l (n) is a suppression signal of the first layer, c l (v) And (4) introducing a suppression signal into a suppression matrix of the ith layer so as to enhance the anti-noise/distortion capability of the deep learning model.
Starting with the first layer from the hidden layer and separately for each layerFirstly, starting from the first layer of the characteristic extraction layer and the downsampling layer of the hidden layer, carrying out plane-by-plane training on each surface of the characteristic extraction layer and the downsampling layer, and updating a formula a according to a preset weight value l ′(v,K,k)=a l (v,K,k)+q l ·u cl-1 (n + v, K) updating the weight matrix of the surface until the activation value of the connection domain corresponding to the seed cell set by the surface reaches a preset activation threshold value, then starting training the second layer of the characteristic special weight layer and the down-sampling layer of the hidden layer, and sequentially training layer by layer, wherein q is l For the weight learning rate set for the l-th layer to be trained, a l ' (v, K, K) is an updated weight matrix between the kth plane on the l-th layer and the kth plane on the l-1-th layer (i.e., the layer above the l-th layer), thereby further improving the significance of the features extracted by the trained hidden layer.
In step S204, a full-connected layer classification training is performed on the deep learning model that has been partially trained, based on a target image set selected from the training image set in advance, to complete the offline training of the deep learning model.
In the embodiment of the invention, representative training images are selected from a training image set in advance, a target image set is formed by all the selected training images, and according to the target image set, the deep learning model which is partially trained is subjected to classification training of an output layer (namely a full connection layer) so as to complete offline training of the deep learning model.
When the fully-connected layer classification training is performed on the deep learning model which has completed the local training, preferably, according to a target image set, a back propagation algorithm (BP algorithm) is adopted to perform classification training on the fully-connected layer of the deep learning model, and weights connected between layers of the deep learning model are adjusted and corrected to complete offline training on the deep learning model, specifically, the target image set is input into the deep learning model to perform forward propagation, and according to errors between predicted classification values output by an output layer and corresponding actual classification values calibrated for each target image in the target image set in advance, reverse error propagation is performed from the output layer to perform iterative updating on the weights connected between layers of the deep learning model until errors between the predicted classification values and the actual classification values output by the output layer reach a preset target range, so that the fineness of the corrected weights is improved, and the deep learning model after the local training can be effectively recognized for a test sample through a small number of offline training samples, and the anti-rotation capability of the whole deep learning model is improved.
In the embodiment of the invention, according to a received training image set, a preset feature extraction algorithm and a pre-constructed untrained deep learning model are adopted to extract corresponding target features, a hidden layer of the deep learning model is locally trained through a target feature set formed by the extracted target features, after the local training is finished, a full-link layer of the deep learning model is subjected to classification training according to a target image set selected from the training image set to finish off-line training of the deep learning model, so that the number of samples for training the deep learning model is reduced, the trained deep learning model is more consistent with the characteristics of human brain visual cortex, the anti-noise and anti-displacement capabilities of the deep learning model are improved, and the training speed and the training effect of the deep learning model are further improved.
Example three:
fig. 3 shows a structure of an online learning apparatus for a deep learning model provided in a third embodiment of the present invention, and for convenience of description, only a part related to the third embodiment of the present invention is shown, where the structure includes:
the online image recognition unit 31 is configured to perform image recognition on the received online training image through a deep learning model which is trained in advance under a line and into which an inhibition signal and an excitation signal are introduced, so as to obtain an image recognition result;
a basic feature extraction unit 32, configured to, when it is determined that the on-line training image is an unrecognizable image according to the image recognition result, cut the on-line training image through a sliding window, so as to obtain corresponding basic features having the same size as the receptive field of each layer of the deep learning model;
a similarity matching unit 33, configured to perform similarity matching between a basic feature set formed by basic features and a pre-stored training image set according to the size of the basic features to obtain a similarity corresponding to each basic feature, and set, as a singular feature, a basic feature corresponding to a similarity lower than a preset similarity threshold in all similarities; and
and the model training unit 34 is used for retraining the deep learning model according to the singular feature set formed by the singular features and a preset model training algorithm so as to adjust each parameter of the deep learning model and complete the online learning of the deep learning model.
In the embodiment of the present invention, each unit of the online learning apparatus for deep learning model may be implemented by a corresponding hardware or software unit, and each unit may be an independent software or hardware unit, or may be integrated into a software or hardware unit, which is not limited herein. Specifically, the implementation of each unit can refer to the description of the first embodiment, and is not repeated herein.
Example four:
fig. 4 shows a structure of an online learning apparatus for a deep learning model according to a fourth embodiment of the present invention, and for convenience of description, only parts related to the fourth embodiment of the present invention are shown, where the parts include:
a model construction unit 41, configured to construct a deep learning model according to the training image set;
a target feature extraction unit 42, configured to extract, according to the training image set, a corresponding target feature by using a built deep learning model and a preset feature extraction algorithm;
a local training unit 43, configured to perform local training on the hidden layer according to a target feature set formed by the extracted target features until a connection domain activation value corresponding to the seed cell set by the feature extraction plane and the downsampling plane reaches a preset activation threshold;
the global training unit 44 is configured to perform classification training of a full connection layer on the deep learning model that has been subjected to the local training according to a target image set selected from the training image set in advance, so as to complete offline training of the deep learning model;
an online image recognition unit 45, configured to perform image recognition on the received online training image through a deep learning model that is trained in advance under a line and into which an inhibitory signal and an excitatory signal are introduced, to obtain an image recognition result;
a basic feature extraction unit 46, configured to, when it is determined that the on-line training image is an unrecognizable image according to the image recognition result, cut the on-line training image through a sliding window to obtain corresponding basic features having the same size as the receptive field of each layer of the deep learning model;
a similarity matching unit 47, configured to perform similarity matching between a basic feature set formed by basic features and a pre-stored training image set according to the size of the basic features to obtain a similarity corresponding to each basic feature, and set, as a singular feature, a basic feature corresponding to a similarity lower than a preset similarity threshold in all similarities; and
and the model training unit 48 is used for retraining the deep learning model according to the singular feature set formed by the singular features and the preset model training algorithm so as to adjust each parameter of the deep learning model and complete the online learning of the deep learning model.
As shown in fig. 4, preferably, the target feature extraction unit 42 includes:
an initial feature obtaining unit 421, configured to obtain an initial feature corresponding to each training image in the training image set by using the constructed deep learning model;
a feature class obtaining unit 422, configured to classify all initial features by using a similarity clustering algorithm to obtain a plurality of feature classes; and
the target feature obtaining unit 423 is configured to perform weighted average calculation on all initial features in each feature category to obtain a target feature corresponding to the feature category.
The local training unit 43 includes:
and the local training subunit 431 is configured to perform local training on the hidden layer in a manner of performing plane-by-plane layer-by-layer training on each feature extraction plane and the downsampling plane in each layer, respectively, starting from the first layer of the hidden layer according to the target feature set.
The global training unit 44 includes:
and the global training subunit 441 is configured to perform classification training on the fully-connected layers by using a back propagation algorithm according to the target image set, and correct the weights between the layers of the deep learning model to complete offline training of the deep learning model.
The on-line image recognition unit 45 includes:
an image recognition subunit 451 for extracting the formula V using the feature combining the suppression signal and the excitation signal l (n,k)=E l (n,k)-I l (n) performing feature extraction on the on-line training image input to the deep learning model to perform image recognition on the on-line training image according to the extracted features, wherein V l (n, k) is a feature matrix of a kth plane on the l-th layer of the deep learning model, E l (n, k) is excitation signal of kth plane on l layer of deep learning model, I l (n) is the suppression signal of the l-th layer of the deep learning model, and n is the median cell on the k-th plane.
In the embodiment of the present invention, each unit of the online learning apparatus for the deep learning model may be implemented by a corresponding hardware or software unit, and each unit may be an independent software or hardware unit, or may be integrated into a software or hardware unit, which is not limited herein. Specifically, the implementation of each unit can refer to the description of the foregoing method embodiment, and is not repeated herein.
Example five:
fig. 5 shows a structure of a computing device provided in a fifth embodiment of the present invention, and for convenience of description, only parts related to the embodiment of the present invention are shown.
The computing device 5 of an embodiment of the invention comprises a processor 50, a memory 51 and a computer program 52 stored in the memory 51 and executable on the processor 50. The processor 50, when executing the computer program 52, implements the steps in the above-described online learning method embodiment of the deep learning model, such as steps S101 to S104 shown in fig. 1. Alternatively, the processor 50, when executing the computer program 52, implements the functions of the units in the above-described device embodiments, such as the functions of the units 31 to 34 shown in fig. 3.
In the embodiment of the invention, a deep learning model with an introduced inhibition signal is trained offline, the deep learning model is published on an online after training, the received online training image is subjected to image recognition through the deep learning model, when the online training image is determined to be an unrecognizable image according to the obtained image recognition result, the online training image is cut through a sliding window to obtain corresponding basic features, a basic feature set formed by the basic features is subjected to similarity matching with a pre-stored training image set according to the size of the basic features to obtain the similarity corresponding to each basic feature, the basic features corresponding to the similarity lower than a preset similarity threshold in all the similarities are set as singular features, the deep learning model is trained again according to the singular feature set formed by the singular features and a preset model training algorithm to adjust each parameter of the deep learning model to complete online learning of the deep learning model, so that the noise robustness of the deep learning model is improved by introducing the inhibition signal, and the model recognition accuracy is improved through personalized training, and the deep learning cortical visual characteristic of the trained deep learning model is more consistent with the brain characteristic.
The computing equipment of the embodiment of the invention can be a personal computer and a server. The steps implemented when the processor 50 executes the computer program 52 in the computing device 5 to implement the online learning method of the deep learning model can refer to the description of the foregoing method embodiments, and are not described herein again.
Example six:
in an embodiment of the present invention, a computer-readable storage medium is provided, which stores a computer program, which when executed by a processor implements the steps in the above-described online learning method embodiment of the deep learning model, for example, steps S101 to S104 shown in fig. 1. Alternatively, the computer program may be adapted to, when executed by a processor, perform the functions of the units of the device embodiments described above, such as the functions of the units 31 to 34 shown in fig. 3.
In the embodiment of the invention, a deep learning model with an introduced inhibition signal is trained offline, the deep learning model is published on an online after training, the received online training image is subjected to image recognition through the deep learning model, when the online training image is determined to be an unrecognizable image according to the obtained image recognition result, the online training image is cut through a sliding window to obtain corresponding basic features, the basic feature set formed by the basic features is subjected to similarity matching with a pre-stored training image set according to the size of the basic features to obtain the similarity corresponding to each basic feature, the basic features corresponding to the similarity lower than a preset similarity threshold in all the similarities are set as singular features, the deep learning model is retrained again according to the singular feature set formed by the singular features and a preset model training algorithm to adjust all parameters of the deep learning model to complete online learning of the deep learning model, so that the noise robustness of the deep learning model is improved by introducing the inhibition signal, and the model recognition accuracy is improved through personalized training, so that the deep learning model conforms to brain visual cortex more.
The computer readable storage medium of the embodiments of the present invention may include any entity or device capable of carrying computer program code, a recording medium, such as a ROM/RAM, a magnetic disk, an optical disk, a flash memory, or the like.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and improvements made within the spirit and principle of the present invention are intended to be included within the scope of the present invention.

Claims (9)

1. An online learning method of a deep learning model, the method comprising the steps of:
the image recognition method comprises the following steps of carrying out image recognition on a received online training image through a deep learning model which is trained in advance under a line and introduces a suppression signal and an excitation signal to obtain an image recognition result, wherein the deep learning model comprises a hidden layer and a full connection layer, the hidden layer comprises a plurality of feature extraction layers and a lower sampling layer corresponding to the feature extraction layers, each feature extraction layer comprises a plurality of feature extraction surfaces, each lower sampling layer comprises a plurality of lower sampling surfaces, and the image recognition on the received online training image specifically comprises the following steps: feature extraction formula V using a combination of inhibitory and excitatory signals l (n,k)=E l (n,k)-I l (n) performing feature extraction on the online training image input to the deep learning model to perform image recognition on the online training image according to the extracted features, wherein V is l (n, k) is a feature matrix of a kth plane on the ith layer of the deep learning model, E l (n, k) is the excitation signal of the kth plane on the l layer of the deep learning model, the I l (n) is the inhibitory signal of the l-th layer of the deep learning model, n is the median cell on the k-th plane;
when the on-line training image is determined to be an unrecognizable image according to the image recognition result, cutting the on-line training image through a sliding window to obtain corresponding basic features with the same receptive field size of each layer of the deep learning model;
according to the size of the basic features, carrying out similarity matching on the basic feature set formed by the basic features and a pre-stored training image set to obtain the similarity corresponding to each basic feature, and setting the basic features corresponding to the similarities which are lower than a preset similarity threshold value in all the similarities as singular features;
according to a singular feature set formed by the singular features and a preset model training algorithm, retraining the deep learning model again so as to adjust each parameter of the deep learning model and complete the online learning of the deep learning model;
according to a singular feature set formed by the singular features and a preset model training algorithm, the deep learning model is retrained again, and the method comprises the following steps:
classifying all singular characteristics by adopting a similarity clustering algorithm to obtain a plurality of characteristic categories;
respectively carrying out weighted average calculation on all singular characteristics in each characteristic category to obtain target singular characteristics corresponding to the characteristic categories;
according to a target singular characteristic set formed by the obtained target singular characteristics, local training is carried out on the hidden layer until the activation values of the connection domains corresponding to the seed cells arranged on the characteristic extraction surface and the down-sampling surface reach a preset activation threshold value;
and performing classification training of the full connection layer on the deep learning model which is subjected to the local training according to the on-line training image.
2. The method of claim 1, in which the fully-connected layer is an output layer of the deep-learning model.
3. The method of claim 2, wherein prior to the step of performing image recognition on the received on-line training image, the method further comprises:
constructing the deep learning model according to the training image set;
extracting corresponding target features by adopting the built deep learning model and a preset feature extraction algorithm according to the training image set;
according to a target feature set formed by the extracted target features, local training is carried out on the hidden layer until the activation values of the connection domains corresponding to the seed cells arranged on the feature extraction surface and the down-sampling surface reach a preset activation threshold value;
and performing classification training of the full connection layer on the deep learning model which is subjected to the local training according to a target image set selected from the training image set in advance so as to complete off-line training of the deep learning model.
4. The method of claim 3, wherein the step of extracting corresponding target features comprises:
obtaining initial characteristics corresponding to each training image in the training image set by using the built deep learning model;
classifying all the initial features by adopting a similarity clustering algorithm to obtain a plurality of feature categories;
and respectively carrying out weighted average calculation on all initial features in each feature category to obtain target features corresponding to the feature categories.
5. The method of claim 3, wherein the step of locally training the hidden layer of the deep learning model comprises:
and according to the target feature set, local training is carried out on the hidden layer by respectively carrying out face-to-face layer-by-layer training on each feature extraction face and the down-sampling face in each layer from the first layer of the hidden layer.
6. The method of claim 3, wherein the step of performing classification training of the fully-connected layer on the deep-learning model that has completed the local training comprises:
and according to the target image set, carrying out classification training on the full-connected layers by adopting a back propagation algorithm, and correcting the weight values among the layers of the deep learning model to finish off-line training of the deep learning model.
7. An apparatus for online learning of a deep learning model, the apparatus comprising:
an on-line image recognition unit for pre-training the on-line image under the lineThe deep learning model with the suppression signal and the excitation signal performs image recognition on a received online training image to obtain an image recognition result, the deep learning model comprises a hidden layer and a full connection layer, the hidden layer comprises a plurality of feature extraction layers and a lower sampling layer corresponding to the feature extraction layers, the feature extraction layers comprise a plurality of feature extraction surfaces, the lower sampling layer comprises a plurality of lower sampling surfaces, and the image recognition on the received online training image specifically comprises the following steps: feature extraction formula V using a combination of inhibitory and excitatory signals l (n,k)=E l (n,k)-I l (n) performing feature extraction on the online training image input to the deep learning model to perform image recognition on the online training image according to the extracted features, wherein V is l (n, k) is a feature matrix of a kth plane on the ith layer of the deep learning model, E l (n, k) is the excitation signal of the kth plane on the l layer of the deep learning model, I l (n) is the suppression signal of the l-th layer of the deep learning model, and n is the median cell on the k-th plane;
the basic feature extraction unit is used for cutting the on-line training image through a sliding window to obtain corresponding basic features with the same size as the receptive field of each layer of the deep learning model when the on-line training image is determined to be an unrecognizable image according to the image recognition result;
the similarity matching unit is used for matching the similarity of a basic feature set formed by the basic features with a pre-stored training image set according to the size of the basic features to obtain the similarity corresponding to each basic feature, and setting the basic features corresponding to the similarity lower than a preset similarity threshold in all the similarities as singular features; and
the model training unit is used for retraining the deep learning model according to a singular feature set formed by the singular features and a preset model training algorithm, adjusting parameters of the deep learning model, completing on-line learning of the deep learning model, and retraining the deep learning model according to the singular feature set formed by the singular features and the preset model training algorithm, and specifically comprises the following steps:
classifying all singular characteristics by adopting a similarity clustering algorithm to obtain a plurality of characteristic categories;
respectively carrying out weighted average calculation on all singular characteristics in each characteristic category to obtain target singular characteristics corresponding to the characteristic categories;
according to a target singular characteristic set formed by the obtained target singular characteristics, local training is carried out on the hidden layer until the activation values of the connection domains corresponding to the seed cells arranged on the characteristic extraction surface and the down-sampling surface reach a preset activation threshold value;
and performing classification training of the full connection layer on the deep learning model which is subjected to the local training according to the on-line training image.
8. A computing device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the method according to any of claims 1 to 6 when executing the computer program.
9. A computer-readable storage medium, in which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to any one of claims 1 to 6.
CN201910722508.5A 2019-08-06 2019-08-06 Online learning method, device, equipment and medium of deep learning model Active CN110598737B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201910722508.5A CN110598737B (en) 2019-08-06 2019-08-06 Online learning method, device, equipment and medium of deep learning model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201910722508.5A CN110598737B (en) 2019-08-06 2019-08-06 Online learning method, device, equipment and medium of deep learning model

Publications (2)

Publication Number Publication Date
CN110598737A CN110598737A (en) 2019-12-20
CN110598737B true CN110598737B (en) 2023-02-24

Family

ID=68853513

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201910722508.5A Active CN110598737B (en) 2019-08-06 2019-08-06 Online learning method, device, equipment and medium of deep learning model

Country Status (1)

Country Link
CN (1) CN110598737B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112347893B (en) * 2020-11-02 2023-07-21 深圳大学 Model training method and device for video behavior recognition and computer equipment
CN112560338B (en) * 2020-12-10 2022-03-25 东北大学 Complex industrial system intelligent forecasting method, device, equipment and storage medium based on adaptive deep learning

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473558A (en) * 2013-09-04 2013-12-25 深圳先进技术研究院 Image recognizing method and system based on neural network
CN107704859A (en) * 2017-11-01 2018-02-16 哈尔滨工业大学深圳研究生院 A kind of character recognition method based on deep learning training framework
CN108416370A (en) * 2018-02-07 2018-08-17 深圳大学 Image classification method, device based on semi-supervised deep learning and storage medium
CN108765373A (en) * 2018-04-26 2018-11-06 西安工程大学 A kind of insulator exception automatic testing method based on integrated classifier on-line study

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10043112B2 (en) * 2014-03-07 2018-08-07 Qualcomm Incorporated Photo management
US11288551B2 (en) * 2016-10-24 2022-03-29 International Business Machines Corporation Edge-based adaptive machine learning for object recognition
US20190095764A1 (en) * 2017-09-26 2019-03-28 Panton, Inc. Method and system for determining objects depicted in images
CN109543818A (en) * 2018-10-19 2019-03-29 中国科学院计算技术研究所 A kind of link evaluation method and system based on deep learning model
CN109754068A (en) * 2018-12-04 2019-05-14 中科恒运股份有限公司 Transfer learning method and terminal device based on deep learning pre-training model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103473558A (en) * 2013-09-04 2013-12-25 深圳先进技术研究院 Image recognizing method and system based on neural network
CN107704859A (en) * 2017-11-01 2018-02-16 哈尔滨工业大学深圳研究生院 A kind of character recognition method based on deep learning training framework
CN108416370A (en) * 2018-02-07 2018-08-17 深圳大学 Image classification method, device based on semi-supervised deep learning and storage medium
CN108765373A (en) * 2018-04-26 2018-11-06 西安工程大学 A kind of insulator exception automatic testing method based on integrated classifier on-line study

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
"Online Unsupervised Kernel Learning Algorithms";Kuh, Anthony et al.;《2017 ASIA-PACIFIC SIGNAL AND INFORMATION PROCESSING ASSOCIATION ANNUAL SUMMIT AND CONFERENCE》;20171231;第1-48页 *
"基于深度学习的图像识别鲁棒性研究";李海涛;《中国优秀硕士学位论文全文数据库 (信息科技辑)》;20190215(第02期);第1019-1025页 *

Also Published As

Publication number Publication date
CN110598737A (en) 2019-12-20

Similar Documents

Publication Publication Date Title
Rahman et al. A new benchmark on american sign language recognition using convolutional neural network
CN108229444B (en) Pedestrian re-identification method based on integral and local depth feature fusion
CN108717568B (en) A kind of image characteristics extraction and training method based on Three dimensional convolution neural network
CN110245608B (en) Underwater target identification method based on half tensor product neural network
CN110188794B (en) Deep learning model training method, device, equipment and storage medium
Karsoliya Approximating number of hidden layer neurons in multiple hidden layer BPNN architecture
Denker et al. Transforming neural-net output levels to probability distributions
CN107145830B (en) Hyperspectral image classification method based on spatial information enhancing and deepness belief network
CN112368719A (en) Gradient antagonism training of neural networks
CN100492399C (en) Method for making human face posture estimation utilizing dimension reduction method
CN111753881B (en) Concept sensitivity-based quantitative recognition defending method against attacks
CN110459225B (en) Speaker recognition system based on CNN fusion characteristics
CN108446676B (en) Face image age discrimination method based on ordered coding and multilayer random projection
CN109190521B (en) Construction method and application of face recognition model based on knowledge purification
CN110705428B (en) Facial age recognition system and method based on impulse neural network
CN110598737B (en) Online learning method, device, equipment and medium of deep learning model
CN109344713A (en) A kind of face identification method of attitude robust
Ibragimovich et al. Effective recognition of pollen grains based on parametric adaptation of the image identification model
Li et al. Adaptive dropout method based on biological principles
CN114492634A (en) Fine-grained equipment image classification and identification method and system
CN114170657A (en) Facial emotion recognition method integrating attention mechanism and high-order feature representation
CN114202792A (en) Face dynamic expression recognition method based on end-to-end convolutional neural network
CN109101984B (en) Image identification method and device based on convolutional neural network
CN110796177B (en) Method for effectively reducing neural network overfitting in image classification task
CN116561533A (en) Emotion evolution method and terminal for virtual avatar in educational element universe

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant