CN111062465A - Image recognition model and method with neural network structure self-adjusting function - Google Patents

Image recognition model and method with neural network structure self-adjusting function Download PDF

Info

Publication number
CN111062465A
CN111062465A CN201911259716.2A CN201911259716A CN111062465A CN 111062465 A CN111062465 A CN 111062465A CN 201911259716 A CN201911259716 A CN 201911259716A CN 111062465 A CN111062465 A CN 111062465A
Authority
CN
China
Prior art keywords
neural network
model
network structure
training
search space
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911259716.2A
Other languages
Chinese (zh)
Inventor
陈荣聪
林倞
王广润
王广聪
张吉祺
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sun Yat Sen University
National Sun Yat Sen University
Original Assignee
National Sun Yat Sen University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by National Sun Yat Sen University filed Critical National Sun Yat Sen University
Priority to CN201911259716.2A priority Critical patent/CN111062465A/en
Publication of CN111062465A publication Critical patent/CN111062465A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Abstract

The invention discloses an image recognition model and a method with a neural network structure self-adjusting function, wherein the model comprises the following steps: the pre-training model generating unit is used for constructing a neural network model structure and pre-training on a source domain based on standard transfer learning to obtain a pre-training model; a search space design unit for designing a search space of the neural network model structure such that the selected neural network model structure can be regarded as one instance in the search space; the combined fine tuning unit is used for combining and fine tuning network parameters and a network structure in a search space from the obtained pre-training model, and obtaining a target neural network structure after optimization; and the network parameter fine tuning unit is used for further fine tuning the network parameters of the target neural network structure obtained by the combined fine tuning unit.

Description

Image recognition model and method with neural network structure self-adjusting function
Technical Field
The invention relates to the technical field of computer vision based on deep learning, in particular to an image recognition model and method with a neural network structure self-adjusting function.
Background
There is a large body of evidence that in deep learning, pre-trained features can be transferred across tasks, i.e., migratory learning. In the 80's of the 20 th century, Hinton introduced migratory learning into deep learning, especially unsupervised learning. In 2012, this technology began to appeal to academia when ImageNet was first introduced to computer vision communities. In the tasks of image recognition, target detection, semantic segmentation, video recognition, pedestrian re-recognition, pedestrian attribute recognition and the like, a better result can be obtained by using the ImageNe pre-trained model for transfer learning. Other data domains are migrated after the ImageNet is pre-trained, so that the performance of a target task can be improved, the learning process can be accelerated, and the training time can be shortened. Standard transfer learning is not only applied to computer vision, but also in other fields, such as Natural Language Processing (NLP).
On the other hand, human processes that automatically design machine learning algorithms are of increasing interest. In particular, neural network architecture tuning is expected to reduce the time spent by human experts in neural network architecture design. However, there is an unresolved problem in neural network structure adjustment, namely how to effectively solve such a search model. The most accurate and reliable solution is to train each candidate architecture in the search space and compare their performances, and the neural network structure with the best performance is taken as the final neural network structure. However, this approach is very time consuming because the search space is typically large (e.g., greater than 1e 20). To address this problem, many researchers have explored training candidate architectures using Reinforcement Learning (RL) or evolutionary learning to guide the search direction. For example, in RL-based neural network structure tuning, only the most potential candidate neural network structures with the greatest reward are trained because they assume the target neural network structure to be included therein. These neural network structure adjustment algorithms achieve more significant performance, however, they still require a large amount of computation. For example, to obtain the most advanced neural network structure on CIFAR-10, reinforcement learning requires 1800GPU days, while evolutionary learning requires 3150GPU days. This indicates that training candidate neural network structures in a search subspace (e.g., 100 ten thousand neural network structures) is still impractical because even training only one neural network structure often takes a long time (e.g., training ResNet on ImageNet for more than 10GPU days).
At present, the following framework flows are mostly followed by standard image recognition systems: (a) pre-training a neural network on a large-scale dataset (e.g., ImageNet); (b) network parameters are fine-tuned over a smaller, task-specific data set. This process of migration learning is intended to migrate the recognition capabilities of the network from one data domain to another through parameter adaptation, but is based on the assumption that a fixed neural network structure is applicable to all data domains. However, data fields with different recognition targets may require different feature hierarchies, where some neurons may become redundant and others reactivated to form new network structures.
Disclosure of Invention
In order to overcome the defects in the prior art, the invention aims to provide an image recognition model and method with a self-adjusting neural network structure, which realize the joint optimization of the neural network structure and the improvement of the image recognition performance of the neural network parameter enhancement model by combining a transfer learning technology and a neural network structure adjustment technology, utilizing the characteristics learned by pre-training in the transfer learning and self-adaptively adjusting the neural network structure for different tasks and data.
To achieve the above object, the present invention provides an image recognition model with a neural network structure self-adjusting, comprising:
the pre-training model generating unit is used for constructing a neural network model structure and pre-training on a source domain based on standard transfer learning to obtain a pre-training model;
a search space design unit for designing a search space of the neural network model structure such that the selected neural network model structure can be regarded as one instance in the search space;
the combined fine tuning unit is used for combining and fine tuning network parameters and a network structure in a search space from the obtained pre-training model, and obtaining a target neural network structure after optimization;
and the network parameter fine tuning unit is used for further fine tuning the network parameters of the target neural network structure obtained by the combined fine tuning unit.
Preferably, in the pre-training model generation unit, after the neural network model structure is constructed, the network model α is given0Training by using a source data set ImageNet to obtain a model parameter W of the pre-training network0
Preferably, the neural network model structure adopts a ResNet50 neural network structure.
Preferably, the pre-training model is expressed as:
Figure BDA0002311282750000031
where Φ (·) represents a nonlinear function of the neural network, X is an input of the neural network, and W ═ W1,W2,…,Wi,…,WK-1,WKDenotes the parameters of the neural network, K denotes the depth of the neural network, α0For a given network model α0
Figure BDA0002311282750000032
Is represented by WiIn order to perform the convolution operation of the convolution kernel,
Figure BDA0002311282750000033
is an operation sequence symbol.
Preferably, in step S1, the standard migration learning can be formulated as:
Figure BDA0002311282750000034
W*representing a given network structure α0The best network parameters.
Preferably, the search space design unit expands the neural network structure selected in the pre-training model generation unit to a larger neural network space.
Preferably, the search space is represented as:
A={O1,O2,…,Oi,…,OK-1,OK}
wherein O isi(1 ≦ i ≦ K) represents the set of candidate operations.
Preferably, the joint fine tuning unit obtains the discrete target neural network structure α by using a soft selection method on the basis of joint fine tuning*
Preferably, the image recognition model is trained on an ImageNet data set, and the obtained trained image recognition model realizes the classification of the input images.
In order to achieve the above object, the present invention further provides an image recognition method for self-adjusting a neural network structure, comprising the following steps:
step S1, constructing a neural network model structure, and pre-training on a source domain based on standard transfer learning to obtain a pre-training model;
step S2, designing a search space of the neural network structure, so that the selected neural network structure can be regarded as an example in the search space;
step S3, starting from the obtained pre-training model, combining and fine-tuning network parameters and a network structure in a search space, and obtaining a target neural network structure after optimization;
step S4 is to perform fine adjustment of network parameters in the target domain for the target neural network structure obtained in step S3.
Compared with the prior art, the image recognition method and the system for the self-adjustment of the neural network structure combine the transfer learning technology and the neural network structure adjustment technology, utilize the representation learned by pre-training in the transfer learning, adaptively adjust the neural network structure for different tasks and data, jointly optimize the neural network structure and the image recognition performance of the neural network parameter promotion model, realize the image recognition frame for the self-adjustment of the neural network structure based on the transfer learning, and simultaneously utilize the capabilities of standard transfer learning and network structure adjustment to obtain better performance.
Drawings
FIG. 1 is a system architecture diagram of an image recognition model with a self-adjusting neural network architecture according to the present invention;
FIG. 2 illustrates an initial network structure α according to an embodiment of the present invention0Target network fabric α*A relationship diagram of the search space A;
FIG. 3 is a graph illustrating the comparison of the results of the present invention with standard transfer learning;
fig. 4 is a system architecture diagram of an image recognition method with a self-adjusting neural network structure according to the present invention.
Detailed Description
Other advantages and capabilities of the present invention will be readily apparent to those skilled in the art from the present disclosure by describing the embodiments of the present invention with specific embodiments thereof in conjunction with the accompanying drawings. The invention is capable of other and different embodiments and its several details are capable of modification in various other respects, all without departing from the spirit and scope of the present invention.
FIG. 1 is a system architecture diagram of an image recognition model with a self-adjusting neural network structure according to the present invention. As shown in fig. 1, the image recognition model with a neural network structure self-adjusting according to the present invention includes:
and the pre-training model generating unit 101 is configured to construct a neural network model structure, such as ResNet50, and pre-train the neural network model structure on the source domain based on standard transfer learning to obtain a pre-training model.
In an embodiment of the present invention, the mathematical representation of the constructed neural network model is as follows
Figure BDA0002311282750000051
It can be seen that the neural network is a complex nonlinear function phi (·)) X is the input of the neural network, i.e. the preprocessed input image, W ═ W1,W2,…,Wi,…,WK-1,WKRepresents the parameters of the neural network, K represents the depth of the neural network,
Figure BDA0002311282750000052
can represent convolution operations (e.g. general convolution, dilation convolution), psi (·) represents a non-linear activation function (e.g. ReLU), and (1) can also be expressed as:
Figure BDA0002311282750000053
wherein
Figure BDA0002311282750000054
Is represented by WiIn order to perform the convolution operation of the convolution kernel,
Figure BDA0002311282750000055
for operation sequence notation, α denotes a network structure, which will also be described
Figure BDA0002311282750000056
Simplified to CiThen the network structure can be simplified as:
Figure BDA0002311282750000057
the deep learning problem can be expressed as the following optimization problem:
Figure BDA0002311282750000058
wherein
Figure BDA0002311282750000059
Representing a loss function, λ | W |2In order to simplify the expression, the invention does not express the commonly used components in the network structure in the formula, and the default here is to be 1e-4Batch normalization (BatchNormalize) was used for these operations.
In the pre-training model generation unit 101, after the neural network model structure (e.g., ResNet50) is constructed, the network model α is given0(if ResNet50 is used, ResNet50 is α0) The pre-training is to obtain a model parameter W of the pre-training network through training in the source data set ImageNet0Thus, a pre-trained model can be expressed as:
Figure BDA0002311282750000061
in computer vision, ImageNet has been proven to be useful for many other tasks. ImageNet is considered to be the most commonly used source data set, and therefore, the present invention employs ImageNet as a pre-trained source data set.
In standard migration learning, network parameters are adjustable when migrating to a target task, while the network structure is fixed, with the goal of finding a given network structure α0Of the optimal network parameter W*And then:
Figure BDA0002311282750000062
so standard migration learning can be formalized as:
Figure BDA0002311282750000063
wherein the network structure α0The optimization is not changed before and after, namely the network structure is not changed.
A search space design unit 102 for designing a search space of the neural network structure such that the selected neural network structure can be regarded as one instance in the search space.
In the embodiment of the present invention, the search space of the neural network structure is designed such that the selected neural network structure can be regarded as an example in the search space by expanding the neural network structure selected by the pre-training model generating unit 101 to a larger neural network space.
It is an object of the present invention to utilize the initial network structure α0Migratable property of, second explores the initial network structure α0To this end, the present invention designs a search space for network architecture A that results in an initial network structure α0Can be viewed as an instance in search space a (i.e., α)0e.A) FIG. 2 shows the initial network structure α0Target network fabric α*And the relation with the search space A, according to the definition of the search space, the self-adjusting image identification model of the neural network structure based on the transfer learning of the invention can be further expressed as follows:
Figure BDA0002311282750000064
wherein O isi(1 ≦ i ≦ K) representing the set of candidate operations, then the search of the network structure can be simplified to choose the appropriate operation from each layer, and the search space can be expressed as:
A={O1,I2,…,Oi,…,OK-1,OK} (9)
to ensure α0E.g. A, then has,
Figure BDA0002311282750000071
in particular, three types of candidate operations are exemplified below:
convolution: convolution with 5 × 5,3 × 3,1 × 1, respectively
Figure BDA0002311282750000072
To represent
Pooling: 3X 3 maximum pooling, 3X 3 average pooling, Global average pooling, individually
Figure BDA0002311282750000073
To represent
And others: equivalent transformation, noise disturbance operation, respectively
Figure BDA0002311282750000074
To represent
Wherein the noise perturbation operation is to add Gaussian noise to the input of the operation, i represents the ith layer,
it should be noted that the above 8 kinds of candidate operations are only specific examples, and more operations, such as adding 7 × 7 convolution, expanding convolution, etc., may be specified by designing the search space and may be flexibly changed.
And the joint fine-tuning unit 103 is used for joint fine-tuning the network parameters and the network structure in the search space from the obtained pre-training model, and obtaining the target neural network structure after optimization.
In the invention, starting from the obtained pre-training model, network parameters and network structures are adjusted in a search space in a combined manner, which is different from standard transmission learning, and the invention adjusts the network parameters and the network structures in the search space at the same time and can be expressed as:
Figure BDA0002311282750000075
wherein the content of the first and second substances,
Figure BDA0002311282750000076
and (9) can be further simplified as follows:
Figure BDA0002311282750000077
then α is driven from the initial network configuration0Tuning to a target neural network structure α*The process of (a) can be expressed as:
Figure BDA0002311282750000078
preferably, the joint fine tuning unit 103 further obtains the discrete target neural network structure α by using a soft selection method based on the joint fine tuning*
According to the above equation (8), the optimal neural network structure is found, and it is simplified to select an appropriate operation from the candidate operation set of each layer. However, selecting an operation from the candidate set is discrete, not trivial, which is not optimized for DNNs. To address this problem, the present invention relaxes the hard selection problem to the soft selection problem. Specifically, each candidate operation in O is associated with a confidence value P ∈ [0,1], where P ═ 1 indicates that the corresponding operation is positively taken. Assuming that P can be learned in a data-driven manner, for example, for the ith layer (0 ≦ i ≦ K), the probability of selecting a 3 × 3 convolution is defined as:
Figure BDA0002311282750000081
wherein
Figure BDA0002311282750000082
Representing a learnable parameter that measures the probability that the ith layer selects a 3 x 3 convolution. The possibility of selecting other operations is defined similarly. The neural network in the search process can then be expressed as:
Figure BDA0002311282750000083
wherein
Figure BDA0002311282750000084
Represents a weighted sum of candidate operations, namely:
Figure BDA0002311282750000085
due to the fact that
Figure BDA0002311282750000086
Has been pre-trained so that it has a greater capacity at the start of the search than other candidate operations, and so will
Figure BDA0002311282750000087
The initialization is 1 and theta for other operations is initialized to 0. Then there are:
Figure BDA0002311282750000088
wherein phiW,θ(X) is:
Figure BDA0002311282750000089
wherein
Figure BDA00023112827500000810
The expression parameter is
Figure BDA00023112827500000811
Operation of
Figure BDA00023112827500000812
The invention also adds an operation regularization item
Figure BDA00023112827500000813
Wherein the content of the first and second substances,
Figure BDA00023112827500000814
to suppress operation of the initial network structure
Figure BDA00023112827500000815
To encourage new network operation, and in addition, with equal transformation, noise perturbation operation can reduce network complexity, and finally, the image recognition framework for neural network structure self-adjustment based on transfer learning can be expressed as:
Figure BDA0002311282750000091
in a specific embodiment of the present invention, a random gradient descent method may be used to solve this problem.
In the present invention, a target network structure α is obtained*Reduced to a set of candidate operations O from each layeriMiddle selection operation CiBy searching, obtainθ, which measures the probability that the i-th layer operation is selected. Thus, the optimal operation should be the candidate operation with the highest probability:
Figure BDA0002311282750000092
after the optimal operation is selected, the target neural network structure α is obtained*
Fig. 3 shows a schematic diagram comparing the results of the present invention with standard transfer learning. Wherein (a) represents standard transfer learning, when the source domain is transferred to other target domains after pre-training, target1, target2 and target3, which only change the weight of the network structure and do not change the network structure, and the network structure of the source domain and the network structure of the target domain are the same. And (b) represents the invention, when the source domain is moved to other target domains after pre-training, namely target1, target2 and target3, the invention changes not only the weight but also the network structure (including the connection mode and operation).
A network parameter fine-tuning unit 104 for fine-tuning the target neural network structure α obtained by the joint fine-tuning unit 103*And (4) fine tuning of network parameters in the target domain, namely adjusting the weight W based on the network structure and the weight W obtained in the last step, and continuing training.
After the image recognition model with the self-adjusting neural network structure is trained, the image to be recognized can be input into the image recognition model for image processing to obtain an image recognition result, and if the image recognition model is trained on an ImageNet data set, the image recognition model can classify the image.
Fig. 4 is a system architecture diagram of an image recognition method with a self-adjusting neural network structure according to the present invention. As shown in fig. 4, the image recognition method for neural network structure self-adjustment of the present invention includes the following steps:
and step S1, constructing a neural network model structure such as ResNet50, and pre-training on a source domain based on standard transfer learning to obtain a pre-training model.
In an embodiment of the present invention, the mathematical representation of the constructed neural network model is as follows
Figure BDA0002311282750000093
Figure BDA0002311282750000101
It can be seen that the neural network is a complex nonlinear function Φ (·), X is the input to the neural network, and W ═ W1,W2,…,Wi,…,WK-1,WKRepresents the parameters of the neural network, K represents the depth of the neural network,
Figure BDA0002311282750000102
can represent convolution operations (e.g. general convolution, dilation convolution), psi (·) represents a non-linear activation function (e.g. ReLU), and (1) can also be expressed as:
Figure BDA0002311282750000103
wherein
Figure BDA0002311282750000104
Is represented by WiIn order to perform the convolution operation of the convolution kernel,
Figure BDA0002311282750000105
for operation sequence notation, α denotes a network structure, which will also be described
Figure BDA0002311282750000106
Simplified to CiThen the network structure can be simplified as:
Figure BDA0002311282750000107
the deep learning problem can be expressed as the following optimization problem:
Figure BDA0002311282750000108
wherein
Figure BDA0002311282750000109
Representing a loss function, λ | W |2For the regularization term, λ is a hyperparameter, typically set to 1e-4, where the batch normalization is hidden for simplified representation.
In step S1, after the neural network model structure (e.g., ResNet50) is constructed, the network model α is given0The pre-training is to obtain a model parameter W of the pre-training network through training in the source data set ImageNet0Thus, a pre-trained model can be expressed as:
Figure BDA00023112827500001010
in computer vision, ImageNet has been proven to be useful for many other tasks. ImageNet is considered to be the most commonly used source data set, and therefore, the present invention employs ImageNet as a pre-trained source data set.
In standard migration learning, network parameters are adjustable when migrating to a target task, while the network structure is fixed, with the goal of finding a given network structure α0Of the optimal network parameter W*And then:
Figure BDA00023112827500001011
so standard migration learning can be formalized as:
Figure BDA00023112827500001012
wherein the network structure α0The optimization is not changed before and after, namely the network structure is not changed.
Step S2, a search space of neural network structures is designed such that the selected neural network structure can be considered as one instance in the search space.
In the embodiment of the present invention, the search space of the neural network structure is designed such that the selected neural network structure can be regarded as an example in the search space by expanding the neural network structure selected by the pre-training model generating unit 101 to a larger neural network space.
It is an object of the present invention to utilize the initial network structure α0Migratable property of, second explores the initial network structure α0To this end, the present invention designs a search space for network architecture A that results in an initial network structure α0Can be viewed as an instance in search space a (i.e., α)0e.A) FIG. 2 shows the initial network structure α0Target network fabric α*And the relation with the search space A, according to the definition of the search space, the self-adjusting image identification model of the neural network structure based on the transfer learning of the invention can be further expressed as follows:
Figure BDA0002311282750000111
wherein O isi(1 ≦ i ≦ K) representing the set of candidate operations, then the search of the network structure can be simplified to choose the appropriate operation from each layer, and the search space can be expressed as:
A={O1,O2,…,Oi,…,OK-1,OK}(9)
to ensure α0E.g. A, then has,
Figure BDA0002311282750000112
in particular, three types of candidate operations are exemplified below:
convolution: convolution with 5 × 5,3 × 3,1 × 1, respectively
Figure BDA0002311282750000113
To represent
Pooling: 3X 3 maximum pooling, 3X 3 average pooling, Global average pooling, individually
Figure BDA0002311282750000114
To represent
And others: equivalent transformation, noise disturbance operation, respectively
Figure BDA0002311282750000115
To represent
Wherein the noise perturbation operation is to add Gaussian noise to the input of the operation, i represents the ith layer,
and step S3, combining and fine-tuning network parameters and a network structure in a search space from the obtained pre-training model, and optimizing to obtain a target neural network structure.
Unlike standard transmission learning, the present invention adjusts both network parameters and network structure in the search space, which can be expressed as:
Figure BDA0002311282750000121
wherein the content of the first and second substances,
Figure BDA0002311282750000122
and (9) can be further simplified as follows:
Figure BDA0002311282750000123
then α is driven from the initial network configuration0Tuning to a target neural network structure α*The process of (a) can be expressed as:
Figure BDA0002311282750000124
preferably, the present invention also uses a soft selection method to obtain the discrete target neural network structure α based on the joint fine tuning*
According to the above equation (8), the optimal neural network structure is found, and it is simplified to select an appropriate operation from the candidate operation set of each layer. However, selecting an operation from the candidate set is discrete, not trivial, which is not optimized for DNNs. To address this problem, the present invention relaxes the hard selection problem to the soft selection problem. Specifically, each candidate operation in O is associated with a confidence value P ∈ [0,1], where P ═ 1 indicates that the corresponding operation is positively taken. Assuming that P can be learned in a data-driven manner, for example, for the ith layer (0 ≦ i ≦ K), the probability of selecting a 3 × 3 convolution is defined as:
Figure BDA0002311282750000125
wherein
Figure BDA0002311282750000126
Representing a learnable parameter that measures the probability that the ith layer selects a 3 x 3 convolution. The possibility of selecting other operations is defined similarly. The neural network in the search process can then be expressed as:
Figure BDA0002311282750000127
wherein
Figure BDA0002311282750000128
Represents a weighted sum of candidate operations, namely:
Figure BDA0002311282750000129
due to the fact that
Figure BDA0002311282750000131
Has been pre-trained so that it has a greater capacity at the start of the search than other candidate operations, and so will
Figure BDA0002311282750000132
The initialization is 1 and theta for other operations is initialized to 0. Then there are:
Figure BDA0002311282750000133
wherein phiW,θ(X) is:
Figure BDA0002311282750000134
wherein
Figure BDA0002311282750000135
The expression parameter is
Figure BDA0002311282750000136
Operation of
Figure BDA0002311282750000137
The invention also adds an operation regularization item
Figure BDA0002311282750000138
Wherein the content of the first and second substances,
Figure BDA0002311282750000139
to suppress operation of the initial network structure
Figure BDA00023112827500001310
To encourage new network operation, and in addition, with equal transformation, noise perturbation operation can reduce network complexity, and finally, the image recognition framework for neural network structure self-adjustment based on transfer learning can be expressed as:
Figure BDA00023112827500001311
in a specific embodiment of the present invention, a random gradient descent method may be used to solve this problem.
In the present invention, a target network structure α is obtained*Reduced to a set of candidate operations O from each layeriMiddle selection operation CiFrom the search, θ is obtained, which measures the probability that the operation at the ith level is selected. Thus, the optimal operation should be the candidate operation with the highest probability:
Figure BDA00023112827500001312
after the optimal operation is selected, the target neural network structure α is obtained*
Step S4, the target neural network α obtained through step S3*And (4) fine tuning of network parameters in the target domain, namely, adjusting the weight based on the network structure and the weight obtained in the last step, and continuing training.
Preferably, after step S4, the method further includes the following steps:
and inputting the image to be recognized into the trained image recognition model for image processing to obtain an image recognition result. If the image recognition model is trained on ImageNet datasets, it can classify the images.
In summary, the image recognition method and system for neural network structure self-adjustment of the present invention combine the transfer learning technology and the neural network structure adjustment technology, utilize the representation learned by the pre-training in the transfer learning, adaptively adjust the neural network structure for different tasks and data, and jointly optimize the neural network structure and the image recognition performance of the neural network parameter enhancement model.
The foregoing embodiments are merely illustrative of the principles and utilities of the present invention and are not intended to limit the invention. Modifications and variations can be made to the above-described embodiments by those skilled in the art without departing from the spirit and scope of the present invention. Therefore, the scope of the invention should be determined from the following claims.

Claims (10)

1. An image recognition model with a neural network structure self-adjusting, comprising:
the pre-training model generating unit is used for constructing a neural network model structure and pre-training on a source domain based on standard transfer learning to obtain a pre-training model;
a search space design unit for designing a search space of the neural network model structure such that the selected neural network model structure can be regarded as one instance in the search space;
the combined fine tuning unit is used for combining and fine tuning network parameters and a network structure in a search space from the obtained pre-training model, and obtaining a target neural network structure after optimization;
and the network parameter fine tuning unit is used for further fine tuning the network parameters of the target neural network structure obtained by the combined fine tuning unit.
2. The model of claim 1, wherein the pre-trained model generation unit is configured to generate the network model α after constructing the neural network model structure0Training by using a source data set ImageNet to obtain a model parameter W of the pre-training network0
3. The self-adjusting image recognition model of claim 2, wherein the neural network model structure adopts a ResNet50 neural network structure.
4. The model for image recognition with self-adjustment of neural network structure as claimed in claim 2, wherein the pre-training model is expressed as:
Figure FDA0002311282740000011
where Φ (·) represents a nonlinear function of the neural network, X is an input of the neural network, and W ═ W1,W2,…,Wi,…,WK-1,WKDenotes the parameters of the neural network, K denotes the depth of the neural network, α0For a given network model α0
Figure FDA0002311282740000012
Is represented by WiIn order to perform the convolution operation of the convolution kernel,
Figure FDA0002311282740000014
is an operation sequence symbol.
5. The self-adjusting image recognition model of neural network structure as claimed in claim 4, wherein in step S1, the standard transfer learning can be formed as:
Figure FDA0002311282740000013
W*representing a given network structure α0The best network parameters.
6. The model of claim 2, wherein the search space design unit expands the neural network structure selected by the pre-training model generation unit to a larger neural network space.
7. The model of claim 6, wherein the search space is represented by:
A={O1,O2,…,Oi,…,OK-1,OK}
wherein O isi(1 ≦ i ≦ K) represents the set of candidate operations.
8. The model for image recognition of self-adjustment of neural network structure as claimed in claim 1, wherein the joint fine-tuning unit obtains the discrete target neural network structure α by using soft-selection method based on the joint fine-tuning*
9. A self-adjusting image recognition model of neural network architecture as claimed in claim 1, wherein: and training the image recognition model on an ImageNet data set to obtain a trained image recognition model to realize the classification of the input images.
10. An image recognition method for self-adjusting a neural network structure comprises the following steps:
step S1, constructing a neural network model structure, and pre-training on a source domain based on standard transfer learning to obtain a pre-training model;
step S2, designing a search space of the neural network structure, so that the selected neural network structure can be regarded as an example in the search space;
step S3, starting from the obtained pre-training model, combining and fine-tuning network parameters and a network structure in a search space, and obtaining a target neural network structure after optimization;
step S4 is to perform fine adjustment of network parameters in the target domain for the target neural network structure obtained in step S3.
CN201911259716.2A 2019-12-10 2019-12-10 Image recognition model and method with neural network structure self-adjusting function Pending CN111062465A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911259716.2A CN111062465A (en) 2019-12-10 2019-12-10 Image recognition model and method with neural network structure self-adjusting function

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911259716.2A CN111062465A (en) 2019-12-10 2019-12-10 Image recognition model and method with neural network structure self-adjusting function

Publications (1)

Publication Number Publication Date
CN111062465A true CN111062465A (en) 2020-04-24

Family

ID=70300431

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911259716.2A Pending CN111062465A (en) 2019-12-10 2019-12-10 Image recognition model and method with neural network structure self-adjusting function

Country Status (1)

Country Link
CN (1) CN111062465A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931854A (en) * 2020-08-12 2020-11-13 北京建筑大学 Method for improving portability of image recognition model
CN113344932A (en) * 2021-06-01 2021-09-03 电子科技大学 Semi-supervised single-target video segmentation method
CN113887546A (en) * 2021-12-08 2022-01-04 军事科学院系统工程研究院网络信息研究所 Method and system for improving image recognition accuracy

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111931854A (en) * 2020-08-12 2020-11-13 北京建筑大学 Method for improving portability of image recognition model
CN111931854B (en) * 2020-08-12 2021-03-23 北京建筑大学 Method for improving portability of image recognition model
CN113344932A (en) * 2021-06-01 2021-09-03 电子科技大学 Semi-supervised single-target video segmentation method
CN113887546A (en) * 2021-12-08 2022-01-04 军事科学院系统工程研究院网络信息研究所 Method and system for improving image recognition accuracy
CN113887546B (en) * 2021-12-08 2022-03-11 军事科学院系统工程研究院网络信息研究所 Method and system for improving image recognition accuracy

Similar Documents

Publication Publication Date Title
US11941523B2 (en) Stochastic gradient boosting for deep neural networks
JP7462623B2 (en) System and method for accelerating and embedding neural networks using activity sparsification
US11263524B2 (en) Hierarchical machine learning system for lifelong learning
EP3711000B1 (en) Regularized neural network architecture search
US20200334520A1 (en) Multi-task machine learning architectures and training procedures
US10325223B1 (en) Recurrent machine learning system for lifelong learning
EP3583552A1 (en) Active learning system
Reyad et al. A modified Adam algorithm for deep neural network optimization
CN111062465A (en) Image recognition model and method with neural network structure self-adjusting function
Passricha et al. PSO-based optimized CNN for Hindi ASR
US20190311258A1 (en) Data dependent model initialization
CN113302605A (en) Robust and data efficient black box optimization
Luo et al. Multinomial Bayesian extreme learning machine for sparse and accurate classification model
Qin et al. Active learning with extreme learning machine for online imbalanced multiclass classification
JP6942203B2 (en) Data processing system and data processing method
Saha et al. Recal: Reuse of established cnn classifier apropos unsupervised learning paradigm
Cowen et al. Lsalsa: accelerated source separation via learned sparse coding
CN111753995A (en) Local interpretable method based on gradient lifting tree
Gu A dual-model semi-supervised self-organizing fuzzy inference system for data stream classification
CN113159072B (en) Online ultralimit learning machine target identification method and system based on consistency regularization
London et al. Boosted off-policy learning
CN114757169A (en) Self-adaptive small sample learning intelligent error correction method based on ALBERT model
Hsieh et al. MS-DARTS: Mean-Shift Based Differentiable Architecture Search
Li Boosting
US11922324B1 (en) Training multi-task neural network while minimizing catastrophic forgetting

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20200424

RJ01 Rejection of invention patent application after publication