CN114972928A

CN114972928A - Image recognition model training method and device

Info

Publication number: CN114972928A
Application number: CN202210883039.7A
Authority: CN
Inventors: 钟雨崎; 凌明; 杨作兴; 艾国
Original assignee: Shenzhen MicroBT Electronics Technology Co Ltd
Current assignee: Shenzhen MicroBT Electronics Technology Co Ltd
Priority date: 2022-07-26
Filing date: 2022-07-26
Publication date: 2022-08-30
Anticipated expiration: 2042-07-26
Also published as: CN114972928B

Abstract

The invention provides an image recognition model training method and device, wherein the method comprises the following steps: acquiring a training data set of a target to be recognized; training an original model according to a training data set to obtain an image recognition model; in each round of training process, limiting the parameter values of all original model parameters of each round of model, obtaining the parameter values of the limited model parameters corresponding to all the original model parameters of each round of model, and performing model training; and the parameter values of the image recognition model parameters included by the image recognition model are the parameter values of the limiting model parameters adopted by the last round of model training. The device is used for executing the method. The image recognition model training method and device provided by the embodiment of the invention improve the training efficiency of the image recognition model.

Description

Image recognition model training method and device

Technical Field

The invention relates to the technical field of computers, in particular to an image recognition model training method and device.

Background

Deep learning is a research direction of machine learning, and is applied to image recognition, character recognition, voice recognition, semantic analysis and the like.

The deep learning technology can be applied to image recognition, the training of an image recognition model needs to be carried out firstly, and the trained image recognition model is put into practical application. In the training process of the image recognition model, it is found that the loss function is reduced more and more slowly in the later stage of model training, the training period is prolonged, and the training efficiency of the image recognition model is low.

Disclosure of Invention

For solving the problems in the prior art, embodiments of the present invention provide a method and an apparatus for training an image recognition model, which can at least partially solve the problems in the prior art.

In a first aspect, the present invention provides a training method for an image recognition model, including:

acquiring a training data set of a target to be recognized;

training an original model according to a training data set to obtain an image recognition model; in each round of training process, limiting the parameter values of the original model parameters of each round of model, obtaining the parameter values of the limited model parameters corresponding to the original model parameters of each round of model, and performing model training; and the parameter values of the image recognition model parameters included by the image recognition model are the parameter values of the limiting model parameters adopted by the last round of model training.

In a second aspect, the present invention provides an image recognition model training apparatus, including:

the acquisition unit is used for acquiring a training data set of a target to be recognized;

the training unit is used for training the original model according to a training data set to obtain an image recognition model; in each round of training process, limiting the parameter values of all original model parameters of each round of model, and obtaining the parameter values of the limited model parameters corresponding to all the original model parameters of each round of model to train the model; and the parameter values of the image recognition model parameters included by the image recognition model are the parameter values of the limiting model parameters adopted by the last round of model training.

In a third aspect, the present invention provides a computer device, including a memory, a processor, and instructions stored on the memory and executable on the processor, where the processor implements the image recognition model training method according to any one of the above embodiments when executing the program.

In a fourth aspect, the present invention provides a computer-readable storage medium, having stored thereon instructions, which when executed by a processor, implement the image recognition model training method according to any of the above embodiments.

The image recognition model training method and device provided by the embodiment of the invention have the advantages that the training data set of the target to be recognized is obtained, the original model is trained according to the training data set to obtain the image recognition model, the parameter value of each original model parameter of each model is limited in each round of training process to obtain the limited model corresponding to each original model parameter of each model and perform model training, the parameter value of the image recognition model parameter included in the image recognition model is the parameter value of the limited model parameter adopted in the last round of model training, and the training efficiency of the image recognition model is improved due to the limited number of the parameter values of the limited model parameter.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts.

Fig. 1 is a schematic flowchart of a training method for an image recognition model according to a first embodiment of the present invention.

Fig. 2 is a flowchart illustrating an image recognition model training method according to a second embodiment of the present invention.

Fig. 3 is a schematic structural diagram of an image recognition model training apparatus according to a third embodiment of the present invention.

Fig. 4 is a schematic structural diagram of an image recognition model training apparatus according to a fourth embodiment of the present invention.

Fig. 5 is a schematic physical structure diagram of an electronic device according to a fifth embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention. It should be noted that the embodiments and features of the embodiments in the present application may be arbitrarily combined with each other without conflict.

In order to facilitate understanding of the technical solutions provided in the present application, the following first describes relevant contents of the technical solutions in the present application.

For model training, the training speed of the model and whether the model is over-fitted need to be considered. The model training speed is related to the development period of the product, and whether the model is over-fitted or not influences the intelligence of the product.

The training range of the model parameters is 32-bit floating point number, and the numerical range is [ -3.4 × 10 [ ] ³⁸ ,3.4×10 ³⁸ ]The numerical representation is very broad. In actual training, the model weights are usually of the value [ -2.0,2.0 [ -2.0 [ ]]The expression range can be regarded as infinite, namely the expression granularity is 1/∞, and the information contained in each granularity is rare. When the number of training samples is insufficient, an overfitting phenomenon of the model easily occurs.

In many model training processes, it is found that after a model is trained for a certain period of time, because the expression range is infinite, the model parameters are updated only by changing the numerical values of the model parameters and the model parameters do not really go to the learning task target, so that the loss function is reduced more and more slowly and the training period is lengthened in the later stage of model training.

Therefore, the embodiment of the invention provides an image recognition model training method, which limits model parameters to accelerate convergence of the model parameters, improves the model training speed and can effectively avoid overfitting of the model.

The following describes a specific implementation process of the image recognition model training method provided by the embodiment of the present invention with a server as an execution subject. It can be understood that the execution subject of the image recognition model training method provided by the embodiment of the invention is not limited to the server.

Fig. 1 is a schematic flow chart of an image recognition model training method according to a first embodiment of the present invention, and as shown in fig. 1, the image recognition model training method according to the embodiment of the present invention includes:

s101, acquiring a training data set of a target to be recognized;

specifically, the server may obtain a training data set of the target to be recognized, where the training data set is used for model training. The target to be identified is selected according to actual needs, and the embodiment of the invention is not limited.

For example, in order to obtain a face recognition model through training, a preset number of face pictures can be collected and scaled to a uniform size, then each face picture is labeled, and the preset number of face pictures with uniform size and corresponding labels form a training data set.

S102, training an original model according to a training data set to obtain an image recognition model; in each round of training process, limiting the parameter values of all original model parameters of each round of model, obtaining the parameter values of the limited model parameters corresponding to all the original model parameters of each round of model, and performing model training; the image recognition model comprises image recognition model parameters which are parameter values for limiting model parameters adopted by the last round of model training.

After obtaining the image recognition model, the target to be recognized may be recognized according to the image recognition model. For example, if the training data set is a picture of a human face, the target to be recognized is a picture, and the human face is recognized by the image recognition model.

Specifically, the server trains the original model according to the training data set, and the image recognition model can be obtained through training. The model training process has multiple rounds of training, in each round of training process, the parameter values of all original model parameters of each round of model can be obtained, the parameter values of all the original model parameters are limited, the value number of the parameter values of the original model parameters is limited, the expression granularity of the parameter values of all the original model parameters is improved, the parameter values of the limited model parameters corresponding to all the original model parameters are obtained, model training is carried out through the parameter values of the limited model parameters corresponding to all the original model parameters of each round of model, and the parameter values of all the original model parameters are updated. The training speed of the model is accelerated due to the limited number of parameter values for limiting the model parameters. And when the end condition of the model training is met, the server acquires the parameter value of the limiting model parameter used by the last round of model training as the parameter value of the image recognition model parameter of the image recognition model. The original model may be a convolutional neural network model, and is selected according to actual needs, which is not limited in the embodiments of the present invention. The first round model is an original model, and parameter values of original model parameters of the original model can be randomly generated or preset. From the second round of training, each round of model is the model with the updated parameter values of the original model parameters, and the parameter values of the original model parameters of each round of model are the updated parameter values of the original model parameters of the previous round of model. In the embodiment of the present invention, the original model parameters refer to parameters that need to be updated in the training process, and include, but are not limited to, parameters such as weight and offset.

The image recognition model training method provided by the embodiment of the invention comprises the steps of obtaining a training data set of a target to be recognized, training an original model according to the training data set to obtain an image recognition model, limiting parameter values of original model parameters of each round of model in each round of training process to obtain a limiting model corresponding to each original model parameter of each round of model, and performing model training, wherein the image recognition model comprises the parameter values of the image recognition model parameters which are the parameter values of the limiting model parameters adopted in the last round of model training, and the training efficiency of the image recognition model is improved because the quantity of the parameter values of the limiting model parameters is limited. In addition, due to the fact that the number of the parameter values limiting the model parameters is limited and the model parameters can express more information, the phenomenon of overfitting of the model can be effectively avoided.

Fig. 2 is a schematic flow chart of an image recognition model training method according to a second embodiment of the present invention, and as shown in fig. 2, in addition to the above embodiments, the performing model training by limiting parameter values of original model parameters of each model in each round of training to obtain parameter values of limited model parameters corresponding to the original model parameters of each model in each round of training further includes:

s201, obtaining the maximum value of the parameter values of all original model parameters of the current wheel model, and rounding to obtain an adjustment reference value;

specifically, the server may obtain a parameter value of each original model parameter of the current wheel model, compare the parameter values of the original model parameters, obtain a maximum value of the parameter values of the original model parameters, and then perform rounding on the maximum value of the parameter values of the original model parameters to obtain an adjustment reference value. The current round model refers to a currently trained model. Since the parameter values of the model are updated every round of training, the parameter values of the original model parameters of each round of model vary.

S202, obtaining an amplification parameter value corresponding to each original model parameter of the current wheel model according to the parameter value, the maximum expression particle size value and the adjustment reference value of each original model parameter of the current wheel model;

specifically, for each original model parameter of the current wheel model, the server may obtain an amplification parameter value corresponding to the original model parameter according to the parameter value, the maximum expression particle size value, and the adjustment reference value of the original model parameter. The amplification parameter value may be an integer value. The range of values of the amplification parameter value is predetermined and includes a limited number of values.

S203, restoring the amplification parameter value corresponding to each original model parameter of the current wheel model to obtain the parameter value of the limiting model parameter corresponding to each original model parameter of the current wheel model;

specifically, for the amplification parameter value corresponding to each original model parameter of the current wheel model, the server may restore the amplification parameter value corresponding to each original model parameter, so that the amplification parameter value corresponding to each original model parameter is reduced to the size range of the parameter value of the original model parameter, and the parameter value of the constraint model parameter corresponding to each original model parameter is obtained.

S204, performing current wheel model training according to the parameter values of the limiting model parameters corresponding to the original model parameters of the current wheel model, and updating the parameter values of the original model parameters of the current wheel model to obtain the parameter values of the original model parameters of the next wheel model.

Specifically, the server performs the current wheel model training based on the parameter values of the constraint model parameters corresponding to the original model parameters of the current wheel model, that is, the model obtained by replacing the parameter values of the constraint model parameters corresponding to the original model parameters of the current wheel model with the parameter values of the constraint model parameters corresponding to the original model parameters of the current wheel model is used for performing the model training. When the parameters are updated, the parameter values of all original model parameters of the current wheel model are updated, and the updated parameter values of all original model parameters of the current wheel model are used as the parameter values of all original model parameters of the next wheel model of the current wheel model. The calculation of loss functions, gradients, and the like involved in the model training process is the same as that in the prior art, and is not described herein again.

On the basis of the foregoing embodiments, further, the obtaining, according to the parameter value, the maximum expression particle size value, and the adjustment reference value of each original model parameter of the current wheel model, an amplification parameter value corresponding to each original model parameter of the current wheel model includes:

according to the formula

Calculating to obtain an amplification parameter value corresponding to the ith original model parameter of the current wheel model,wherein,

represents the amplification parameter value corresponding to the ith original model parameter of the current wheel model,

a parameter value representing an ith original model parameter of the current wheel model,

it is indicated that the reference value is adjusted,

the value of the maximum expressed particle size is expressed,

is a constant number of times, and is,

the method is characterized in that rounding is performed, i is a positive integer, and i is less than or equal to the number of original model parameters of the current wheel model.

Specifically, the server compares the parameter value of the ith original model parameter of the current wheel model

Adjusting the reference value

Maximum expressed particle size value

Into formulas

In the method, the amplification parameter value corresponding to the ith original model parameter of the current wheel model can be obtained through calculation

Obtained by calculation

Are integers. The maximum expression particle size value is set according to actual needs, and the embodiment of the invention is not limited. The parameter value of the original model parameter is amplified to be mapped into the range interval of the amplified parameter value, so that the value number of the parameter value is limited.

For example, setting the amplification parameter value to a range of [ -100,100], then the maximum expression particle size value is 100.

For example, the range of the amplification parameter value can be set according to the final task difficulty, such as image classification, and finally divided into 1000 classes, and if each class is expressed by 200 numerical values, the range of the amplification parameter value can be set to be [ -100 × 1000,100 × 1000], and the maximum expression particle size value is 100000.

On the basis of the foregoing embodiments, further, the image recognition model training method provided in the embodiment of the present invention further includes:

if judged to acquire

Is greater than

Then will be

Value replacement by

；

If judged to acquire

Is less than

Then will be

Value replacement by

。

Specifically, in calculating

Then, will

Absolute value of and

make a comparison if

Is greater than

Description of the invention

Out of the limit range if

At a positive value, then

Value replacement by

If, if

Negative values then will

Value replacement by-

. Limiting

Maximum and minimum values of the values are to prevent

After amplification, the calculation result is beyond the numerical range allowed by the computer.

For example, the amplification parameter value is set to a range of [ -100,100 [ ]]Then the maximum expression particle size value is 100. If obtained by calculation

105, then will

The value of (1) is changed to 100; if obtained by calculation

Is-108, then will

The value of (A) is changed to-100.

On the basis of the foregoing embodiments, further, the reducing the amplification parameter value corresponding to each original model parameter of the current wheel model, and obtaining the parameter value of the constraint model parameter corresponding to each original model parameter of the current wheel model includes:

according to the formula

Calculating to obtain the parameter value of the limited model parameter corresponding to the ith original model parameter of the current wheel model, wherein,

a parameter value of a constraint model parameter corresponding to an ith original model parameter representing a current wheel model,

is shown asThe amplification parameter value corresponding to the ith original model parameter of the front wheel model,

it is indicated that the reference value is adjusted,

the value of the maximum expressed particle size is expressed,

and the number of the original model parameters of the current wheel model is less than or equal to the number of the original model parameters of the current wheel model.

Specifically, the server amplifies a parameter value corresponding to the ith original model parameter of the current wheel model

Adjusting the reference value

Maximum expressed particle size value

Into formulas

In the method, the parameter value of the limited model parameter corresponding to the ith original model parameter of the current wheel model can be calculated and obtained

. Limiting parameter values of model parameters

For model training.

On the basis of the above embodiments, further, the original model parameters are weights and offsets.

For example, for a deep learning model, a plurality of neurons are included, each neuron corresponding to at least one weight and one bias. The proto-model parameters of the deep learning model are the weights and biases of all neurons.

The following describes a specific implementation process of the image recognition model training method provided by the embodiment of the present invention, taking a training process of a handwritten number recognition model as an example.

And collecting N handwritten digital pictures, and taking the number written in each handwritten digital picture as a label corresponding to the corresponding handwritten digital picture to obtain a training data set.

The original model adopts a Convolutional Neural Network (CNN) model, a Deep Neural network (Deep DNN) model, and the like, and is selected according to actual needs, which is not limited in the embodiments of the present invention. The following description will take the original model as an example, where the CNN model includes a first feature extraction layer, a second feature extraction layer and a classifier, and the first feature extraction layer includes weights

And bias

The second feature extraction layer includes weights

And bias

The classifier includes weights

And bias

。

Setting the range of the amplification parameter values to be [ -100,100], and setting the total number of the amplification parameter values to be 201. The maximum expression particle size value was 100.

In the process of model training based on the training data set, the first round of training is performed, the first round of model is a CNN model, and a primary model of the first round of modelForm parameter

、

、

、

And

the parameter values of (a) are randomly generated. Obtaining

、

、

、

And

and (4) rounding the maximum value of the parameter values of the six parameters to obtain an adjustment reference value of the first round of training.

According to the formula

Will be

、

、

、

And

the parameter values of the six parameters are respectively brought into the formula and can be obtained by calculation

、

、

、

And

the corresponding amplification parameter values are respectively recorded as

、

、

、

、

And

. According to the formula

Will be

、

、

、

、

And

are respectively substituted into the formula to obtain

、

、

、

And

the corresponding parameter values of the constraint model parameters are respectively recorded as

、

And

。

by using

、

And

replacement in CNN model

、

、

、

And

obtaining a training model corresponding to the first round of models, performing model training on the training model corresponding to the first round of models, performing correlation calculation, and updating after calculating to obtain a gradient

、

、

、

And

of the parameter values of

、

、

、

And

and taking the updated parameter value as the parameter value of the original model parameter of the second round model. The second round model is the CNN model is updated

、

、

And, and

a model of the parameter values of (a).

From the second round of training to the end of model training, the specific process of each round of training is similar to the process of the first round of training, which is not repeated herein, and when the calculated value of the loss function reaches the preset requirement, the model training is ended. After the model training is finished, the training model corresponding to the model in the last round can be used as a handwritten number recognition model.

The following describes a specific implementation process of the image recognition model training method provided by the embodiment of the present invention, taking a training process of an object recognition model for recognizing an object from a picture as an example.

And collecting M object pictures, and taking the object in each object picture as an object label corresponding to each object picture to obtain an object identification training data set.

The original model adopts a RESNET50 model, and the RESNET50 model comprises 49 convolutional layers and 1 full-connection layer. Each convolutional layer includes a weight and an offset.

Setting the range of the amplification parameter values to be [ -100000,100000], and setting the total number of the amplification parameter values to be 200001. The maximum expressed particle size value was 100000.

In the process of model training based on the object recognition training data set, the first round of training is conducted, the first round of model is a RESNET50 model, original model parameters of the first round of model are weights and offsets included by each convolutional layer, and parameter values of the weights and the offsets included by each convolutional layer are generated randomly. And acquiring the maximum value of the weight and the biased parameter value included in each convolutional layer, and rounding the maximum value to obtain an adjustment reference value of the first round of training.

According to the formula

The values of the weight and offset parameters included in each convolutional layer are respectively introduced into the formula, and the values of the amplification parameters corresponding to the weight and offset parameters included in each convolutional layer can be calculated and obtained and are respectively recorded as

、

、

、

、

……

. According to the formula

Will be

、

、

、

、

……

Respectively substituting into the formula, calculating to obtain the parameter values of the constraint model parameters corresponding to the weight and bias included in each convolution layer, and respectively recording as

、

……

。

By using

、

……

Replacing the weight and offset parameter values included in each convolution layer of the first round model to obtain a training model corresponding to the first round model, performing model training on a constraint model corresponding to the first round model, performing correlation calculation, after calculating to obtain a gradient, updating the weight and offset parameter values included in each convolution layer of the first round model, and taking the updated weight and offset parameter values included in each convolution layer of the first round model as the original model parameter values of the second round model, namely the weight and offset parameter values included in each convolution layer of the second round model. The second round model updates the model for the RESNET50 model with the parameter values for the weights and biases included in each convolutional layer.

From the second round of training to the end of model training, the specific process of each round of training is similar to the process of the first round of training, which is not repeated herein, and when the calculated value of the loss function reaches the preset requirement, the model training is ended. And after the model training is finished, the training model corresponding to the model in the last round can be used as the object recognition model.

According to the image recognition model training method provided by the embodiment of the invention, because the expressible range of the model parameters is limited, the information contained in the expression granularity of the model parameters is more, the model is regressed to the characteristic analysis of the sample instead of the meaningless value range change, the model convergence can be accelerated, the model training speed is increased, the lifting rate of the model training is in direct proportion to the number of the model parameters, and the lifting rate is increased when the number of the model parameters is more. Because the expression meaning of the model parameters is richer, the possibility of the overfitting of the model is reduced, and the generalization capability of the model is improved.

The image recognition model training method provided by the embodiment of the invention can be applied to tasks such as image classification, face recognition, target detection, human shape detection and the like in the image field; tasks such as voice awakening, voice recognition, voiceprint recognition, voice synthesis and the like in the voice field; the tasks of named body recognition, semantic analysis, text generation, text classification and the like in the field of natural language processing accelerate model training and prevent overfitting of the model.

Fig. 3 is a schematic structural diagram of an image recognition model training apparatus according to a third embodiment of the present invention, and as shown in fig. 3, the image recognition model training apparatus according to the embodiment of the present invention includes an obtaining unit 301 and a training unit 302, where:

the acquiring unit 301 is configured to acquire a training data set of a target to be recognized; the training unit 302 is configured to train the original model according to a training data set to obtain an image recognition model; in each round of training process, limiting the parameter values of all original model parameters of each round of model, and obtaining the parameter values of the limited model parameters corresponding to all the original model parameters of each round of model to train the model; and the parameter values of the image recognition model parameters included by the image recognition model are the parameter values of the limiting model parameters adopted by the last round of model training.

Specifically, the obtaining unit 301 may obtain a training data set of the target to be identified, where the training data set is used for model training. The target to be identified is selected according to actual needs, and the embodiment of the invention is not limited.

The training unit 302 trains the original model according to the training data set, and may train to obtain the image recognition model. The model training process has multiple rounds of training, in each round of training process, the parameter value of each original model parameter of each round of model can be obtained, the parameter value of each original model parameter is limited, the value number of the parameter value of the original model parameter is limited, the expression granularity of the parameter value of each original model parameter is improved, the parameter value of the limited model parameter corresponding to each original model parameter is obtained, model training is carried out through the parameter value of the limited model parameter corresponding to each original model parameter of each round of model, and the parameter value of each original model parameter is updated. The training speed of the model is accelerated due to the limited number of parameter values for limiting the model parameters. And when the end condition of the model training is met, the server acquires the parameter value of the limiting model parameter used by the last round of model training as the parameter value of the image recognition model parameter of the image recognition model. The original model may be a convolutional neural network model, and is selected according to actual needs, which is not limited in the embodiments of the present invention. The first round model is an original model, and parameter values of original model parameters of the original model can be randomly generated or preset. From the second round of training, each round of model is the model with the updated parameter values of the original model parameters, and the parameter values of the original model parameters of each round of model are the updated parameter values of the original model parameters of the previous round of model. In the embodiment of the present invention, the original model parameters refer to parameters that need to be updated in the training process, and include, but are not limited to, parameters such as weight and offset.

The image recognition model training device provided by the embodiment of the invention obtains a training data set of a target to be recognized, trains an original model according to the training data set to obtain an image recognition model, limits the parameter values of each original model parameter of each model in each round of training process to obtain a limited model corresponding to each original model parameter of each model in each round of training and performs model training, wherein the parameter values of the image recognition model parameters included in the image recognition model are the parameter values of the limited model parameters adopted in the last round of model training, and the training efficiency of the image recognition model is improved because the quantity of the parameter values of the limited model parameters is limited. In addition, the number of the parameter values limiting the model parameters is limited, and the model parameters can express more information, so that the phenomenon of overfitting of the model can be effectively avoided.

Fig. 4 is a schematic structural diagram of an image recognition model training apparatus according to a fourth embodiment of the present invention, and as shown in fig. 4, on the basis of the foregoing embodiments, further, the training unit 302 includes an acquiring subunit 3021, an amplifying subunit 3022, a restoring subunit 3023, and a training subunit 3024, where:

the obtaining subunit 3021 is configured to obtain a maximum value of parameter values of each original model parameter of the current wheel model, and perform rounding to obtain an adjustment reference value; the amplification subunit 3022 is configured to obtain an amplification parameter value corresponding to each original model parameter of the current wheel model according to the parameter value, the maximum expression particle size value, and the adjustment reference value of each original model parameter of the current wheel model; the atomic reduction unit 3023 is configured to reduce the amplification parameter value corresponding to each original model parameter of the current wheel model, and obtain a parameter value amplification parameter value of the constraint model parameter corresponding to each original model parameter of the current wheel model; the training subunit 3024 is configured to perform model training on the current wheel model according to the parameter values of the constraint model parameters corresponding to the original model parameters of the current wheel model, and update the parameter values of the original model parameters of the current wheel model to obtain the parameter values of the original model parameters of the next wheel model.

Specifically, the obtaining subunit 3021 may obtain the parameter value of each original model parameter of the current wheel model, compare the parameter values of the original model parameters, obtain the maximum value of the parameter value of each original model parameter, and then perform rounding on the maximum value of the parameter value of each original model parameter to obtain the adjustment reference value. The current round model refers to a currently trained model. Since the parameter values of the model are updated every round of training, the parameter values of the original model parameters of each round of model vary.

For each original model parameter of the current wheel model, the amplification subunit 3022 may obtain an amplification parameter value corresponding to the original model parameter according to the parameter value, the maximum expression particle size value, and the adjustment reference value of the original model parameter. The amplification parameter value may be an integer value. The range of values of the amplification parameter value is predetermined and includes a limited number of values.

For the amplification parameter value corresponding to each original model parameter of the current wheel model, the reduction subunit 3023 may reduce the amplification parameter value corresponding to each original model parameter, so that the amplification parameter value corresponding to each original model parameter is reduced to the size range of the parameter value of the original model parameter, and the parameter value of the limited model parameter corresponding to each original model parameter is obtained. The size range of the parameter values scaled to the original model parameters is to prevent the calculation result from exceeding the computer allowable value range after the parameter values are amplified.

The training subunit 3024 performs model training on the current wheel model based on the parameter values of the constraint model parameters corresponding to the original model parameters of the current wheel model, that is, performs model training by replacing the parameter values of the original model parameters of the current wheel model with the parameter values of the constraint model parameters corresponding to the original model parameters of the current wheel model. When the parameters are updated, the parameter values of all original model parameters of the current wheel model are updated, and the updated parameter values of all original model parameters of the current wheel model are used as the parameter values of all original model parameters of the next wheel model of the current wheel model. The calculation of loss functions, gradients, and the like involved in the model training process is the same as that in the prior art, and is not described herein again.

On the basis of the above embodiments, further, the amplifying subunit 3022 is specifically configured to:

according to the formula

Calculating to obtain an amplification parameter value corresponding to the ith original model parameter of the current wheel model, wherein,

it is indicated that the reference value is adjusted,

the value of the maximum expressed particle size is expressed,

is a constant number of times, and is,

Specifically, the amplification subunit 3022 converts the parameter value of the ith original model parameter of the current wheel model into the parameter value of the ith original model parameter

Adjusting the reference value

Maximum expressed particle size value

Into formulas

Obtained by calculation

Is an integer. The maximum expression particle size value is set according to actual needs, and the embodiment of the invention is not limited. The parameter value of the original model parameter is amplified to be mapped into the range interval of the amplified parameter value, so that the value number of the parameter value is limited.

On the basis of the above embodiments, further, the amplifying subunit 3022 is further configured to:

if judged to acquire

Is greater than

Then will be

Take a value of

；

If judged to acquire

Is less than

Then will be

Take a value of

。

Specifically, in calculating

Thereafter, the amplification subunit 3022 will amplify the signal

Absolute value of and

make a comparison if

Is greater than

Description of the invention

Out of the limit range if

At a positive value, then

Value replacement by

If, if

Negative values then will

Value replacement

. On the basis of the above embodiments, further, the reducing subunit 3023 is specifically configured to:

according to the formula

it is indicated that the reference value is adjusted,

the value of the maximum expressed particle size is expressed,

Specifically, the reduction subunit 3023 uses the amplification parameter value corresponding to the ith original model parameter of the current wheel model

Adjusting the reference value

Maximum expressed particle size value

Into formulas

. Limiting parameter values of model parameters

For model training.

The embodiment of the apparatus provided in the embodiment of the present invention may be specifically configured to execute the processing flows of the above method embodiments, and the functions of the apparatus are not described herein again, and refer to the detailed description of the above method embodiments.

Fig. 5 is a schematic physical structure diagram of an electronic device according to a fifth embodiment of the present invention, and as shown in fig. 5, the electronic device may include: a processor (processor)501, a communication Interface (Communications Interface)502, a memory (memory)503, and a communication bus 504, wherein the processor 501, the communication Interface 502, and the memory 503 are configured to communicate with each other via the communication bus 504. The processor 501 may call logic instructions in the memory 503 to perform the following method: acquiring a training data set of a target to be recognized; training an original model according to a training data set to obtain an image recognition model; in each round of training process, limiting the parameter values of all original model parameters of each round of model, obtaining the parameter values of the limited model parameters corresponding to all the original model parameters of each round of model, and performing model training; and the parameter values of the image recognition model parameters included by the image recognition model are the parameter values of the limiting model parameters adopted by the last round of model training.

In addition, the logic instructions in the memory 503 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

The present embodiment discloses a computer program product comprising a computer program stored on a non-transitory computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, enable the computer to perform the method provided by the above-mentioned method embodiments, for example, comprising: acquiring a training data set of a target to be recognized; training an original model according to a training data set to obtain an image recognition model; in each round of training process, limiting the parameter values of all original model parameters of each round of model, obtaining the parameter values of the limited model parameters corresponding to all the original model parameters of each round of model, and performing model training; and the parameter values of the image recognition model parameters included by the image recognition model are the parameter values of the limiting model parameters adopted by the last round of model training.

The present embodiment provides a computer-readable storage medium, which stores instructions that cause the computer to execute the method provided by the above method embodiments, for example, including: acquiring a training data set of a target to be recognized; training an original model according to a training data set to obtain an image recognition model; in each round of training process, limiting the parameter values of all original model parameters of each round of model, obtaining the parameter values of the limited model parameters corresponding to all the original model parameters of each round of model, and performing model training; and the parameter values of the image recognition model parameters included by the image recognition model are the parameter values of the limiting model parameters adopted by the last round of model training.

As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In the description herein, reference to the description of the terms "one embodiment," "a particular embodiment," "some embodiments," "for example," "an example," "a particular example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.

The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims

1. An image recognition model training method is characterized by comprising the following steps:

acquiring a training data set of a target to be recognized;

training an original model according to a training data set to obtain an image recognition model; in each round of training process, limiting the parameter values of all original model parameters of each round of model, obtaining the parameter values of the limited model parameters corresponding to all the original model parameters of each round of model, and performing model training; and the parameter values of the image recognition model parameters included by the image recognition model are the parameter values of the limiting model parameters adopted by the last round of model training.

2. The method according to claim 1, wherein the limiting of the parameter values of the original model parameters of each model in each round of training process, and the obtaining of the parameter values of the limited model parameters corresponding to the original model parameters of each model in each round of training and the model training comprises:

acquiring the maximum value of the parameter values of all original model parameters of the current wheel model, and rounding to obtain an adjustment reference value;

obtaining an amplification parameter value corresponding to each original model parameter of the current wheel model according to the parameter value, the maximum expression particle size value and the adjustment reference value of each original model parameter of the current wheel model;

restoring the amplification parameter value corresponding to each original model parameter of the current wheel model to obtain the parameter value of the limiting model parameter corresponding to each original model parameter of the current wheel model;

and performing current wheel model training according to the parameter values of the limiting model parameters corresponding to the original model parameters of the current wheel model, and updating the parameter values of the original model parameters of the current wheel model to obtain the parameter values of the original model parameters of the next wheel model.

3. The method of claim 2, wherein obtaining the amplified parameter value corresponding to each original model parameter of the current wheel model according to the parameter value, the maximum expression particle size value and the adjustment reference value of each original model parameter of the current wheel model comprises:

according to the formula

it is indicated that the reference value is adjusted,

the value of the maximum expressed particle size is expressed,

is a constant number of times, and is,

4. The method of claim 3, further comprising:

if judged to acquire

Is greater than

Then will be

Take a value of

；

If judged to acquire

Is less than

Then will be

Take a value of

。

5. The method of claim 2, wherein the restoring the amplified parameter value corresponding to each original model parameter of the current wheel model to obtain the parameter value of the constrained model parameter corresponding to each original model parameter of the current wheel model comprises:

according to the formula

ith original model parameter correspondence representing current wheel modelThe value of the amplification parameter of (a),

it is indicated that the reference value is adjusted,

the value of the maximum expressed particle size is expressed,

6. The method according to any of claims 1 to 5, wherein the proto-model parameters are weights and biases.

7. An image recognition model training apparatus, comprising:

the training unit is used for training the original model according to the training data set to obtain an image recognition model; in each round of training process, limiting the parameter values of all original model parameters of each round of model, obtaining the parameter values of the limited model parameters corresponding to all the original model parameters of each round of model, and performing model training; and the parameter values of the image recognition model parameters included by the image recognition model are the parameter values of the limiting model parameters adopted by the last round of model training.

8. The apparatus of claim 7, wherein the training unit comprises:

the acquisition subunit is used for acquiring the maximum value of the parameter values of all the original model parameters of the current wheel model and rounding the maximum value to obtain an adjustment reference value;

the amplifying subunit is used for obtaining an amplifying parameter value corresponding to each original model parameter of the current wheel model according to the parameter value, the maximum expression particle size value and the adjustment reference value of each original model parameter of the current wheel model;

the restoring subunit is used for restoring the amplification parameter value corresponding to each original model parameter of the current wheel model to obtain a parameter value amplification parameter value of the limited model parameter corresponding to each original model parameter of the current wheel model;

and the training subunit is used for performing model training on the current wheel model according to the parameter values of the limited model parameters corresponding to the original model parameters of the current wheel model, updating the parameter values of the original model parameters of the current wheel model, and obtaining the parameter values of the original model parameters of the next wheel model.

9. The apparatus of claim 8, wherein the amplification subunit is specifically configured to:

according to the formula

it is indicated that the reference value is adjusted,

the value of the maximum expressed particle size is expressed,

is a constant number of times, and is,

10. The apparatus of claim 9, wherein the amplification subunit is further configured to:

if judged to acquire

Is greater than

Then will be

Take a value of

；

If judged to acquire

Is less than

Then will be

Take a value of

。

11. The apparatus of claim 8, wherein the further atomic unit is specifically configured to:

according to the formula

Calculating to obtain the parameter value of the limited model parameter corresponding to the ith original model parameter of the current wheel model, wherein the parameter value of the limited model parameter corresponding to the ith original model parameter of the current wheel model is represented,

it is indicated that the reference value is adjusted,

the value of the maximum expressed particle size is expressed,

12. The apparatus according to any of the claims 7 to 11, characterized in that the proto-model parameters are weights and biases.

13. A computer device comprising a memory, a processor, and instructions stored on the memory and executable on the processor, wherein the processor implements the method of any of claims 1 to 6 when executing the instructions.

14. A computer-readable storage medium, characterized in that the computer-readable storage medium stores instructions that, when executed by a processor, implement the method of any of claims 1 to 6.