CN111814962B - Parameter acquisition method and device for identification model, electronic equipment and storage medium - Google Patents

Parameter acquisition method and device for identification model, electronic equipment and storage medium Download PDF

Info

Publication number
CN111814962B
CN111814962B CN202010656659.8A CN202010656659A CN111814962B CN 111814962 B CN111814962 B CN 111814962B CN 202010656659 A CN202010656659 A CN 202010656659A CN 111814962 B CN111814962 B CN 111814962B
Authority
CN
China
Prior art keywords
standard
data set
identification model
loss function
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010656659.8A
Other languages
Chinese (zh)
Other versions
CN111814962A (en
Inventor
凡金龙
刘莉红
刘玉宇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202010656659.8A priority Critical patent/CN111814962B/en
Publication of CN111814962A publication Critical patent/CN111814962A/en
Priority to PCT/CN2020/131974 priority patent/WO2021151345A1/en
Application granted granted Critical
Publication of CN111814962B publication Critical patent/CN111814962B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

The invention relates to a data processing technology, and discloses a parameter acquisition method of an identification model, which comprises the following steps: acquiring a training data set containing a noise label, and performing data standardization processing on the training data set to obtain a standard data set; establishing an identification model, and training the identification model by using a standard data set to obtain a standard identification model containing initialization parameters; constructing a noise probability transition matrix of the standard data set; constructing a loss function based on the noise probability transition matrix; and calculating the update parameters of the standard identification model by using the loss function, and replacing the update parameters with the initialization parameters. In addition, the present invention relates to blockchain techniques, and training data sets may be stored in blockchain nodes. The invention also discloses a parameter acquisition device of the identification model, electronic equipment and a storage medium. The invention can improve the accuracy of the acquired model parameters.

Description

Parameter acquisition method and device for identification model, electronic equipment and storage medium
Technical Field
The present invention relates to the field of data processing technologies, and in particular, to a method and apparatus for acquiring parameters of an identification model, an electronic device, and a computer readable storage medium.
Background
With the rise of artificial intelligence, more and more technicians train built models by using data with labels to acquire required model parameters, and further, use the model parameters to enable the models to realize specific functions. However, a model is often trained by massive data with labels, and the manual labeling of the data is not only inefficient, but also a large number of false labels, i.e. noise labels, occur during the labeling process, and accurate model parameters cannot be obtained when training the model by using the data with noise labels.
Therefore, how to train a model by using the data with noise labels to obtain more accurate model parameters is an important point of increasing attention.
Disclosure of Invention
The invention provides a method, a device, an electronic device and a computer readable storage medium for acquiring parameters of an identification model, and mainly aims to provide a method for improving the accuracy of acquired model parameters.
In order to achieve the above object, the present invention provides a method for obtaining parameters of an identification model, including:
acquiring a training data set containing a noise label, and performing data standardization processing on the training data set to obtain a standard data set;
Establishing an identification model based on a multi-layer deep neural network, and training the identification model by utilizing the standard data set to obtain a standard identification model containing initialization parameters;
constructing a noise probability transition matrix of the standard data set;
Constructing a loss function based on the noise probability transition matrix;
and calculating the updating parameters of the standard identification model by using the loss function, and replacing the updating parameters with the initializing parameters.
Optionally, the data normalization processing is performed on the training data set, including one or a combination of several of the following:
Removing unique attribute values in the training data set;
Filling the missing values of the training data set;
and carrying out data normalization on the training data set.
Optionally, the performing data normalization on the training data set includes:
data normalization of the training dataset was performed using the following normalization algorithm:
Wherein x is standard data after data normalization, S old is data in the training data set, S max is the maximum value of the values of S old, and S min is the minimum value of the values of S old.
Optionally, the noise probability transition matrix includes:
Q∈[0,1]c×c
Wherein the size of c is the same as the number of standard data in the standard data set.
Optionally, the loss function includes a forward loss function, the forward loss function being:
Wherein Q T is a transposed matrix of the noise probability transition matrix, ψ is an error factor of the recognition model, h is the multi-layer deep neural network, A loss value that is the forward loss function.
Optionally, the loss function further includes a backward loss function, and the backward loss function is:
Wherein l (h) is the loss value of the backward loss function, y is a preset standard label of any standard data x in the standard dataset, For the predictive label of the standard recognition model pair x, Q is the noise probability transition matrix, p (x, y) is the joint distribution of the standard data x and the preset standard label y corresponding to x, and the method comprises the following steps of/>Is the predicted value of p (x, y).
Optionally, the calculating the update parameter of the standard identification model by using the loss function includes:
Acquiring a preset standard label of standard data in the standard data set and a prediction label of the standard recognition model on the standard data in the standard data set;
calculating a difference value between the predictive label and the standard label by using a loss function;
When the difference value is within a preset threshold value interval, calculating an update parameter of the standard identification model by using a gradient descent algorithm;
When the difference value is larger than the upper limit of the threshold interval, calculating a probability value of the standard label as a noise label by using the loss function;
and when the probability value is smaller than a preset probability threshold value, calculating the updating parameters of the standard identification model by using a gradient descent algorithm.
In order to solve the above-mentioned problem, the present invention also provides a parameter acquisition apparatus for identifying a model, the apparatus comprising:
the training data acquisition module is used for acquiring a training data set containing a noise label, and carrying out data standardization processing on the training data set to obtain a standard data set;
The recognition model construction module is used for building a recognition model based on the multi-layer deep neural network, and training the recognition model by utilizing the standard data set to obtain a standard recognition model containing initialization parameters;
The transition matrix construction module is used for constructing a noise probability transition matrix of the standard data set;
the loss function construction module is used for constructing a loss function based on the noise probability transition matrix;
And the model parameter updating module is used for calculating the updating parameters of the standard identification model by using the loss function and replacing the updating parameters with the initialization parameters.
In order to solve the above-mentioned problems, the present invention also provides an electronic apparatus including:
a memory storing at least one instruction; and
And a processor executing the instructions stored in the memory to implement the method for acquiring parameters of the identification model according to any one of the above.
In order to solve the above-mentioned problems, the present invention also provides a computer-readable storage medium including a storage data area storing created data and a storage program area storing a computer program; wherein the computer program, when executed by a processor, implements the method for obtaining parameters of the identification model according to any one of the above.
According to the embodiment of the invention, after the training data set containing the noise label is obtained, the training data set is subjected to standardized processing, so that the processing efficiency of the training data is improved; after a standard identification model containing initialization parameters is obtained, a noise probability transfer matrix of the standard data set is constructed, so that the applicability of a loss function constructed according to the noise transfer matrix to the model is improved, and more accurate model parameters can be trained by using the loss function later; and constructing a loss function based on the noise probability transition matrix, and calculating the update parameters of the standard identification model by using the loss function, so that more accurate model parameters can be obtained, and the aim of improving the accuracy of obtaining the model parameters is fulfilled. Therefore, the method, the device and the computer readable storage medium for acquiring the parameters of the identification model can provide a method for improving the accuracy of acquiring the parameters of the model.
Drawings
FIG. 1 is a flowchart of a method for obtaining parameters of an identification model according to an embodiment of the present invention;
FIG. 2 is a schematic block diagram of a device for obtaining parameters of an identification model according to an embodiment of the present invention;
Fig. 3 is a schematic diagram of an internal structure of an electronic device for implementing a method for obtaining parameters of an identification model according to an embodiment of the present invention;
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The execution body of the method for acquiring the parameters of the identification model provided by the embodiment of the application comprises at least one of an electronic device, such as a server, a terminal and the like, which can be configured to execute the method provided by the embodiment of the application. In other words, the parameter acquiring method of the identification model may be performed by software or hardware installed in a terminal device or a server device, and the software may be a blockchain platform. The service end includes but is not limited to: a single server, a server cluster, a cloud server or a cloud server cluster, and the like.
The invention provides a parameter acquisition method of an identification model. Referring to fig. 1, a flowchart of a method for obtaining parameters of an identification model according to an embodiment of the invention is shown. The method may be performed by an apparatus, which may be implemented in software and/or hardware.
In this embodiment, the method for acquiring parameters of the identification model includes:
s1, acquiring a training data set containing noise labels, and performing data standardization processing on the training data set to obtain a standard data set.
In the embodiment of the present invention, the training data set including the noise tag means that some data exists in the training data set, but the preset standard tag of the data does not correspond to the content of the data, that is, the preset standard tag is the data of the noise tag.
The embodiment of the invention can acquire the training data set from the blockchain node by using the python statement with the data grabbing function, and can also acquire the training data set from a database.
Preferably, the training data sets are stored in different nodes of the blockchain, and the efficiency of acquiring the training data sets can be improved by utilizing the high data throughput of the blockchain.
Specifically, the data normalization processing is performed on the training data set, which comprises one or a combination of several of the following:
Removing unique attribute values in the training data set;
filling the missing values of the training data set; and carrying out data normalization on the training data set.
In detail, the unique attribute values include, but are not limited to: data id, data number.
Because the unique attribute value can not describe the distribution rule of the data, the content of the data can be increased, so that more additional computing resources are required to be occupied when the data is processed, and the data processing efficiency is reduced.
Preferably, the embodiment of the invention uses a high-dimensional mapping method to map the data in the training data set to a pre-constructed high-dimensional space, and then uses a single-heat encoding technology to fill the missing data. The efficiency of searching missing data in the training data set can be improved by utilizing the multidimensional property of the high-order space, and the accuracy of data filling can be improved by utilizing the single-heat coding technology.
Specifically, the embodiment of the invention performs data normalization on the training data set by using the following normalization algorithm:
Wherein x is standard data after data normalization, S old is data in the training data set, S max is the maximum value of the values of S old, and S min is the minimum value of the values of S old.
It is emphasized that S max and S min are preset for defining the range of data in the training dataset.
And after the data normalization processing is completed, obtaining the standard data set.
S2, building an identification model based on the multi-layer deep neural network, and training the identification model by utilizing the standard data set to obtain a standard identification model containing initialization parameters.
In the embodiment of the invention, the multi-layer deep neural network is as follows:
h=(h(n)·h(n-1)·…·h(1))
Wherein h (n) represents the network structure of the nth layer of the multi-layer deep neural network.
When the multi-layer deep neural network is activated by a softmax function, the predicted value of the joint distribution p (x, y) of the preset standard label y corresponding to the standard data x and x can be outputAnd obtaining a predictive label for the standard data in the standard data set.
The softmax function is an activation function for transforming the output result of the multi-layer deep neural network into a preset form, and in the embodiment of the present invention, the output result of the multi-layer deep neural network is transformed into a probability form (i.e.). The difference between the preset standard label and the predictive label can be visually seen from the output result of the multi-layer deep neural network after being converted into the probability form, and the model parameters are adjusted according to the difference, so that the training efficiency of the model is improved.
Specifically, the embodiment of the invention inputs the standard data set into the recognition model, trains the recognition model by using the standard data set to obtain the initialization parameters of the recognition model, and determines the recognition model containing the initialization parameters as the standard recognition model.
Further, in an embodiment of the present invention, before the identifying model is built based on the multi-layer deep neural network, the method further includes:
building a feature space: wherein the feature space is used for storing a standard data set;
constructing a label space corresponding to the feature space: y= { e i:i e [ c ] }, wherein e is a preset standard label of standard data in the standard data set, [ c ] = {1 … c }, is any c positive integers, the number of the positive integers is the same as the number of data in the standard data set, and the label space is used for storing preset standard labels corresponding to the standard data in the feature space.
In addition, the joint distribution of the standard data x stored in the feature space and the preset standard label y corresponding to the standard data x in the label space is p (x, y):
p(x,y)=p(y|x)p(x)
Wherein p (x) is the frequency of any standard data x in the standard data set in the feature space, and p (y|x) is the frequency of the preset standard label in the label space when the standard data x appears.
In this embodiment, the feature space and the tag space are constructed, and the joint distribution of the standard data x and the tag y corresponding to the standard data x in the tag space is calculated as p (x, y), so that the relationship between the standard data in the standard data set and the tag corresponding to the standard data can be better displayed, and the efficiency of data processing is improved.
S3, constructing a noise probability transition matrix of the standard data set.
In an embodiment of the present invention, the noise probability transition matrix of the standard dataset may be expressed as:
Q∈[0,1]c×c
Wherein the size of c is the same as the number of standard data in the standard data set.
The noise probability transition matrix represents the distribution of noise tags in the data.
Specifically, the elements in the ith row and the jth column in the noise probability transition matrix Q represent the probability of occurrence of noise labels.
In detail, in the embodiment of the present invention, the noise probability transition matrix of the standard data set is as follows:
wherein Q is the noise probability transition matrix, alpha is any one of the standard data sets, beta i is a preset standard label corresponding to alpha, The predictive label generated for a for the standard recognition model pair α, β j is the noise label for α.
S4, constructing a loss function based on the noise probability transition matrix.
In an embodiment of the present invention, the loss function includes, but is not limited to: a backward loss function and a forward loss function.
Specifically, the forward loss function is:
Wherein Q T is a transposed matrix of the noise probability transition matrix, ψ is an error factor of the recognition model, h is the multi-layer deep neural network, A loss value that is the forward loss function.
Specifically, the postamble loss function is:
Wherein l (h) is the loss value of the backward loss function, y is a preset standard label of any standard data x in the standard dataset, For the predictive label of the standard recognition model pair x, Q is the noise probability transition matrix, p (x, y) is the joint distribution of the standard data x and the preset standard label y corresponding to x, and the method comprises the following steps of/>Is the predicted value of p (x, y).
The backward loss function is used for calculating a probability value that a label corresponding to the standard data x in a label space is a noise label, namely the possibility of error occurrence of a preset standard label of the standard data x.
S5, calculating the updating parameters of the standard identification model by using the loss function, and replacing the updating parameters with the initializing parameters.
In an embodiment of the present invention, the calculating, by using the loss function, an update parameter of the standard recognition model includes:
Acquiring a preset standard label of standard data in the standard data set and a prediction label of the standard recognition model on the standard data in the standard data set;
calculating a difference value between the predictive label and the standard label by using a loss function;
When the difference value is within a preset threshold value interval, calculating an update parameter of the standard identification model by using a gradient descent algorithm;
When the difference value is larger than the upper limit of the threshold interval, calculating a probability value of the standard label as a noise label by using the loss function;
and when the probability value is smaller than a preset probability threshold value, calculating the updating parameters of the standard identification model by using a gradient descent algorithm.
In the embodiment of the invention, the loss function is a forward loss function and/or a backward loss function.
In the embodiment of the invention, when the difference value is within the preset threshold value interval, the error of the recognition result of the standard recognition model is indicated, and the parameters of the standard recognition model are updated by using a gradient descent algorithm so as to improve the accuracy of the standard recognition model.
In this embodiment, the gradient descent algorithm includes, but is not limited to, a batch gradient descent algorithm, a random gradient descent algorithm, and a small batch gradient descent algorithm.
When the difference value is greater than the upper limit of the threshold interval, it may not be due to a recognition error of the standard recognition model. In practical application, due to the noise label, the error of the preset standard label of the standard data can also cause the difference value between the preset standard label and the predicted label to be larger than the upper limit of the threshold interval. Therefore, when the difference value is greater than the upper limit of the threshold interval, the embodiment of the invention calculates the probability value of the noise label as the preset standard label of the standard data by using the loss function, and when the probability value is smaller than the preset probability threshold, the recognition result of the standard recognition model is proved to be wrong, and then the update parameter of the standard recognition model is calculated by using a gradient descent algorithm.
Further, when the probability value is greater than or equal to the probability threshold, the embodiment of the invention corrects the preset standard label of the standard data.
Further, in the embodiment of the present invention, the initialization parameters are replaced by the update parameters, and a final recognition model can be obtained after the initialization parameters are replaced, where the final recognition model can be used to recognize input data, and the input data includes but is not limited to image data.
According to the embodiment of the invention, after the training data set containing the noise label is obtained, the training data set is subjected to standardized processing, so that the processing efficiency of the training data is improved; after a standard identification model containing initialization parameters is obtained, a noise probability transfer matrix of the standard data set is constructed, so that the applicability of a loss function constructed according to the noise transfer matrix to the model is improved, and more accurate model parameters can be trained by using the loss function later; and constructing a loss function based on the noise probability transition matrix, and calculating the update parameters of the standard identification model by using the loss function, so that more accurate model parameters can be obtained, and the aim of improving the accuracy of obtaining the model parameters is fulfilled. Therefore, the method for acquiring the parameters of the identification model can provide a method for improving the accuracy of acquiring the parameters of the model.
Fig. 2 is a schematic block diagram of a parameter acquisition device of the identification model according to the present invention.
The parameter acquiring apparatus 100 for identification model according to the present invention may be installed in an electronic device. Depending on the implemented functions, the parameter acquisition means of the recognition model may comprise a training data acquisition module 101, a recognition model construction module 102, a transfer matrix construction module 103, a loss function construction module 104 and a model parameter update module 105. The module of the present invention may also be referred to as a unit, meaning a series of computer program segments capable of being executed by the processor of the electronic device and of performing fixed functions, stored in the memory of the electronic device.
In the present embodiment, the functions concerning the respective modules/units are as follows:
the training data acquisition module 101 is configured to acquire a training data set including a noise tag, and perform data normalization processing on the training data set to obtain a standard data set;
The recognition model construction module 102 is configured to build a recognition model based on a multi-layer deep neural network, and train the recognition model by using the standard data set to obtain a standard recognition model including initialization parameters;
the transition matrix construction module 103 is configured to construct a noise probability transition matrix of the standard data set;
The loss function construction module 104 is configured to construct a loss function based on the noise probability transition matrix;
the model parameter updating module 105 is configured to calculate an updating parameter of the standard identification model by using the loss function, and replace the updating parameter with the initializing parameter.
In detail, the specific implementation modes of each module of the parameter acquisition device of the identification model are as follows:
The training data obtaining module 101 is configured to obtain a training data set including noise labels, and perform data normalization processing on the training data set to obtain a standard data set.
In the embodiment of the present invention, the training data set including the noise tag means that some data exists in the training data set, but the preset standard tag of the data does not correspond to the content of the data, that is, the preset standard tag is the data of the noise tag.
The embodiment of the invention can acquire the training data set from the blockchain node by using the python statement with the data grabbing function, and can also acquire the training data set from a database.
Preferably, the training data sets are stored in different nodes of the blockchain, and the efficiency of acquiring the training data sets can be improved by utilizing the high data throughput of the blockchain.
Specifically, the training data acquisition module 101 performs data normalization processing on the training data set, including one or a combination of several of the following:
Removing unique attribute values in the training data set;
Filling the missing values of the training data set;
and carrying out data normalization on the training data set.
In detail, the unique attribute values include, but are not limited to: data id, data number.
Because the unique attribute value can not describe the distribution rule of the data, the content of the data can be increased, so that more additional computing resources are required to be occupied when the data is processed, and the data processing efficiency is reduced.
Preferably, the embodiment of the invention uses high-dimensional mapping to map the data in the training data set to a pre-constructed high-dimensional space, and then uses the single-heat encoding technology to fill the missing data. The efficiency of searching missing data in the training data set can be improved by utilizing the multidimensional property of the high-order space, and the accuracy of data filling can be improved by utilizing the single-heat coding technology.
Specifically, the invention adopts the following standardized algorithm to normalize the data of the training data set:
Wherein x is standard data after data normalization, S old is data in the training data set, S max is the maximum value of the values of S old, and S min is the minimum value of the values of S old.
It is emphasized that S max and S min are preset for defining the range of data in the training dataset.
And after the data normalization processing is completed, obtaining the standard data set.
The recognition model construction module 102 is configured to build a recognition model based on a multi-layer deep neural network, and train the recognition model by using the standard data set to obtain a standard recognition model including initialization parameters.
In the embodiment of the invention, the multi-layer deep neural network is as follows:
h=(h(n)·h(n-1)·…·h(1))
Wherein h (n) represents the network structure of the nth layer of the multi-layer deep neural network.
When the multi-layer deep neural network is activated by a softmax function, the predicted value of the joint distribution p (x, y) of the preset standard label y corresponding to the standard data x and x can be outputAnd obtaining a predictive label for the standard data in the standard data set.
The softmax function is an activation function for transforming the output result of the multi-layer deep neural network into a preset form, and in the embodiment of the present invention, the output result of the multi-layer deep neural network is transformed into a probability form (i.e.). The difference between the preset standard label and the predictive label can be visually seen from the output result of the multi-layer deep neural network after being converted into the probability form, and the model parameters are adjusted according to the difference, so that the training efficiency of the model is improved.
Specifically, the embodiment of the invention inputs the standard data set into the recognition model, trains the recognition model by using the standard data set to obtain the initialization parameters of the recognition model, and determines the recognition model containing the initialization parameters as the standard recognition model.
Further, in an embodiment of the present invention, before the identification model is built based on the multi-layer deep neural network, the method further includes:
building a feature space: wherein the feature space is used for storing a standard data set;
constructing a label space corresponding to the feature space: y= { e i:i e [ c ] }, wherein e is a preset standard label of standard data in the standard data set, [ c ] = {1 … c }, is any c positive integers, the number of the positive integers is the same as the number of data in the standard data set, and the label space is used for storing preset standard labels corresponding to the standard data in the feature space.
In addition, the joint distribution of the standard data x stored in the feature space and the preset standard label y corresponding to the standard data x in the label space is p (x, y):
p(x,y)=p(y|x)p(x)
Wherein p (x) is the frequency of any standard data x in the standard data set in the feature space, and p (y|x) is the frequency of the preset standard label in the label space when the standard data x appears.
In this embodiment, the feature space and the tag space are constructed, and the joint distribution of the standard data x and the tag y corresponding to the standard data x in the tag space is calculated as p (x, y), so that the relationship between the standard data in the standard data set and the tag corresponding to the standard data can be better displayed, and the efficiency of data processing is improved.
The transition matrix construction module 103 is configured to construct a noise probability transition matrix of the standard dataset.
In an embodiment of the present invention, the noise probability transition matrix of the standard dataset may be expressed as:
Q∈[0,1]c×c
Wherein the size of c is the same as the number of standard data in the standard data set.
The noise probability transition matrix represents the distribution of noise tags in the data.
Specifically, the elements in the ith row and the jth column in the noise probability transition matrix Q represent the probability of occurrence of noise labels.
In detail, in the embodiment of the present invention, the noise probability transition matrix of the standard data set is as follows:
wherein Q is the noise probability transition matrix, alpha is any one of the standard data sets, beta i is a preset standard label corresponding to alpha, The predictive label generated for a for the standard recognition model pair α, β j is the noise label for α.
The loss function construction module 104 is configured to construct a loss function based on the noise probability transition matrix.
In an embodiment of the present invention, the loss function includes, but is not limited to: a backward loss function and a forward loss function.
Specifically, the forward loss function is:
Wherein Q T is a transposed matrix of the noise probability transition matrix, ψ is an error factor of the recognition model, h is the multi-layer deep neural network, A loss value that is the forward loss function.
Specifically, the postamble loss function is:
Wherein l (h) is the loss value of the backward loss function, y is a preset standard label of any standard data x in the standard dataset, For the predictive label of the standard recognition model pair x, Q is the noise probability transition matrix, p (x, y) is the joint distribution of the standard data x and the preset standard label y corresponding to x, and the method comprises the following steps of/>Is the predicted value of p (x, y).
The backward loss function is used for calculating a probability value that a label corresponding to the standard data x in a label space is a noise label, namely the possibility of error occurrence of a preset standard label of the standard data x.
The model parameter updating module 105 is configured to calculate an updating parameter of the standard identification model by using the loss function, and replace the updating parameter with the initializing parameter.
In the embodiment of the present invention, the model parameter updating module 105 calculates the updated parameters of the standard identification model by using the loss function, including:
Acquiring a preset standard label of standard data in the standard data set and a prediction label of the standard recognition model on the standard data in the standard data set;
calculating a difference value between the predictive label and the standard label by using a loss function;
When the difference value is within a preset threshold value interval, calculating an update parameter of the standard identification model by using a gradient descent algorithm;
When the difference value is larger than the upper limit of the threshold interval, calculating a probability value of the standard label as a noise label by using the loss function;
and when the probability value is smaller than a preset probability threshold value, calculating the updating parameters of the standard identification model by using a gradient descent algorithm.
In the embodiment of the invention, the loss function is a forward loss function and/or a backward loss function.
In the embodiment of the invention, when the difference value is within the preset threshold value interval, the error of the recognition result of the standard recognition model is indicated, and the parameters of the standard recognition model are updated by using a gradient descent algorithm so as to improve the accuracy of the standard recognition model.
In this embodiment, the gradient descent algorithm includes, but is not limited to, a batch gradient descent algorithm, a random gradient descent algorithm, and a small batch gradient descent algorithm.
When the difference value is greater than the upper limit of the threshold interval, it may not be due to a recognition error of the standard recognition model. In practical application, due to the noise label, the error of the preset standard label of the standard data can also cause the difference value between the preset standard label and the predicted label to be larger than the upper limit of the threshold interval. Therefore, when the difference value is greater than the upper limit of the threshold interval, the embodiment of the invention calculates the probability value of the noise label as the preset standard label of the standard data by using the loss function, and when the probability value is smaller than the preset probability threshold, the recognition result of the standard recognition model is proved to be wrong, and then the update parameter of the standard recognition model is calculated by using a gradient descent algorithm.
Further, when the probability value is greater than or equal to the probability threshold, the embodiment of the invention corrects the preset standard label of the standard data.
Further, in the embodiment of the present invention, the initialization parameters are replaced by the update parameters, and a final recognition model can be obtained after the initialization parameters are replaced, where the final recognition model can be used to recognize input data, and the input data includes but is not limited to image data.
According to the embodiment of the invention, after the training data set containing the noise label is obtained, the training data set is subjected to standardized processing, so that the processing efficiency of the training data is improved; after a standard identification model containing initialization parameters is obtained, a noise probability transfer matrix of the standard data set is constructed, so that the applicability of a loss function constructed according to the noise transfer matrix to the model is improved, and more accurate model parameters can be trained by using the loss function later; and constructing a loss function based on the noise probability transition matrix, and calculating the update parameters of the standard identification model by using the loss function, so that more accurate model parameters can be obtained, and the aim of improving the accuracy of obtaining the model parameters is fulfilled. Therefore, the parameter acquisition device for the identification model can provide a method for improving the accuracy of acquiring the model parameters.
Fig. 3 is a schematic structural diagram of an electronic device implementing a method for acquiring parameters of an identification model according to the present invention.
The electronic device 1 may comprise a processor 10, a memory 11 and a bus, and may further comprise a computer program stored in the memory 11 and executable on the processor 10, such as a parameter acquisition program 12 identifying a model.
The memory 11 includes at least one type of readable storage medium, including flash memory, a mobile hard disk, a multimedia card, a card memory (e.g., SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, etc. The memory 11 may in some embodiments be an internal storage unit of the electronic device 1, such as a removable hard disk of the electronic device 1. The memory 11 may in other embodiments also be an external storage device of the electronic device 1, such as a plug-in mobile hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the electronic device 1. Further, the memory 11 may also include both an internal storage unit and an external storage device of the electronic device 1. The memory 11 may be used not only for storing application software installed in the electronic device 1 and various types of data, such as codes of the parameter acquisition program 12 for identifying a model, but also for temporarily storing data that has been output or is to be output.
The processor 10 may be comprised of integrated circuits in some embodiments, for example, a single packaged integrated circuit, or may be comprised of multiple integrated circuits packaged with the same or different functions, including one or more central processing units (Central Processing unit, CPU), microprocessors, digital processing chips, graphics processors, combinations of various control chips, and the like. The processor 10 is a Control Unit (Control Unit) of the electronic device, connects respective parts of the entire electronic device using various interfaces and lines, executes or executes programs or modules stored in the memory 11 (for example, executes a parameter acquisition program of an identification model, etc.), and invokes data stored in the memory 11 to perform various functions of the electronic device 1 and process data.
The bus may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus, or an extended industry standard architecture (extended industry standard architecture, EISA) bus, among others. The bus may be classified as an address bus, a data bus, a control bus, etc. The bus is arranged to enable a connection communication between the memory 11 and at least one processor 10 etc.
Fig. 3 shows only an electronic device with components, it being understood by a person skilled in the art that the structure shown in fig. 3 does not constitute a limitation of the electronic device 1, and may comprise fewer or more components than shown, or may combine certain components, or may be arranged in different components.
For example, although not shown, the electronic device 1 may further include a power source (such as a battery) for supplying power to each component, and preferably, the power source may be logically connected to the at least one processor 10 through a power management device, so that functions of charge management, discharge management, power consumption management, and the like are implemented through the power management device. The power supply may also include one or more of any of a direct current or alternating current power supply, recharging device, power failure detection circuit, power converter or inverter, power status indicator, etc. The electronic device 1 may further include various sensors, bluetooth modules, wi-Fi modules, etc., which will not be described herein.
Further, the electronic device 1 may also comprise a network interface, optionally the network interface may comprise a wired interface and/or a wireless interface (e.g. WI-FI interface, bluetooth interface, etc.), typically used for establishing a communication connection between the electronic device 1 and other electronic devices.
The electronic device 1 may optionally further comprise a user interface, which may be a Display, an input unit, such as a Keyboard (Keyboard), or a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an OLED (Organic Light-Emitting Diode) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device 1 and for displaying a visual user interface.
It should be understood that the embodiments described are for illustrative purposes only and are not limited to this configuration in the scope of the patent application.
The parameter acquisition program 12 of the identification model stored in the memory 11 in the electronic device 1 is a combination of instructions that, when executed in the processor 10, can implement:
acquiring a training data set containing a noise label, and performing data standardization processing on the training data set to obtain a standard data set;
Establishing an identification model based on a multi-layer deep neural network, and training the identification model by utilizing the standard data set to obtain a standard identification model containing initialization parameters;
constructing a noise probability transition matrix of the standard data set;
Constructing a loss function based on the noise probability transition matrix;
and calculating the updating parameters of the standard identification model by using the loss function, and replacing the updating parameters with the initializing parameters.
Further, the modules/units integrated in the electronic device 1 may be stored in a computer readable storage medium if implemented in the form of software functional units and sold or used as separate products. The computer readable medium may include: any entity or device capable of carrying the computer program code, a recording medium, a U disk, a removable hard disk, a magnetic disk, an optical disk, a computer Memory, a Read-Only Memory (ROM).
Further, the computer-usable storage medium may mainly include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function, and the like; the storage data area may store data created from the use of blockchain nodes, and the like.
In the several embodiments provided in the present invention, it should be understood that the disclosed apparatus, device and method may be implemented in other manners. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the modules is merely a logical function division, and there may be other manners of division when actually implemented.
The modules described as separate components may or may not be physically separate, and components shown as modules may or may not be physical units, may be located in one place, or may be distributed over multiple network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional module in the embodiments of the present invention may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit. The integrated units can be realized in a form of hardware or a form of hardware and a form of software functional modules.
It will be evident to those skilled in the art that the invention is not limited to the details of the foregoing illustrative embodiments, and that the present invention may be embodied in other specific forms without departing from the spirit or essential characteristics thereof.
The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive, the scope of the invention being indicated by the appended claims rather than by the foregoing description, and all changes which come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein. Any accompanying diagram representation in the claims should not be considered as limiting the claim concerned.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The blockchain (Blockchain), essentially a de-centralized database, is a string of data blocks that are generated in association using cryptographic methods, each of which contains information from a batch of network transactions for verifying the validity (anti-counterfeit) of its information and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
Furthermore, it is evident that the word "comprising" does not exclude other elements or steps, and that the singular does not exclude a plurality. A plurality of units or means recited in the system claims can also be implemented by means of software or hardware by means of one unit or means. The terms second, etc. are used to denote a name, but not any particular order.
Finally, it should be noted that the above-mentioned embodiments are merely for illustrating the technical solution of the present invention and not for limiting the same, and although the present invention has been described in detail with reference to the preferred embodiments, it should be understood by those skilled in the art that modifications and equivalents may be made to the technical solution of the present invention without departing from the spirit and scope of the technical solution of the present invention.

Claims (7)

1. A method for obtaining parameters of an identification model, the method comprising:
acquiring a training data set containing a noise label, and performing data standardization processing on the training data set to obtain a standard data set;
Establishing an identification model based on a multi-layer deep neural network, and training the identification model by utilizing the standard data set to obtain a standard identification model containing initialization parameters;
constructing a noise probability transition matrix of the standard data set;
Constructing a loss function based on the noise probability transition matrix;
Calculating an update parameter of the standard identification model by using the loss function, and replacing the update parameter with the initialization parameter;
wherein the noise probability transition matrix comprises:
Wherein, The size of the data is the same as the number of standard data in the standard data set;
The loss function includes a forward loss function, the forward loss function being:
Wherein Q T is the transpose of the noise probability transition matrix, Is an error factor of the identification model,/>For the multi-layer deep neural network,/>A loss value that is the forward loss function;
The loss function further includes a backward loss function, the backward loss function being:
Wherein, For the loss value of the backward loss function,/>For any one of the standard data sets/>Is a preset standard label of/>Identifying model pairs for the criteria/>Predictive tag of/>For the noise probability transition matrix,/>Is standard data/>And/>Corresponding preset standard label/>Is a joint distribution of/>For/>Is a predicted value of (a).
2. The method for obtaining parameters of an identification model according to claim 1, wherein the data normalization processing is performed on the training data set, and the method comprises one or a combination of the following steps:
Removing unique attribute values in the training data set;
Filling the missing values of the training data set;
and carrying out data normalization on the training data set.
3. The method for obtaining parameters of an identification model according to claim 2, wherein said data normalizing said training data set comprises:
data normalization of the training dataset was performed using the following normalization algorithm:
Wherein, Data normalized standard data,/>For the data in the training dataset,/>For/>Maximum value of value,/>For/>The minimum value of the values is taken.
4. A method of obtaining parameters of an identification model according to any one of claims 1 to 3, wherein said calculating updated parameters of the standard identification model using the loss function comprises:
Acquiring a preset standard label of standard data in the standard data set and a prediction label of the standard recognition model on the standard data in the standard data set;
calculating a difference value between the predictive label and the standard label by using a loss function;
When the difference value is within a preset threshold value interval, calculating an update parameter of the standard identification model by using a gradient descent algorithm;
When the difference value is larger than the upper limit of the threshold interval, calculating a probability value of the standard label as a noise label by using the loss function;
and when the probability value is smaller than a preset probability threshold value, calculating the updating parameters of the standard identification model by using a gradient descent algorithm.
5. A parameter acquisition apparatus of an identification model for realizing the parameter acquisition method of an identification model according to any one of claims 1 to 4, characterized in that the apparatus comprises:
the training data acquisition module is used for acquiring a training data set containing a noise label, and carrying out data standardization processing on the training data set to obtain a standard data set;
The recognition model construction module is used for building a recognition model based on the multi-layer deep neural network, and training the recognition model by utilizing the standard data set to obtain a standard recognition model containing initialization parameters;
The transition matrix construction module is used for constructing a noise probability transition matrix of the standard data set;
the loss function construction module is used for constructing a loss function based on the noise probability transition matrix;
And the model parameter updating module is used for calculating the updating parameters of the standard identification model by using the loss function and replacing the updating parameters with the initialization parameters.
6. An electronic device, the electronic device comprising:
at least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of parameter acquisition of an identification model as claimed in any one of claims 1 to 4.
7. A computer-readable storage medium comprising a storage data area storing created data and a storage program area storing a computer program; wherein the computer program, when executed by a processor, implements a method for parameter acquisition of an identification model according to any one of claims 1 to 4.
CN202010656659.8A 2020-07-09 2020-07-09 Parameter acquisition method and device for identification model, electronic equipment and storage medium Active CN111814962B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202010656659.8A CN111814962B (en) 2020-07-09 2020-07-09 Parameter acquisition method and device for identification model, electronic equipment and storage medium
PCT/CN2020/131974 WO2021151345A1 (en) 2020-07-09 2020-11-26 Method and apparatus for parameter acquisition for recognition model, electronic device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010656659.8A CN111814962B (en) 2020-07-09 2020-07-09 Parameter acquisition method and device for identification model, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN111814962A CN111814962A (en) 2020-10-23
CN111814962B true CN111814962B (en) 2024-05-10

Family

ID=72842855

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010656659.8A Active CN111814962B (en) 2020-07-09 2020-07-09 Parameter acquisition method and device for identification model, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN111814962B (en)
WO (1) WO2021151345A1 (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111814962B (en) * 2020-07-09 2024-05-10 平安科技(深圳)有限公司 Parameter acquisition method and device for identification model, electronic equipment and storage medium
CN112215238B (en) * 2020-10-29 2022-06-07 支付宝(杭州)信息技术有限公司 Method, system and device for constructing general feature extraction model
CN113158911A (en) * 2021-04-25 2021-07-23 北京华捷艾米科技有限公司 Data generation method and device
CN113902121B (en) * 2021-07-15 2023-07-21 陈九廷 Method, device, equipment and medium for verifying battery degradation estimation device
CN113706204B (en) * 2021-08-31 2024-04-05 中国平安财产保险股份有限公司 Deep learning-based rights issuing method, device, equipment and storage medium
CN113780473B (en) * 2021-09-30 2023-07-14 平安科技(深圳)有限公司 Depth model-based data processing method and device, electronic equipment and storage medium
US20230259762A1 (en) * 2022-02-14 2023-08-17 Samsung Electronics Co., Ltd. Machine learning with instance-dependent label noise
CN115270848B (en) * 2022-06-17 2023-09-29 合肥心之声健康科技有限公司 PPG and ECG automatic conversion intelligent algorithm, storage medium and computer system
CN115860574B (en) * 2023-02-06 2023-05-09 佰聆数据股份有限公司 Method and device for analyzing using effect of charging equipment
CN117077016B (en) * 2023-08-17 2024-03-19 中国自然资源航空物探遥感中心 Supermatrix rock identification method of support vector machine based on aviation magnetic release data
CN116908134B (en) * 2023-09-12 2023-11-24 津海威视技术(天津)有限公司 Semi-quantitative analysis method for plasticizer content and training method for analysis model
CN117349899B (en) * 2023-12-06 2024-04-05 湖北省楚天云有限公司 Sensitive data processing method, system and storage medium based on forgetting model

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102015010602A1 (en) * 2015-08-18 2017-02-23 Hochschule Aschaffenburg Method for analyzing a data set of a time-of-flight mass spectrometry measurement and a device
CN107563567A (en) * 2017-09-18 2018-01-09 河海大学 Core extreme learning machine Flood Forecasting Method based on sparse own coding
CN109450830A (en) * 2018-12-26 2019-03-08 重庆大学 Channel estimation methods based on deep learning under a kind of high-speed mobile environment
CN111191726A (en) * 2019-12-31 2020-05-22 浙江大学 Fault classification method based on weak supervised learning multi-layer perceptron

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110543898A (en) * 2019-08-16 2019-12-06 上海数禾信息科技有限公司 Supervised learning method for noise label, data classification processing method and device
CN110929733A (en) * 2019-12-09 2020-03-27 上海眼控科技股份有限公司 Denoising method and device, computer equipment, storage medium and model training method
CN111079836B (en) * 2019-12-16 2022-10-04 浙江大学 Process data fault classification method based on pseudo label method and weak supervised learning
CN111814962B (en) * 2020-07-09 2024-05-10 平安科技(深圳)有限公司 Parameter acquisition method and device for identification model, electronic equipment and storage medium

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
DE102015010602A1 (en) * 2015-08-18 2017-02-23 Hochschule Aschaffenburg Method for analyzing a data set of a time-of-flight mass spectrometry measurement and a device
CN107563567A (en) * 2017-09-18 2018-01-09 河海大学 Core extreme learning machine Flood Forecasting Method based on sparse own coding
CN109450830A (en) * 2018-12-26 2019-03-08 重庆大学 Channel estimation methods based on deep learning under a kind of high-speed mobile environment
CN111191726A (en) * 2019-12-31 2020-05-22 浙江大学 Fault classification method based on weak supervised learning multi-layer perceptron

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于改进DCNN结合迁移学习的图像分类方法;杨东旭等;《新疆大学学报(自然科学版)》;20180507(第02期);第196页至第202页 *

Also Published As

Publication number Publication date
WO2021151345A1 (en) 2021-08-05
CN111814962A (en) 2020-10-23

Similar Documents

Publication Publication Date Title
CN111814962B (en) Parameter acquisition method and device for identification model, electronic equipment and storage medium
CN113157927B (en) Text classification method, apparatus, electronic device and readable storage medium
CN114822812A (en) Character dialogue simulation method, device, equipment and storage medium
CN114491047A (en) Multi-label text classification method and device, electronic equipment and storage medium
CN113658002B (en) Transaction result generation method and device based on decision tree, electronic equipment and medium
CN116821373A (en) Map-based prompt recommendation method, device, equipment and medium
CN112990374B (en) Image classification method, device, electronic equipment and medium
CN114840684A (en) Map construction method, device and equipment based on medical entity and storage medium
CN113157739B (en) Cross-modal retrieval method and device, electronic equipment and storage medium
CN116578696A (en) Text abstract generation method, device, equipment and storage medium
CN116720525A (en) Disease auxiliary analysis method, device, equipment and medium based on inquiry data
CN114596958B (en) Pathological data classification method, device, equipment and medium based on cascade classification
CN112215336B (en) Data labeling method, device, equipment and storage medium based on user behaviors
CN113515591B (en) Text defect information identification method and device, electronic equipment and storage medium
CN114610854A (en) Intelligent question and answer method, device, equipment and storage medium
CN111414452A (en) Search word matching method and device, electronic equipment and readable storage medium
CN113706019B (en) Service capability analysis method, device, equipment and medium based on multidimensional data
CN111783982B (en) Method, device, equipment and medium for acquiring attack sample
CN116486972A (en) Electronic medical record generation method, device, equipment and storage medium
CN116933779A (en) Policy address identification method and device, electronic equipment and readable storage medium
CN116451764A (en) Entity recognition model training method, device and equipment based on query set
CN116663503A (en) Sentence error correction method, device, equipment and medium based on self-attention weight graph
CN116701629A (en) Training method and device for text classification model, electronic equipment and storage medium
CN117195898A (en) Entity relation extraction method and device, electronic equipment and storage medium
CN116541723A (en) Image text matching method, device, equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant