CN114049539A - Collaborative target identification method, system and device based on decorrelation binary network - Google Patents

Collaborative target identification method, system and device based on decorrelation binary network Download PDF

Info

Publication number
CN114049539A
CN114049539A CN202210023260.5A CN202210023260A CN114049539A CN 114049539 A CN114049539 A CN 114049539A CN 202210023260 A CN202210023260 A CN 202210023260A CN 114049539 A CN114049539 A CN 114049539A
Authority
CN
China
Prior art keywords
network model
target
output result
tensor
binary network
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210023260.5A
Other languages
Chinese (zh)
Other versions
CN114049539B (en
Inventor
吕金虎
王滨
徐昇
张宝昌
李炎静
张峰
王星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hangzhou Hikvision Digital Technology Co Ltd
Beihang University
Original Assignee
Hangzhou Hikvision Digital Technology Co Ltd
Beihang University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hangzhou Hikvision Digital Technology Co Ltd, Beihang University filed Critical Hangzhou Hikvision Digital Technology Co Ltd
Priority to CN202210023260.5A priority Critical patent/CN114049539B/en
Publication of CN114049539A publication Critical patent/CN114049539A/en
Application granted granted Critical
Publication of CN114049539B publication Critical patent/CN114049539B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/084Backpropagation, e.g. using gradient descent

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The application provides a collaborative target identification method, a system and a device based on a decorrelation binary network. In the application, the target quantization error in the binary network model training process is minimized, the minimized target quantization error is decomposed into the maximum likelihood estimation and the maximum posterior estimation based on Bayesian learning so as to respectively determine the activation tensor binarization error Lc and the weight tensor binarization error Ld, so as to realize that the conventional prediction error Ls is considered when the binary network model is trained, Lc and Ld are further considered, noise influence in the neural network binarization process is optimized, the condition that the training of the binary network model is influenced because the gradient of parameters disappears or does not exist in the model training process is avoided, the finally trained binary network model is ensured to be more stable and better complete model convergence, and the accuracy of the binary network model in application, such as identified target objects (such as human faces) and/or abnormal action behaviors (such as fire and robbery) is further improved.

Description

Collaborative target identification method, system and device based on decorrelation binary network
Technical Field
The application relates to the technology of the Internet of things, in particular to a collaborative target identification method, a collaborative target identification system and a collaborative target identification device based on a decorrelation binary network.
Background
A Binary Neural Network (BNN), also called Binary Network, has the function of reducing the Network scale and accelerating Network training. In application, the weight tensor and the activation tensor of each layer of network included in the binary network are converted into two values, such as +1 or-1. The core of the binary network is to effectively reduce multiply-add operation and reduce the storage space of the weight under the condition of keeping the precision to be reduced a little, and provide effective feasibility for the deployment of the mobile terminal.
However, since the weight tensor and the activation tensor of each layer network in the binary network are simply scaled to be binary values, such as +1 or-1, the simple scaling manner may cause the gradient of each parameter in the binary network, such as the weight tensor and the activation tensor, to be absent (disappear). The disappearance or absence of the gradient of the parameter can affect the training of the binary network model, and further cause inaccuracy of the application of the finally trained binary network model, such as identified target objects (such as faces) and/or abnormal action behaviors (such as fire, robbery, fighting and theft).
Disclosure of Invention
The embodiment of the application provides a collaborative target identification method, a collaborative target identification system and a collaborative target identification device based on a decorrelation binary network, so that the target identification accuracy of a binary network model is improved.
The embodiment of the application provides a collaborative target identification method based on a decorrelation binary network, which comprises the following steps:
decomposing a target quantization error in a binary network model training process into two parts after minimizing the target quantization error based on Bayes learning, wherein one part is represented by maximum likelihood estimation, and the other part is represented by maximum posterior estimation; the target quantization error refers to an error caused by forward propagation and backward propagation in the training process of the binary network model;
determining an activation tensor binarization error Lc based on the maximum likelihood estimation, and determining a weight tensor binarization error Ld based on the maximum a posteriori estimation; training a binary network model according to the objectives of optimizing the Lc, the Ld and the prediction error Ls; the Ls refers to an error between an output value obtained by inputting input data into a binary network model in a training process of the binary network model and a true value corresponding to the input data; deploying the binary network model on the terminal equipment of the Internet of things;
when the terminal equipment of the Internet of things collects data used for target identification, the collected data are input into the binary network model to obtain a primary output result; reporting the data and the preliminary output result to a central server, so that the central server inputs the data to a trained target network model to verify the preliminary output result to finally obtain a target output result when determining that the preliminary output result needs to be verified according to the preliminary output result; the target network model is a trained non-binary network model.
The embodiment of the application provides a collaborative target identification system based on a decorrelation binary network, which comprises: the system comprises the Internet of things terminal equipment and a central server;
the terminal equipment of the Internet of things is provided with a binary network model; the binary network model is trained according to targets of an optimized activation tensor binarization error Lc, a weight tensor binarization error Ld and a prediction error Ls; the activation tensor binarization error Lc is determined based on maximum likelihood estimation, the weight tensor binarization error Ld is determined based on maximum posterior estimation, and the maximum likelihood estimation and the maximum posterior estimation are obtained by decomposing a target quantization error in a binary network model training process after the target quantization error is minimized based on Bayes learning; the target quantization error refers to an error caused by forward propagation and backward propagation in the training process of the binary network model;
when data used for target identification are collected, the terminal equipment of the Internet of things inputs the collected data into the binary network model to obtain a primary output result, and reports the data and the primary output result to a central server;
the central server is used for determining whether the primary output result needs to be verified or not according to the primary output result, if not, determining the primary output result as a target output result, and if so, inputting the data to a trained target network model and outputting a target final result; the target network model is a trained non-binary network model.
The embodiment of the application also provides the electronic equipment. The electronic device includes: a processor and a machine-readable storage medium;
the machine-readable storage medium stores machine-executable instructions executable by the processor;
the processor is configured to execute machine-executable instructions to implement the steps of the above-disclosed method.
According to the technical scheme, the target quantization error in the binary network model training process is minimized, the minimized target quantization error is decomposed into the maximum likelihood estimation and the maximum posterior estimation based on Bayes learning to respectively determine the activation tensor binarization error Lc and the weight tensor binarization error Ld, so that the conventional prediction error Ls is considered when the binary network model is trained, the Lc and the Ld are further considered to optimize the noise influence in the neural network binarization process, the condition that the training of the binary network model is influenced due to the disappearance or absence of the gradient of the parameter in the model training process is avoided, the finally trained binary network model is ensured to be more stable, the model convergence is better completed, and the target object (such as a human face) and/or abnormal action behaviors (such as a longitudinal fire, a longitudinal fire and an abnormal action behavior) of the binary network model in the application are further improved, Robbery, fighting, theft), etc.
Further, in this embodiment, the terminal device of the internet of things performs target identification in cooperation with the central server, a preliminary result is identified by the terminal device of the internet of things based on the binary network model, the central server determines whether the preliminary result needs to be verified according to the preliminary result, and when the preliminary result needs to be verified, the central server inputs data acquired by the terminal device of the internet of things to the trained target network model to verify the preliminary result, so as to finally obtain a target output result; when verification is not needed, the central server directly determines a target output result from the preliminary result of the terminal equipment identification of the internet of things, so that the load of the central server is reduced, the utilization rate of the terminal equipment of the internet of things is improved, and the robustness of the whole target identification is improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
FIG. 1 is a flow chart of a method provided by an embodiment of the present application;
fig. 2 is a system configuration diagram provided in an embodiment of the present application;
FIG. 3 is a block diagram of an apparatus according to an embodiment of the present disclosure;
fig. 4 is a structural diagram of an electronic device according to an embodiment of the present application.
Detailed Description
Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present application. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present application, as detailed in the appended claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this application and the appended claims, the singular forms "a", "an", and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise.
In order to make the technical solutions provided in the embodiments of the present application better understood and make the above objects, features and advantages of the embodiments of the present application more comprehensible, the technical solutions in the embodiments of the present application are described in further detail below with reference to the accompanying drawings.
The "two-Propagation" most commonly used in training neural network models are Forward Propagation (Forward Propagation) and Backward Propagation (Backward Propagation). The so-called forward propagation, which is based on the parameters of each layer of the neural network model, such as weight tensor, activation tensor, etc., puts the input data along the sequence from the input layer to the output layer, finally outputs y ^ (also called predicted value), and then calculates the loss (loss) function L (y ^, y) according to the difference between y ^ and the true value y.
As for back Propagation, partial derivatives (gradients) of parameters such as weight tensor, activation tensor and the like of each layer in the neural network model are reversely calculated according to the loss function L (y ^ y), and gradient optimization is performed based on the gradient descent idea, so that the parameters such as weight tensor and activation tensor of each layer are updated.
When the neural network model is a binary network model, parameters of each layer, such as a weight tensor and an activation tensor, which are obtained through binarization operation, such as a binarization weight tensor and a binarization activation tensor, participate in forward propagation. For convenience of description, parameters before binarization, such as a weight tensor and an activation tensor, may be referred to as original parameters, such as an original weight tensor and an original activation tensor.
For the backward propagation, the parameters of each layer of network are binarized and then lose the gradient, so that the method cannot be applied to the backward propagation. Participating in the back propagation are original parameters of each layer of network, such as weight tensor, activation tensor, etc. (i.e. not the above-mentioned binarized weight tensor, binarized activation tensor, etc., but the above-mentioned original parameters, such as original weight tensor, original activation tensor).
Due to the fact that parameters of each layer network are different during forward propagation and backward propagation (for example, binary parameters of each layer participate in the forward propagation, and original parameters of each layer participate in the backward propagation), the problem that a parameter gradient of a finally trained binary network model disappears or does not exist can be caused, and further, target action behaviors such as fire, robbery, fighting, stealing and the like recognized by the finally trained binary network model in application are not accurate, and a large error occurs.
On the basis, in order to improve the accuracy of target identification of the binary network model and avoid the problem that the identified target actions, such as fire, robbery, fighting, theft and the like, of the binary network model in the application are not accurate due to the fact that the parameter gradient of the binary network model disappears or does not exist, the embodiment of the application provides a novel training method of the binary network model.
In the binary network model training method provided by the embodiment of the application, the thought of Bayesian learning is combined, the traditional training mode is optimized by utilizing the maximum likelihood estimation and the maximum posterior estimation parameters in the network model training process, and the problems of easy disappearance of gradient, difficult convergence of the model and slow training process in the binary network model training process are solved. The method provided by the embodiment of the application is described in the following with reference to the attached drawings:
referring to fig. 1, fig. 1 is a flowchart of a method provided in an embodiment of the present application. The flow is applied to the electronic equipment. Optionally, in this embodiment, the electronic device may be a terminal device with less computing resources, such as a video terminal, a door access device, and the like, to which the binary network model is applied.
As shown in fig. 1, the process may include the following steps:
step 101, optimizing and decomposing a target quantization error in a binary network model training process into two parts based on Bayes learning, wherein one part is represented by maximum likelihood estimation, and the other part is represented by maximum posterior estimation; the target quantization error refers to an error caused by forward propagation and backward propagation in the training process of the binary network model.
In a normal convolutional neural network model (also called a full-precision convolutional neural network), the basic operation is expressed as formula 1:
Figure 327206DEST_PATH_IMAGE002
(formula 1)
In formula 1, w is the original weight tensor (i.e. non-binary parameter), a is the original activation tensor, z is the output tensor, and the convolution operation is performed in the middle.
When applied to a binary network model, since the parameters of each layer of network in the binary network model are binarized, i.e., represented by 1 bit, to
Figure 306663DEST_PATH_IMAGE003
Representing the binarized weight tensor where the original weight tensor w is binarized,
Figure 351980DEST_PATH_IMAGE004
the binarized activation tensor, which represents the binarized activation tensor obtained by binarizing the original activation tensor a, can obtain the following expression:
Figure 75085DEST_PATH_IMAGE005
Figure 420616DEST_PATH_IMAGE006
(formula 2)
In the formula 2, the first and second groups,
Figure 508657DEST_PATH_IMAGE007
Figure 103587DEST_PATH_IMAGE008
is a scale factor (also referred to as a scaling parameter).
Figure 833645DEST_PATH_IMAGE009
Representing a binarized weight tensor by scaling parameters
Figure 768103DEST_PATH_IMAGE007
The value obtained is the value of the sum of the values,
Figure 92293DEST_PATH_IMAGE010
representing a binarized activation tensor by scaling parameters
Figure 112202DEST_PATH_IMAGE008
The value obtained.
For a binary network model, the forward propagation process can be expressed as the following equation 3:
Figure 177109DEST_PATH_IMAGE012
(formula 3)
And in the backward propagation, because the parameters of each layer of network in the binary network model lose the gradient after being binarized, the method cannot be applied to the backward propagation. Based on this, when propagating in the reverse direction, it calculates the gradient based on the original parameters (i.e. the parameters before binarization) of each layer of network in the binary network model, such as the original weight tensor and the original activation tensor. Equation 4 illustrates back propagation with a straight-through estimator as an example:
Figure 966074DEST_PATH_IMAGE014
(formula 4)
In equation 4, x represents an input, and may be specifically an original parameter such as an original weight tensor and an original activation tensor.
Because the parameters of each layer of network in the binary network model adopted by the forward propagation and the backward propagation are different, an error exists between the forward propagation and the backward propagation, namely the target quantization error. It can be seen that the target quantization error is caused by forward propagation and backward propagation in the training process of the binary network model.
Alternatively, in the present embodiment, optimization of the binary network model may be achieved by minimizing the above-described target quantization error. Minimizing the above target quantization error can be represented by equation 5:
Figure 927077DEST_PATH_IMAGE015
(formula 5)
In equation 5, w represents the original weight tensor,
Figure 496598DEST_PATH_IMAGE003
a binarized weight tensor representing the binarized weight tensor obtained by binarizing the original weight tensor,
Figure 302880DEST_PATH_IMAGE007
representing a scaling parameter. In this embodiment, if Lc and Ld described below are determined by minimizing the target quantization error, decorrelation between parameters of each layer of network in the binary network model can be achieved, and finally, a decorrelated binary network can be achieved.
As an example, in this embodiment, a bayesian learning idea is introduced, and equation 5 is re-expressed by maximizing the type II likelihood estimation, so as to obtain the following equation 6:
Figure 211930DEST_PATH_IMAGE016
(formula 6)
In the formula 6, the first and second groups,
Figure 874993DEST_PATH_IMAGE017
expressing probability, sigma expressing a binarization activation tensor matrix formed by binarization activation tensor, X expressing specified convolution kernel information, for example, X is formed by combining convolution kernels in a binary network model, and e expressing a target binarization weight tensor to be obtained.
As another example, the present embodiment characterizes the above target quantization error via the minimization process by Maximum A Posteriori (MAP) estimation, see in particular the following equation 7:
Figure 135073DEST_PATH_IMAGE018
)=
Figure 276204DEST_PATH_IMAGE019
)=
Figure 42691DEST_PATH_IMAGE020
)
Figure 142234DEST_PATH_IMAGE021
) (formula 7)
Step 102, determining an activation tensor binarization error Lc based on the maximum likelihood estimation, and determining a weight tensor binarization error Ld based on the maximum posterior estimation; training a binary network model according to the objectives of optimizing the Lc, the Ld and the prediction error Ls; and deploying the binary network model on the terminal equipment of the Internet of things.
Optionally, in this embodiment, a specific form of the maximum likelihood function in the above equation 6 is as follows:
Figure 624031DEST_PATH_IMAGE022
(formula 8)
In equation 8, I denotes an original activation tensor matrix formed by the original activation tensor,
Figure 772115DEST_PATH_IMAGE023
representing a constant.
Based on the formulas 6 and 8, the Lc is finally obtained, and Lc is shown by the following formula 9:
Figure 452495DEST_PATH_IMAGE025
(formula 9)
The determination of the activation tensor binarization error Lc based on the maximum likelihood estimation is described above. Determining the weight tensor binarization error Ld based on the maximum a posteriori estimation is described as follows:
in the above-mentioned formula 7,
Figure 660623DEST_PATH_IMAGE026
) Can be expressed by equation 10:
Figure 629716DEST_PATH_IMAGE027
(formula 10)
In the formula 10, the process is described,
Figure 378229DEST_PATH_IMAGE028
representing the variance of the original weight tensor (the weights are a trend distribution when embodied).
For a binary network model such as a binary neural network, the weights are usually quantized to two numbers with the same absolute value, i.e., the weights are modeled as a gaussian mixture distribution with two peaks, based on which equation 7 above
Figure 913115DEST_PATH_IMAGE029
) Can be expressed by the following equation 11:
Figure 557723DEST_PATH_IMAGE031
(formula 11)
In equation 11, N represents the dimension of the weight tensor,
Figure 748533DEST_PATH_IMAGE032
is an arithmetic expression.
Finally, the Ld is obtained based on the above formula 7, formula 10 to formula 11, specifically see formula 12:
Figure 504000DEST_PATH_IMAGE033
(formula 12)
In the formula 12, in the above-mentioned formula,
Figure 884604DEST_PATH_IMAGE028
representing the variance of the original weight tensor,
Figure 434534DEST_PATH_IMAGE034
to represent
Figure 378219DEST_PATH_IMAGE007
The possible dimensions of the components are such that,
Figure 734114DEST_PATH_IMAGE007
the scaling parameters are represented by a scale parameter,
Figure 181276DEST_PATH_IMAGE035
representing a positive number in the binarized weight tensor,
Figure 902107DEST_PATH_IMAGE036
representing the negatives in the binarized weight tensor.
Determining the weight tensor binarization error Ld based on the maximum a posteriori estimate is described above.
For any network model, there will be a prediction error Ls. Here, the prediction error refers to an error between the output value and the true value. Taking a binary network model as an example, Ls may be an error between an output value obtained by inputting input data into the binary network model in a training process of the binary network model and a true value corresponding to the input data. Ls is an error frequently occurring in the current network model training, and is not described herein again.
Optionally, in this embodiment, after Lc, Ld, Ls are determined based on the above description, a corresponding bayesian framework may be constructed. The Bayes framework defines optimization targets, namely Lc, Ld and Ls, in the binary network model training process, and can be specifically expressed by the following formula 13:
Figure 129826DEST_PATH_IMAGE037
(formula 13)
In the formula 13, in the above-mentioned formula,
Figure 227095DEST_PATH_IMAGE038
representing the goal of the final optimization (i.e. the weight loss goal),
Figure 794343DEST_PATH_IMAGE039
Figure 686075DEST_PATH_IMAGE040
representing a preset constant.
And then, training a binary network model based on a Bayesian framework, and finally training a binary network model.
It should be noted that, in this embodiment, when the binary network model is trained, in addition to the conventional prediction error Ls, Lc and Ld may be further considered in this embodiment to optimize noise influence in the gradient optimization process, so as to avoid influence on training of the binary network model due to disappearance or absence of the gradient of the parameter in the model training process, ensure that the finally trained binary network model is more stable and better completes model convergence, and further improve the accuracy of the binary network model in application, such as the accuracy of a recognized target object (e.g., a human face) and/or abnormal action behavior (e.g., fire, robbery, fighting, theft, etc.).
In this embodiment, the finally trained binary network model is deployed on the terminal device of the internet of things. The terminal equipment of the internet of things can be the terminal equipment of the internet of things with less computing resources, such as a video terminal, an access control device and the like, and the intellectualization of the terminal equipment of the internet of things can be better realized.
103, when the terminal equipment of the internet of things acquires data for target identification, inputting the acquired data into a binary network model to obtain a primary output result; then reporting the acquired data and the primary output result to a central server, so that the central server inputs the acquired data to a trained target network model to verify the primary output result to finally obtain a target output result when determining that the primary output result needs to be verified according to the primary output result; the target network model is a trained non-binary network model.
In different applications, data collected by the terminal equipment of the internet of things and used for target recognition are different, for example, the data are used for face recognition, and the data collected by the terminal equipment of the internet of things are data used for face recognition, such as face pictures and the like; for example, the data collected by the terminal equipment of the internet of things is data for identifying targets such as vehicles and the like, such as vehicle pictures and the like; for example, the data collected by the terminal device of the internet of things is data used for identifying abnormal behaviors, such as video images, and the like, and the embodiment is not particularly limited.
After the data are collected by the Internet of things terminal equipment, the collected data are input into the binary network model to obtain a preliminary output result. It should be noted that the terminal device of the internet of things only obtains a preliminary output result at this time. This also requires the decision of the central server as to whether the preliminary output result is the final target output result. Based on this, the terminal device of the internet of things reports the acquired data and the output result to the central server, so that the central server decides whether to verify the primary output result.
Optionally, in this embodiment, the central server determines (or determines) whether the preliminary output result needs to be verified according to the preliminary output result, for example, whether the preliminary output result meets a preset verification condition is identified, if so, it is determined that the preliminary output result needs to be verified, and if not, it is determined that the preliminary output result does not need to be verified.
In this embodiment, the central server is connected to a plurality of different terminal devices of the internet of things. Based on this, optionally, the preset verification condition may include: other output results which are not matched with the preliminary output result (for example, the other output results are not similar to the preliminary output result, etc.) are reported by other internet of things terminal devices in the same designated area as the internet of things terminal device (the internet of things terminal device which reports the preliminary output result) within a preset time period.
In this embodiment, the preliminary output result may have a small confidence level that carries the target recognition result, for example, a confidence level that an abnormal behavior such as robbery is recognized, and based on this, the preset verification condition may include: and the confidence coefficient of the target recognition result in the preliminary output result is smaller than a set confidence coefficient threshold value. When the confidence of the target recognition result is smaller than the set confidence threshold, in order to ensure the accuracy of the target recognition, the central server is required to further verify the preliminary output result. Conversely, when the confidence of the target recognition result is greater than or equal to the set confidence threshold, further verification of the preliminary output result may not be required.
Thus, the flow shown in fig. 1 is completed.
As can be seen from the flow shown in fig. 1, in this embodiment, the target quantization error in the binary network model training process is minimized based on bayesian learning, and then the minimized target quantization error is decomposed into the maximum likelihood estimation and the maximum posterior estimation to respectively determine the activation tensor binarization error Lc and the weight tensor binarization error Ld, so as to realize that Lc and Ld are further considered when the binary network model is trained, in addition to the conventional prediction error Ls, to optimize the noise influence in the gradient optimization process, avoid influencing the training of the binary network model due to the disappearance or absence of the gradient of the parameter in the model training process, ensure that the finally trained binary network model is more stable, better complete the model convergence, and further improve the target object (such as a human face) and/or abnormal action behavior (such as a longitudinal fire, a longitudinal fire, Robbery, fighting, theft), etc.
Further, in this embodiment, the terminal device of the internet of things performs target identification in cooperation with the central server, a preliminary result is identified by the terminal device of the internet of things based on the binary network model, the central server determines whether the preliminary result needs to be verified according to the preliminary result, and when the preliminary result needs to be verified, the central server inputs data acquired by the terminal device of the internet of things to the trained target network model to verify the preliminary result, so as to finally obtain a target output result; when verification is not needed, the central server directly determines a target output result from the preliminary result of the terminal equipment identification of the internet of things, so that the load of the central server is reduced, the utilization rate of the terminal equipment of the internet of things is improved, and the robustness of the whole target identification is improved.
It should be noted that, in this embodiment, the trained target network model and the trained binary network model are also updated according to requirements, for example, when a preliminary output result of the terminal device of the internet of things is inconsistent with a result further identified by the central server, and the inconsistent number reaches a preset threshold, the target network model and the binary network model need to be updated; as another example, the target network model, the binary network model, etc. are updated based on external triggers, such as maintenance personnel, etc.
In this embodiment, the manner of updating the binary network model is similar to the manner of training the binary network model in step 101 and step 102, and details are not repeated here.
The following describes a system provided in an embodiment of the present application:
referring to fig. 2, fig. 2 is a system structure diagram provided in the embodiment of the present application. As shown in fig. 2, the system may include: internet of things terminal equipment and central server.
In this embodiment, the internet of things terminal device deploys a binary network model. The binary network model is trained according to targets of an optimized activation tensor binarization error Lc, a weight tensor binarization error Ld and a prediction error Ls; the activation tensor binarization error Lc is determined based on maximum likelihood estimation, the weight tensor binarization error Ld is determined based on maximum posterior estimation, and the maximum likelihood estimation and the maximum posterior estimation are obtained by decomposing a target quantization error in a binary network model training process after the target quantization error is minimized based on Bayes learning; the target quantization error refers to an error caused by forward propagation and backward propagation in the training process of the binary network model;
when data used for target identification are collected, the terminal equipment of the Internet of things inputs the collected data into the binary network model to obtain a primary output result, and reports the data and the primary output result to a central server;
the central server is used for determining whether the primary output result needs to be verified or not according to the primary output result, if not, determining the primary output result as a target output result, and if so, inputting the data to a trained target network model and outputting a target final result; the target network model is a trained non-binary network model.
Optionally, in this embodiment, the determining, by the central server, whether the preliminary output result needs to be verified according to the preliminary output result includes:
identifying whether the preliminary output result meets a preset verification condition, wherein the preset verification condition at least comprises the following steps: reporting other output results which are not matched with the primary output result by other internet of things terminal equipment in the same designated area with the internet of things terminal equipment within a preset time period, and/or enabling the confidence degree of the target identification result in the primary output result to be smaller than a set confidence degree threshold value;
if so, determining that the preliminary output result needs to be verified, and if not, determining that the preliminary output result does not need to be verified.
The following describes the apparatus provided in the embodiments of the present application:
referring to fig. 3, fig. 3 is a structural diagram of an apparatus according to an embodiment of the present disclosure. As shown in fig. 3, the apparatus may include:
a model unit which runs a binary network model; the binary network model is trained according to targets of an optimized activation tensor binarization error Lc, a weight tensor binarization error Ld and a prediction error Ls; the activation tensor binarization error Lc is determined based on maximum likelihood estimation, the weight tensor binarization error Ld is determined based on maximum posterior estimation, and the maximum likelihood estimation and the maximum posterior estimation are obtained by decomposing a target quantization error in a binary network model training process after the target quantization error is minimized based on Bayes learning; the target quantization error refers to an error caused by forward propagation and backward propagation in the training process of the binary network model; the Ls refers to an error between an output value obtained by inputting input data into a binary network model in a training process of the binary network model and a true value corresponding to the input data;
the identification unit is used for inputting the collected data for target identification into the binary network model to obtain a preliminary output result; reporting the data and the preliminary output result to a central server, so that the central server inputs the data to a trained target network model to verify the preliminary output result to finally obtain a target output result when determining that the preliminary output result needs to be verified according to the preliminary output result; the target network model is a trained non-binary network model.
Optionally, the minimization target quantization error is represented by:
Figure 401090DEST_PATH_IMAGE041
where w represents the original weight tensor,
Figure 36471DEST_PATH_IMAGE003
a binarized weight tensor representing the binarized weight tensor obtained by binarizing the original weight tensor,
Figure 992313DEST_PATH_IMAGE007
representing a scaling parameter.
Optionally, the maximum likelihood estimate is represented by:
Figure 117264DEST_PATH_IMAGE042
p represents probability, Σ represents a binarization activation tensor matrix formed by binarizing an activation tensor, X represents specified convolution kernel information, and e represents a target binarization weight tensor to be obtained;
said Lc is represented by the following formula:
Figure 257258DEST_PATH_IMAGE044
where I denotes an original activation tensor matrix formed by the original activation tensor.
Optionally, the maximum a posteriori estimate is represented by:
Figure 493068DEST_PATH_IMAGE046
wherein p represents probability, X represents specified convolution kernel information, w represents an original weight tensor, and e represents a target binarization weight tensor to be obtained;
the Ld is represented by the following formula:
Figure 893962DEST_PATH_IMAGE048
wherein,
Figure 393077DEST_PATH_IMAGE049
representing the variance of the original weight tensor,
Figure 817105DEST_PATH_IMAGE050
to represent
Figure 59867DEST_PATH_IMAGE007
The dimension (c) of (a) is,
Figure 193564DEST_PATH_IMAGE007
the scaling parameters are represented by a scale parameter,
Figure 394738DEST_PATH_IMAGE051
representing a positive number in the binarized weight tensor,
Figure 509325DEST_PATH_IMAGE052
representing the negatives in the binarized weight tensor.
Optionally, the original activation tensor is an activation tensor of the target network model.
Optionally, the original weight tensor is a weight tensor of the target network model.
Optionally, the training data and the test data of the target network model and the binarization network model are the same.
Thus, the description of the structure of the device shown in fig. 3 is completed.
The embodiment of the application also provides a hardware structure of the device shown in fig. 3. Referring to fig. 4, fig. 4 is a structural diagram of an electronic device according to an embodiment of the present application. As shown in fig. 4, the hardware structure may include: a processor and a machine-readable storage medium having stored thereon machine-executable instructions executable by the processor; the processor is configured to execute machine-executable instructions to implement the methods disclosed in the above examples of the present application.
Based on the same application concept as the method, embodiments of the present application further provide a machine-readable storage medium, where several computer instructions are stored, and when the computer instructions are executed by a processor, the method disclosed in the above example of the present application can be implemented.
The machine-readable storage medium may be, for example, any electronic, magnetic, optical, or other physical storage device that can contain or store information such as executable instructions, data, and the like. For example, the machine-readable storage medium may be: a RAM (random Access Memory), a volatile Memory, a non-volatile Memory, a flash Memory, a storage drive (e.g., a hard drive), a solid state drive, any type of storage disk (e.g., an optical disk, a dvd, etc.), or similar storage medium, or a combination thereof.
The systems, devices, modules or units illustrated in the above embodiments may be implemented by a computer chip or an entity, or by a product with certain functions. A typical implementation device is a computer, which may take the form of a personal computer, laptop computer, cellular telephone, camera phone, smart phone, personal digital assistant, media player, navigation device, email messaging device, game console, tablet computer, wearable device, or a combination of any of these devices.
For convenience of description, the above devices are described as being divided into various units by function, and are described separately. Of course, the functionality of the units may be implemented in one or more software and/or hardware when implementing the present application.
As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
Furthermore, these computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above description is only an example of the present application and is not intended to limit the present application. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims (10)

1. A collaborative target identification method based on a decorrelation binary network is characterized by comprising the following steps:
decomposing a target quantization error in a binary network model training process into two parts after minimizing the target quantization error based on Bayes learning, wherein one part is represented by maximum likelihood estimation, and the other part is represented by maximum posterior estimation; the target quantization error refers to an error caused by forward propagation and backward propagation in the training process of the binary network model;
determining an activation tensor binarization error Lc based on the maximum likelihood estimation, and determining a weight tensor binarization error Ld based on the maximum a posteriori estimation; training a binary network model according to the objectives of optimizing the Lc, the Ld and the prediction error Ls; the Ls refers to an error between an output value obtained by inputting input data into a binary network model in a training process of the binary network model and a true value corresponding to the input data; deploying the binary network model on the terminal equipment of the Internet of things;
when the terminal equipment of the Internet of things collects data used for target identification, the collected data are input into the binary network model to obtain a primary output result; reporting the data and the preliminary output result to a central server, so that the central server inputs the data to a trained target network model to verify the preliminary output result to finally obtain a target output result when determining that the preliminary output result needs to be verified according to the preliminary output result; the target network model is a trained non-binary network model.
2. The method of claim 1,
the minimized target quantization error is represented by the following equation:
Figure 584271DEST_PATH_IMAGE001
where w represents the original weight tensor,
Figure 32570DEST_PATH_IMAGE002
a binarized weight tensor representing the binarized weight tensor obtained by binarizing the original weight tensor,
Figure 343465DEST_PATH_IMAGE003
representing a scaling parameter.
3. The method of claim 1 or 2, wherein the maximum likelihood estimate is represented by:
Figure 4254DEST_PATH_IMAGE004
wherein p represents probability, Σ represents a binarization activation tensor matrix formed by binarizing an activation tensor, X represents specified convolution kernel information in the binary network model, and e represents a target binarization weight tensor to be obtained;
said Lc is represented by the following formula:
Lc
Figure 615364DEST_PATH_IMAGE005
Figure 703405DEST_PATH_IMAGE006
where I denotes an original activation tensor matrix formed by the original activation tensor.
4. The method according to claim 1 or 2, wherein the maximum a posteriori estimate is represented by:
Figure 501597DEST_PATH_IMAGE007
)=
Figure 966076DEST_PATH_IMAGE008
)=
Figure 900534DEST_PATH_IMAGE009
)
Figure 425057DEST_PATH_IMAGE010
) ;
wherein p represents probability, X represents specified convolution kernel information in the binary network model, w represents an original weight tensor, and e represents a target binarization weight tensor to be obtained;
the Ld is represented by the following formula:
Figure 444965DEST_PATH_IMAGE011
wherein,
Figure 713135DEST_PATH_IMAGE012
representing the variance of the original weight tensor,
Figure 502100DEST_PATH_IMAGE013
to represent
Figure 197523DEST_PATH_IMAGE003
The dimension (c) of (a) is,
Figure 970307DEST_PATH_IMAGE003
the scaling parameters are represented by a scale parameter,
Figure 511010DEST_PATH_IMAGE014
representing a positive number in the binarized weight tensor,
Figure 685640DEST_PATH_IMAGE015
representing the negatives in the binarized weight tensor.
5. The method of claim 3, wherein the original activation tensor is an activation tensor for the target network model.
6. The method of claim 4, wherein the original weight tensor is a weight tensor of the target network model.
7. The method according to claim 1, wherein the training data and the testing data of the target network model and the binarized network model are the same.
8. A collaborative target identification system based on a decorrelation binary network is characterized by comprising: the system comprises the Internet of things terminal equipment and a central server;
the terminal equipment of the Internet of things is provided with a binary network model; the binary network model is trained according to targets of an optimized activation tensor binarization error Lc, a weight tensor binarization error Ld and a prediction error Ls; the activation tensor binarization error Lc is determined based on maximum likelihood estimation, the weight tensor binarization error Ld is determined based on maximum posterior estimation, and the maximum likelihood estimation and the maximum posterior estimation are obtained by decomposing a target quantization error in a binary network model training process after the target quantization error is minimized based on Bayes learning; the target quantization error refers to an error caused by forward propagation and backward propagation in the training process of the binary network model;
when data used for target identification are collected, the terminal equipment of the Internet of things inputs the collected data into the binary network model to obtain a primary output result, and reports the data and the primary output result to a central server;
the central server is used for determining whether the primary output result needs to be verified or not according to the primary output result, if not, determining the primary output result as a target output result, and if so, inputting the data to a trained target network model and outputting a target final result; the target network model is a trained non-binary network model.
9. The system of claim 8, wherein the central server determining whether the preliminary output result needs to be verified based on the preliminary output result comprises:
identifying whether the preliminary output result meets a preset verification condition, wherein the preset verification condition at least comprises the following steps: reporting other output results which are not matched with the primary output result by other internet of things terminal equipment in the same designated area with the internet of things terminal equipment within a preset time period, and/or enabling the confidence degree of the target identification result in the primary output result to be smaller than a set confidence degree threshold value;
if so, determining that the preliminary output result needs to be verified, and if not, determining that the preliminary output result does not need to be verified.
10. An electronic device, comprising: a processor and a machine-readable storage medium;
the machine-readable storage medium stores machine-executable instructions executable by the processor;
the processor is configured to execute machine executable instructions to implement the method steps of any of claims 1-7.
CN202210023260.5A 2022-01-10 2022-01-10 Collaborative target identification method, system and device based on decorrelation binary network Active CN114049539B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210023260.5A CN114049539B (en) 2022-01-10 2022-01-10 Collaborative target identification method, system and device based on decorrelation binary network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210023260.5A CN114049539B (en) 2022-01-10 2022-01-10 Collaborative target identification method, system and device based on decorrelation binary network

Publications (2)

Publication Number Publication Date
CN114049539A true CN114049539A (en) 2022-02-15
CN114049539B CN114049539B (en) 2022-04-26

Family

ID=80213539

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210023260.5A Active CN114049539B (en) 2022-01-10 2022-01-10 Collaborative target identification method, system and device based on decorrelation binary network

Country Status (1)

Country Link
CN (1) CN114049539B (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114612750A (en) * 2022-05-09 2022-06-10 杭州海康威视数字技术股份有限公司 Target identification method and device for adaptive learning rate collaborative optimization and electronic equipment

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170286830A1 (en) * 2016-04-04 2017-10-05 Technion Research & Development Foundation Limited Quantized neural network training and inference
CN108537147A (en) * 2018-03-22 2018-09-14 东华大学 A kind of gesture identification method based on deep learning
CN109871461A (en) * 2019-02-13 2019-06-11 华南理工大学 The large-scale image sub-block search method to be reordered based on depth Hash network and sub-block
CN110956263A (en) * 2019-11-14 2020-04-03 深圳华侨城文化旅游科技集团有限公司 Construction method of binarization neural network, storage medium and terminal equipment
WO2020243922A1 (en) * 2019-06-05 2020-12-10 Intel Corporation Automatic machine learning policy network for parametric binary neural networks
CN113537462A (en) * 2021-06-30 2021-10-22 华为技术有限公司 Data processing method, neural network quantization method and related device
CN113592065A (en) * 2021-07-07 2021-11-02 中国人民解放军国防科技大学 Training method and training device for binary neural network

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170286830A1 (en) * 2016-04-04 2017-10-05 Technion Research & Development Foundation Limited Quantized neural network training and inference
CN108537147A (en) * 2018-03-22 2018-09-14 东华大学 A kind of gesture identification method based on deep learning
CN109871461A (en) * 2019-02-13 2019-06-11 华南理工大学 The large-scale image sub-block search method to be reordered based on depth Hash network and sub-block
WO2020243922A1 (en) * 2019-06-05 2020-12-10 Intel Corporation Automatic machine learning policy network for parametric binary neural networks
CN110956263A (en) * 2019-11-14 2020-04-03 深圳华侨城文化旅游科技集团有限公司 Construction method of binarization neural network, storage medium and terminal equipment
CN113537462A (en) * 2021-06-30 2021-10-22 华为技术有限公司 Data processing method, neural network quantization method and related device
CN113592065A (en) * 2021-07-07 2021-11-02 中国人民解放军国防科技大学 Training method and training device for binary neural network

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
DAIVID LEAL ET AL.: "training ensembles of quantum binary neural networks", 《2021 INTERNATIONAL JOINT CONFERENCE ON NEURAL NETWORK》 *
HAOTONG QIN ET AL.: "forward and backward information retention for accurate neural networks", 《CVPR》 *
JIAXIN GU ET AL.: "projection convolutional neural networks for 1-bits CNNs via discrete back propagation", 《ARXIV》 *
X MENG ET AL.: "Training binary neural networks using the Bayesian learning rule", 《ARXIV》 *
徐丽: "面向图像标记的条件随机场模型研究", 《中国优秀博硕士学位论文数据库(博士) 信息科技辑》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114612750A (en) * 2022-05-09 2022-06-10 杭州海康威视数字技术股份有限公司 Target identification method and device for adaptive learning rate collaborative optimization and electronic equipment
CN114612750B (en) * 2022-05-09 2022-08-19 杭州海康威视数字技术股份有限公司 Target identification method and device for adaptive learning rate collaborative optimization and electronic equipment

Also Published As

Publication number Publication date
CN114049539B (en) 2022-04-26

Similar Documents

Publication Publication Date Title
CN108776834B (en) System reinforcement learning method and device, electronic equipment and computer storage medium
CN110956255A (en) Difficult sample mining method and device, electronic equipment and computer readable storage medium
JP2017097585A (en) Learning device, program, and learning method
US20230274150A1 (en) Performing Inference And Training Using Sparse Neural Network
CN111046027A (en) Missing value filling method and device for time series data
US20200065664A1 (en) System and method of measuring the robustness of a deep neural network
JP2021039640A (en) Learning device, learning system, and learning method
CN108885787A (en) Method for training image restoration model, image restoration method, device, medium, and apparatus
CN112101207A (en) Target tracking method and device, electronic equipment and readable storage medium
WO2021012263A1 (en) Systems and methods for end-to-end deep reinforcement learning based coreference resolution
CN114049539B (en) Collaborative target identification method, system and device based on decorrelation binary network
CN111611390B (en) Data processing method and device
CN112966754A (en) Sample screening method, sample screening device and terminal equipment
CN108875502B (en) Face recognition method and device
CN114091594A (en) Model training method and device, equipment and storage medium
KR20220030108A (en) Method and system for training artificial neural network models
CN115358485A (en) Traffic flow prediction method based on graph self-attention mechanism and Hox process
CN115439708A (en) Image data processing method and device
CN114581966A (en) Method, electronic device and computer program product for information processing
CN114155388A (en) Image recognition method and device, computer equipment and storage medium
CN112906728A (en) Feature comparison method, device and equipment
CN111353428A (en) Action information identification method and device, electronic equipment and storage medium
CN111062468A (en) Training method and system for generating network, and image generation method and equipment
CN113850302B (en) Incremental learning method, device and equipment
CN113836438B (en) Method, electronic device, and storage medium for post recommendation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant