WO2018068421A1

WO2018068421A1 - Method and device for optimizing neural network

Info

Publication number: WO2018068421A1
Application number: PCT/CN2016/113271
Authority: WO
Inventors: 张玉兵
Original assignee: 广州视源电子科技股份有限公司
Priority date: 2016-10-11
Filing date: 2016-12-29
Publication date: 2018-04-19
Also published as: CN106650928A

Abstract

Disclosed are a method and device for optimizing a neural network. The method comprises: obtaining an initial neural network meeting a set precision condition, and determining a weight parameter matrix between two adjacent layers of unit nodes in the initial neural network; processing the weight parameter matrix according to a set deletion threshold and determining a unit node to be deleted in the initial neural network; and deleting the unit node to form an optimized target neural network. The method can easily and efficiently compress a neural network, thereby implementing the optimization of the neural network. Therefore, during human face recognition based on an optimized neural network, the recognition processing speed can be increased, the recognition processing time can be shortened, and the occupation of spaces, such as storage, a running memory, and a display memory, can be reduced.

Description

Neural network optimization method and device

Technical field

Embodiments of the present invention relate to the field of artificial neural network technologies, and in particular, to a method and an apparatus for optimizing a neural network.

Background technique

Currently, face recognition is usually performed based on a trained neural network model such as a deep convolutional neural network model. When using neural network model for face recognition, the problems are as follows: 1. The computational complexity of image data processing is high, which affects the computing time (for example, processing face images on electronic devices equipped with Core i7 processor) It usually takes more than 1 second); 2. It takes a large memory space or graphics memory space during processing; 3. It also needs to occupy a large storage space to store the entire neural network model.

The existing optimization method of neural network model can not completely solve the above problems. For example, the optimization in the form of Huffman coding can ensure the processing accuracy of the optimized neural network model and effectively reduce the deep nerve. The storage space of the network model, but can not reduce the complexity of processing operations, shorten the running time, and can not reduce the space occupied by memory or video memory during processing.

Summary of the invention

The embodiment of the invention provides a method and a device for optimizing a neural network, which can optimize the neural network, achieve the purpose of shortening the running time and reducing the space occupied by the device resources.

In one aspect, an embodiment of the present invention provides a method for optimizing a neural network, including:

Obtain an initial neural network that meets the set accuracy condition, and determine two adjacent ones in the initial neural network a weight parameter matrix between layer unit nodes;

Processing the weight parameter matrix according to the set deletion threshold, and determining a unit node to be deleted in the initial neural network;

Deleting the unit nodes forms an optimized target neural network.

In another aspect, an embodiment of the present invention provides an apparatus for optimizing a neural network, including:

a parameter matrix determining module, configured to acquire an initial neural network that meets a set precision condition, and determine a weight parameter matrix between adjacent two layer unit nodes in the initial neural network;

a node determining module to be deleted, configured to process the weight parameter matrix according to the set deletion threshold, and determine a unit node to be deleted in the initial neural network;

The target network determining module is configured to delete the unit node to form an optimized target neural network.

In the embodiment of the present invention, a method and a device for optimizing a neural network are provided. The method first obtains an initial neural network that meets a set precision condition, and determines a weight parameter matrix between adjacent two nodes in the initial neural network; The set deletion threshold is in the determined weight parameter matrix, thereby determining the unit node to be deleted in the initial neural network; finally deleting the determined unit node in the initial neural network to form the optimized target neural network. By using this method, the neural network can be compressed simply and efficiently, and the neural network can be optimized. Therefore, when the face recognition based on the optimized post-neural network is used, the recognition processing speed can be accelerated, the recognition processing time can be shortened, and the storage can be reduced. Run memory and the purpose of space occupied by video memory.

DRAWINGS

1 is a schematic flowchart of a method for optimizing a neural network according to Embodiment 1 of the present invention;

FIG. 2 is a schematic flowchart of a method for optimizing a neural network according to Embodiment 2 of the present invention; FIG.

2b is a structural diagram of a trained initial neural network according to Embodiment 2 of the present invention;

2c is a structural diagram of a target neural network formed by optimizing an initial neural network according to Embodiment 2 of the present invention;

FIG. 3 is a schematic flowchart of a method for optimizing a neural network according to Embodiment 3 of the present invention; FIG.

FIG. 3b is a structural diagram of a target neural network formed by depth optimization of a target neural network according to Embodiment 3 of the present invention; FIG.

FIG. 4 is a structural block diagram of an apparatus for optimizing a neural network according to Embodiment 4 of the present invention.

detailed description

The present invention will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. It should also be noted that, for ease of description, only some, but not all, of the structures related to the present invention are shown in the drawings.

Embodiment 1

FIG. 1 is a schematic flowchart of a method for optimizing a neural network according to Embodiment 1 of the present invention. The method is applicable to a compression optimization of a neural network after training and learning, and the method may be performed by an optimization device of a neural network, where The device can be implemented by software and/or hardware and is typically integrated on a terminal device or server platform on which the neural network model is located.

Generally, the neural network mainly refers to the artificial neural network, which can be regarded as an algorithm mathematical model that simulates the behavior characteristics of animal neural networks and performs distributed parallel information processing. The unit nodes in the neural network are divided into at least three layers, including an input layer, a hidden layer, and an output layer, wherein the input layer and the output layer each include only one layer of unit nodes, and the hidden layer includes at least one layer of unit nodes. In addition, The number of unit nodes included in each layer can be set according to different application conditions. Specifically, the input layer of the neural network is responsible for receiving the input data and distributing it to the hidden layer, the hidden layer is responsible for calculating the received data and passing the calculation result to the output layer, and the output layer is responsible for the output of the calculation result, which is understandable. Yes, the data transmission and processing in the neural network is mainly based on the connection between the adjacent two-level unit nodes and the weight parameter values corresponding to the connection.

At present, pattern recognition (such as face recognition) processing can be performed based on a neural network, and the created neural network needs to be trained and learned before performing pattern recognition, and only when it is determined that the processing precision of the neural network satisfies the application requirement. Perform pattern recognition. It should be noted that in the actual pattern recognition process, the trained neural network has a large scale, which not only affects the running time of the recognition process, but also occupies more storage space and runs memory. And the memory space, so the trained neural network can be optimized based on the neural network optimization method provided by the embodiment to solve the above problem.

As shown in FIG. 1 , a method for optimizing a neural network according to Embodiment 1 of the present invention includes the following operations:

S101. Acquire an initial neural network that meets a set precision condition, and determine a weight parameter matrix between adjacent two layer unit nodes in the initial neural network.

In this embodiment, the setting precision condition can be specifically understood as a range of processing precision that the neural network needs to reach when the neural network is trained and learned. Generally, the setting precision condition may be a system default range or an artificially set range. In this embodiment, the trained neural network can determine the current processing precision by processing the sample data included in the standard test set, and consider the neural network when the currently determined processing precision meets the set precision condition. The actual application processing can be performed, and the neural network can be referred to as an initial neural network.

Generally, the training learning of the neural network is mainly implemented by a set training learning algorithm. Since the training learning algorithm is already a mature technology, it will not be described in detail here. It can be understood that the training learning process of the neural network is actually a process in which the value of the weight parameter corresponding to the connection of two adjacent unit nodes in the neural network is continuously changed and finally determined. In this embodiment, after acquiring the initial neural network, the weight parameter matrix between the adjacent two layer unit nodes may be determined according to the weight parameter values corresponding to the connection of the adjacent two layer unit nodes.

In this embodiment, for a cell node of two adjacent layers, if there is only one connection between any one of the cell nodes of one layer to any of the cell nodes of the other layer, it can be considered that only between the two cell nodes There is a weight parameter value, and a two-dimensional weight parameter matrix can be determined between adjacent two-level unit nodes; if there is at least two connections between any one of the unit nodes of one layer to any one of the other nodes of the other layer, Or the connection relationship between the two unit nodes needs to be represented by a function, then it can be considered that there is a weight parameter array (may be a one-dimensional array or a two-dimensional array) between the two unit nodes, and the adjacent two layer unit nodes can be determined at this time. A multidimensional weight parameter matrix.

S102. Process a weight parameter matrix according to the set deletion threshold, and determine a unit node to be deleted in the initial neural network.

In this embodiment, the optimization of the initial neural network may be implemented by deleting the unit nodes in the initial neural network, thereby determining the unit nodes to be deleted in the initial neural network. Specifically, the set deletion threshold is first obtained, and then the relationship between the absolute value of each element value in the weight parameter matrix and the deletion threshold is determined; finally, the initial neural network may be determined according to the relationship between the absolute value of the element value and the deletion threshold. The cell node to be deleted.

In this embodiment, the deletion threshold may be set based on the basic distribution rule of the element values in the weight parameter matrix, and the set deletion threshold is a real number greater than 0. Exemplarily, through actual analysis, The element values in the weight parameter matrix are basically distributed on the left and right sides of the value 0, and the range of the range is generally (-0.05, 0.05), and it is preferable to set the deletion threshold to 0.001. It should be noted that, in this embodiment, the setting of the deletion threshold is only needed when the neural network optimization is performed for the first time. When the neural network optimization is performed cyclically, the set deletion threshold may be kept unchanged or the deletion threshold may be automatically changed based on other rules. In addition, for the weight parameter matrix of different dimensions, the setting of the deletion threshold is different, and thus the deletion threshold needs to be specifically set based on the specific situation.

S103. Delete the unit node to form an optimized target neural network.

In this embodiment, after determining the unit node to be deleted, the unit node may be deleted in the initial neural network, and in addition, when the unit node is deleted, the relationship or connection existing with the unit node is also deleted. The resulting neural network is finally formed, called the target neural network.

A method for optimizing a neural network according to Embodiment 1 of the present invention first obtains an initial neural network that meets a set accuracy condition, and determines a weight parameter matrix between adjacent two nodes in the initial neural network; The threshold is in the determined weight parameter matrix, thereby determining the unit node to be deleted in the initial neural network; finally deleting the determined unit node in the initial neural network to form an optimized target neural network. By using this method, the neural network can be compressed simply and efficiently, and the neural network can be optimized. Therefore, when the face recognition based on the optimized post-neural network is used, the recognition processing speed can be accelerated, the recognition processing time can be shortened, and the storage can be reduced. Run memory and the purpose of space occupied by video memory.

Embodiment 2

FIG. 2 is a schematic flowchart diagram of a method for optimizing a neural network according to Embodiment 2 of the present invention. The embodiment of the present invention is optimized based on the above embodiment, and in this embodiment, the initial god is determined The weight parameter matrix between two adjacent unit nodes in the network is further optimized to: if the connection between two adjacent unit nodes in the initial neural network is fully connected, the weight parameter corresponding to the connection between the unit nodes is based on The value forms a two-dimensional weight parameter matrix; if the connection between two adjacent unit nodes in the initial neural network is a convolution connection, the multi-dimensional weight parameter matrix is formed based on the weight parameter array corresponding to the connection between the unit nodes.

Further, the weight parameter matrix is processed according to the set deletion threshold, and the unit node to be deleted in the initial neural network is determined, and the specific optimization is: if the weight parameter matrix is a two-dimensional weight parameter matrix, based on The column vector in the two-dimensional weight parameter matrix determines a unit node to be deleted in the initial neural network; if the weight parameter matrix is a multi-dimensional weight parameter matrix, determining the initial nerve based on the dimension-reduced multi-dimensional weight parameter matrix The cell node to be deleted in the network.

As shown in FIG. 2a, a method for optimizing a neural network according to Embodiment 2 of the present invention specifically includes the following operations:

S201. Acquire an initial neural network that meets a condition of setting accuracy.

Illustratively, after training learning of the created neural network, if the current processing accuracy of the neural network meets the set accuracy condition, the neural network may be determined as the initial neural network to be acquired.

S202. If the connection between two adjacent unit nodes in the initial neural network is a full connection, the two-dimensional weight parameter matrix is formed based on the weight parameter values corresponding to the connection between the unit nodes, and then step S204 is performed.

In this embodiment, when the connection between adjacent two layer unit nodes is only a single line connection, the connection between the adjacent two layer unit nodes may be referred to as a full connection, and at this time, any unit node of one layer of the layer There is only one weight parameter value between any one of the unit nodes of another layer, so that the two-dimensional weight parameter matrix can be constructed according to the weight parameter value corresponding to the connection between the adjacent two layer unit nodes.

Illustratively, FIG. 2b is a structural diagram of a trained initial neural network according to Embodiment 2 of the present invention. As shown in FIG. 2b, the initial neural network has four layers of unit nodes, wherein the first layer serves as an input layer. Layers 2 and 3 act as hidden layers, and layer 4 acts as an output layer, and it can be determined that the connections between adjacent two-layer unit nodes in the initial neural network are fully connected. Taking the connection between the first layer and the second layer unit node as an example, first, the weight parameter value corresponding to the connection between the mth unit node in the first layer and the nth unit node in the second layer is represented by w _mn . Wherein, 1≤m≤5, 1≤n≤4; then a 5×4 order two-dimensional weight parameter matrix can be formed between the unit nodes in the first layer and the second layer, and the two-dimensional weight parameter matrix W _5×4 It can be expressed as:

S203. If the connection between two adjacent unit nodes in the initial neural network is a convolution connection, the multi-dimensional weight parameter matrix is formed based on the weight parameter array corresponding to the connection between the unit nodes, and then step S205 is performed.

In this embodiment, when the connection between two adjacent unit nodes is a multi-line connection (at least two connections exist) or a connection relationship is represented by a function, the connection between the adjacent two-layer unit nodes may be referred to as Convolution connection. At this time, there may be an array of weight parameters (generally a one-dimensional array or a two-dimensional array) between any one of the unit nodes of one layer to any of the unit nodes of the other layer, thereby being able to be based on the adjacent two-level unit nodes. The multi-weight weight parameter matrix is constructed by connecting the weight parameter array corresponding to the line, wherein when the weight parameter array is a one-dimensional array, a three-dimensional weight parameter matrix can be formed, and when the weight parameter array is a two-dimensional array, a four-dimensional weight parameter can be formed. matrix.

S204. Determine, according to a column vector in the two-dimensional weight parameter matrix, the to-be-deleted list in the initial neural network. Meta node.

Specifically, determining the unit node to be deleted in the initial neural network based on the column vector in the two-dimensional weight parameter matrix includes: acquiring a column vector of the i-th column in the two-dimensional weight parameter matrix; If the weight parameter values included in the column vector of the column are all smaller than the set first deletion threshold, the i-th unit node of the second layer in the adjacent two layers is determined as the unit node to be deleted in the initial neural network.

In this embodiment, for the fully connected adjacent two-layer unit nodes, if the formed two-dimensional weight parameter matrix is g×h order, it may be determined that the two-dimensional weight parameter matrix has a column vector of the h column. And determining that the weight parameter values in the i-th column of the two-dimensional weight parameter matrix are related to the i-th unit node of the second layer in the adjacent two layers. Specifically, the determining process of the unit node to be deleted may be expressed as: acquiring the set first deletion threshold, selecting a column vector of the i-th column in the two-dimensional weight parameter matrix, and determining the ith column column vector Whether the included g weight parameter values are all smaller than the first set threshold, and if all are smaller than the first set threshold, the i-th unit node of the second layer of the adjacent two layers may be determined as the initial neural network. The cell node to be deleted.

In this embodiment, it is assumed that the weight parameter value in the two-dimensional weight parameter matrix of g×h order is represented as w _ai , and the obtained first deletion threshold is t ₁ , where 1≤a≤g, 1≤i≤ h, the process of determining whether the g weight parameter values included in the i-th column column vector are all smaller than the first set threshold t ₁ may be specifically described as: if the weight parameter value w _ai in the i-th column column vector is smaller than the first Once the threshold t ₁ is deleted, the function f(w _ai )=0 is determined. Otherwise, the function f(w _ai )=1 is determined, and then determined.

Whether the value is equal to 0, if equal to 0, it can be considered that the g weight parameter values included in the i-th column column vector are all smaller than the first set threshold t ₁ . It should be noted that, when the first deletion threshold is set for the first time, the first deletion threshold may be preferably set based on a basic distribution rule of the weight parameter values in the two-dimensional weight parameter matrix.

S205. Determine, according to the dimension reduction multidimensional weight parameter matrix, the unit node to be deleted in the initial neural network.

In this embodiment, for a convoluted adjacent two-layer unit node, a multi-dimensional weight parameter matrix may be formed. In this case, the multi-dimensional weight parameter matrix needs to be subjected to dimensionality reduction processing, and is formed after the dimensionality reduction processing. A two-dimensional target weight parameter matrix, after which the cell node to be deleted can be determined according to the target weight parameter matrix.

Further, determining the unit node to be deleted in the initial neural network based on the dimension-reduced multi-dimensional weight parameter matrix comprises: performing dimension reduction processing on the multi-dimensional weight parameter matrix to form a two-dimensional target weight parameter matrix; a column vector of the jth column in the target weight parameter matrix; if the element values included in the column vector of the jth column are all smaller than the set second deletion threshold, the jth of the second layer in the adjacent two layers The unit node is determined to be a unit node to be deleted in the initial neural network.

In this embodiment, the process of reducing the dimensionality of the multi-dimensional weight parameter matrix to form a two-dimensional target weight parameter matrix may be described as: determining the dimension of the multi-dimensional weight parameter matrix and the element value in the matrix, if the multi-dimensional weight parameter matrix is The three-dimensional weight parameter matrix, the value of each element in the matrix is usually a one-dimensional array, in this case, the data in the one-dimensional array can be directly summed, and finally the calculated data and or the average of the data sum can be used as The element value corresponding to the target weight parameter matrix after dimension reduction; if the multi-dimensional weight parameter matrix is a four-dimensional weight parameter matrix, each element value in the matrix is usually a two-dimensional array, and the column data of each column in the two-dimensional array is first used. The summation calculation is performed to obtain the corresponding column data sum, and then the column data is summed and calculated. Finally, the calculated data and the average value of the data sum may be used as the element values corresponding to the dimension-reduced target weight parameter matrix.

In this embodiment, after performing the dimensionality reduction processing on the multi-dimensional weight parameter matrix to obtain the target weight parameter matrix, the determination of the unit node to be deleted may be performed in the formed target weight parameter matrix, The operation process of determining the unit node to be deleted is the same as the operation process determined by the unit node based on step S204, and details are not described herein again. It should be noted that, when the second deletion threshold is set for the first time, the second deletion threshold may be preferably set based on a basic distribution rule of the element values in the target weight parameter matrix.

S206. Delete the unit nodes determined in the neural network to form an optimized target neural network.

In this embodiment, based on the operations of step S201 to step S205 described above, all the unit nodes to be deleted in the initial neural network may be determined, thereby deleting all the determined unit nodes in the initial neural network, and simultaneously deleting and The unit node has a connection to the relationship.

Illustratively, FIG. 2c is a structural diagram of a target neural network formed by optimizing an initial neural network according to Embodiment 2 of the present invention. Following the example in the above step S202, as shown in FIG. 2c, compared with the initial neural network in FIG. 2b, one unit node is reduced in the second layer, and thus, the neural network provided according to the second embodiment of the present invention is known. The optimization method realizes the optimization of deleting the second unit node of the second layer of the initial neural network in Fig. 2b.

A method for optimizing a neural network according to Embodiment 2 of the present invention specifically describes a process for determining a weight matrix of corresponding weights when different adjacent nodes of a neural network have different connection modes; and specifically describing different weight parameter matrices The process of determining the cell node to be deleted. By using this method, the neural network with different unit node connection modes can be compressed and optimized, so that when the face recognition based on the optimized back neural network is used, the recognition processing speed can be accelerated, the recognition processing time can be shortened, and the storage and operation can be reduced. The purpose of memory and space occupied by video memory.

Embodiment 3

FIG. 3a is a schematic flowchart diagram of a method for optimizing a neural network according to Embodiment 3 of the present invention. Ben The embodiment of the present invention is optimized based on the foregoing embodiment. In this embodiment, the optimization is further increased: determining whether the current processing precision of the target neural network meets the set precision condition, and determining the target neural network based on the determination result. The network conducts training learning or deep optimization.

On the basis of the above optimization, the training learning or depth optimization of the target neural network based on the determination result is further optimized to: if the current processing precision does not meet the set precision condition, then the target neural network is Performing training learning until the set accuracy condition is met or the set training number is reached; otherwise, the deletion threshold is self-incremented, and the target neural network is used as a new initial neural network to re-execute the neural network optimization. The operation, wherein the deletion threshold is a first deletion threshold or a second deletion threshold.

On the basis of the above optimization, the operation including another case is also optimized, that is, if the target neural network after the training learning meets the set precision condition and the number of training times is not greater than the set training number, then The deletion threshold is performed by a self-incrementing operation, and the target neural network is used as a new initial neural network to re-execute the optimization operation of the neural network.

As shown in FIG. 3, a method for optimizing a neural network according to Embodiment 3 of the present invention specifically includes the following operations:

S301. Acquire an initial neural network that meets a set precision condition, and determine a weight parameter matrix between adjacent two layer unit nodes in the initial neural network.

In this embodiment, it can be considered that the acquired initial neural network is specifically used for face recognition, and the setting precision condition can be considered as the processing precision required by the neural network in face recognition.

S302. Process the weight parameter matrix according to the set deletion threshold, and determine a unit node to be deleted in the initial neural network.

S303. Delete the unit node to form an optimized target neural network.

S304. Determine whether the current processing precision of the target neural network meets the set precision condition. If not, execute step S305; if yes, execute step S307.

In this embodiment, it can also be considered that the optimized target neural network is also specifically used for face recognition. Therefore, it is also required to test whether the current processing accuracy of the target neural network meets the set accuracy condition.

Specifically, the testing process of the current processing accuracy of the target neural network may be described as follows: first, the sample image required for the target neural network test is selected according to the rules of the international standard face verification test set, and preferably, 3000 pairs of positive samples may be selected. Image (a pair of positive sample images are two images with the same face) and 3000 pairs of negative sample images (a pair of negative sample images are two images with different faces); then, the positive sample images are paired and negative, respectively The sample image pair is input data of the target neural network; finally, the current processing accuracy of the target neural network can be determined according to the value of the output result X of the face recognition and the calculation formula of the current processing precision.

Illustratively, when the target neural network determines two images in the positive sample image pair as the same person, the output result X has a value of 1, otherwise the output result X has a value of 0; when the target neural network will have a negative sample image pair When the two images in the determination are not the same person, the value of the output result X is 1, otherwise the value of the output result X is 0. Among them, the calculation formula of the current processing precision can be expressed as:

In this embodiment, if the determined current processing precision meets the set precision condition, the operation of step S307 may be directly performed; otherwise, the operation of step S305 needs to be performed.

S305. Perform training learning on the target neural network, and then perform step S306.

In this embodiment, when the determined current processing accuracy does not meet the set accuracy condition, the target neural network may be trained and learned based on the set training learning method. The training learning method used in training learning is not detailed here.

S306. Determine whether the number of trainings for training the target neural network reaches the set training number. If not, return to step S304; if yes, execute step S308.

In this embodiment, after training the target neural network, it is necessary to determine whether the number of trainings for which the training is performed reaches the set training number, and then perform different operation steps based on the determination result.

S307. Perform a self-increment operation on the deletion threshold, and use the target neural network as a new initial neural network, and return to step S301.

In this embodiment, if the current processing accuracy of the target neural network meets the set accuracy condition, the current target neural network may continue to perform depth optimization based on steps S301 to S303.

Before performing deep optimization on the target neural network, the deletion threshold may be self-incremented, and then the target neural network is used as the initial neural network, and then returning to step S301 to restart the optimization of the neural network. It should be noted that, since the deletion threshold is the first deletion threshold or the second deletion threshold in this embodiment, the incremental value according to the deletion threshold self-increment operation is mainly set based on a specific situation. It can be understood that when the neural network optimization is performed based on the added deletion threshold, more unit nodes to be deleted can be determined, so that the target neural network that is finally suitable for face recognition can be determined more quickly.

Illustratively, FIG. 3b is a structural diagram of a target neural network formed by depth optimization of an original target neural network according to Embodiment 3 of the present invention. According to the example in step S206 in the second embodiment, the target shown in FIG. 3b can be considered. The neural network is a further depth optimization of the target neural network shown in Figure 2c, as shown in Figure 3b, compared to the target neural network provided in Figure 2c, there is a reduction of one unit node in the third layer, thus knowing Based on the optimization method of the neural network provided by the third embodiment of the present invention, the depth optimization of deleting the second unit node of the third layer of the target neural network provided in FIG. 2c is implemented.

S308. End training learning on the target neural network.

In this embodiment, if the number of trainings for training the target neural network reaches the set training number, and the current processing accuracy corresponding to the current processing does not meet the set precision condition, the training of the target neural network may be ended. At the same time, it can be preferably determined that the target neural network that was last optimized as the initial neural network is the neural network required for face recognition.

A method for optimizing a neural network according to Embodiment 3 of the present invention further increases the operation of training learning or deep optimization for the optimized neural network, thereby maintaining the processing precision of the optimized neural network and the existing optimization. Compared with the method, the compression optimization of the neural network can be realized without reducing the computational accuracy of the neural network processing, so as to speed up the face recognition processing and shorten the face recognition when performing face recognition based on the optimized back neural network. Processing time, reducing the purpose of storage, running memory, and space occupied by video memory.

Embodiment 4

FIG. 4 is a structural block diagram of an apparatus for optimizing a neural network according to Embodiment 4 of the present invention. The device is suitable for compression optimization of a neural network after training and learning, wherein the device can be implemented by software and/or hardware, and is generally integrated on a terminal device or a server platform where the neural network model is located. As shown in FIG. 4, the optimization apparatus includes a parameter matrix determination module 41, a node determination module 42 to be deleted, and a target network determination module 43.

The parameter matrix determining module 41 is configured to acquire an initial neural network that meets a set precision condition, and determine a weight parameter matrix between adjacent two layer unit nodes in the initial neural network;

The to-be-deleted node determining module 42 is configured to process the weight parameter matrix according to the set deletion threshold to determine a unit node to be deleted in the initial neural network;

The target network determining module 43 is configured to delete the unit node to form an optimized target neural network.

In this embodiment, the optimization device first obtains an initial neural network that meets the set precision condition by the parameter matrix determination module 41, and determines a weight parameter matrix between adjacent two layer unit nodes in the initial neural network; The node determining module 42 processes the weight parameter matrix according to the set deletion threshold to determine a unit node to be deleted in the initial neural network; finally, the target network determining module 43 deletes the unit node to form an optimized target neural network.

The neural network optimization device provided in Embodiment 4 of the present invention can compress the neural network simply and efficiently, and realize the optimization of the neural network, thereby achieving accelerated recognition when performing face recognition based on the optimized back neural network. Processing speed, shorten recognition processing time, and reduce the space occupied by storage, running memory and video memory.

Further, the parameter matrix determining module 41 is specifically configured to:

After obtaining the initial neural network that meets the set precision condition, if the connection between two adjacent unit nodes in the initial neural network is fully connected, the two-dimensional weight parameter is formed based on the weight parameter values corresponding to the connection between the unit nodes. a matrix; if the connection between two adjacent unit nodes in the initial neural network is a convolution connection, the multi-dimensional weight parameter matrix is formed based on the weight parameter array corresponding to the connection between the unit nodes.

Further, the node to be deleted module 42 includes:

a first determining unit, configured to: when the weight parameter matrix is a two-dimensional weight parameter matrix, determine a cell node to be deleted in the initial neural network based on a column vector in the two-dimensional weight parameter matrix;

And a second determining unit, configured to: when the weight parameter matrix is a multi-dimensional weight parameter matrix, determine a cell node to be deleted in the initial neural network based on the dimension-reduced multi-dimensional weight parameter matrix.

Based on the above optimization, the first determining unit is specifically configured to:

When the weight parameter matrix is a two-dimensional weight parameter matrix, acquiring a column vector of the i-th column in the two-dimensional weight parameter matrix; if the weight parameter values included in the column vector of the i-th column are smaller than the set value The first deletion threshold determines the i-th unit node of the second layer in the adjacent two layers as the unit node to be deleted in the initial neural network.

Further, the second determining unit is specifically configured to:

When the weight parameter matrix is a two-dimensional weight parameter matrix, the multi-dimensional weight parameter matrix is subjected to dimensionality reduction processing to form a two-dimensional target weight parameter matrix; and the column vector of the j-th column in the target weight parameter matrix is obtained; If the element values included in the column vector of the jth column are both smaller than the set second deletion threshold, determining the jth unit node of the second layer of the adjacent two layers as the to-be-deleted in the initial neural network Unit node.

Further, the optimization apparatus further includes: a target network processing module 44, configured to determine whether a current processing precision of the target neural network meets the set precision condition, and perform training learning or depth on the target neural network based on the determination result. optimization.

Based on the above optimization, the target network processing module 44 is specifically configured to:

Determining whether the current processing precision of the target neural network meets the set precision condition; if the current processing accuracy does not meet the set accuracy condition, performing training learning on the target neural network until the setting is met The precision condition or the set training number is reached; otherwise, the deletion threshold is self-incremented, and the target neural network is used as a new initial neural network to re-execute the optimization operation of the neural network, wherein the deletion threshold is The first deletion threshold or the second deletion threshold.

Further, if the target neural network after the training learning meets the set precision condition and the number of training times is not greater than the set training number, the self-increase operation is performed on the deletion threshold, and the target neural network is used as a new The initial neural network re-executes the optimization of the neural network.

Note that the above are only the preferred embodiments of the present invention and the technical principles applied thereto. Those skilled in the art will appreciate that the invention is not limited to the specific embodiments described herein, and that various modifications, changes and substitutions may be made without departing from the scope of the invention. Therefore, the present invention has been described in detail by the above embodiments, but the present invention is not limited to the above embodiments, and other equivalent embodiments may be included without departing from the inventive concept. The scope is determined by the scope of the appended claims.

Claims

A method for optimizing a neural network, comprising:

Obtaining an initial neural network that meets a set precision condition, and determining a weight parameter matrix between adjacent two layer unit nodes in the initial neural network;

Processing the weight parameter matrix according to the set deletion threshold, and determining a unit node to be deleted in the initial neural network;

Deleting the unit nodes forms an optimized target neural network.
The method according to claim 1, wherein determining a weight parameter matrix between adjacent two-layer unit nodes in the initial neural network comprises:

If the connection between two adjacent unit nodes in the initial neural network is a full connection, forming a two-dimensional weight parameter matrix based on the weight parameter values corresponding to the connection between the unit nodes;

If the connection between two adjacent unit nodes in the initial neural network is a convolution connection, a multi-dimensional weight parameter matrix is formed based on the weight parameter array corresponding to the connection between the unit nodes.
The method according to claim 2, wherein the determining the weight parameter matrix according to the set deletion threshold, determining the unit node to be deleted in the initial neural network comprises:

If the weight parameter matrix is a two-dimensional weight parameter matrix, determining a cell node to be deleted in the initial neural network based on a column vector in the two-dimensional weight parameter matrix;

If the weight parameter matrix is a multi-dimensional weight parameter matrix, the unit node to be deleted in the initial neural network is determined based on the dimension-reduced multi-dimensional weight parameter matrix.
The method according to claim 3, wherein determining the unit node to be deleted in the initial neural network based on the column vector in the two-dimensional weight parameter matrix comprises:

Obtaining a column vector of the i-th column in the two-dimensional weight parameter matrix;

If the value of the weight parameter included in the column vector of the i-th column is less than the set first deletion threshold, Then, the i-th unit node of the second layer in the adjacent two layers is determined as the unit node to be deleted in the initial neural network.
The method according to claim 3, wherein determining the unit node to be deleted in the initial neural network based on the dimension-reduced multi-dimensional weight parameter matrix comprises:

Performing dimensionality reduction processing on the multi-dimensional weight parameter matrix to form a two-dimensional target weight parameter matrix;

Obtaining a column vector of the jth column in the target weight parameter matrix;

If the element values included in the column vector of the jth column are both smaller than the set second deletion threshold, determining the jth unit node of the second layer of the adjacent two layers as the to-be-deleted in the initial neural network Unit node.
The method of any of claims 1-5, further comprising:

Determining whether the current processing precision of the target neural network meets the set accuracy condition, and performing training learning or depth optimization on the target neural network based on the determination result.
The method according to claim 6, wherein the training learning or depth optimization of the target neural network based on the determination result comprises:

If the current processing precision does not meet the set accuracy condition, performing training learning on the target neural network until the set accuracy condition is met or the set training number is reached; otherwise,

Performing a self-incrementing operation on the deletion threshold, and using the target neural network as a new initial neural network to re-execute the optimization operation of the neural network, wherein the deletion threshold is a first deletion threshold or a second deletion threshold.
The method according to claim 7, wherein if the target neural network after the training learning meets the set precision condition and the number of training times is not greater than the set training number, the deletion threshold is self-incremented. And the target neural network as a new initial neural network, re Perform optimization of the neural network.
An apparatus for optimizing a neural network, comprising:

a parameter matrix determining module, configured to acquire an initial neural network that meets a set precision condition, and determine a weight parameter matrix between adjacent two layer unit nodes in the initial neural network;

a node determining module to be deleted, configured to process the weight parameter matrix according to the set deletion threshold, and determine a unit node to be deleted in the initial neural network;

The target network determining module is configured to delete the unit node to form an optimized target neural network.
The device according to claim 9, further comprising:

The target network processing module is configured to determine whether the current processing precision of the target neural network meets the set precision condition, and perform training learning or depth optimization on the target neural network based on the determination result.