WO2019091401A1 - 深度神经网络的网络模型压缩方法、装置及计算机设备 - Google Patents

深度神经网络的网络模型压缩方法、装置及计算机设备 Download PDF

Info

Publication number
WO2019091401A1
WO2019091401A1 PCT/CN2018/114357 CN2018114357W WO2019091401A1 WO 2019091401 A1 WO2019091401 A1 WO 2019091401A1 CN 2018114357 W CN2018114357 W CN 2018114357W WO 2019091401 A1 WO2019091401 A1 WO 2019091401A1
Authority
WO
WIPO (PCT)
Prior art keywords
network
operation unit
neural network
deep neural
importance
Prior art date
Application number
PCT/CN2018/114357
Other languages
English (en)
French (fr)
Inventor
张渊
陈伟杰
谢迪
浦世亮
Original Assignee
杭州海康威视数字技术股份有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 杭州海康威视数字技术股份有限公司 filed Critical 杭州海康威视数字技术股份有限公司
Publication of WO2019091401A1 publication Critical patent/WO2019091401A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Definitions

  • the present application relates to the field of data processing technologies, and in particular, to a network model compression method, apparatus, and computer device for a deep neural network.
  • DNN Deep Neural Network
  • the DNN includes: CNN (Convolutional Neural Network), RNN (Recurrent Neural Network), LSTM (Long Short Term Memory), and the like. Because DNN can quickly and accurately identify and detect targets through the operation of multiple network layers in the network model, it has been widely used in target detection and segmentation, behavior detection and recognition, and speech recognition.
  • the target features are more and more complex, and more and more target features need to be extracted.
  • the network layer and the arithmetic unit in each network layer The number is increasing greatly, resulting in increased computational complexity of target recognition and target detection, and a large number of network layers and computing units consume excessive memory and bandwidth resources, affecting the efficiency of target recognition and target detection.
  • the purpose of the embodiments of the present application is to provide a network model compression method, apparatus, and computer device for a deep neural network, so as to improve the efficiency of target recognition and target detection.
  • the specific technical solutions are as follows:
  • an embodiment of the present application provides a network model compression method for a deep neural network, where the method includes:
  • An operation unit of the network layer whose importance is lower than a preset importance degree is determined as an operation unit to be deleted by analyzing an importance degree of each operation unit in the network layer of the original deep neural network;
  • the determining, by using the importance degree of each operation unit in the network layer of the original deep neural network, determining, in the network layer, an operation unit whose importance is lower than a preset importance degree, as an operation unit to be deleted include:
  • an operation unit whose importance is lower than the preset importance degree is determined as the operation unit to be deleted.
  • the method further includes:
  • rank analysis tool analyzing the network layer of the original deep neural network to obtain a first number of the operation units to be deleted in the network layer under the condition that the preset error tolerance is satisfied;
  • the importance degree of each operation unit in the network layer is obtained by analyzing the importance degree of each operation unit in the network layer of the original deep neural network;
  • the first number of operation units are selected in order of importance of each operation unit in the network layer from small to large, and the selected operation unit is regarded as an operation unit to be deleted.
  • the method further includes:
  • the network is determined by a preset algorithm
  • the weights in the arithmetic unit of each network layer in the deep neural network after the model compression are adjusted until the output result satisfies the preset effect.
  • the method further includes:
  • the weight of each operation unit of the network layer is adjusted by using a preset regularization item, and when the correlation degree is less than the preset correlation degree, the weights in each operation unit are stopped.
  • an embodiment of the present application provides a network model compression apparatus for a deep neural network, where the apparatus includes:
  • a first obtaining module configured to acquire an original deep neural network
  • a first determining module configured to determine, by using an importance of each operation unit in the network layer of the original deep neural network, an operation unit whose importance is lower than a preset importance in the network layer as an operation unit to be deleted ;
  • a deleting module configured to delete the to-be-deleted computing unit of each network layer in the original deep neural network, and obtain a deep neural network compressed by the network model.
  • the first determining module is specifically configured to:
  • an operation unit whose importance is lower than the preset importance degree is determined as the operation unit to be deleted.
  • the device further includes:
  • An analysis module configured to analyze, by using a rank analysis tool, the network layer of the original deep neural network to obtain a first number of the operation units to be deleted in the network layer under the condition that the preset error tolerance is satisfied;
  • the first determining module is specifically configured to:
  • the importance degree of each operation unit in the network layer is obtained by analyzing the importance degree of each operation unit in the network layer of the original deep neural network;
  • the first number of operation units are selected in order of importance of each operation unit in the network layer from small to large, and the selected operation unit is regarded as an operation unit to be deleted.
  • the device further includes:
  • a second obtaining module configured to obtain an output result of performing operations by using the deep neural network compressed by the network model
  • a first adjustment module configured to: if the output result cannot meet the preset effect, use a difference between an output result of the original depth neural network and an output result of the compressed depth neural network of the network model, And an algorithm is configured to adjust weights in the operation units of each network layer in the deep neural network compressed by the network model until the output result satisfies the preset effect.
  • the device further includes:
  • a third acquiring module configured to acquire a correlation between each computing unit of any network layer in the deep neural network compressed by the network model
  • a determining module configured to determine whether the correlation is less than a preset relevance
  • a second adjusting module configured to: if the determining result of the determining module is negative, adjust a weight in each computing unit of the network layer by using a preset regularization item, until the correlation is less than the pre-determination When the correlation is set, the adjustment of the weight in each arithmetic unit is stopped.
  • the embodiment of the present application provides a computer readable storage medium for storing executable code, which is executed at runtime: a deep neural network provided by the first aspect of the embodiment of the present application Network model compression method.
  • the embodiment of the present application provides an application program, which is executed at runtime: a network model compression method of a deep neural network provided by the first aspect of the embodiment of the present application.
  • an embodiment of the present application provides a computer device, including a processor and a computer readable storage medium, where
  • the computer readable storage medium for storing executable code
  • the processor is configured to implement the step of the network model compression method of the deep neural network provided by the first aspect of the embodiment of the present application when the executable code stored on the computer readable storage medium is executed.
  • the operation of determining the importance of the network layer is lower than the preset importance degree.
  • the unit is used as the operation unit to be deleted, and then the operation unit to be deleted of each network layer in the original deep neural network is obtained, and the operation unit to be deleted of each network layer in the original deep neural network is deleted, and the deep neural network compressed by the network model can be obtained.
  • deleting the operation unit to be deleted does not affect the recognition and detection of the target, such that By deleting the operation unit to be deleted in each network layer, the network model of the compressed depth neural network is realized, thereby reducing the computational complexity of target recognition and target detection, and reducing the memory and bandwidth resource consumption, thereby improving target recognition and target detection. s efficiency.
  • FIG. 1 is a schematic flow chart of a network model compression method for a deep neural network according to an embodiment of the present application
  • FIG. 2 is a schematic flow chart of a network model compression method for a deep neural network according to another embodiment of the present application
  • FIG. 3 is a schematic flow chart of a network model compression method for a deep neural network according to still another embodiment of the present application.
  • FIG. 4 is a schematic flow chart of a network model compression method for a deep neural network according to still another embodiment of the present application.
  • FIG. 5 is a schematic structural diagram of a network model compression apparatus for a deep neural network according to an embodiment of the present application.
  • FIG. 6 is a schematic structural diagram of a network model compression apparatus for a deep neural network according to another embodiment of the present application.
  • FIG. 7 is a schematic structural diagram of a network model compression apparatus for a deep neural network according to still another embodiment of the present application.
  • FIG. 8 is a schematic structural diagram of a network model compression apparatus for a deep neural network according to still another embodiment of the present application.
  • FIG. 9 is a schematic structural diagram of a computer device according to an embodiment of the present application.
  • the embodiment of the present application provides a network model compression method, device, and computer device for a deep neural network.
  • the execution body of the network model compression method of the deep neural network provided by the embodiment of the present application may be a computer device that implements functions such as image classification, voice recognition, and target detection, or may be a camera having functions of image classification, target detection, and the like. It can be a microphone with voice recognition function, and the execution body includes at least a core processing chip with data processing capability.
  • the manner of implementing the network model compression method of the deep neural network provided by the embodiment of the present application may be at least one of software, hardware circuits, and logic circuits disposed in the execution body.
  • a network model compression method for a deep neural network may include the following steps:
  • the original deep neural network is a deep neural network for target recognition and target detection functions such as image classification, speech recognition, and target detection. It is a deep neural network designed according to the target features that need to be identified and detected.
  • the network model of the original deep neural network can be obtained, that is, the network layer of the original deep neural network, the operation unit of each network layer, and the network parameters of each network layer, where the network parameters are included in the network layer. The number of arithmetic units included and the specific values in each arithmetic unit.
  • the target features are complex, and the target features to be extracted are numerous, so that the structure of the network model in the original deep neural network is complicated, and the number of arithmetic units in the network layer and each network layer is large, and A large number of network layers and computing units consume too much memory and bandwidth resources, resulting in a large computational complexity of target recognition and target detection. Therefore, in this embodiment, the original deep neural network needs to be analyzed through the network. The model is compressed to reduce the computational complexity, thereby improving the efficiency of target recognition and target detection.
  • S102 Perform an analysis on the importance degree of each operation unit in the network layer of the original deep neural network, and determine an operation unit in the network layer whose importance is lower than a preset importance degree as the operation unit to be deleted.
  • Each computing unit in the network layer of the original deep neural network may be used to extract different target features.
  • a network layer includes an arithmetic unit for extracting eye features for extracting the nose.
  • the operation unit of the feature, the operation unit for extracting the ear feature, the operation unit for extracting the contour of the face, etc. actually, in the process of performing feature extraction, some features have a great influence on the result of target recognition and target detection, and Some features have little effect on the results of target recognition and target detection.
  • the arithmetic unit that extracts features such as eyes, nose, and ears is for face recognition.
  • the degree of influence of the result is stronger than that of the arithmetic unit that extracts features such as hair color, glasses, earrings, etc., that is, the arithmetic unit for extracting features such as eyes, nose, and ears is more important.
  • the importance degree can be configured according to the degree of influence of each arithmetic unit on the target recognition and the target detection result.
  • the comparison may be compared with the preset importance degree. If the importance level is lower than the preset importance degree, the operation unit is determined as the operation unit to be deleted.
  • the preset importance degree is the importance degree of the preset operation unit, and is generally set according to the influence of the feature of the target that needs to be identified and detected on the target recognition and the target detection. For example, the importance degree is divided into the first importance degree, the first Second importance degree, third importance degree, fourth importance degree, and the order of influence on target recognition and target detection is that the first importance degree is stronger than the second importance degree, the second importance degree is stronger than the third importance degree, The third importance is stronger than the fourth importance.
  • the third importance degree may be set as the preset importance degree, if the importance degree of one operation unit is the fourth importance degree. Because the preset importance is lower than the preset importance, the operation unit may be determined as the operation unit to be deleted.
  • the preset importance may also be an importance determined according to the number of the operation units that can be deleted in the network layer by analysis, for example, for a certain network layer, the number of the operation units that can be deleted in the network layer is obtained through analysis. If the total number of operation units in the network layer is 12, the minimum importance degree among the remaining 7 operation units can be determined as the preset importance degree. In general, the importance degrees of the five deletable operation units are all It is smaller than the preset importance degree, so that five arithmetic units whose importance is lower than the preset importance degree can be determined as the operation unit to be deleted. For each network layer in the original deep neural network, the steps of S102 are performed, and the operation unit to be deleted of each network layer can be obtained.
  • S102 can be specifically:
  • the absolute value of the weight of each arithmetic unit in the network layer of the original deep neural network is extracted.
  • the importance of each operation unit is configured according to the absolute value of the weight of each operation unit in the network layer, wherein the absolute value of the weight of each transport unit is proportional to the importance of the configuration.
  • an operation unit whose importance is lower than the preset importance degree is determined as the operation unit to be deleted.
  • the absolute value of the weight of each operation unit in the network layer of the original deep neural network represents the degree of influence of the operation unit on the result of target recognition and target detection.
  • the degree of influence of the results of the target detection is stronger. Therefore, the importance degree of each operation unit can be configured according to the absolute value of the weight of each operation unit.
  • the absolute value of the weight can be directly used as the importance degree, or the weight value in a certain interval can be determined according to the absolute value of the weight value.
  • the absolute value is configured as the higher importance, the medium importance, and the lower importance, and the absolute value of the weight is proportional to the importance, that is, the greater the absolute value of the weight, the higher the importance.
  • the importance can be further divided according to the requirements, for example, divided into the first importance degree, the second importance degree, the third importance degree, the fourth importance degree, and the like.
  • the operation unit whose importance is lower than the preset importance degree may be determined as the operation unit to be deleted. For example, if the preset importance is medium importance, the operation unit of lower importance may be used. Determined as the unit to be deleted.
  • the operation unit to be deleted in each network layer in the original deep neural network is an operation unit that has less influence on the target recognition and target detection results. Since these operation units have less influence on the target recognition and target detection results, the original can be directly
  • the to-be-deleted arithmetic unit of each network layer in the deep neural network is deleted, so that the network model of the compressed depth neural network can be realized without affecting the result of target recognition and target detection, thereby achieving the operation of reducing target recognition and target detection.
  • the operation unit whose importance degree is lower than the preset importance degree in the network layer is determined as the operation unit to be deleted, and further The operation unit to be deleted of each network layer in the original deep neural network is obtained, and the operation unit to be deleted of each network layer in the original deep neural network is deleted, and the deep neural network compressed by the network model can be obtained.
  • deleting the operation unit to be deleted does not affect the recognition and detection of the target, such that By deleting the operation unit to be deleted in each network layer, the network model of the compressed depth neural network is realized, thereby reducing the computational complexity of target recognition and target detection, and reducing the memory and bandwidth resource consumption, thereby improving target recognition and target detection. s efficiency.
  • the embodiment of the present application further provides a network model compression method for a deep neural network.
  • the network model compression method of the deep neural network may include the following steps:
  • the rank of the matrix of m i-th arithmetic units characterizes the network layer, there are several important Layer i arithmetic unit, for example, if by The rank analysis tool analyzes that the rank of the matrix composed of m i arithmetic units is 3, and the total number of actual arithmetic units in the network layer Layer i of the original deep neural network is 8, indicating that there are 3 in the network layer Layer i
  • An important arithmetic unit, and 5 arithmetic units are not important arithmetic units, the maximum number of arithmetic units that can be deleted is 5, in order to ensure that the result of the target recognition and detection is within a certain error range, the number of arithmetic units to be deleted
  • the determination needs to be based on the preset error tolerance ⁇ .
  • the first number of the arithmetic unit to be deleted may be determined to be 3.
  • the first number can be determined as the maximum number of arithmetic units that can be deleted under the condition that the preset error tolerance is satisfied, of course, if the first number is smaller than the The maximum number of values, for example, in the above example, the first number can also be determined to be 2 or 1.
  • the rank analysis tool may be a PCA (Principal Component Analysis) method.
  • the rank analysis tool may be any method for obtaining the rank of the matrix by analysis, and details are not described herein again.
  • the importance degree of each operation unit in the network layer can be analyzed according to the step S102 of the embodiment shown in FIG. 1 to obtain the importance of each operation unit in the network layer, and details are not described herein again.
  • the operation units with the lowest importance can be determined as the operation unit to be deleted, and for the network layer Layer i , if the operation is to be deleted
  • the number of units is n i
  • the total number of operation units is m i
  • the n i operation units with the lowest importance can be determined as the operation unit to be deleted, so that the number of the final operation units of the network layer Layer i is m i -n i .
  • the first number of the operation units to be deleted is three, and the importance degrees of the ten operation units of the i-th network layer Layer i are from the smallest to the largest: the fifth operation unit, the second operation unit, and the seventh operation unit.
  • the first operation unit, the eighth operation unit, the tenth operation unit, the sixth operation unit, the third operation unit, the fourth operation unit, and the ninth operation unit may preset the first number and the importance level.
  • the degree of importance is set to the importance of the first arithmetic unit, so that the fifth arithmetic unit, the second arithmetic unit, and the seventh arithmetic unit having lower importance can be determined as the arithmetic unit to be deleted.
  • the operation unit whose importance degree is lower than the preset importance degree in the network layer is determined as the operation unit to be deleted, and further The operation unit to be deleted of each network layer in the original deep neural network is obtained, and the operation unit to be deleted of each network layer in the original deep neural network is deleted, and the deep neural network compressed by the network model can be obtained.
  • the preset importance degree may be set according to the first number and the importance degree of each operation unit, because the importance of the operation unit to be deleted is lower than the preset The importance, that is, the impact of the target recognition and the target detection is relatively small.
  • deleting the operation unit to be deleted does not affect the identification and detection of the target, so that by deleting the operation unit to be deleted of each network layer,
  • the network model of the compressed depth neural network is realized, the purpose of reducing the computational complexity of target recognition and target detection, and reducing the memory and bandwidth resource consumption are achieved, thereby improving the efficiency of target recognition and target detection, and the lowest importance can be
  • the first number of corresponding operation units to be deleted is deleted, and the depth neural network after deleting the operation unit to be deleted satisfies the preset error tolerance condition, thereby ensuring that the error of the target recognition and the target detection result is within a certain range, which is high.
  • the accuracy is described in the accuracy.
  • the embodiment of the present application further provides a network model compression method for a deep neural network.
  • the network model compression method of the deep neural network may include the following steps:
  • S302 Perform an analysis on the importance degree of each operation unit in the network layer of the original deep neural network, and determine an operation unit in the network layer whose importance is lower than the preset importance degree as the operation unit to be deleted.
  • the difference between the output result of the original depth neural network and the output result of the deep neural network compressed by the network model is used, and the deep neural field compressed by the network model is determined by a preset algorithm.
  • the weights in the arithmetic unit of each network layer in the network are adjusted until the output result satisfies the preset effect.
  • the preset effect cannot be satisfied.
  • the preset effect is the effect of the target recognition and the target detection that needs to be achieved, that is, there is a certain deviation between the actual output result and the effect that needs to be achieved. In order to reduce the deviation, the original depth can be utilized.
  • the output of the deep neural network after the adjusted network model is compressed satisfies the preset effect, wherein the preset algorithm may be a current general inverse gradient propagation algorithm, such as a BP algorithm, which will not be described in detail herein.
  • the operation unit whose importance degree is lower than the preset importance degree in the network layer is determined as the operation unit to be deleted, and further The operation unit to be deleted of each network layer in the original deep neural network is obtained, and the operation unit to be deleted of each network layer in the original deep neural network is deleted, and the deep neural network compressed by the network model can be obtained.
  • deleting the operation unit to be deleted does not affect the recognition and detection of the target, such that By deleting the operation unit to be deleted in each network layer, the network model of the compressed depth neural network is realized, thereby reducing the computational complexity of target recognition and target detection, and reducing the memory and bandwidth resource consumption, thereby improving target recognition and target detection. s efficiency. Moreover, if the output result of the operation using the deep neural network compressed by the network model cannot satisfy the preset effect, the difference between the output result of the original depth neural network and the output result of the deep neural network compressed by the network model is passed.
  • the preset algorithm adjusts the weights in the operation unit until the output result satisfies the preset effect, thereby effectively avoiding the situation that the output result caused by the high correlation between the operation units cannot satisfy the required effect, and the guarantee is ensured.
  • the embodiment of the present application further provides a network model compression method for a deep neural network.
  • the network model compression method of the deep neural network may include the following steps:
  • S402 Perform an analysis on the importance degree of each operation unit in the network layer of the original deep neural network, and determine an operation unit in the network layer whose importance is lower than the preset importance degree as the operation unit to be deleted.
  • the operation unit to be deleted After the operation unit to be deleted is deleted, there may be a high degree of correlation between the operation units in the network layer of the deep neural network compressed by the network model. When the correlation is high, there is still a certain relationship between the operation units. The redundant information leads to poor performance of the network model. If the correlation between the computing units is greater than or equal to the preset correlation, then the network layer has more redundant information, and the network structure is not sufficiently streamlined.
  • a preset regularization term such as an orthogonal regularization term, adjusts each computing unit of the network layer until the correlation is less than a preset correlation. If a conventional L2 regularization term is used in the original deep neural network, the conventional L2 regularization term may be replaced with a preset regularization term such as an orthogonal regularization term to achieve a reduction in correlation between the arithmetic units. purpose.
  • the operation unit whose importance degree is lower than the preset importance degree in the network layer is determined as the operation unit to be deleted, and further The operation unit to be deleted of each network layer in the original deep neural network is obtained, and the operation unit to be deleted of each network layer in the original deep neural network is deleted, and the deep neural network compressed by the network model can be obtained.
  • deleting the operation unit to be deleted does not affect the recognition and detection of the target, such that
  • the network model of the compressed depth neural network is realized, thereby reducing the computational complexity of target recognition and target detection, and reducing the memory and bandwidth resource consumption, thereby improving target recognition and target detection. s efficiency.
  • the preset regularization term is used to the network layer. The weights in each operation unit are adjusted until the correlation degree is less than the preset correlation degree, which effectively reduces the influence of the redundant information on the accuracy of the result, and ensures the accuracy of the target recognition and the target detection result.
  • the embodiment of the present application further provides a network model compression method for a deep neural network, where the network model compression method of the deep neural network may include the embodiment shown in FIG. 3 and FIG. All the steps of the embodiment are performed, that is, not only the adjustment of the weights in each operation unit is performed by using the preset regularization term, but also the output result is monitored, and in the case that the output result cannot satisfy the preset effect, the weights in each operation unit are performed. Adjustment, so as to achieve high precision and high accuracy requirements for target recognition and target detection results, which will not be described in detail here.
  • the embodiment of the present application provides a network model compression device for a deep neural network.
  • the network model compression device of the deep neural network may include:
  • a first obtaining module 510 configured to acquire an original deep neural network
  • the first determining module 520 is configured to determine, by using an importance of each computing unit in the network layer of the original deep neural network, an operation unit whose importance is lower than a preset importance in the network layer as a to-be-deleted operation unit;
  • the deleting module 530 is configured to delete the operation unit to be deleted of each network layer in the original deep neural network, and obtain a deep neural network compressed by the network model.
  • the first determining module 520 is specifically configured to:
  • an operation unit whose importance is lower than the preset importance degree is determined as the operation unit to be deleted.
  • the operation unit whose importance degree is lower than the preset importance degree in the network layer is determined as the operation unit to be deleted, and further The operation unit to be deleted of each network layer in the original deep neural network is obtained, and the operation unit to be deleted of each network layer in the original deep neural network is deleted, and the deep neural network compressed by the network model can be obtained.
  • deleting the operation unit to be deleted does not affect the recognition and detection of the target, such that By deleting the operation unit to be deleted in each network layer, the network model of the compressed depth neural network is realized, thereby reducing the computational complexity of target recognition and target detection, and reducing the memory and bandwidth resource consumption, thereby improving target recognition and target detection. s efficiency.
  • the embodiment of the present application further provides a network model compression device for a deep neural network.
  • the network model compression device of the deep neural network may include:
  • a first obtaining module 610 configured to acquire an original deep neural network
  • the analysis module 620 is configured to analyze, by using a rank analysis tool, the network layer of the original deep neural network to obtain a first number of the operation units to be deleted in the network layer under the condition that the preset error tolerance is satisfied;
  • the first determining module 630 is configured to analyze the importance of each computing unit in the network layer of the original deep neural network to obtain the importance of each computing unit in the network layer; according to the operations in the network layer Selecting, by the order of importance of the units, the first number of operation units from the smallest to the largest, and using the selected operation unit as the operation unit to be deleted;
  • the deleting module 640 is configured to delete the to-be-deleted computing unit of each network layer in the original deep neural network, and obtain a deep neural network compressed by the network model.
  • the operation unit whose importance degree is lower than the preset importance degree in the network layer is determined as the operation unit to be deleted, and further The operation unit to be deleted of each network layer in the original deep neural network is obtained, and the operation unit to be deleted of each network layer in the original deep neural network is deleted, and the deep neural network compressed by the network model can be obtained.
  • the preset importance degree may be set according to the first number and the importance degree of each operation unit, because the importance of the operation unit to be deleted is lower than the preset The importance, that is, the impact of the target recognition and the target detection is relatively small.
  • deleting the operation unit to be deleted does not affect the identification and detection of the target, so that by deleting the operation unit to be deleted of each network layer,
  • the network model of the compressed depth neural network is realized, the purpose of reducing the computational complexity of target recognition and target detection, and reducing the memory and bandwidth resource consumption are achieved, thereby improving the efficiency of target recognition and target detection, and the lowest importance and the number can be achieved.
  • the corresponding several deleted operation units are deleted, and the deep neural network after deleting the operation unit to be deleted satisfies the preset error tolerance condition, thereby ensuring that the error of the target recognition and the target detection result is within a certain range, and has high accuracy. .
  • the embodiment of the present application further provides a network model compression device for a deep neural network.
  • the network model compression device of the deep neural network may include:
  • a first obtaining module 710 configured to acquire an original deep neural network
  • the first determining module 720 is configured to determine, by using an importance of each computing unit in the network layer of the original deep neural network, an operation unit whose importance is lower than a preset importance in the network layer as a to-be-deleted operation unit;
  • the deleting module 730 is configured to delete the operation unit to be deleted of each network layer in the original deep neural network, and obtain a deep neural network compressed by the network model;
  • the second obtaining module 740 is configured to obtain an output result of performing operations by using the deep neural network compressed by the network model
  • the first adjustment module 750 is configured to: if the output result cannot meet the preset effect, use a difference between an output result of the original depth neural network and an output result of the compressed depth neural network of the network model, The preset algorithm adjusts weights in the operation units of each network layer in the deep neural network compressed by the network model until the output result satisfies the preset effect.
  • the operation unit whose importance degree is lower than the preset importance degree in the network layer is determined as the operation unit to be deleted, and further The operation unit to be deleted of each network layer in the original deep neural network is obtained, and the operation unit to be deleted of each network layer in the original deep neural network is deleted, and the deep neural network compressed by the network model can be obtained.
  • deleting the operation unit to be deleted does not affect the recognition and detection of the target, such that By deleting the operation unit to be deleted in each network layer, the network model of the compressed depth neural network is realized, thereby reducing the computational complexity of target recognition and target detection, and reducing the memory and bandwidth resource consumption, thereby improving target recognition and target detection. s efficiency. Moreover, if the output result of the operation using the deep neural network compressed by the network model cannot satisfy the preset effect, the difference between the output result of the original depth neural network and the output result of the deep neural network compressed by the network model is passed.
  • the preset algorithm adjusts the weights in the operation unit until the output result satisfies the preset effect, thereby effectively avoiding the situation that the output result caused by the high correlation between the operation units cannot satisfy the required effect, and the guarantee is ensured.
  • the embodiment of the present application further provides a network model compression device for a deep neural network.
  • the network model compression device of the deep neural network may include:
  • a first obtaining module 810 configured to acquire an original deep neural network
  • a first determining module 820 configured to determine, by using an importance of each computing unit in the network layer of the original deep neural network, an operation unit whose importance is lower than a preset importance in the network layer as a to-be-deleted operation unit;
  • the deleting module 830 is configured to delete the operation unit to be deleted of each network layer in the original deep neural network, and obtain a deep neural network compressed by the network model;
  • a third obtaining module 840 configured to acquire a correlation between each computing unit of any network layer in the deep neural network compressed by the network model
  • the determining module 850 is configured to determine whether the correlation is less than a preset relevance
  • the second adjustment module 860 is configured to: if the determination result of the determining module 850 is negative, adjust a weight in each computing unit of the network layer by using a preset regularization item until the correlation is less than When the preset correlation is described, the adjustment of the weights in each arithmetic unit is stopped.
  • the operation unit whose importance degree is lower than the preset importance degree in the network layer is determined as the operation unit to be deleted, and further The operation unit to be deleted of each network layer in the original deep neural network is obtained, and the operation unit to be deleted of each network layer in the original deep neural network is deleted, and the deep neural network compressed by the network model can be obtained.
  • deleting the operation unit to be deleted does not affect the recognition and detection of the target, such that
  • the network model of the compressed depth neural network is realized, thereby reducing the computational complexity of target recognition and target detection, and reducing the memory and bandwidth resource consumption, thereby improving target recognition and target detection. s efficiency.
  • the preset regularization term is used to the network layer. The weights in each operation unit are adjusted until the correlation degree is less than the preset correlation degree, which effectively reduces the influence of the redundant information on the accuracy of the result, and ensures the accuracy of the target recognition and the target detection result.
  • the embodiment of the present application further provides a network model compression device for a deep neural network, where the network model compression device of the deep neural network may include the embodiment shown in FIG. 7 and FIG. 8 . All the modules of the embodiment are shown to achieve high precision and high accuracy requirements of target recognition and target detection results, which will not be described in detail herein.
  • the embodiment of the present application provides a computer readable storage medium for storing executable code, which is used at runtime Execution:
  • the network model compression method of the deep neural network provided by the embodiment of the present application specifically, the network model compression method of the deep neural network may include:
  • An operation unit of the network layer whose importance is lower than a preset importance degree is determined as an operation unit to be deleted by analyzing an importance degree of each operation unit in the network layer of the original deep neural network;
  • the computer readable storage medium stores the executable code of the network model compression method of the deep neural network provided by the embodiment of the present application at runtime, so that the importance of the operation unit to be deleted is lower than Preset importance, that is, the impact of target recognition and target detection is relatively small. Therefore, deleting the operation unit to be deleted does not affect the recognition and detection of the target, thus deleting the to-be-deleted operation of each network layer.
  • the unit realizes the network model of the compressed depth neural network, achieves the purpose of reducing the computational complexity of target recognition and target detection, and reducing the consumption of memory and bandwidth resources, thereby improving the efficiency of target recognition and target detection.
  • the embodiment of the present application provides an application program for performing, at runtime, the network model of the deep neural network provided by the embodiment of the present application.
  • the compression method specifically, the network model compression method of the deep neural network provided by the embodiment of the present application may include:
  • An operation unit of the network layer whose importance is lower than a preset importance degree is determined as an operation unit to be deleted by analyzing an importance degree of each operation unit in the network layer of the original deep neural network;
  • the application performs the network model compression method of the deep neural network provided by the embodiment of the present application at runtime, so that the target recognition can be realized because the importance of the operation unit to be deleted is lower than the preset importance degree.
  • the effect of the target detection is relatively small. Therefore, deleting the operation unit to be deleted does not affect the identification and detection of the target.
  • the network of the compressed depth neural network is realized by deleting the operation unit to be deleted in each network layer.
  • the model achieves the purpose of reducing the computational complexity of target recognition and target detection and reducing the memory and bandwidth resource consumption, thereby improving the efficiency of target recognition and target detection.
  • the embodiment of the present application provides a computer device, as shown in FIG. 9, including a processor 901 and a computer readable storage medium 902, where
  • Computer readable storage medium 902 for storing executable code
  • the processor 901 is configured to perform the following steps when executing the executable code stored on the computer readable storage medium 902:
  • An operation unit of the network layer whose importance is lower than a preset importance degree is determined as an operation unit to be deleted by analyzing an importance degree of each operation unit in the network layer of the original deep neural network;
  • the processor 901 analyzes the importance of each operation unit in the network layer of the original deep neural network, and determines that the importance of the network layer is lower than a preset importance degree.
  • the specific implementation can be:
  • an operation unit whose importance is lower than the preset importance degree is determined as the operation unit to be deleted.
  • processor 901 is further configured to:
  • rank analysis tool analyzing the network layer of the original deep neural network to obtain a first number of the operation units to be deleted in the network layer under the condition that the preset error tolerance is satisfied;
  • the processor 901 analyzes the importance of each operation unit in the network layer of the original deep neural network, and determines an operation unit in the network layer whose importance is lower than the preset importance as the to-be-deleted In the steps of the operation unit, the specific implementation can be:
  • the importance degree of each operation unit in the network layer is obtained by analyzing the importance degree of each operation unit in the network layer of the original deep neural network;
  • the first number of operation units are selected in order of importance of each operation unit in the network layer from small to large, and the selected operation unit is regarded as an operation unit to be deleted.
  • processor 901 is further configured to:
  • the network is determined by a preset algorithm
  • the weights in the arithmetic unit of each network layer in the deep neural network after the model compression are adjusted until the output result satisfies the preset effect.
  • processor 901 is further configured to:
  • the weight of each operation unit of the network layer is adjusted by using a preset regularization item, and when the correlation degree is less than the preset correlation degree, the weights in each operation unit are stopped.
  • the computer readable storage medium 902 and the processor 901 can perform data transmission by means of a wired connection or a wireless connection, and the computer device can communicate with other devices through a wired communication interface or a wireless communication interface.
  • the above computer readable storage medium may include a RAM (Random Access Memory), and may also include an NVM (Non-volatile Memory), such as at least one disk storage.
  • the computer readable storage medium may also be at least one storage device located remotely from the aforementioned processor.
  • the processor may be a general-purpose processor, including a CPU (Central Processing Unit), an NP (Network Processor), or the like; or a DSP (Digital Signal Processor) or an ASIC (Application) Specific Integrated Circuit, FPGA (Field-Programmable Gate Array) or other programmable logic device, discrete gate or transistor logic device, discrete hardware components.
  • CPU Central Processing Unit
  • NP Network Processor
  • DSP Digital Signal Processor
  • ASIC Application) Specific Integrated Circuit
  • FPGA Field-Programmable Gate Array
  • other programmable logic device discrete gate or transistor logic device, discrete hardware components.
  • the processor of the computer device runs a program corresponding to the executable code by reading the executable code stored in the computer readable storage medium, and the program executes the method provided by the embodiment of the present application at runtime.
  • the network model compression method of the deep neural network can realize that: since the importance degree of the operation unit to be deleted is lower than the preset importance degree, that is, the effect of the target recognition and the target detection is relatively small, therefore, the operation unit to be deleted is deleted, It does not affect the identification and detection of the target.
  • the network model of the compressed depth neural network is realized, which reduces the computational complexity of the target recognition and target detection, and reduces the memory and The purpose of bandwidth resource consumption, thereby improving the efficiency of target recognition and target detection.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

一种深度神经网络的网络模型压缩方法、装置及计算机设备,其中,深度神经网络的网络模型压缩方法包括:获取原始深度神经网络(S101);通过对原始深度神经网络的网络层中各运算单元的重要度进行分析,确定该网络层中重要度低于预设重要度的运算单元作为待删除运算单元(S102);删除原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络(S103)。通过该方法可以提高目标识别与目标检测的效率。

Description

深度神经网络的网络模型压缩方法、装置及计算机设备
本申请要求于2017年11月08日提交中国专利局、申请号为201711092273.3发明名称为“深度神经网络的网络模型压缩方法、装置及计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请涉及数据处理技术领域,特别是涉及一种深度神经网络的网络模型压缩方法、装置及计算机设备。
背景技术
DNN(Deep Neural Network,深度神经网络)作为机器学习研究中的一个新兴领域,通过模仿人脑的机制来解析数据,是一种通过建立和模拟人脑进行分析学习的智能模型,当前较为流行的DNN包括:CNN(Convolutional Neural Network,卷积神经网络)、RNN(Recurrent Neural Network,循环神经网络)、LSTM(Long Short Term Memory,长短期记忆网络)等。由于DNN可以通过网络模型中多个网络层的运算,快速、准确地对目标进行识别与检测,已在目标检测与分割、行为检测与识别、语音识别等方面得到了广泛的应用。
随着目标识别与目标检测技术的发展,目标特征越来越复杂,需要提取的目标特征也越来越多,这样,使得在DNN网络模型的设计中,网络层以及各网络层中运算单元的数量都在大幅增加,导致目标识别与目标检测的运算复杂度增大,并且大量的网络层和运算单元会消耗过多的内存与带宽资源,影响目标识别与目标检测的效率。
发明内容
本申请实施例的目的在于提供一种深度神经网络的网络模型压缩方法、装置及计算机设备,以提高目标识别与目标检测的效率。具体技术方案如下:
第一方面,本申请实施例提供了一种深度神经网络的网络模型压缩方法,所述方法包括:
获取原始深度神经网络;
通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元;
删除所述原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络。
可选的,所述通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元,包括:
提取所述原始深度神经网络的网络层中各运算单元的权值绝对值;
根据所述网络层中各运算单元的权值绝对值,配置各运算单元的重要度,其中,各运输单元的权值绝对值与配置的重要度成正比关系;
基于各运算单元的重要度,确定重要度低于预设重要度的运算单元作为待删除运算单元。
可选的,在所述通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元之前,所述方法还包括:
利用秩分析工具,对所述原始深度神经网络的网络层进行分析,得到满足预设误差容忍度的条件下,所述网络层中待删除运算单元的第一数目;
所述通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元,包括:
通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,得到所述网络层中各运算单元的重要度;
按照所述网络层中各运算单元的重要度从小到大的顺序选择所述第一数目个运算单元,并将所选择的运算单元作为待删除运算单元。
可选的,在所述删除所述原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络之后,所述方法还包括:
获取利用所述网络模型压缩后的深度神经网络进行运算的输出结果;
如果所述输出结果无法满足预设效果,则利用所述原始深度神经网络的输出结果与所述网络模型压缩后的深度神经网络的输出结果之间的差异,通过预设算法,对所述网络模型压缩后的深度神经网络中各网络层的运算单元中的权值进行调整,直至所述输出结果满足所述预设效果。
可选的,在所述删除所述原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络之后,所述方法还包括:
获取所述网络模型压缩后的深度神经网络中任一网络层的各运算单元之间的相关度;
判断所述相关度是否小于预设相关度;
若否,则采用预设正则化项,对该网络层的各运算单元中的权值进行调整,直至所述相关度小于所述预设相关度时,停止调整各运算单元中的权值。
第二方面,本申请实施例提供了一种深度神经网络的网络模型压缩装置,所述装置包括:
第一获取模块,用于获取原始深度神经网络;
第一确定模块,用于通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元;
删除模块,用于删除所述原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络。
可选的,所述第一确定模块,具体用于:
提取所述原始深度神经网络的网络层中各运算单元的权值绝对值;
根据所述网络层中各运算单元的权值绝对值,配置各运算单元的重要度,其中,各运输单元的权值绝对值与配置的重要度成正比关系;
基于各运算单元的重要度,确定重要度低于预设重要度的运算单元作为待删除运算单元。
可选的,所述装置还包括:
分析模块,用于利用秩分析工具,对所述原始深度神经网络的网络层进行分析,得到满足预设误差容忍度的条件下,所述网络层中待删除运算单元的第一数目;
所述第一确定模块,具体用于:
通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,得到所述网络层中各运算单元的重要度;
按照所述网络层中各运算单元的重要度从小到大的顺序选择所述第一数目个运算单元,并将所选择的运算单元作为待删除运算单元。
可选的,所述装置还包括:
第二获取模块,用于获取利用所述网络模型压缩后的深度神经网络进行运算的输出结果;
第一调整模块,用于如果所述输出结果无法满足预设效果,则利用所述原始深度神经网络的输出结果与所述网络模型压缩后的深度神经网络的输出结果之间的差异,通过预设算法,对所述网络模型压缩后的深度神经网络中各网络层的运算单元中的权值进行调整,直至所述输出结果满足所述预设效果。
可选的,所述装置还包括:
第三获取模块,用于获取所述网络模型压缩后的深度神经网络中任一网络层的各运算单元之间的相关度;
判断模块,用于判断所述相关度是否小于预设相关度;
第二调整模块,用于若所述判断模块的判断结果为否,则采用预设正则化项,对该网络层的各运算单元中的权值进行调整,直至所述相关度小于所述预设相关度时,停止调整各运算单元中的权值。
第三方面,本申请实施例提供了一种计算机可读存储介质,用于存储可执行代码,所述可执行代码用于在运行时执行:本申请实施例第一方面所提 供的深度神经网络的网络模型压缩方法。
第四方面,本申请实施例提供了一种应用程序,用于在运行时执行:本申请实施例第一方面所提供的深度神经网络的网络模型压缩方法。
第五方面,本申请实施例提供了一种计算机设备,包括处理器和计算机可读存储介质,其中,
所述计算机可读存储介质,用于存放可执行代码;
所述处理器,用于执行所述计算机可读存储介质上所存放的可执行代码时,实现本申请实施例第一方面所提供的深度神经网络的网络模型压缩方法的步骤。
综上可见,本申请实施例提供的方案中,通过对获取到的原始深度神经网络的网络层中各运算单元的重要度进行分析,确定该网络层中重要度低于预设重要度的运算单元作为待删除运算单元,进而得到原始深度神经网络中各网络层的待删除运算单元,删除原始深度神经网络中各网络层的待删除运算单元,即可以得到网络模型压缩后的深度神经网络。由于待删除运算单元的重要度低于预设重要度,即对目标识别与目标检测的结果影响相对较小,因此,删除待删除运算单元,并不会影响到对目标的识别与检测,这样,通过删除各网络层的待删除运算单元,实现压缩深度神经网络的网络模型,达到降低目标识别与目标检测的运算复杂度、减小内存与带宽资源消耗的目的,从而提高目标识别与目标检测的效率。
附图说明
为了更清楚地说明本申请实施例和现有技术的技术方案,下面对实施例和现有技术中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1为本申请一实施例的深度神经网络的网络模型压缩方法的流程示意图;
图2为本申请另一实施例的深度神经网络的网络模型压缩方法的流程示意图;
图3为本申请又一实施例的深度神经网络的网络模型压缩方法的流程示意图;
图4为本申请再一实施例的深度神经网络的网络模型压缩方法的流程示意图;
图5为本申请一实施例的深度神经网络的网络模型压缩装置的结构示意图;
图6为本申请另一实施例的深度神经网络的网络模型压缩装置的结构示意图;
图7为本申请又一实施例的深度神经网络的网络模型压缩装置的结构示意图;
图8为本申请再一实施例的深度神经网络的网络模型压缩装置的结构示意图;
图9为本申请实施例的计算机设备的结构示意图。
具体实施方式
为使本申请的目的、技术方案、及优点更加清楚明白,以下参照附图并举实施例,对本申请进一步详细说明。显然,所描述的实施例仅仅是本申请一部分实施例,而不是全部的实施例。基于本申请中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本申请保护的范围。
下面通过具体实施例,对本申请进行详细的说明。
为了提高目标检测的效率,本申请实施例提供了一种深度神经网络的网络模型压缩方法、装置及计算机设备。
下面,首先对本申请实施例所提供的深度神经网络的网络模型压缩方法进行介绍。
本申请实施例所提供的深度神经网络的网络模型压缩方法的执行主体可以为实现图像分类、语音识别、目标检测等功能的计算机设备,也可以为具有图像分类、目标检测等功能的摄像机,还可以为具有语音识别功能的麦克 风,执行主体中至少包括具有数据处理能力的核心处理芯片。实现本申请实施例所提供的深度神经网络的网络模型压缩方法的方式可以为设置于执行主体中的软件、硬件电路和逻辑电路中的至少一种方式。
如图1所示,为本申请实施例所提供的一种深度神经网络的网络模型压缩方法,该深度神经网络的网络模型压缩方法可以包括如下步骤:
S101,获取原始深度神经网络。
原始深度神经网络为实现图像分类、语音识别、目标检测等目标识别与目标检测功能的深度神经网络,是按照需要识别与检测的目标特征所设计的深度神经网络。通过获取原始深度神经网络,可以得到原始深度神经网络的网络模型,即该原始深度神经网络的网络层、各网络层的运算单元以及各网络层的网络参数,这里的网络参数包括该网络层中包含的运算单元的数量以及各运算单元中的具体数值。
由于在当前目标识别与目标检测技术中,目标特征复杂,需要提取的目标特征繁多,这样,使得原始深度神经网络中网络模型的结构复杂,网络层及各网络层中运算单元的数量庞大,且大量的网络层和运算单元会消耗过多的内存与带宽资源,导致目标识别与目标检测的运算复杂度较大,因此,在本实施例中,需要对原始深度神经网络进行分析,通过对网络模型进行压缩,达到降低运算复杂度的目的,进而提高目标识别与目标检测的效率。
S102,通过对原始深度神经网络的网络层中各运算单元的重要度进行分析,确定该网络层中重要度低于预设重要度的运算单元作为待删除运算单元。
原始深度神经网络的网络层中各运算单元可以用于提取不同的目标特征,例如在进行人脸识别的深度神经网络中,一个网络层中包含有用于提取眼睛特征的运算单元、用于提取鼻子特征的运算单元、用于提取耳朵特征的运算单元、用于提取脸部轮廓的运算单元等等,实际在进行特征提取的过程中,有些特征对于目标识别与目标检测的结果影响较大,而有些特征对于目标识别与目标检测的结果基本无影响,例如,在用于人脸识别的深度神经网络中,眼睛特征、鼻子特征、耳朵特征等对于结果影响较大,如果不提取这些特征,无法正确检测和识别人脸;而头发颜色、是否佩戴眼镜、是否佩戴耳环等特 征对于结果的影响相对较小,如果不提取这些特征,并不会影响检测和识别人脸的结果。
对原始深度神经网络的网络层中各运算单元的重要度进行分析,可以是通过对各运算单元对于目标识别与目标检测结果的影响程度进行分析实现的,运算单元对于目标识别与目标检测结果的影响程度越强,则说明运算单元的重要度越高,其中,影响程度可以是表征重要度的属性参数,如各运算单元的权值,样本训练的过程中,深度神经网络各运算单元的权值会不断被调整,以人脸识别为例,经过样本训练后,用于提取眼睛、鼻子、耳朵等特征的运算单元的权值绝对值大于用于提取发色、眼镜、耳环等特征的运算单元的权值绝对值,则说明提取眼睛、鼻子、耳朵等特征的运算单元对于人脸识别结果的影响程度要强于提取发色、眼镜、耳环等特征的运算单元,即用于提取眼睛、鼻子、耳朵等特征的运算单元的重要度更高;再如各运算单元提取的特征元素(像素等)占所需识别和检测的目标总元素的比例,可以通过特征提取和对特征元素的分析得到,仍然以人脸识别为例,通过运算单元提取眼睛、鼻子、耳朵等特征,这些特征的元素占目标总元素的比例要大于运算单元提取到的发色、眼镜、耳环等特征的元素占目标总元素的比例,则说明提取眼睛、鼻子、耳朵等特征的运算单元对于人脸识别结果的影响程度要强于提取发色、眼镜、耳环等特征的运算单元,即用于提取眼睛、鼻子、耳朵等特征的运算单元的重要度更高。通过对各运算单元的重要度进行分析,可以得到各运算单元的重要度,例如可以根据各运算单元对于目标识别与目标检测结果的影响程度,配置相应的重要度。
在得到各运算单元的重要度后,可以分别与预设重要度进行比较,如果重要度低于预设重要度,则将该运算单元确定为待删除运算单元。其中,预设重要度为预先设定的运算单元的重要程度,一般根据需要识别和检测的目标的特征对目标识别和目标检测的影响设定,例如,重要度分为第一重要度、第二重要度、第三重要度、第四重要度,并且对目标识别与目标检测的影响程度的顺序依次为第一重要度强于第二重要度、第二重要度强于第三重要度、第三重要度强于第四重要度。假设第一重要度、第二重要度及第三重要度对应的运算单元所提取的特征对于识别和检测的目标来说不可或缺,即如果没有这些特征,无法正确识别和检测目标,而第四重要度对应的运算单元所提 取的特征对于最终的目标识别和检测结果影响不大,则可以将第三重要度设定为预设重要度,如果一个运算单元的重要度为第四重要度,由于低于预设重要度,则可以将该运算单元确定为待删除运算单元。
预设重要度还可以是根据通过分析得到网络层中可以删除的运算单元的个数后确定的重要度,例如,针对某一网络层,通过分析得到该网络层可删除的运算单元的个数为5,而该网络层中运算单元的总数为12,则可以将其余7个运算单元中最小的重要度确定为预设重要度,一般情况下,5个可删除的运算单元的重要度均小于该预设重要度,这样,即可将重要度低于预设重要度的5个运算单元确定为待删除运算单元。对于原始深度神经网络中每个网络层均执行S102的步骤,则可以得到每个网络层的待删除运算单元。
可选的,S102具体可以为:
第一步,提取原始深度神经网络的网络层中各运算单元的权值绝对值。
第二步,根据该网络层中各运算单元的权值绝对值,配置各运算单元的重要度,其中,各运输单元的权值绝对值与配置的重要度成正比关系。
第三步,基于各运算单元的重要度,确定重要度低于预设重要度的运算单元作为待删除运算单元。
原始深度神经网络的网络层中各运算单元的权值绝对值分别代表了该运算单元对目标识别与目标检测的结果的影响程度,权值绝对值越大,则说明该运算单元对目标识别与目标检测的结果的影响程度越强。因此,可以根据各运算单元的权值绝对值,配置各运算单元的重要度,具体的,可以直接将权值绝对值作为重要度,也可以根据权值绝对值,将一定区间内的权值绝对值对应配置为高等重要度、中等重要度、低等重要度,且权值绝对值与重要度之间为正比关系,即权值绝对值越大,则重要度越高。当然,还可以根据需求,将重要度进行更为详细的划分,例如,划分成第一重要度、第二重要度、第三重要度、第四重要度等等。基于各运算单元的重要度,则可以将重要度低于预设重要度的运算单元确定为待删除运算单元,例如,预设重要度为中等重要度,则可以将低等重要度的运算单元确定为待删除运算单元。
S103,删除原始深度神经网络中各网络层的待删除运算单元,得到网络 模型压缩后的深度神经网络。
原始深度神经网络中各网络层的待删除运算单元即为对目标识别与目标检测的结果影响较小的运算单元,由于这些运算单元对目标识别与目标检测的结果影响较小,可以直接将原始深度神经网络中各网络层的待删除运算单元删除,这样,可以在不影响目标识别与目标检测的结果的基础上,实现压缩深度神经网络的网络模型,从而达到降低目标识别与目标检测的运算复杂度、减小内存与带宽资源消耗的目的,进而提高目标识别与目标检测的效率。
应用本实施例,通过对获取到的原始深度神经网络的网络层中各运算单元的重要度进行分析,确定该网络层中重要度低于预设重要度的运算单元作为待删除运算单元,进而得到原始深度神经网络中各网络层的待删除运算单元,删除原始深度神经网络中各网络层的待删除运算单元,即可以得到网络模型压缩后的深度神经网络。由于待删除运算单元的重要度低于预设重要度,即对目标识别与目标检测的结果影响相对较小,因此,删除待删除运算单元,并不会影响到对目标的识别与检测,这样,通过删除各网络层的待删除运算单元,实现压缩深度神经网络的网络模型,达到降低目标识别与目标检测的运算复杂度、减小内存与带宽资源消耗的目的,从而提高目标识别与目标检测的效率。
基于图1所示实施例,本申请实施例还提供了一种深度神经网络的网络模型压缩方法,如图2所示,该深度神经网络的网络模型压缩方法可以包括如下步骤:
S201,获取原始深度神经网络。
S202,利用秩分析工具,通过对原始深度神经网络的网络层进行分析,得到满足预设误差容忍度的条件下,该网络层中待删除运算单元的第一数目。
针对具有m i个运算单元的原始深度神经网络的第i个网络层Layer i,m i个运算单元组成的矩阵的秩表征了网络层Layer i中有几个重要的运算单元,例如,若通过秩分析工具,分析得到m i个运算单元组成的矩阵的秩为3,而原始深度神经网络的网络层Layer i中实际的运算单元的总数为8个,则说明网络层 Layer i中有3个重要的运算单元,而有5个运算单元并非重要的运算单元,则可以删除的运算单元的最大数目为5,为了保证目标的识别与检测的结果在一定误差范围以内,待删除运算单元的数目的确定需要基于预设误差容忍度ε,再例如,若删除4个运算单元,结果误差会大于预设误差容忍度ε,而若删除3个运算单元,结果误差小于预设误差容忍度ε,则可以将待删除运算单元的第一数目确定为3。通常情况下,为了最大程度的精简深度神经网络的网络结构,第一数目可以确定为满足预设误差容忍度的条件下,可以删除的运算单元的最大数目,当然,如果第一数目为小于该最大数目的数值,例如上述实例中,也可以将第一数目确定为2或者1,这样,也可以达到简化深度神经网络的网络结构的目的,因此,也属于本申请实施例的保护范围。示例性的,秩分析工具可以是PCA(Principal Component Analysis,主成分分析)方法,当然,秩分析工具可以为通过分析得到矩阵的秩的任一种方法,这里不再一一赘述。
S203,通过对该网络层中各运算单元的重要度进行分析,得到该网络层中各运算单元的重要度。
可以按照图1所示实施例的S102的步骤,对该网络层中各运算单元的重要度进行分析,得到该网络层中各运算单元的重要度,这里不再赘述。
S204,按照该网络层中各运算单元的重要度从小到大的顺序选择n i个运算单元,并将该n i个运算单元作为待删除运算单元,其中,n i为网络层Layer i中待删除运算单元的第一数目。
在确定待删除运算单元的第一数目及得到网络层中各运算单元的重要度之后,可以将重要度最低的几个运算单元确定为待删除运算单元,对于网络层Layer i,如果待删除运算单元的数目为n i个,运算单元的总数为m i个,则可以将重要度最低的n i个运算单元确定为待删除运算单元,这样,网络层Layer i最终的运算单元的个数为m i-n i个。例如,待删除运算单元的第一数目为3个,第i个网络层Layer i的10个运算单元的重要度从小到大依次为:第五运算单元、第二运算单元、第七运算单元、第一运算单元、第八运算单元、第十运算单元、第六运算单元、第三运算单元、第四运算单元、第九运算单元,在确定第一数目和重要度大小之后,可以将预设重要度设定为第一运算单元的重要度,从而能够将重要度较低的第五运算单元、第二运算单元和第七运 算单元确定为待删除运算单元。
S205,删除原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络。
应用本实施例,通过对获取到的原始深度神经网络的网络层中各运算单元的重要度进行分析,确定该网络层中重要度低于预设重要度的运算单元作为待删除运算单元,进而得到原始深度神经网络中各网络层的待删除运算单元,删除原始深度神经网络中各网络层的待删除运算单元,即可以得到网络模型压缩后的深度神经网络。在通过秩分析工具确定待删除运算单元的第一数目,预设重要度就可以根据该第一数目以及各运算单元的重要度的大小设定,由于待删除运算单元的重要度低于预设重要度,即对目标识别与目标检测的结果影响相对较小,因此,删除待删除运算单元,并不会影响到对目标的识别与检测,这样,通过删除各网络层的待删除运算单元,实现压缩深度神经网络的网络模型,达到降低目标识别与目标检测的运算复杂度、减小内存与带宽资源消耗的目的,从而提高目标识别与目标检测的效率,可以将重要度最低的、与该第一数目对应的几个待删除运算单元删除,删除待删除运算单元后的深度神经网络满足预设误差容忍度条件,保证了目标识别与目标检测的结果的误差在一定范围内,具有较高的准确性。
基于图1所示实施例,本申请实施例还提供了一种深度神经网络的网络模型压缩方法,如图3所示,该深度神经网络的网络模型压缩方法可以包括如下步骤:
S301,获取原始深度神经网络。
S302,通过对原始深度神经网络的网络层中各运算单元的重要度进行分析,确定该网络层中重要度低于预设重要度的运算单元作为待删除运算单元。
S303,删除原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络。
S304,获取利用网络模型压缩后的深度神经网络进行运算的输出结果。
S305,如果输出结果无法满足预设效果,则利用原始深度神经网络的输出结果与网络模型压缩后的深度神经网络的输出结果之间的差异,通过预设算法,对网络模型压缩后的深度神经网络中各网络层的运算单元中的权值进行调整,直至输出结果满足预设效果。
由于各运算单元之间并非完全没有相关性,也就是如果删除了一些运算单元,可能对其他运算单元的特征提取性能产生一定的影响,导致利用网络模型压缩后的深度神经网络进行运算的输出结果无法满足预设效果,其中,预设效果为需要达到的目标识别与目标检测的效果,即实际的输出结果与需要达到的效果之间存在一定的偏差,为了减小该偏差,可以利用原始深度神经网络的输出结果与网络模型压缩后的深度神经网络的输出结果之间的差异,通过预设算法,对网络模型压缩后的深度神经网络中各网络层的运算单元中的权值进行调整,直至调整后的网络模型压缩后的深度神经网络的输出结果满足预设效果,其中,预设算法可以为当前通用的反向梯度传播算法,例如BP算法,这里不再详述。
应用本实施例,通过对获取到的原始深度神经网络的网络层中各运算单元的重要度进行分析,确定该网络层中重要度低于预设重要度的运算单元作为待删除运算单元,进而得到原始深度神经网络中各网络层的待删除运算单元,删除原始深度神经网络中各网络层的待删除运算单元,即可以得到网络模型压缩后的深度神经网络。由于待删除运算单元的重要度低于预设重要度,即对目标识别与目标检测的结果影响相对较小,因此,删除待删除运算单元,并不会影响到对目标的识别与检测,这样,通过删除各网络层的待删除运算单元,实现压缩深度神经网络的网络模型,达到降低目标识别与目标检测的运算复杂度、减小内存与带宽资源消耗的目的,从而提高目标识别与目标检测的效率。并且,如果利用网络模型压缩后的深度神经网络进行运算的输出结果无法满足预设效果,则利用原始深度神经网络的输出结果与网络模型压缩后的深度神经网络的输出结果之间的差异,通过预设算法,对运算单元中的权值进行调整,直至输出结果满足预设效果,有效避免了运算单元之间具有较高的相关性所导致的输出结果无法满足需要达到的效果的情况,保证了目标识别与目标检测的结果准确性。
基于图1所示实施例,本申请实施例还提供了一种深度神经网络的网络模型压缩方法,如图4所示,该深度神经网络的网络模型压缩方法可以包括如下步骤:
S401,获取原始深度神经网络。
S402,通过对原始深度神经网络的网络层中各运算单元的重要度进行分析,确定该网络层中重要度低于预设重要度的运算单元作为待删除运算单元。
S403,删除原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络。
S404,获取网络模型压缩后的深度神经网络中任一网络层的各运算单元之间的相关度。
S405,判断相关度是否小于预设相关度,若是,则执行S406,否则执行S407。
S406,停止调整各运算单元中的权值。
S407,采用预设正则化项,对该网络层的各运算单元中的权值进行调整。
在删除待删除运算单元之后,网络模型压缩后的深度神经网络的网络层中各运算单元之间可能还存在较高的相关度,相关度较高的情况下,各运算单元之间仍然存在一定的冗余信息,导致网络模型的性能较差,如果各运算单元之间的相关度大于或等于预设相关度,则说明该网络层的冗余信息较多,网络结构不够精简,则可以采用预设正则化项,例如正交正则化项,对该网络层的各运算单元进行调整,直至相关度小于预设相关度。如果原始深度神经网络中采用了例如传统L2正则化项,可以将该传统L2正则化项替换为例如正交正则化项的预设正则化项,以实现减小各运算单元间的相关度的目的。
应用本实施例,通过对获取到的原始深度神经网络的网络层中各运算单元的重要度进行分析,确定该网络层中重要度低于预设重要度的运算单元作为待删除运算单元,进而得到原始深度神经网络中各网络层的待删除运算单元,删除原始深度神经网络中各网络层的待删除运算单元,即可以得到网络模型压缩后的深度神经网络。由于待删除运算单元的重要度低于预设重要度, 即对目标识别与目标检测的结果影响相对较小,因此,删除待删除运算单元,并不会影响到对目标的识别与检测,这样,通过删除各网络层的待删除运算单元,实现压缩深度神经网络的网络模型,达到降低目标识别与目标检测的运算复杂度、减小内存与带宽资源消耗的目的,从而提高目标识别与目标检测的效率。并且,通过对网络模型压缩后的深度神经网络中网络层的各运算单元之间的相关度进行判断,如果相关度大于或等于预设相关度,则采用预设正则化项,对该网络层的各运算单元中的权值进行调整,直至相关度小于预设相关度,有效降低冗余信息对结果精度的影响,保证了目标识别与目标检测的结果精度。
基于图3及图4所示实施例,本申请实施例还提供了一种深度神经网络的网络模型压缩方法,该深度神经网络的网络模型压缩方法可以包括图3所示实施例与图4所示实施例的所有步骤,即不仅利用预设正则化项进行各运算单元中权值的调整,还监测输出结果,在输出结果无法满足预设效果的情况下,对各运算单元中权值进行调整,从而实现目标识别与目标检测结果的高精度、高准确性要求,这里不再详述。
相应于上述方法实施例,本申请实施例提供了一种深度神经网络的网络模型压缩装置,如图5所示,该深度神经网络的网络模型压缩装置可以包括:
第一获取模块510,用于获取原始深度神经网络;
第一确定模块520,用于通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元;
删除模块530,用于删除所述原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络。
可选的,所述第一确定模块520,具体可以用于:
提取所述原始深度神经网络的网络层中各运算单元的权值绝对值;
根据所述网络层中各运算单元的权值绝对值,配置各运算单元的重要度, 其中,各运输单元的权值绝对值与配置的重要度成正比关系;
基于各运算单元的重要度,确定重要度低于预设重要度的运算单元作为待删除运算单元。
应用本实施例,通过对获取到的原始深度神经网络的网络层中各运算单元的重要度进行分析,确定该网络层中重要度低于预设重要度的运算单元作为待删除运算单元,进而得到原始深度神经网络中各网络层的待删除运算单元,删除原始深度神经网络中各网络层的待删除运算单元,即可以得到网络模型压缩后的深度神经网络。由于待删除运算单元的重要度低于预设重要度,即对目标识别与目标检测的结果影响相对较小,因此,删除待删除运算单元,并不会影响到对目标的识别与检测,这样,通过删除各网络层的待删除运算单元,实现压缩深度神经网络的网络模型,达到降低目标识别与目标检测的运算复杂度、减小内存与带宽资源消耗的目的,从而提高目标识别与目标检测的效率。
基于图5所示实施例,本申请实施例还提供了一种深度神经网络的网络模型压缩装置,如图6所示,该深度神经网络的网络模型压缩装置可以包括:
第一获取模块610,用于获取原始深度神经网络;
分析模块620,用于利用秩分析工具,对所述原始深度神经网络的网络层进行分析,得到满足预设误差容忍度的条件下,所述网络层中待删除运算单元的第一数目;
第一确定模块630,用于通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,得到所述网络层中各运算单元的重要度;按照所述网络层中各运算单元的重要度从小到大的顺序选择所述第一数目个运算单元,并将所选择的运算单元作为待删除运算单元;
删除模块640,用于删除所述原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络。
应用本实施例,通过对获取到的原始深度神经网络的网络层中各运算单 元的重要度进行分析,确定该网络层中重要度低于预设重要度的运算单元作为待删除运算单元,进而得到原始深度神经网络中各网络层的待删除运算单元,删除原始深度神经网络中各网络层的待删除运算单元,即可以得到网络模型压缩后的深度神经网络。在通过秩分析工具确定待删除运算单元的第一数目,预设重要度就可以根据该第一数目以及各运算单元的重要度的大小设定,由于待删除运算单元的重要度低于预设重要度,即对目标识别与目标检测的结果影响相对较小,因此,删除待删除运算单元,并不会影响到对目标的识别与检测,这样,通过删除各网络层的待删除运算单元,实现压缩深度神经网络的网络模型,达到降低目标识别与目标检测的运算复杂度、减小内存与带宽资源消耗的目的,从而提高目标识别与目标检测的效率,可以将重要度最低的与该数目对应的几个待删除运算单元删除,删除待删除运算单元后的深度神经网络满足预设误差容忍度条件,保证了目标识别与目标检测的结果的误差在一定范围内,具有较高的准确性。
基于图5所示实施例,本申请实施例还提供了一种深度神经网络的网络模型压缩装置,如图7所示,该深度神经网络的网络模型压缩装置可以包括:
第一获取模块710,用于获取原始深度神经网络;
第一确定模块720,用于通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元;
删除模块730,用于删除所述原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络;
第二获取模块740,用于获取利用所述网络模型压缩后的深度神经网络进行运算的输出结果;
第一调整模块750,用于如果所述输出结果无法满足预设效果,则利用所述原始深度神经网络的输出结果与所述网络模型压缩后的深度神经网络的输出结果之间的差异,通过预设算法,对所述网络模型压缩后的深度神经网络中各网络层的运算单元中的权值进行调整,直至所述输出结果满足所述预设 效果。
应用本实施例,通过对获取到的原始深度神经网络的网络层中各运算单元的重要度进行分析,确定该网络层中重要度低于预设重要度的运算单元作为待删除运算单元,进而得到原始深度神经网络中各网络层的待删除运算单元,删除原始深度神经网络中各网络层的待删除运算单元,即可以得到网络模型压缩后的深度神经网络。由于待删除运算单元的重要度低于预设重要度,即对目标识别与目标检测的结果影响相对较小,因此,删除待删除运算单元,并不会影响到对目标的识别与检测,这样,通过删除各网络层的待删除运算单元,实现压缩深度神经网络的网络模型,达到降低目标识别与目标检测的运算复杂度、减小内存与带宽资源消耗的目的,从而提高目标识别与目标检测的效率。并且,如果利用网络模型压缩后的深度神经网络进行运算的输出结果无法满足预设效果,则利用原始深度神经网络的输出结果与网络模型压缩后的深度神经网络的输出结果之间的差异,通过预设算法,对运算单元中的权值进行调整,直至输出结果满足预设效果,有效避免了运算单元之间具有较高的相关性所导致的输出结果无法满足需要达到的效果的情况,保证了目标识别与目标检测的结果准确性。
基于图5所示实施例,本申请实施例还提供了一种深度神经网络的网络模型压缩装置,如图8所示,该深度神经网络的网络模型压缩装置可以包括:
第一获取模块810,用于获取原始深度神经网络;
第一确定模块820,用于通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元;
删除模块830,用于删除所述原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络;
第三获取模块840,用于获取所述网络模型压缩后的深度神经网络中任一网络层的各运算单元之间的相关度;
判断模块850,用于判断所述相关度是否小于预设相关度;
第二调整模块860,用于若所述判断模块850的判断结果为否,则采用预设正则化项,对该网络层的各运算单元中的权值进行调整,直至所述相关度小于所述预设相关度时,停止调整各运算单元中的权值。
应用本实施例,通过对获取到的原始深度神经网络的网络层中各运算单元的重要度进行分析,确定该网络层中重要度低于预设重要度的运算单元作为待删除运算单元,进而得到原始深度神经网络中各网络层的待删除运算单元,删除原始深度神经网络中各网络层的待删除运算单元,即可以得到网络模型压缩后的深度神经网络。由于待删除运算单元的重要度低于预设重要度,即对目标识别与目标检测的结果影响相对较小,因此,删除待删除运算单元,并不会影响到对目标的识别与检测,这样,通过删除各网络层的待删除运算单元,实现压缩深度神经网络的网络模型,达到降低目标识别与目标检测的运算复杂度、减小内存与带宽资源消耗的目的,从而提高目标识别与目标检测的效率。并且,通过对网络模型压缩后的深度神经网络中网络层的各运算单元之间的相关度进行判断,如果相关度大于或等于预设相关度,则采用预设正则化项,对该网络层的各运算单元中的权值进行调整,直至相关度小于预设相关度,有效降低冗余信息对结果精度的影响,保证了目标识别与目标检测的结果精度。
基于图7及图8所示实施例,本申请实施例还提供了一种深度神经网络的网络模型压缩装置,该深度神经网络的网络模型压缩装置可以包括图7所示实施例与图8所示实施例的所有模块,以实现目标识别与目标检测结果的高精度、高准确性要求,这里不再详述。
另外,相应于上述实施例所提供的深度神经网络的网络模型压缩方法,本申请实施例提供了一种计算机可读存储介质,用于存储可执行代码,所述可执行代码用于在运行时执行:本申请实施例所提供的深度神经网络的网络模型压缩方法;具体的,所述深度神经网络的网络模型压缩方法,可以包括:
获取原始深度神经网络;
通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元;
删除所述原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络。
本实施例中,计算机可读存储介质存储有在运行时执行本申请实施例所提供的深度神经网络的网络模型压缩方法的可执行代码,因此能够实现:由于待删除运算单元的重要度低于预设重要度,即对目标识别与目标检测的结果影响相对较小,因此,删除待删除运算单元,并不会影响到对目标的识别与检测,这样,通过删除各网络层的待删除运算单元,实现压缩深度神经网络的网络模型,达到降低目标识别与目标检测的运算复杂度、减小内存与带宽资源消耗的目的,从而提高目标识别与目标检测的效率。
另外,相应于上述实施例所提供的深度神经网络的网络模型压缩方法,本申请实施例提供了一种应用程序,用于在运行时执行:本申请实施例所提供的深度神经网络的网络模型压缩方法;具体的,本申请实施例所提供的深度神经网络的网络模型压缩方法,可以包括:
获取原始深度神经网络;
通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元;
删除所述原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络。
本实施例中,应用程序在运行时执行本申请实施例所提供的深度神经网络的网络模型压缩方法,因此能够实现:由于待删除运算单元的重要度低于预设重要度,即对目标识别与目标检测的结果影响相对较小,因此,删除待删除运算单元,并不会影响到对目标的识别与检测,这样,通过删除各网络层的待删除运算单元,实现压缩深度神经网络的网络模型,达到降低目标识别与目标检测的运算复杂度、减小内存与带宽资源消耗的目的,从而提高目标识别与目标检测的效率。
另外,相应于上述实施例提供的深度神经网络的网络模型压缩方法,本 申请实施例提供了一种计算机设备,如图9所示,包括处理器901和计算机可读存储介质902,其中,
计算机可读存储介质902,用于存放可执行代码;
处理器901,用于执行计算机可读存储介质902上所存放的可执行代码时,实现如下步骤:
获取原始深度神经网络;
通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元;
删除所述原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络。
可选的,所述处理器901在实现所述通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元的步骤中,具体可以实现:
提取所述原始深度神经网络的网络层中各运算单元的权值绝对值;
根据所述网络层中各运算单元的权值绝对值,配置各运算单元的重要度,其中,各运输单元的权值绝对值与配置的重要度成正比关系;
基于各运算单元的重要度,确定重要度低于预设重要度的运算单元作为待删除运算单元。
可选的,所述处理器901还可以实现:
利用秩分析工具,对所述原始深度神经网络的网络层进行分析,得到满足预设误差容忍度的条件下,所述网络层中待删除运算单元的第一数目;
所述处理器901在实现所述通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元的步骤中,具体可以实现:
通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,得到所述网络层中各运算单元的重要度;
按照所述网络层中各运算单元的重要度从小到大的顺序选择所述第一数目个运算单元,并将所选择的运算单元作为待删除运算单元。
可选的,所述处理器901还可以实现:
获取利用所述网络模型压缩后的深度神经网络进行运算的输出结果;
如果所述输出结果无法满足预设效果,则利用所述原始深度神经网络的输出结果与所述网络模型压缩后的深度神经网络的输出结果之间的差异,通过预设算法,对所述网络模型压缩后的深度神经网络中各网络层的运算单元中的权值进行调整,直至所述输出结果满足所述预设效果。
可选的,所述处理器901还可以实现:
获取所述网络模型压缩后的深度神经网络中任一网络层的各运算单元之间的相关度;
判断所述相关度是否小于预设相关度;
若否,则采用预设正则化项,对该网络层的各运算单元中的权值进行调整,直至所述相关度小于所述预设相关度时,停止调整各运算单元中的权值。
计算机可读存储介质902与处理器901之间可以通过有线连接或者无线连接的方式进行数据传输,并且计算机设备可以通过有线通信接口或者无线通信接口与其他的设备进行通信。
上述计算机可读存储介质可以包括RAM(Random Access Memory,随机存取存储器),也可以包括NVM(Non-volatile Memory,非易失性存储器),例如至少一个磁盘存储器。可选的,计算机可读存储介质还可以是至少一个位于远离前述处理器的存储装置。
上述处理器可以是通用处理器,包括CPU(Central Processing Unit,中央处理器)、NP(Network Processor,网络处理器)等;还可以是DSP(Digital Signal Processor,数字信号处理器)、ASIC(Application Specific Integrated Circuit,专用集成电路)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件。
本实施例中,该计算机设备的处理器通过读取计算机可读存储介质中存储的可执行代码来运行与所述可执行代码对应的程序,该程序在运行时执行本申请实施例所提供的深度神经网络的网络模型压缩方法,因此能够实现:由于待删除运算单元的重要度低于预设重要度,即对目标识别与目标检测的结果影响相对较小,因此,删除待删除运算单元,并不会影响到对目标的识别与检测,这样,通过删除各网络层的待删除运算单元,实现压缩深度神经网络的网络模型,达到降低目标识别与目标检测的运算复杂度、减小内存与带宽资源消耗的目的,从而提高目标识别与目标检测的效率。
对于计算机设备、应用程序以及计算机可读存储介质实施例而言,由于其所涉及的方法内容基本相似于前述的方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
需要说明的是,在本文中,诸如第一和第二等之类的关系术语仅仅用来将一个实体或者操作与另一个实体或操作区分开来,而不一定要求或者暗示这些实体或操作之间存在任何这种实际的关系或者顺序。而且,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者设备不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者设备所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括所述要素的过程、方法、物品或者设备中还存在另外的相同要素。
本说明书中的各个实施例均采用相关的方式描述,各个实施例之间相同相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。尤其,对于装置、计算机设备、应用程序以及计算机可读存储介质实施例而言,由于其基本相似于方法实施例,所以描述的比较简单,相关之处参见方法实施例的部分说明即可。
以上所述仅为本申请的较佳实施例而已,并不用以限制本申请,凡在本申请的精神和原则之内,所做的任何修改、等同替换、改进等,均应包含在本申请保护的范围之内。

Claims (13)

  1. 一种深度神经网络的网络模型压缩方法,其特征在于,所述方法包括:
    获取原始深度神经网络;
    通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元;
    删除所述原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络。
  2. 根据权利要求1所述的方法,其特征在于,所述通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元,包括:
    提取所述原始深度神经网络的网络层中各运算单元的权值绝对值;
    根据所述网络层中各运算单元的权值绝对值,配置各运算单元的重要度,其中,各运输单元的权值绝对值与配置的重要度成正比关系;
    基于各运算单元的重要度,确定重要度低于预设重要度的运算单元作为待删除运算单元。
  3. 根据权利要求1所述的方法,其特征在于,在所述通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元之前,所述方法还包括:
    利用秩分析工具,对所述原始深度神经网络的网络层进行分析,得到满足预设误差容忍度的条件下,所述网络层中待删除运算单元的第一数目;
    所述通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元,包括:
    通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,得到所述网络层中各运算单元的重要度;
    按照所述网络层中各运算单元的重要度从小到大的顺序选择所述第一数 目个运算单元,并将所选择的运算单元作为待删除运算单元。
  4. 根据权利要求1所述的方法,其特征在于,在所述删除所述原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络之后,所述方法还包括:
    获取利用所述网络模型压缩后的深度神经网络进行运算的输出结果;
    如果所述输出结果无法满足预设效果,则利用所述原始深度神经网络的输出结果与所述网络模型压缩后的深度神经网络的输出结果之间的差异,通过预设算法,对所述网络模型压缩后的深度神经网络中各网络层的运算单元中的权值进行调整,直至所述输出结果满足所述预设效果。
  5. 根据权利要求1所述的方法,其特征在于,在所述删除所述原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络之后,所述方法还包括:
    获取所述网络模型压缩后的深度神经网络中任一网络层的各运算单元之间的相关度;
    判断所述相关度是否小于预设相关度;
    若否,则采用预设正则化项,对该网络层的各运算单元中的权值进行调整,直至所述相关度小于所述预设相关度时,停止调整各运算单元中的权值。
  6. 一种深度神经网络的网络模型压缩装置,其特征在于,所述装置包括:
    第一获取模块,用于获取原始深度神经网络;
    第一确定模块,用于通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,确定所述网络层中重要度低于预设重要度的运算单元作为待删除运算单元;
    删除模块,用于删除所述原始深度神经网络中各网络层的待删除运算单元,得到网络模型压缩后的深度神经网络。
  7. 根据权利要求6所述的装置,其特征在于,所述第一确定模块,具体用于:
    提取所述原始深度神经网络的网络层中各运算单元的权值绝对值;
    根据所述网络层中各运算单元的权值绝对值,配置各运算单元的重要度,其中,各运输单元的权值绝对值与配置的重要度成正比关系;
    基于各运算单元的重要度,确定重要度低于预设重要度的运算单元作为待删除运算单元。
  8. 根据权利要求6所述的装置,其特征在于,所述装置还包括:
    分析模块,用于利用秩分析工具,对所述原始深度神经网络的网络层进行分析,得到满足预设误差容忍度的条件下,所述网络层中待删除运算单元的第一数目;
    所述第一确定模块,具体用于:
    通过对所述原始深度神经网络的网络层中各运算单元的重要度进行分析,得到所述网络层中各运算单元的重要度;
    按照所述网络层中各运算单元的重要度从小到大的顺序选择所述第一数目个运算单元,并将所选择的运算单元作为待删除运算单元。
  9. 根据权利要求6所述的装置,其特征在于,所述装置还包括:
    第二获取模块,用于获取利用所述网络模型压缩后的深度神经网络进行运算的输出结果;
    第一调整模块,用于如果所述输出结果无法满足预设效果,则利用所述原始深度神经网络的输出结果与所述网络模型压缩后的深度神经网络的输出结果之间的差异,通过预设算法,对所述网络模型压缩后的深度神经网络中各网络层的运算单元中的权值进行调整,直至所述输出结果满足所述预设效果。
  10. 根据权利要求6所述的装置,其特征在于,所述装置还包括:
    第三获取模块,用于获取所述网络模型压缩后的深度神经网络中任一网络层的各运算单元之间的相关度;
    判断模块,用于判断所述相关度是否小于预设相关度;
    第二调整模块,用于若所述判断模块的判断结果为否,则采用预设正则化项,对该网络层的各运算单元中的权值进行调整,直至所述相关度小于所述预设相关度时,停止调整各运算单元中的权值。
  11. 一种计算机可读存储介质,其特征在于,用于存储可执行代码,所述可执行代码用于在运行时执行:权利要求1-5任一项所述的深度神经网络的网络模型压缩方法。
  12. 一种应用程序,其特征在于,用于在运行时执行:权利要求1-5任一项所述的深度神经网络的网络模型压缩方法。
  13. 一种计算机设备,其特征在于,包括处理器和计算机可读存储介质,其中,
    所述计算机可读存储介质,用于存放可执行代码;
    所述处理器,用于执行所述计算机可读存储介质上所存放的可执行代码时,实现权利要求1-5任一所述的深度神经网络的网络模型压缩方法。
PCT/CN2018/114357 2017-11-08 2018-11-07 深度神经网络的网络模型压缩方法、装置及计算机设备 WO2019091401A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711092273.3 2017-11-08
CN201711092273.3A CN109754077B (zh) 2017-11-08 2017-11-08 深度神经网络的网络模型压缩方法、装置及计算机设备

Publications (1)

Publication Number Publication Date
WO2019091401A1 true WO2019091401A1 (zh) 2019-05-16

Family

ID=66402063

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/114357 WO2019091401A1 (zh) 2017-11-08 2018-11-07 深度神经网络的网络模型压缩方法、装置及计算机设备

Country Status (2)

Country Link
CN (1) CN109754077B (zh)
WO (1) WO2019091401A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114418086A (zh) * 2021-12-02 2022-04-29 北京百度网讯科技有限公司 压缩神经网络模型的方法、装置

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110598731B (zh) * 2019-07-31 2021-08-20 浙江大学 一种基于结构化剪枝的高效图像分类方法
CN110650370B (zh) * 2019-10-18 2021-09-24 北京达佳互联信息技术有限公司 一种视频编码参数确定方法、装置、电子设备及存储介质
CN114692816B (zh) * 2020-12-31 2023-08-25 华为技术有限公司 神经网络模型的处理方法和设备

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106127297A (zh) * 2016-06-02 2016-11-16 中国科学院自动化研究所 基于张量分解的深度卷积神经网络的加速与压缩方法
CN106203376A (zh) * 2016-07-19 2016-12-07 北京旷视科技有限公司 人脸关键点定位方法及装置
CN106297778A (zh) * 2015-05-21 2017-01-04 中国科学院声学研究所 数据驱动的基于奇异值分解的神经网络声学模型裁剪方法
CN106355210A (zh) * 2016-09-14 2017-01-25 华北电力大学(保定) 基于深度神经元响应模式的绝缘子红外图像特征表达方法
CN106548234A (zh) * 2016-11-17 2017-03-29 北京图森互联科技有限责任公司 一种神经网络剪枝方法及装置

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US6269351B1 (en) * 1999-03-31 2001-07-31 Dryken Technologies, Inc. Method and system for training an artificial neural network
CN100367300C (zh) * 2006-07-07 2008-02-06 华中科技大学 一种基于人工神经网络的特征选择方法
CN103778467A (zh) * 2014-01-16 2014-05-07 天津大学 一种选择电力系统暂态稳定评估输入特征量的方法
CN106650928A (zh) * 2016-10-11 2017-05-10 广州视源电子科技股份有限公司 一种神经网络的优化方法及装置
CN106779068A (zh) * 2016-12-05 2017-05-31 北京深鉴智能科技有限公司 调整人工神经网络的方法和装置
CN107248144B (zh) * 2017-04-27 2019-12-10 东南大学 一种基于压缩型卷积神经网络的图像去噪方法

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106297778A (zh) * 2015-05-21 2017-01-04 中国科学院声学研究所 数据驱动的基于奇异值分解的神经网络声学模型裁剪方法
CN106127297A (zh) * 2016-06-02 2016-11-16 中国科学院自动化研究所 基于张量分解的深度卷积神经网络的加速与压缩方法
CN106203376A (zh) * 2016-07-19 2016-12-07 北京旷视科技有限公司 人脸关键点定位方法及装置
CN106355210A (zh) * 2016-09-14 2017-01-25 华北电力大学(保定) 基于深度神经元响应模式的绝缘子红外图像特征表达方法
CN106548234A (zh) * 2016-11-17 2017-03-29 北京图森互联科技有限责任公司 一种神经网络剪枝方法及装置

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114418086A (zh) * 2021-12-02 2022-04-29 北京百度网讯科技有限公司 压缩神经网络模型的方法、装置
CN114418086B (zh) * 2021-12-02 2023-02-28 北京百度网讯科技有限公司 压缩神经网络模型的方法、装置
US11861498B2 (en) 2021-12-02 2024-01-02 Beijing Baidu Netcom Science Technology Co., Ltd. Method and apparatus for compressing neural network model

Also Published As

Publication number Publication date
CN109754077B (zh) 2022-05-06
CN109754077A (zh) 2019-05-14

Similar Documents

Publication Publication Date Title
US11087180B2 (en) Risky transaction identification method and apparatus
WO2019091401A1 (zh) 深度神经网络的网络模型压缩方法、装置及计算机设备
WO2019100724A1 (zh) 训练多标签分类模型的方法和装置
WO2019228317A1 (zh) 人脸识别方法、装置及计算机可读介质
WO2020098074A1 (zh) 人脸样本图片标注方法、装置、计算机设备及存储介质
WO2020155518A1 (zh) 物体检测方法、装置、计算机设备及存储介质
WO2021184902A1 (zh) 图像分类方法、装置、及其训练方法、装置、设备、介质
WO2022042123A1 (zh) 图像识别模型生成方法、装置、计算机设备和存储介质
WO2022213465A1 (zh) 基于神经网络的图像识别方法、装置、电子设备及介质
US20230401834A1 (en) Image processing method, apparatus and device, and readable storage medium
WO2020215560A1 (zh) 自编码神经网络处理方法、装置、计算机设备及存储介质
US11514315B2 (en) Deep neural network training method and apparatus, and computer device
CN108959474B (zh) 实体关系提取方法
US20170185913A1 (en) System and method for comparing training data with test data
WO2020168754A1 (zh) 基于预测模型的绩效预测方法、装置及存储介质
US11893773B2 (en) Finger vein comparison method, computer equipment, and storage medium
US11809990B2 (en) Method apparatus and system for generating a neural network and storage medium storing instructions
CN113487610A (zh) 疱疹图像识别方法、装置、计算机设备和存储介质
CN113536965B (zh) 一种训练脸部遮挡识别模型的方法及相关装置
CN110956116B (zh) 基于卷积神经网络的人脸图像性别识别模型及识别方法
WO2021169604A1 (zh) 动作信息识别方法、装置、电子设备及存储介质
WO2020151530A1 (zh) 服装的计件方法、装置及设备
CN112101185A (zh) 一种训练皱纹检测模型的方法、电子设备及存储介质
CN111382712A (zh) 一种手掌图像识别方法、系统及设备
CN111507195A (zh) 虹膜分割神经网络模型的训练方法、虹膜分割方法及装置

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18876242

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18876242

Country of ref document: EP

Kind code of ref document: A1