WO2023102844A1 - 剪枝模块的确定方法、装置及计算机可读存储介质 - Google Patents

剪枝模块的确定方法、装置及计算机可读存储介质 Download PDF

Info

Publication number
WO2023102844A1
WO2023102844A1 PCT/CN2021/136849 CN2021136849W WO2023102844A1 WO 2023102844 A1 WO2023102844 A1 WO 2023102844A1 CN 2021136849 W CN2021136849 W CN 2021136849W WO 2023102844 A1 WO2023102844 A1 WO 2023102844A1
Authority
WO
WIPO (PCT)
Prior art keywords
input
task
determining
module
tasks
Prior art date
Application number
PCT/CN2021/136849
Other languages
English (en)
French (fr)
Inventor
高伟
郭洋
李革
Original Assignee
北京大学深圳研究生院
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 北京大学深圳研究生院 filed Critical 北京大学深圳研究生院
Priority to CN202180003874.0A priority Critical patent/CN114514539A/zh
Priority to PCT/CN2021/136849 priority patent/WO2023102844A1/zh
Priority to US17/681,243 priority patent/US20230186091A1/en
Publication of WO2023102844A1 publication Critical patent/WO2023102844A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0495Quantised networks; Sparse networks; Compressed networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/082Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/0464Convolutional networks [CNN, ConvNet]

Definitions

  • the present application relates to the technical field of neural network compression, and in particular to a method, device and computer-readable storage medium for determining a pruning module.
  • Neural networks have made major breakthroughs in many fields such as computer vision and natural language processing.
  • the computational complexity and parameter storage requirements of neural networks are relatively high, which makes it impossible to deploy on some devices with limited resources. superior.
  • the existing neural network pruning schemes usually evaluate the importance of different parts in the network structure based on the weight parameters in the neural network or the data characteristics of the output feature map, and then perform pruning operations according to the importance. In this way, when evaluating the importance of different parts in the network structure based on data-driven, only the data characteristics of the output data itself are considered, resulting in further improvement in pruning accuracy when pruning according to the estimated importance.
  • the embodiment of the present application provides a method, device, and computer-readable storage medium for determining a pruning module, aiming to solve the problem of only considering the data characteristics of the output data itself when evaluating the importance of different parts in the network structure based on data-driven, A technical problem leading to further improvement in pruning accuracy when pruning according to the estimated importance.
  • An embodiment of the present application provides a method for determining a pruning module, the method for determining a pruning module includes:
  • a pruning module in the neural network to be pruned is determined according to the importance index value.
  • the step of determining the task relevance of the component modules according to the input tasks, the number of tasks and the output information includes:
  • the task relevance of the component modules is determined according to the task quantity, the input task retention and the sum of the input and output information.
  • the input task includes at least one input task map
  • the output information includes an output feature map
  • the determination of each of the target components according to the output information of each of the target component modules and the input tasks The steps of the module's input task retention and the sum of input and output information include:
  • a union of the processed input task graph and the processed output feature graph is used to obtain the sum of the input and output information of each of the target component modules.
  • the step of determining the task relevance of the component modules according to the number of tasks, the retention of the input tasks and the sum of the input and output information includes:
  • the step of determining the input information retention degree of the component module according to the number of tasks and the output information includes:
  • the input information retention degree of the component module is determined according to the number of tasks, the first energy value and the second energy value.
  • the first energy value of each target component module in the current network level is determined according to the output information of each target component module
  • the previous energy value is determined according to the output information of each component module in the previous network level.
  • the step of determining the input information retention degree of the component module according to the number of tasks, the first energy value and the second energy value includes:
  • the step of determining the importance index value of the component module according to the task relevance degree and the input information retention degree includes:
  • the product of the normalized task relevance degree and the normalized input information retention degree is used as the importance index value of the component module.
  • the present application also provides a device for determining a pruning module, the device for determining a pruning module includes a memory, a processor, and a pruning device that is stored in the memory and can run the network model on the processor.
  • a branch program when the network model pruning program is executed by the processor, the steps of the network model pruning method described above are implemented.
  • the present application also provides a computer-readable storage medium, the computer-readable storage medium stores the determination program of the pruning module, and the determination program of the pruning module is implemented when executed by the processor. The steps of the method for determining the pruning module as described above.
  • the method, device, and computer-readable storage medium for determining the pruning module obtain the input tasks and the number of tasks of the neural network to be pruned and the output information of the components of the neural network to be pruned, and then according to The input tasks, the number of tasks and the output information determine the task relevance of each component module, and determine the input information retention of the component modules according to the task quantity and output information, and then determine the composition according to the task relevance and input information retention
  • the importance index value of the module so as to determine the pruning module in the neural network to be pruned according to the importance index value, so that the output information of each component module can be associated with the input task to realize task-driven neural network pruning, which can Avoid easily reducing the important modules associated with the input task when evaluating the importance of each component module based on the data characteristics of the output data itself, which will affect the processing of the input task by the neural network, thereby improving the accuracy of the pruning module determination to improve the pruning accuracy of the neural network.
  • FIG. 1 is a schematic structural diagram of a device for determining the pruning module of the hardware operating environment involved in the solution of the embodiment of the present application;
  • Fig. 2 is a schematic flow chart of the first embodiment of the method for determining the pruning module of the present application
  • FIG. 3 is a schematic flow diagram of the second embodiment of the method for determining the pruning module of the present application
  • Fig. 4 is a schematic flow chart of the third embodiment of the method for determining the pruning module of the present application.
  • FIG. 5 is a schematic flowchart of a fourth embodiment of a method for determining a pruning module of the present application.
  • the main solution of this application is: to obtain the input tasks and the number of tasks of the neural network to be pruned, and the output information of the constituent modules of the neural network to be pruned; according to the input tasks, the number of tasks and the output information to determine the task relevance of the component modules; determine the input information retention of the component modules according to the number of tasks and the output information; determine the component modules according to the task relevance and the input information retention The importance index value; determine the pruning module in the neural network to be pruned according to the importance index value.
  • FIG. 1 is a schematic structural diagram of an apparatus for determining a pruning module of a hardware operating environment involved in the solution of an embodiment of the present application.
  • the device for determining the pruning module may include: a communication bus 1002 , a processor 1001 such as a CPU, a user interface 1003 , a network interface 1004 , and a memory 1005 .
  • the communication bus 1002 is used to realize connection and communication between these components.
  • the user interface 1003 may include a display screen (Display), an input unit such as a keyboard (Keyboard), and the optional user interface 1003 may also include a standard wired interface and a wireless interface.
  • the network interface 1004 may include a standard wired interface and a wireless interface (such as a WI-FI interface).
  • the memory 1005 can be a high-speed RAM memory, or a stable memory (non-volatile memory), such as a disk memory.
  • the memory 1005 may also be a storage device independent of the aforementioned processor 1001 .
  • the network interface 1004 is mainly used to connect to the background server and perform data communication with the background server;
  • the user interface 1003 is mainly used to connect to the client (client) and perform data communication with the client.
  • Communication; and the processor 1001 can be used to call the control program of the device for determining the pruning module stored in the memory 1005, and execute the relevant steps of the following various embodiments of the method for determining the pruning module.
  • the determination method of described pruning module comprises the following steps:
  • Step S10 Obtain the input tasks and the number of tasks of the neural network to be pruned, and the output information of the constituent modules of the neural network to be pruned;
  • Step S20 Determine the task relevance of the component modules according to the input tasks, the number of tasks and the output information
  • Step S30 Determine the input information retention degree of the component modules according to the number of tasks and the output information
  • Step S40 Determine the importance index value of the component module according to the task relevance and the input information retention
  • Step S50 Determine the pruning module in the neural network to be pruned according to the importance index value.
  • the neural network to be pruned refers to the neural network that needs to be pruned to achieve neural network compression, for example, convolutional neural network (Convolutional Neural Networks, CNN), deep neural network (Deep Neural Networks, DNN) And recurrent neural network (Recurrent Neural Network, RNN), etc.;
  • the input task of the neural network to be pruned refers to the input data that is input to the neural network to be pruned for information processing, such as data table, sequence table or image table;
  • the output information of the constituent modules of the branch neural network refers to the output data of the constituent modules of the neural network to be pruned, such as the output feature map.
  • each neural network to be pruned may include a plurality of component modules, and the component modules may include at least one of filters, channels, and parameters; each input task may include at least one input data, such as at least one input image, etc. .
  • the importance evaluation of the components of the neural network is usually based on data-driven, for example, based on the L1 regularization/L2 regularization/other regularization items of the output feature map of the calculation channel, through the calculation result numerical sorting to determine the degree of importance; or, sort all the elements in each weight parameter matrix from small to large in absolute value to determine the degree of importance; or, by calculating the norm of the filter (L1 regularization/ L2 regularization/other regularization terms) to determine the contribution of the filter, and the norm size is proportional to the contribution.
  • the method for determining the pruning module proposed in this application can evaluate the importance of the neural network to be pruned in combination with the connection between the output data of each component module and the input task, so as to more accurately determine the composition of the neural network to be pruned
  • the pruning module in the module improves the pruning accuracy when the neural network to be pruned is pruned.
  • the output information of each component module of the neural network to be pruned can be correspondingly obtained, and the input task input into the neural network to be pruned and the corresponding input tasks contained in the input task can be obtained.
  • the number of tasks is recorded, so as to determine the importance of each component module of the neural network to be pruned according to the input tasks and the number of tasks of the neural network to be pruned and the output information of the component modules of the neural network to be pruned.
  • each component module of the neural network to be pruned may specifically be: determine the tasks of each component module according to the input tasks and the number of tasks of the neural network to be pruned and the output information of the component modules of the neural network to be pruned degree of relevance; and, according to the number of tasks of the neural network to be pruned and the output information of each component module of the neural network to be pruned, the input information retention of each component module is determined, and then according to the task relevance of each component module and the input information retention Determine the importance index value of each component module.
  • the task correlation degree of each component module refers to the correlation degree between the output information of each component module and the input task of the neural network to be pruned;
  • the input information retention degree of each component module refers to the output information of each component module The amount of information contained in its input information.
  • the transmission impact can be specifically: the input task of the pruned neural network and the output information of the first N component modules in the current network layer including the current component module N (N is a positive integer) are intersected to obtain the first N components.
  • the ratios are summed, and then the sum is averaged according to the number of tasks to obtain the task correlation degree corresponding to the current component module.
  • the input information retention of each component module when determining the input information retention of each component module according to the number of tasks of the neural network to be pruned and the output information of each component module of the neural network to be pruned, it may specifically be: Interaction among them, determine the energy value corresponding to the output information of the first N constituent modules including the current constituent module N (N is a positive integer) in the current network level, and sum the determined N energy values Finally, according to the number of tasks, the sum of the sums is averaged to obtain the input information retention of the current component module; or, considering the information flow between adjacent network levels and the component modules of the same network level After determining the energy ratio corresponding to the output information of each component module and the output information of the previous network level, according to the serial number N of the current component module in the current network level, the corresponding energy proportion of the first N component modules The sum is performed, and the average value is taken according to the number of tasks to obtain the input information retention of the current component modules, etc.
  • the importance index value of each component module when determining the importance index value of each component module according to the task relevance degree and input information retention degree of each component module, it may specifically be: taking the sum of the task relevance degree and input information retention degree as the importance index value; Alternatively, the product of the task relevance degree and the input information retention degree is used as the importance index value; or the weighted sum value of the task relevance degree and the input information retention degree is used as the importance index value, etc., which are not specifically limited here.
  • the pruning module in the neural network to be pruned can be determined according to the importance index value.
  • the pruning module in the neural network to be pruned refers to a component module that can be pruned among the component modules of the neural network to be pruned, such as at least one of a filter, a channel, and a parameter.
  • the pruning module can be pruned to reduce the computational complexity and data storage requirements of the neural network to be pruned.
  • the method of determining the pruning module in the neural network to be pruned according to the importance index value may be: determining the component module whose importance index value is within a preset value range as a pruning module; or, according to After the preset pruning rate determines the number of pruning modules that need to be pruned, the constituent modules with higher importance index values corresponding to the number of pruning modules are determined as pruning modules, etc., which are not specifically limited here.
  • the preset value range and the preset pruning rate can be set according to actual needs, and are not specifically limited here.
  • the quantity and output information determine the input information retention of the component modules, and then determine the importance index value of each component module according to the task relevance and input information retention, and determine the pruning module in the neural network to be pruned according to the importance index value , so that the output information of each component module can be associated with the input task of the neural network to be pruned to evaluate the importance of each component module based on task-driven, avoiding the importance evaluation only based on the data characteristics of the output data itself and ignoring the input
  • the task plays a role in the information processing of the neural network, it is easy to mistakenly prune the component modules with a high degree of correlation with the input task, which will affect the processing of the input task. In this way, associating the output information of each component module with the input
  • step S20 may include:
  • Step S21 Obtain the target sequence number of the component modules in the current network hierarchy
  • Step S22 Determine the target component module whose sequence number is less than or equal to the target sequence number in the current network hierarchy
  • Step S23 Determine the input task retention and the sum of input and output information of each target component module according to the output information of each target component module and the input task;
  • Step S24 Determine the task relevance of the component modules according to the number of tasks, the retention of the input tasks and the sum of the input and output information.
  • the neural network to be pruned may include multiple network levels, and each network level may include multiple constituent modules.
  • each network level may include multiple constituent modules.
  • the different constituent modules in the same network layer can be numbered based on the connection relationship or processing order of each constituent module, so as to distinguish the different constituent modules in the same network layer. For example, it can be sorted step by step according to the connection relationship from small to large; or, according to the processing order, it can be sorted from small to large, and the higher the processing sequence, the smaller the corresponding sorting number.
  • the information on the current component module in the current network level can be determined first Process the affected target component modules, and then combine the impact of the target component modules with the processing capability of the current component modules to determine the task relevance of the current component modules.
  • the target sequence number of the current component module in the current network level where it is located may be determined first, for example, which component module is the current filter in the current network level.
  • the constituent modules whose sequence number is smaller than the target serial number are considered to be the constituent modules that may affect the task correlation degree of the current constituent modules.
  • the component modules whose sequence number is less than or equal to the target sequence number in the current network level can be determined as the target component modules, for example, when the target sequence number is 1, the first component module is determined as the target component module, and when the target sequence number is 3 , the first, second and third components are determined as the target components, and then the task association of the current component can be determined according to the output information of the target component and the input tasks and the number of tasks of the neural network to be pruned Spend.
  • the input task retention of each target component module and the input task of each target component module can be determined according to the output information of each target component module and the input task of the neural network to be pruned. Output the sum of information, and then determine the task relevance of the current component module according to the number of tasks input to the neural network to be pruned, the input task retention and the sum of input and output information of each target component module.
  • the input task retention of each target component module refers to the amount of information contained in the input task in the output information of each target component module; the sum of input and output information of each target component module refers to the output information of each target component module.
  • the information sum of the task information contained in the input task of the pruned neural network is a predefined neural network.
  • the input task retention degree of each target component module and the sum of input and output information according to the output information of each target component module and the input task of the neural network to be pruned it may specifically be: output information of the target component module and Take the intersection of the input tasks of the neural network to be pruned to obtain the retention of the input tasks of each target component module; and, take the union of the output information of the target component module and the input task of the neural network to be pruned to obtain the input and output of each target component module sum of information.
  • the input task takes image classification as an example
  • the input task of the neural network to be pruned includes at least one input task map
  • the output information of each component module includes at least one output feature map
  • the resolution of the output feature map of each target component module can be adjusted first, so that the resolution of the output feature map Adjust to match the resolution of the input task map.
  • the output feature map and the input task map can also be binarized .
  • the specific method may be: first calculate each target component module The ratio of the input task retention to the sum of the input and output information is used to obtain the task retention ratio; then the task retention ratio of each target component module is summed to obtain the sum of the task retention ratio; and then the task retention ratio is divided by the number of tasks , get the average value of task retention corresponding to each task, and take the average value of task retention as the task relevance of the current component module.
  • the resolution of the output feature map of each target component module can be adjusted first, so that the resolution of the output feature map can be adjusted to be consistent with the resolution of the input task map; and then the input task map can be binary Binarization processing to obtain the input task map after binarization processing; and binarization processing on the adjusted output feature map to obtain the output feature map after binarization processing; and then the input task map after binarization processing
  • the input task retention degree of each target component module can be obtained by taking the intersection of the output feature map after the value processing and the output feature map after the binarization processing and the output feature map after the value processing, and then Obtain the sum of the input and output information of each target component module; then, obtain the ratio of the input task retention of each target component module to the sum of input and output information, and obtain the task retention ratio of each target component module; Sum the reserved proportions to obtain the task reserved proportions and values of each target component module; then, calculate the quotient of the task reserved proportions and values of each target component module and the number of tasks to obtain
  • the network weight of the i-th component module of the t-th layer in the network is Mark its corresponding output feature map as Then calculating the task relevance of the i-th component module of the t-th layer in the neural network to be pruned may include the following steps:
  • each component module can be regarded as a semantic extractor related to the input task
  • the output feature map of each component module can be regarded as a feature map containing the semantic information of the input task.
  • the task correlation degree of the current component module is determined by the number of tasks corresponding to the input tasks of the neural network to be pruned, the input task retention degree of the target component module, and the sum of input and output information, so that not only the output information of the current component module can be combined with
  • the input tasks can also fully consider the impact of the front-ranked components in the current network hierarchy on the task relevance of the current component modules, which can improve the accuracy of the determination of the task relevance of the current component modules, and then improve the current component modules.
  • the accuracy of the importance index value is determined to improve the pruning accuracy.
  • step S30 may include:
  • Step S31 Obtain the target serial number of the component module in the current network level, and the number of images contained in the input task;
  • Step S32 Determine the target component module whose sequence number is less than or equal to the target sequence number in the current network hierarchy
  • Step S33 Determine the first energy value of each target component module in the current network level according to the output information of each target component module, and determine the composition in the previous network level according to the output information of each component module in the previous network level the second energy value of the module;
  • Step S34 Determine the input information retention degree of the component module according to the number of tasks, the first energy value and the second energy value.
  • the neural network to be pruned may include multiple network levels, and each network level may include multiple constituent modules.
  • each network level may include multiple constituent modules.
  • the different constituent modules in the same network layer can be numbered based on the connection relationship or processing order of each constituent module, so as to distinguish the different constituent modules in the same network layer. For example, it can be sorted step by step according to the connection relationship from small to large; or, according to the processing order, it can be sorted from small to large, and the higher the processing sequence, the smaller the corresponding sorting number.
  • the constituent modules of adjacent layers can be determined by analyzing the amount of information obtained between the constituent modules of adjacent layers The flow of information in to assess the retention of the input information of the current constituent modules. Specifically, before determining the input information retention degree of the current component module, the target component modules in the current network hierarchy that have an impact on the input information retention degree of the current component module may be determined first. For example, the target sequence number of the current component module in the current network level where Qi is located can be determined first, for example, which filter is the current filter in the current network level.
  • the component modules whose sorting sequence number is smaller than the target sequence number they are considered as the target component modules that affect the input information retention of the current component module.
  • the component modules whose sequence number is less than or equal to the target sequence number in the current network level can be determined as the target component module, and then the current component module can be determined according to the output information of the target component module and the number of tasks corresponding to the input tasks of the neural network to be pruned. Enter the degree of information retention.
  • the input information retention of the current component module When determining the input information retention of the current component module according to the output information of the target component module and the number of tasks of the neural network to be pruned, it can be specifically: first determine each target component module in the current network level according to the output information of each target component module and determine the second energy value of the component modules in the previous network level according to the output information of each component module in the previous network level; then determine the current energy value according to the number of tasks, the first energy value and the second energy value The degree of retention of the input information of the constituent modules.
  • the first energy value refers to the energy value corresponding to the output information of each target component module
  • the second energy value refers to the energy value corresponding to the output information of the previous network level.
  • the norm of the output information of each target component module may be calculated, and the square of the calculated norm may be used as the first energy value of each target component module.
  • the norm here refers especially to the L2 norm, in order to improve the accuracy of the calculation of the first energy value.
  • other norms such as the L1 norm, may also be used in some other embodiments, which are not specifically limited here.
  • the L2 norm refers to the sum of the squares of each element in the vector and then the root sign
  • the L1 norm refers to the sum of the absolute values of each element in the vector, also known as the "sparse rule operator".
  • the variance of the output information of each component group module in the previous network level may be calculated as the second energy value corresponding to the previous network level.
  • the ratio between the first energy value and the second energy value of each target component module can be obtained first, Obtain the energy ratio corresponding to each target component module, then sum each energy ratio to obtain the energy ratio and value, and then divide the obtained energy ratio and value by the number of tasks to obtain the input information of the current component module retention.
  • the input task is image classification as an example
  • the input task of the neural network to be pruned includes at least one input task map
  • the output information of each component module includes at least one output feature map
  • the network weight parameter of the i-th component module of the t-th layer in the network is Mark its corresponding output feature map as The energy proportion of each target component module is recorded as According to the number d of tasks input to the neural network to be pruned, the energy ratio By taking the average value, the input information retention degree of the i-th component module of the t-th layer can be obtained.
  • the specific calculation formula is as follows:
  • the number of tasks corresponding to the input tasks of the neural network to be pruned, the first energy value and the second energy value corresponding to the input tasks of the neural network to be pruned determine the input information retention of the constituent modules, so that the The information flow between the layers is used to evaluate the input information retention of the constituent modules, so as to improve the accuracy of the determination of the input information retention of the constituent modules, and then improve the accuracy of the determination of the importance index value of the current constituent modules, and improve the accuracy of pruning .
  • step S40 may include:
  • Step S41 Perform normalization processing on the task relevance degree and the input information retention degree
  • Step S42 taking the sum of the normalized task relevance degree and the normalized input information retention degree as the importance index value of the component module; or,
  • Step S43 taking the product of the normalized task relevance degree and the normalized input information retention degree as the importance index value of the component module.
  • the importance index value of each component module can be determined according to the task correlation degree and input information retention degree. Evaluate the importance of each component module.
  • the task correlation degree and input information retention degree of each component module can be summed, and the sum value obtained by the summation can be used as the importance index value of each component module; or, the task correlation degree and input information retention degree of each component module can be summed
  • the product of the retention degree is taken as the importance index value of each component module; or, after summing the task relevance degree of each component module and the input information retention degree, and calculating the product of the task relevance degree of each component module and the input information retention degree , different weight values can be assigned to the obtained summation value and product, and the summation value and product are weighted summation with the assigned weight value, and the weighted sum value obtained by weighted summation is used as the important value of each component module index value, etc.
  • the task correlation degree and input information retention degree of each component module can be normalized first, and then the normalized task correlation degree and input information can be obtained.
  • the sum or product of information retention is used as the importance index value of each component module.
  • the maximum task correlation degree among the task correlation degrees of all constituent modules in the current network level and the maximum input information retention degree among the input information retention degrees of all constituent modules in the current network level can be obtained first; then, the current The quotient between the task correlation degree of the component module and the maximum task correlation degree is obtained to obtain the normalized task correlation degree, and the quotient value between the input information retention degree of the current component module and the maximum input information retention degree is obtained, Obtain the input information retention degree after normalization processing; then, sum the task correlation degree after normalization processing and the input information retention degree after normalization processing to obtain the importance index value of the corresponding component module; Alternatively, the product of the normalized task correlation degree and the normalized input information retention degree can be obtained to obtain the importance index value of the corresponding component module.
  • the importance index value of each component module is obtained by summing or multiplying the normalized task relevance degree and input information retention, so that the importance index value of each component module can be determined based on task-driven, so as to improve The accuracy of the importance index value, thereby improving the pruning accuracy of the neural network to be pruned.
  • the component modules are filters in the neural network to be pruned
  • the input information of the neural network to be pruned includes at least one input task graph
  • the output information of each component module includes output feature Graph
  • the importance of each filter can be evaluated by evaluating the connection between the input task and the output feature map (task relevance), and measuring the information flow in the filter between adjacent network layers (input information retention). The index value, and then determine the pruning module in the neural network to be pruned according to the importance index value.
  • Each filter can be regarded as a semantic extractor related to the input task, and the output feature map of the filter can be regarded as a feature map containing the semantic information of the input task. Semantic information is used to evaluate the connection between the input task and the feature map, so as to determine the semantic extraction ability of the filter that outputs the output feature map to the input task, and this ability is used as an indicator to evaluate the importance of the filter.
  • the input task graph is x
  • obtain its grayscale image G(x) Input the input task graph into the neural network to be pruned, assuming that the network weight parameter of the i-th filter in the t-th layer of the neural network to be pruned is The corresponding output feature map can be written as Then, the output feature map is expanded to an image with the same resolution as the input task map by bilinear interpolation, and based on the set threshold, the following formula (1) is used to realize the comparison of G(x) and The binarization operation of :
  • d represents the number of input task graphs in the input task, that is, the number of tasks, and It can be obtained by calculating the connection between the input task and the output feature map.
  • the specific calculation process is to find L(x) and The intersection ratio of Here, the calculated The larger is, the stronger the ability of the i-th filter in the t-th layer of the network to extract the semantic information of the input task is, and the more important it is to the input task.
  • the information flow in the input task will flow between adjacent layers from shallow to deep layers.
  • a filter of a specific layer it will input the output feature map of the previous layer and output the output feature map of the current layer, which means that the filters between the layers have no influence on each other. Therefore, the filters of each layer can be regarded as a finite state machine, and the information flow in the filters between adjacent layers can be determined by analyzing the amount of information acquired between filters of adjacent layers.
  • the i-th filter in the t-th layer of the network (the network weight parameter is ) output feature map; express The square of the L2 norm of ; ⁇ 2 (O t-1 (x)) represents the variance of all output feature maps of the (t-1) layer. It can be obtained by calculating the information acquisition ability of the i-th filter in the t-th layer of the network when the task image x is input, and the specific calculation process is Here, the calculated The larger is, it means that the i-th filter in the t-th layer of the network has a stronger ability to obtain information for the input task, and the more important it is for the input task.
  • max(A t ) and max(B t ) respectively represent the values of all filters in the tth layer of the neural network to be pruned the maximum value in , and (n t is the number of filters in the tth layer of the network).
  • At least four typical neural networks such as VGG-16, ResNet-56, ResNet-110, and ResNet-50, are used as the neural network to be pruned, based on two kinds of CIFAR-10 and ImageNet (ILSVRC2012)
  • the test data set is used for network model pruning test.
  • the experimental results show that the task-driven pruning scheme proposed in this application has better network pruning performance than the data-driven pruning scheme, including higher compression ratio, less parameter storage requirements and more Low computational complexity.
  • the two combined calculation schemes of the above importance index values can both achieve good results, and can be selected and used according to specific situations.
  • f(x, W) is the output of the network
  • l(.) is the loss function calculation formula of each input training example x (it can be cross-entropy loss or other specific calculation formulas).
  • y is the ground truth corresponding to the training example x.
  • the specific training framework is as follows:
  • step 3 is the process of pruning based on the task-driven neural network model
  • steps 4-5 are the process of retraining and fine-tuning, Indicates the pruned model parameters obtained when the number of training iterations is j and the pruned model parameters obtained when the number of training iterations is (j-1) times The L2 norm value of the difference between. when , iteratively retrains and fine-tunes the parameters of the pruned network model continuously.
  • an embodiment of the present application also provides a device for determining a pruning module, the device for determining a pruning module includes a memory, a processor, and a pruning module stored on the processor and operable on the processor.
  • a determination program when the processor executes the determination program of the pruning module, implements the steps of the determination method of the pruning module as described above.
  • the embodiment of the present application also provides a computer-readable storage medium, the computer-readable storage medium stores the determination program of the pruning module, and when the determination program of the pruning module is executed by the processor, the above-mentioned The steps of the determination method of the pruning module.
  • the term “comprises”, “comprises” or any other variation thereof is intended to cover a non-exclusive inclusion such that a process, method, article or system comprising a set of elements includes not only those elements, It also includes other elements not expressly listed, or elements inherent in the process, method, article, or system. Without further limitations, an element qualified by the phrase “comprising a" does not preclude the presence of additional identical elements in the process, method, article, or system comprising that element.
  • the methods of the above embodiments can be implemented by means of software plus a necessary general-purpose hardware platform, and of course also by hardware, but in many cases the former is better implementation.
  • the technical solution of the present application can be embodied in the form of a software product in essence or in other words, the part that contributes to the prior art, and the computer software product is stored in a storage medium (such as ROM/RAM) , magnetic disk, optical disk), including several instructions to enable a terminal device (which may be a mobile phone, computer, server, television, or network device, etc.) to execute the methods described in various embodiments of the present application.

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Health & Medical Sciences (AREA)
  • Image Analysis (AREA)

Abstract

本申请公开了一种剪枝模块的确定方法、装置及计算机可读存储介质,剪枝模块的确定方法包括:获取待剪枝神经网络的输入任务和任务数量,以及所述待剪枝神经网络的组成模块的输出信息;根据所述输入任务、所述任务数量和所述输出信息确定所述组成模块的任务关联度;根据所述任务数量和所述输出信息确定所述组成模块的输入信息保留度;根据所述任务关联度和所述输入信息保留度确定所述组成模块的重要性指标值;根据所述重要性指标值确定所述待剪枝神经网络中的剪枝模块。

Description

剪枝模块的确定方法、装置及计算机可读存储介质 技术领域
本申请涉及神经网络压缩技术领域,尤其涉及一种剪枝模块的确定方法、装置及计算机可读存储介质。
背景技术
神经网络在计算机视觉、自然语言处理等众多领域取得重大突破,但是,在具体的应用过程中,神经网络的计算复杂度与参数存储需求量都比较高,导致其无法部署于一些资源有限的设备上。为了扩增神经网络的适用范围,通常可通过神经网络剪枝的方式进行神经网络压缩并降低神经网络的计算复杂度与参数存储需求量。
然而,目前已有的神经网络剪枝方案通常都是基于神经网络中的权重参数或输出特征图的数据特征来评估网络结构中不同部分的重要程度,然后根据该重要程度进行剪枝操作。如此,在基于数据驱动评估网络结构中不同部分的重要程度时,仅考虑到输出数据本身的数据特性,导致在根据所评估的重要程度进行剪枝时的剪枝准确性有待进一步提高。
技术问题
本申请实施例通过提供一种剪枝模块的确定方法、装置及计算机可读存储介质,旨在解决基于数据驱动评估网络结构中不同部分的重要程度时,仅考虑到输出数据本身的数据特性,导致在根据所评估的重要程度进行剪枝时的剪枝准确性有待进一步提高的技术问题。
技术解决方案
本申请实施例提供了一种剪枝模块的确定方法,所述剪枝模块的确定方法,包括:
获取待剪枝神经网络的输入任务和任务数量,以及所述待剪枝神经网络的组成模块的输出信息;
根据所述输入任务、所述任务数量和所述输出信息确定所述组成模块的任务关联度;
根据所述任务数量和所述输出信息确定所述组成模块的输入信息保留度;
根据所述任务关联度和所述输入信息保留度确定所述组成模块的重要性指标值;
根据所述重要性指标值确定所述待剪枝神经网络中的剪枝模块。
在一实施例中,所述根据所述输入任务、所述任务数量和所述输出信息确定所述组成模块的任务关联度的步骤包括:
获取当前网络层级中所述组成模块的目标序号;
确定当前网络层级中序号小于或等于所述目标序号的目标组成模块;
根据各个所述目标组成模块的输出信息与所述输入任务确定各个所述目标组成模块的输入任务保留度和输入输出信息总和;
根据所述任务数量、所述输入任务保留度和所述输入输出信息总和确定所述组成模块的任务关联度。
在一实施例中,所述输入任务包括至少一张输入任务图,所述输出信息包括输出特征图,所述根据各个所述目标组成模块的输出信息与所述输入任务确定各个所述目标组成模块的输入任务保留度和输入输出信息总和的步骤包括:
对各个所述目标组成模块的输出特征图的分辨率进行调节,以使所述输出特征图的分辨率与所述输入任务图的分辨率一致;
对所述输入任务图和调节后的所述输出特征图进行二值化处理;
对处理后的所述输入任务图和处理后的所述输出特征图取交集,得到各个所述目标组成模块的输入任务保留度;
对处理后的所述输入任务图和处理后的所述输出特征图的取并集,得到各个所述目标组成模块的输入输出信息总和。
在一实施例中,所述根据所述任务数量、所述输入任务保留度和所述输入输出信息总和确定所述组成模块的任务关联度的步骤包括:
获取所述输入任务保留度和所述输入输出信息总和的比值,得到各个所述目标组成模块的任务保留占比;
对各个所述任务保留占比求和,得到任务保留占比和值;
将所述任务保留占比和值与所述任务数量的商值确定为所述组成模块的任务关联度。
在一实施例中,所述根据所述任务数量和所述输出信息确定所述组成模块的输入信息保留度的步骤包括:
获取所述组成模块在当前网络层级中的目标序号,以及所述输入任务包含的图像数量;
确定当前网络层级中序号小于或等于所述目标序号的目标组成模块;
根据各个所述目标组成模块的输出信息确定当前网络层级中各个所述目标组成模块的第一能量值,根据前一网络层级中各个组成模块的输出信息确定前一网络层级中的组成模块的第二能量值;
根据所述任务数量、所述第一能量值和所述第二能量值确定所述组成模块的输入信息保留度。
在一实施例中,所述根据各个所述目标组成模块的输出信息确定当前网络层级中各个所述目标组成模块的第一能量值,根据前一网络层级中各个组成模块的输出信息确定前一网络层级中的组成模块的第二能量值的步骤包括:
求取各个所述目标组成模块的输出信息的范数,将所述范数的平方作为各个所述目标组成模块的第一能量值;
求取前一网络层级中的各个组成模块的输出信息的方差,得到所述第二能量值。
在一实施例中,所述根据所述任务数量、所述第一能量值和所述第二能量 值确定所述组成模块的输入信息保留度的步骤包括:
获取各个所述第一能量值与所述第二能量值之间的比值,得到各个所述目标组成模块对应的能量占比;
对各个所述能量占比进行求和,得到能量占比和值;
将所述能量占比和值与所述任务数量的商值确定为所述组成模块的输入信息保留度。
在一实施例中,所述根据所述任务关联度和所述输入信息保留度确定所述组成模块的重要性指标值的步骤包括:
对所述任务关联度和所述输入信息保留度进行归一化处理;
将归一化后的所述任务关联度与归一化后的所述输入信息保留度的和值作为所述组成模块的重要性指标值;
或者,将归一化后的所述任务关联度与归一化后的所述输入信息保留度的乘积作为所述组成模块的重要性指标值。
此外,为实现上述目的,本申请还提供了一种剪枝模块的确定装置,所述剪枝模块的确定装置包括存储器、处理器及存储在存储器上并可在处理器上运行网络模型的剪枝程序,所述网络模型的剪枝程序被所述处理器执行时实现如上所述的网络模型的剪枝方法的步骤。
此外,为实现上述目的,本申请还提供了一种计算机可读存储介质,所述计算机可读存储介质存储有剪枝模块的确定程序,所述剪枝模块的确定程序被处理器执行时实现如上所述的剪枝模块的确定方法的步骤。
有益效果
本申请实施例中提供的剪枝模块的确定方法、装置及计算机可读存储介质,通过获取待剪枝神经网络的输入任务和任务数量以及待剪枝神经网络的组成模块的输出信息,然后根据所输入任务、任务数量和所述输出信息确定各组成模块的任务关联度,并根据任务数量和输出信息确定组成模块的输入信息保留度,随后根据任务关联度和输入信息保留度确定所述组成模块的重要性指标值,以根据重要性指标值确定待剪枝神经网络中的剪枝模块,使得可以将各组成模块的输出信息与输入任务关联起来实现基于任务驱动的神经网络剪枝,能够避免仅根据输出数据本身的数据特性对各组成模块进行重要性评估时容易将与输入任务关联的重要模块减掉而对神经网络对输入任务的处理产生影响,进而能够提高剪枝模块确定的准确性,以提高神经网络的剪枝准确性。
附图说明
图1为本申请实施例方案涉及的硬件运行环境的剪枝模块的确定装置结构示意图;
图2是本申请剪枝模块的确定方法第一实施例的流程示意图;
图3为本申请剪枝模块的确定方法第二实施例的流程示意图;
图4为本申请剪枝模块的确定方法第三实施例的流程示意图;
图5为本申请剪枝模块的确定方法第四实施例的流程示意图。
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明,上述附图只是一个实施例图,而不是本申请的全部。
本发明的实施方式
为了更好的理解上述技术方案,下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例,然而应当理解,可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反,提供这些实施例是为了能够更透彻地理解本公开,并且能够将本公开的范围完整的传达给本领域的技术人员。
本申请的主要解决方案是:获取待剪枝神经网络的输入任务和任务数量,以及所述待剪枝神经网络的组成模块的输出信息;根据所述输入任务、所述任务数量和所述输出信息确定所述组成模块的任务关联度;根据所述任务数量和所述输出信息确定所述组成模块的输入信息保留度;根据所述任务关联度和所述输入信息保留度确定所述组成模块的重要性指标值;根据所述重要性指标值确定所述待剪枝神经网络中的剪枝模块。
由于目前的神经网络剪枝方案大多是基于数据驱动实现各组成模块的重要性评估的,然而,通过此种方式仅考虑到输出数据本身的数据特征,而忽略了输出数据与输入任务之间的关联,导致剪枝时容易将输入任务相关的组成模块剪掉,不利于提高剪枝的准确性。因而,本申请提出的上述解决方案旨在提高剪枝的准确性。
参照图1,图1为本申请实施例方案涉及的硬件运行环境的剪枝模块的确定装置结构示意图。
如图1所示,该剪枝模块的确定装置可以包括:通信总线1002,处理器1001,例如CPU,用户接口1003,网络接口1004,存储器1005。其中,通信总线1002用于实现这些组件之间的连接通信。用户接口1003可以包括显示屏(Display)、输入单元比如键盘(Keyboard),可选用户接口1003还可以包括标准的有线接口、无线接口。网络接口1004可选的可以包括标准的有线接口、无线接口(如WI-FI接口)。存储器1005可以是高速RAM存储器,也可以是稳定的存储器(non-volatile memory),例如磁盘存储器。存储器1005可选的还可以是独立于前述处理器1001的存储装置。
本领域技术人员可以理解,图1中示出的剪枝模块的确定装置结构并不构成对剪枝模块的确定装置的限定,可以包括比图示更多或更少的部件,或组合某些部件,或者不同的部件布置。
在图1所示的剪枝模块的确定装置中,网络接口1004主要用于连接后台服务器,与后台服务器进行数据通信;用户接口1003主要用于连接客户端(用户端),与客户端进行数据通信;而处理器1001可以用于调用存储器1005中存储的剪枝模块的确定装置的控制程序,并执行以下剪枝模块的确定方法的各个实施例的相关步骤。
基于上述剪枝模块的确定装置的系统架构,提出本申请剪枝模块的确定方法的第一实施例。参照图2,本实施例中,所述剪枝模块的确定方法包括以下步骤:
步骤S10:获取待剪枝神经网络的输入任务和任务数量,以及所述待剪枝神经网络的组成模块的输出信息;
步骤S20:根据所述输入任务、所述任务数量和所述输出信息确定所述组成模块的任务关联度;
步骤S30:根据所述任务数量和所述输出信息确定所述组成模块的输入信息保留度;
步骤S40:根据所述任务关联度和所述输入信息保留度确定所述组成模块的重要性指标值;
步骤S50:根据所述重要性指标值确定所述待剪枝神经网络中的剪枝模块。
需要说明的是,待剪枝神经网络指的是需要进行剪枝以实现神经网络压缩的神经网络,例如,卷积神经网络(Convolutional Neural Networks,CNN)、深度神经网络(Deep Neural Networks,DNN)以及循环神经网络(Recurrent Neural Network,RNN)等;待剪枝神经网络的输入任务指的是输入待剪枝神经网络进行信息处理的输入数据,如数据表、序列表或图像表等;待剪枝神经网络的组成模块的输出信息指的是待剪枝神经网络的组成模块的输出数据,如输出特征图。其中,每个待剪枝神经网络可包括多个组成模块,所述组成模块可包括滤波器、通道以及参数中的至少一个;每个输入任务可包括至少一个输入数据,如至少一个输入图像等。
由于神经网络的计算复杂度和参数存储需求量都比较高,为了提高处理效率以及扩增神经网络的适用范围,需要对神经网络进行压缩,而神经网络剪枝是目前比较常用的神经网络压缩方式。然而,一些实现方式中,通常是基于数据驱动对神经网络的组成成分进行重要性评估的,例如,基于计算通道输出特征图的L1正则化/L2正则化/其他正则化项,通过计算结果的数值排序来确定重要性程度;或者是,对每个权重参数矩阵中所有的元素按照绝对值从小到大进行排序以确定重要性程度;或者是,通过计算滤波器的范数(L1正则化/L2正则化/其他正则化项)来确定滤波器的贡献度,范数大小和贡献度成正比。如此,在评估网络结构中不同组成模块的重要性程度时,仅考虑到输出数据本身的输出特性,而忽略了输出数据与输入任务之间的联系,导致在根据所评估的重要性程度对神经网络进行剪枝时的剪枝准确性有待进一步提高。因而,本申请提出的剪枝模块的确定方法可以结合各个组成模块的输出数据与输入任务之间的联系来评估待剪枝神经网络的重要性,以更加准确地确定待剪枝神经网络的组成模块中的剪枝模块,进而提高对待剪枝神经网络进行剪枝时的剪枝准确性。
具体地,在将输入任务输入待剪枝神经网络后,可对应获取待剪枝神经网络的各个组成模块的输出信息,并对输入待剪枝神经网络的输入任务和该输入 任务对应所包含的任务数量进行记录,以便根据待剪枝神经网络的输入任务和任务数量以及待剪枝神经网络的组成模块的输出信息确定待剪枝神经网络的各个组成模块的重要性。
在确定待剪枝神经网络的各个组成模块的重要性时,具体可以是:根据待剪枝神经网络的输入任务和任务数量以及待剪枝神经网络的组成模块的输出信息确定各个组成模块的任务关联度;以及,根据待剪枝神经网络的任务数量和待剪枝神经网络的各个组成模块的输出信息确定各个组成模块的输入信息保留度,进而根据各个组成模块的任务关联度和输入信息保留度确定各个组成模块的重要性指标值。其中,各个组成模块的任务关联度指的是各个组成模块的输出信息与待剪枝神经网络的输入任务之间的关联度;各个组成模块的输入信息保留度指的是各个组成模块的输出信息中包含其输入信息的信息量。
可选地,根据待剪枝神经网络的输入任务和任务数量以及待剪枝神经网络的组成模块的输出信息确定各个组成模块的任务关联度时,考虑到同一网络层级的组成模块之间的信息传输影响,具体可以是:分别对待剪枝神经网络的输入任务和当前网络层级中包含当前组成模块N(N为正整数)在内的前N个组成模块的输出信息取交集得到前N个组成模块中含有的输入任务信息量,并对这前N个组成模块中含有的输入任务信息量进行求和,然后根据任务数量对求和所得的和值取平均值,以得到当前组成模块N对应的任务关联度;或者是,在确定各组成模块的输出信息与输入任务的交集与并集后,根据当前组成模块在当前网络层级中的序号N对前N个组成模块对应的交集与并集的比值进行求和,然后按照任务数量对求和所得的和值取平均值,以得到当前组成模块对应的任务关联度等。
可以理解的是,确定各个组成模块的任务关联度的方式还可以是其他方式,此处仅做列举,不作具体限定。
可选地,根据待剪枝神经网络的任务数量和待剪枝神经网络的各个组成模块的输出信息确定各个组成模块的输入信息保留度时,具体可以是:考虑到同一网络层级的组成模块之间的相互影响,在确定当前网络层级中包含当前组成模块N(N为正整数)在内的前N个组成模块的输出信息对应的能量值,并对所确定的N个能量值进行求和后,按照任务数量对求和所得的和值取平均值,以得到当前组成模块的输入信息保留量;或者是,考虑到相邻网络层级之间的信息流动以及同一网络层级的组成模块之间的相互影响,在确定各组成模块的输出信息与前一网络层级的输出信息对应的能量占比后,根据当前组成模块在当前网络层级中的序号N对前N个组成模块对应的能量占比进行求和,并按照任务数量取平均值,以得到当前组成模块的输入信息保留量等。
可以理解的是,确定各个组成模块的输入信息保留度的方式还可以是其他方式,此处仅做列举,不作具体限定。
可选地,根据各个组成模块的任务关联度和输入信息保留度确定各个组成模块的重要性指标值时,具体可以是:以任务关联度与输入信息保留度的和值作为重要性指标值;或者是,以任务关联度和输入信息保留度的乘积作为重要 性指标值;或者是以任务关联度与输入信息保留度的加权和值作为重要性指标值等,此处不作具体限定。
在确定各个组成模块的重要性指标值后,可根据该重要性指标值确定待剪枝神经网络中的剪枝模块。其中,待剪枝神经网络中的剪枝模块指的是待剪枝神经网络的组成模块中可以进行剪枝的组成模块,如滤波器、通道以及参数中的至少一个。在确定待剪枝神经网络中的剪枝模块后,可对剪枝模块进行剪枝处理,以减小待剪枝神经网络的计算复杂度以及数据存储需求量。
可选地,根据该重要性指标值确定待剪枝神经网络中的剪枝模块的方式可以是:将重要性指标值在预设数值范围内的组成模块确定为剪枝模块;或者是,根据预设剪枝率确定需要剪枝的剪枝模块数量后,将所述剪枝模块数量对应的重要性指标值较高的组成模块确定为剪枝模块等,此处不作具体限定。其中,预设数值范围和预设剪枝率可根据实际需求设定,此处不作具体限定。
本实施例通过获取待剪枝神经网络的输入任务和任务数量以及待剪枝神经网络的组成模块的输出信息,然后根据输入任务、任务数量和输出信息确定组成模块的任务关联度,并根据任务数量和输出信息确定组成模块的输入信息保留度,再根据任务关联度和输入信息保留度确定各个组成模块的重要性指标值,并根据重要性指标值确定待剪枝神经网络中的剪枝模块,使得可以将各个组成模块的输出信息与待剪枝神经网络的输入任务关联起来以基于任务驱动评估各个组成模块的重要性,避免仅根据输出数据本身的数据特性进行重要性评估而忽略了输入任务对于神经网络进行信息处理的作用时,容易误将与输入任务关联度较高的组成模块进行剪枝,导致对输入任务的处理造成影响。如此,将各个组成模块的输出信息与待剪枝神经网络的输入任务关联起来基于任务驱动评估各个组成模块的重要性,能够提高待剪枝神经网络的剪枝准确性。
基于上述第一实施例,提出本申请剪枝模块的确定方法的第二实施例。参照图3,本实施例中,步骤S20可包括:
步骤S21:获取当前网络层级中所述组成模块的目标序号;
步骤S22:确定当前网络层级中序号小于或等于所述目标序号的目标组成模块;
步骤S23:根据各个所述目标组成模块的输出信息与所述输入任务确定各个所述目标组成模块的输入任务保留度和输入输出信息总和;
步骤S24:根据所述任务数量、所述输入任务保留度和所述输入输出信息总和确定所述组成模块的任务关联度。
需要说明的是,待剪枝神经网络可包含多个网络层级,每个网络层级中可包括多个组成模块。基于此,可基于各个组成模块的连接关系或处理顺序等对同一网络层级中的不同组成模块进行编号,以便于对同一网络层级中的不同组成模块进行区分。例如,可按照连接关系按照从小到大的顺序逐级排序;或者是,按照处理顺序按照从小到大的顺序进行排序,处理顺序越靠前,对应的排序序号越小。
由于处于同一层级的组成模块之间会存在相互影响,进而影响当前组成模块的任务关联度,因而,在确定当前组成模块的任务关联度之前,可先确定当前网络层级中对当前组成模块的信息处理存在影响的目标组成模块,然后结合目标组成模块的影响和当前组成模块自身的处理能力来确定当前组成模块的任务关联度。具体地,可先确定当前组成模块在其所处的当前网络层级中的目标序号,例如,当前滤波器是当前网络层级中的第几个组成模块。对于排序序号小于目标序号的组成模块均认为是可能对当前组成模块的任务关联度有影响的组成模块。如此,可将当前网络层级中序号小于或等于目标序号的组成模块确定为目标组成模块,例如,在目标序号为1时,将第一个组成模块确定为目标组成模块,在目标序号为3时,将第一个、第二个以及第三个组成模块均确定为目标组成模块,进而可根据目标组成模块的输出信息以及待剪枝神经网络的输入任务和任务数量确定当前组成模块的任务关联度。
具体地,在确定当前组成模块的任务关联度时,可先根据各目标组成模块的输出信息与待剪枝神经网络的输入任务确定各个目标组成模块的输入任务保留度以及各个目标组成模块的输入输出信息总和,然后根据输入待剪枝神经网络的任务数量以及各个目标组成模块的输入任务保留度和输入输出信息总和确定当前组成模块的任务关联度。其中,各个目标组成模块的输入任务保留度指的是各个目标组成模块的输出信息中包含输入任务的信息量;各个目标组成模块的输入输出信息总和指的是各个目标组成模块的输出信息与待剪枝神经网络的输入任务所包含的任务信息的信息总和。
可选地,根据各目标组成模块的输出信息与待剪枝神经网络的输入任务确定各个目标组成模块的输入任务保留度以及输入输出信息总和时,具体可以是:对目标组成模块的输出信息与待剪枝神经网络的输入任务取交集得到各个目标组成模块的输入任务保留度;以及,对目标组成模块的输出信息与待剪枝神经网络的输入任务取并集得到各个目标组成模块的输入输出信息总和。
可选地,所述输入任务以图像分类为例时,若待剪枝神经网络的输入任务包括至少一张输入任务图,对应地,各个组成模块的输出信息包括至少一张输出特征图,则为了便于进行信息比对,可在确定各个目标组成模块的输入任务保留度以及输入输出信息总和之前,先对各个目标组成模块的输出特征图的分辨率进行调节,以将输出特征图的分辨率调节至与输入任务图的分辨率一致。
可选地,为了减少计算量并凸显出感兴趣目标的轮廓,在确定各个目标组成模块的输入任务保留度以及输入输出信息总和之前,还可对输出特征图和输入任务图进行二值化处理。
可选地,根据输入待剪枝神经网络的任务数量以及各个目标组成模块的输入任务保留度和输入输出信息总和确定当前组成模块的任务关联度时,具体可以是:先计算每一个目标组成模块的输入任务保留度和输入输出信息总和的比值得到任务保留占比;然后对各个目标组成模块的任务保留占比进行求和得到任务保留占比总和;再将任务保留占比总和除以任务数量,得到每一项任务对应的任务保留平均值,以该任务保留平均值作为当前组成模块的任务关联度。
一具体的应用实例中,可先对各个目标组成模块的输出特征图进行分辨率调节,以将输出特征图的分辨率调节至与输入任务图的分辨率一致;然后对输入任务图进行二值化处理,得到二值化处理后的输入任务图;并对调节后的输出特征图进行二值化处理,得到二值化处理后的输出特征图;再对二值化处理后的输入任务图和值化处理后的输出特征图取交集,即可得到各个目标组成模块的输入任务保留度;对二值化处理后的输入任务图和值化处理后的输出特征图取并集,即可得到各个目标组成模块的输入输出信息总和;紧接着,获取各个目标组成模块的输入任务保留度与输入输出信息总和的比值,得到各个目标组成模块的任务保留占比;并对所获取的各个任务保留占比求和,得到各个目标组成模块的任务保留占比和值;随后,计算各个目标组成模块的任务保留占比和值与任务数量的商值,即可得到当前组成模块的任务关联度。
例如,对于待剪枝神经网络中第t层的第i个组成模块,假设将输入任务图x的灰度图记为G(x),网络中第t层的第i个组成模块的网络权重参数为
Figure PCTCN2021136849-appb-000001
将其对应的输出特征图记为
Figure PCTCN2021136849-appb-000002
则计算待剪枝神经网络中第t层的第i个组成模块的任务关联度可包括如下步骤:
1)可通过双线性插值法将
Figure PCTCN2021136849-appb-000003
的分辨率调节至与输入任务图x的分辨率相同,并将调节后的输出特征图记为
Figure PCTCN2021136849-appb-000004
2)可基于设定阈值Δ,对G(x)和
Figure PCTCN2021136849-appb-000005
进行二值化处理,得到二值化处理后的输入任务图,记为L(x),以及二值化处理后的输出特征图,记为
Figure PCTCN2021136849-appb-000006
计算公式如下:
Figure PCTCN2021136849-appb-000007
值得注意的是,所设定的阈值Δ的不同,并不会对最终计算得到的重要性指标值产生影响。
(3)若将序号小于等于i的各个目标组成模块的任务保留占比记为
Figure PCTCN2021136849-appb-000008
则根据任务数量d对
Figure PCTCN2021136849-appb-000009
取平均值,即可得到待剪枝神经网络中第t层的第i个组成模块的任务关联度,记为
Figure PCTCN2021136849-appb-000010
具体的计算公式如下:
Figure PCTCN2021136849-appb-000011
值得注意的是,
Figure PCTCN2021136849-appb-000012
越大,说明网络中第t层的第i个组成模块提取输入任务的语义信息的能力越强,对输入任务越重要。也即,可将每个组成模块视作和输入任务有关的语义提取器,将每个组成模块的输出特征图视作含有输入任务的语义信息的特征图,通过判定输出特征图中含有多少和输入任务有关的语义信息来评估输入任务和特征图之间的联系,从而确定输出这个特征图的滤波器对任务的语义提取能力。
本实施例通过待剪枝神经网络的输入任务对应的任务数量、目标组成模块的输入任务保留度和输入输出信息总和确定当前组成模块的任务关联度,使得 不仅能够将当前组成模块的输出信息与输入任务关联起来,还能充分考虑到当前网络层级中排列靠前的组成模块对当前组成模块的任务关联度的影响,能够提高当前组成模块的任务关联度确定的准确性,进而提高当前组成模块的重要性指标值确定的准确性,以提高剪枝准确性。
基于上述第一实施例,提出本申请剪枝模块的确定方法的第三实施例。参照图4,本实施例中,步骤S30可包括:
步骤S31:获取所述组成模块在当前网络层级中的目标序号,以及所述输入任务包含的图像数量;
步骤S32:确定当前网络层级中序号小于或等于所述目标序号的目标组成模块;
步骤S33:根据各个所述目标组成模块的输出信息确定当前网络层级中各个所述目标组成模块的第一能量值,根据前一网络层级中各个组成模块的输出信息确定前一网络层级中的组成模块的第二能量值;
步骤S34:根据所述任务数量、所述第一能量值和所述第二能量值确定所述组成模块的输入信息保留度。
需要说明的是,待剪枝神经网络可包含多个网络层级,每个网络层级中可包括多个组成模块。基于此,可基于各个组成模块的连接关系或处理顺序等对同一网络层级中的不同组成模块进行编号,以便于对同一网络层级中的不同组成模块进行区分。例如,可按照连接关系按照从小到大的顺序逐级排序;或者是,按照处理顺序按照从小到大的顺序进行排序,处理顺序越靠前,对应的排序序号越小。
由于处于同一层级的组成模块之间是存在相互影响,而隔层的组成模块之间并无互相影响,可通过分析相邻层的组成模块之间的信息获取量来测定相邻层的组成模块中的信息流动,以评估当前组成模块的输入信息保留度。具体地,在确定当前组成模块的输入信息保留度之前,可先确定当前网络层级中对当前组成模块的输入信息保留度存在影响的目标组成模块。例如,可先确定当前组成模块在启所处的当前网络层级中的目标序号,例如,当前滤波器是当前网络层级中的第几个滤波器。对于排序序号小于目标序号的组成模块均认为是对当前组成模块的输入信息保留度存在影响的目标组成模块。如此,可将当前网络层级中序号小于或等于目标序号的组成模块确定为目标组成模块,进而可根据目标组成模块的输出信息以及待剪枝神经网络的输入任务对应的任务数量确定当前组成模块的输入信息保留度。
在根据目标组成模块的输出信息和待剪枝神经网络的任务数量确定当前组成模块的输入信息保留度时,具体可以是:先根据各个目标组成模块的输出信息确定当前网络层级中各个目标组成模块的第一能量值;并根据前一网络层级中各个组成模块的输出信息确定前一网络层级中的组成模块的第二能量值;然后根据任务数量、第一能量值和第二能量值确定当前组成模块的输入信息保留度。第一能量值指的是各个目标组成模块的输出信息对应的能量值,第二能 量值指的是前一网络层级的输出信息对应的能量值。
可选地,可求取各个目标组成模块的输出信息的范数,将所求取的范数的平方作为各个目标组成模块的第一能量值。这里的范数尤指L2范数,以提高第一能量值计算的准确性。当然,在其他一些实施例中也可以采用其他范数,如L1范数,此处不作具体限定。其中,L2范数指的是向量中各个元素的平方之和再开根号;L1范数指的是向量中各个元素的绝对值之和,也称为“稀疏规则算子”。
可选地,可求取前一网络层级中的各个组成组模块的输出信息的方差作为前一网络层级对应的第二能量值。
可选地,根据任务数量、第一能量值和第二能量值确定当前组成模块的输入信息保留度时,可先获取各个目标组成模块的第一能量值与第二能量值之间的比值,得到各个目标组成模块对应的能量占比,然后对各个能量占比求和得到能量占比和值,再将所得到的能量占比和值除以任务数量,即可得到当前组成模块的输入信息保留度。
一具体的应用实例中,输入任务以图像分类为例时,若待剪枝神经网络的输入任务包括至少一张输入任务图,各个组成模块的输出信息对应包括至少一张输出特征图,则对于待剪枝神经网络中第t层的第i个组成模块,假设网络中第t层的第i个组成模块的网络权重参数为
Figure PCTCN2021136849-appb-000013
将其对应的输出特征图记为
Figure PCTCN2021136849-appb-000014
将各个目标组成模块的能量占比记为
Figure PCTCN2021136849-appb-000015
根据输入待剪枝神经网络的任务数量d对该能量占比
Figure PCTCN2021136849-appb-000016
取平均值,即可得到第t层的第i个组成模块的输入信息保留度。具体的计算公式如下:
Figure PCTCN2021136849-appb-000017
其中,
Figure PCTCN2021136849-appb-000018
表示当输入任务图为x时,网络中第t层的第i个滤波器(网络权重参数为
Figure PCTCN2021136849-appb-000019
)的输出特征图;
Figure PCTCN2021136849-appb-000020
表示
Figure PCTCN2021136849-appb-000021
的L2范数的平方;σ 2(O t-1(x))表示(t-1)层所有输出特征图的方差。
Figure PCTCN2021136849-appb-000022
可用于表征待剪枝神经网络中第t层中第i个滤波器的信息获取能力,具体计算过程为:
Figure PCTCN2021136849-appb-000023
Figure PCTCN2021136849-appb-000024
这里,计算得到的
Figure PCTCN2021136849-appb-000025
越大,说明网络中第t层中第i个滤波器对输入任务的信息获取能力更强,对输入任务越重要。
本实施例通过待剪枝神经网络的输入任务对应的任务数量、待剪枝神经网络的输入任务对应的第一能量值和第二能量值确定组成模块的输入信息保留度,使得能够根据不同网络层级之间的信息流动来评估组成模块的输入信息保留度,以提高组成模块的输入信息保留度确定的准确性,进而提高当前组成模块的重要性指标值确定的准确性,提高剪枝准确性。
基于上述第一实施例,提出本申请剪枝模块的确定方法的第四实施例。参照图5,本实施例中,步骤S40可包括:
步骤S41:对所述任务关联度和所述输入信息保留度进行归一化处理;
步骤S42:将归一化后的所述任务关联度与归一化后的所述输入信息保留度的和值作为所述组成模块的重要性指标值;或者,
步骤S43:将归一化后的所述任务关联度与归一化后的所述输入信息保留度的乘积作为所述组成模块的重要性指标值。
由于待剪枝神经网络中各组成模块的重要性与各组成模块的任务关联度和输入信息保留度有关,因而可根据任务关联度和输入信息保留度确定各组成模块的重要性指标值,以对各组成模块的重要性进行评估。
例如,可对各组成模块的任务关联度和输入信息保留度求和,以求和所得的和值作为各组成模块的重要性指标值;或者,可将各组成模块的任务关联度和输入信息保留度的乘积作为各组成模块的重要性指标值;或者,在对各组成模块的任务关联度和输入信息保留度求和,并计算各组成模块的任务关联度和输入信息保留度的乘积后,可为所得到求和和值和乘积分配不同的权重值,以所分配的权重值对求和和值和乘积进行加权求和,以加权求和得到的加权和值作为各组成模块的重要性指标值等。
一实施例中,为了统一量级以减小异常数据干扰,可先对各组成模块的任务关联度和输入信息保留度进行归一化处理后,获取归一化处理后的任务关联度和输入信息保留度的和值或乘积作为各组成模块的重要性指标值。具体地,可先获取当前网络层级中所有组成模块的任务关联度中的最大任务关联度,以及当前网络层级中所有组成模块的输入信息保留度中的最大输入信息保留度;然后,可获取当前组成模块的任务关联度与最大任务关联度之间的商值,得到归一化处理后的任务关联度,并获取当前组成模块的输入信息保留度与最大输入信息保留度之间的商值,得到归一化处理后的输入信息保留度;进而,对归一化处理后的任务关联度和归一化处理后的输入信息保留度求和,即可得到相应组成模块的重要性指标值;或者,求取归一化处理后的任务关联度与归一化处理后的输入信息保留度的乘积,即可得到相应组成模块的重要性指标值。
本实施例通过对归一化的任务关联度和输入信息保留都求和或者求乘积以获取各组成模块的重要性指标值,使得可以基于任务驱动确定各组成模块的重要性指标值,以提高重要性指标值的准确性,进而提高待剪枝神经网络的剪枝准确性。
一具体的应用实例中,若所述组成模块为待剪枝神经网络中的滤波器,所述待剪枝神经网络的输入信息包括至少一张输入任务图,各组成模块的输出信息包括输出特征图,则可通过评估输入任务和输出特征图之间的联系(任务关联度),以及测定相邻网络层级之间滤波器中的信息流动(输入信息保留度)来评估各个滤波器的重要性指标值,进而根据重要性指标值确定待剪枝神经网络中的剪枝模块。具体流程如下:
1、评估输入任务和特征图之间的联系(任务关联度)。
可将每个滤波器视作是和输入任务有关的语义提取器,将滤波器的输出特 征图视作含有输入任务的语义信息的特征图,通过判定输出特征图中含有多少和输入任务有关的语义信息来评估输入任务和特征图之间的联系,从而确定输出该输出特征图的滤波器对输入任务的语义提取能力,以这种能力作为评价滤波器重要性程度的指标。
具体地:假设输入任务图为x,获取其灰度图G(x)。将输入任务图输入待剪枝神经网络中,假定待剪枝神经网络中第t层中第i个滤波器的网络权重参数为
Figure PCTCN2021136849-appb-000026
可将对应的输出特征图记作
Figure PCTCN2021136849-appb-000027
然后,通过双线性插值法将输出特征图的扩展为与输入任务图具有相同分辨率的图像,并基于设定阈值,通过以下公式(1)实现对G(x)和
Figure PCTCN2021136849-appb-000028
的二值化操作:
Figure PCTCN2021136849-appb-000029
其中,L(x)和
Figure PCTCN2021136849-appb-000030
为分别对G(x)和
Figure PCTCN2021136849-appb-000031
进行二值化操作后得到的结果;并且,设置不同值的阈值Δ,对最终所确定的重要性指标值并无影响。
假定待剪枝神经网络中第t层中第i个滤波器的重要性量化分数(任务保留占比)为
Figure PCTCN2021136849-appb-000032
那么对所有的输入任务得到的
Figure PCTCN2021136849-appb-000033
进行平均可以得到最终第t层中第i个滤波器的重要性量化分数(任务关联度)
Figure PCTCN2021136849-appb-000034
对应于当前输出特征图对应的滤波器对输入任务的语义提取能力,可用于表征输入任务和输出特征图之间的联系,具体的计算过程如公式(2)所示:
Figure PCTCN2021136849-appb-000035
其中,d表示输入任务中输入任务图的数量,也即任务数量,而
Figure PCTCN2021136849-appb-000036
可通过计算输入任务和输出特征图之间的联系得到,具体的计算过程为求L(x)和
Figure PCTCN2021136849-appb-000037
的交并比,即
Figure PCTCN2021136849-appb-000038
这里,计算得到的
Figure PCTCN2021136849-appb-000039
越大,说明网络中第t层中第i个滤波器提取输入任务的语义信息的能力越强,对输入任务越重要。
2、测定相邻层之间滤波器中的信息流动(输入信息保留度)
在待剪枝神经网络进行前向推理的过程中,输入任务中的信息流会从浅层到深层一层层地在相邻层之间流动。对于一个特定层的滤波器,其会输入前一层的输出特征图,并输出当前层的输出特征图,也就是说隔层之间的滤波器互相之间没有影响。因此,可将每一层的滤波器视作有限状态机,通过分析相邻层滤波器之间的信息获取量来测定相邻层之间的滤波器中的信息流动。
具体地,假定当输入任务图片x时,待剪枝神经网络中第t层中第i个滤波器的重要性量化分数(能量占比)为
Figure PCTCN2021136849-appb-000040
那么对所有的输入任务得到的
Figure PCTCN2021136849-appb-000041
进行平均可以得到最终第t层中第i个滤波器的重要性量化分数(输入信息保留度)
Figure PCTCN2021136849-appb-000042
以表征剪枝神经网络中第t层中第i个滤波器的信息获取能力。具体计算过程如公式(3)所示:
Figure PCTCN2021136849-appb-000043
其中,
Figure PCTCN2021136849-appb-000044
表示当输入任务图片x时,网络中第t层中第i个滤波器(网络权重参数为
Figure PCTCN2021136849-appb-000045
)的输出特征图;
Figure PCTCN2021136849-appb-000046
表示
Figure PCTCN2021136849-appb-000047
的L2范数的平方;σ 2(O t-1(x))表示(t-1)层所有输出特征图的方差。
Figure PCTCN2021136849-appb-000048
可以通过计算当输入任务图片x时,网络中第t层中第i个滤波器的信息获取能力得到,具体的计算过程为
Figure PCTCN2021136849-appb-000049
这里,计算得到的
Figure PCTCN2021136849-appb-000050
越大,说明网络中第t层中第i个滤波器对输入任务的信息获取能力更强,对输入任务越重要。
3、组合得到最终的基于任务驱动的重要性评估策略
可将以上得到的
Figure PCTCN2021136849-appb-000051
Figure PCTCN2021136849-appb-000052
进行组合,以计算各滤波器的重要性指标值,具体如公式(4)所示:
Figure PCTCN2021136849-appb-000053
这里,max(A t)和max(B t)分别表示待剪枝神经网络的第t层中所有滤波器的
Figure PCTCN2021136849-appb-000054
中的最大值,以及
Figure PCTCN2021136849-appb-000055
(n t是网络中第t层具有的滤波器个数)中的最大值。
为了验证本发明的性能,以至少四个典型的神经网络,如VGG-16,ResNet-56,ResNet-110,ResNet-50作为待剪枝神经网络,基于CIFAR-10和ImageNet(ILSVRC2012)两种测试数据集进行网络模型剪枝测试。实验结果表明,本申请提出的基于任务驱动的剪枝方案,相比于基于数据驱动的剪枝方案获得了更优异的网络剪枝性能,包括更高的压缩率、更少参数存储需求和更低的计算复杂度。
也即,以上重要性指标值的两种组合计算方案,均能取得良好效果,可根据具体情况选择使用。
4、联合训练框架
假定任务的损失函数为公式(5)中所示:
Figure PCTCN2021136849-appb-000056
其中,f(x,W)是网络的输出,而l(.)是每一个输入训练例子x的损失函数计算式(可以是交叉熵损失或者其他具体的计算式)。y是与训练例子x对应的真实值(ground truth)。
剪枝的具体过程为:假设网络中第t层的剪枝率预先设定为r t,我们在得到 第t层中每一个滤波器重要性指标值
Figure PCTCN2021136849-appb-000057
(n t是网络中第t层具有的滤波器个数)之后,可通过r t将第t层中重要性量化分数较小的滤波器去除而保留重要性量化分数较大的滤波器以实现剪枝。例如,若将第t层的稀疏度预先设定为r t=0.1,则第t层的滤波器中会有10%的滤波器保留,而其余90%的滤波器将会被剪枝去除。此时,可根据第t层每一个滤波器的重要性指标值Γ t,将前10%个
Figure PCTCN2021136849-appb-000058
值最大的滤波器保留,而将其余滤波器去除,来最终实现待剪枝神经网络中第t层的剪枝。
具体的训练框架如下:
1)输入:训练数据对(x,y),预训练模型参数W={W t,1≤t≤T}(T是待剪枝神经网络中的层数),预先设定的网络剪枝率r={r t,1≤t≤T},学习率η,训练迭代次数j,训练停止条件∈。
2)输出:最终剪枝后的网络模型参数
Figure PCTCN2021136849-appb-000059
通过上述公式(1)~(4)计算该待剪枝神经网络中基于任务驱动的重要性指标值Γ={Γ t,1≤t≤T};
3)令
Figure PCTCN2021136849-appb-000060
4)通过Γ和r对待剪枝神经网络进行逐层剪枝;
5)当
Figure PCTCN2021136849-appb-000061
时,
Figure PCTCN2021136849-appb-000062
6)输出剪枝后网络模型参数
Figure PCTCN2021136849-appb-000063
需要说明的是,步骤3为基于任务驱动的神经网络模型剪枝的过程;步骤4~5是重训练微调的过程,
Figure PCTCN2021136849-appb-000064
表示的训练迭代次数为j次时得到的剪枝后模型参数
Figure PCTCN2021136849-appb-000065
与训练迭代次数为(j-1)次时得到的剪枝后模型参数
Figure PCTCN2021136849-appb-000066
之间差值的L2范数值。当
Figure PCTCN2021136849-appb-000067
时,会持续对剪枝后网络模型参数进行迭代重训练微调。
此外,本申请实施例还提供一种剪枝模块的确定装置,所述剪枝模块的确定装置包括存储器、处理器及存储在所述处理器上并可在处理器上运行的剪枝模块的确定程序,所述处理器执行所述剪枝模块的确定程序时实现如上所述剪枝模块的确定方法的步骤。
此外,本申请实施例还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有剪枝模块的确定程序,所述剪枝模块的确定程序被处理器执行时实现如上所述的剪枝模块的确定方法的步骤。
需要说明的是,在本文中,术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、方法、物品或者系统不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、方法、物品或者系统所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、方法、物 品或者系统中还存在另外的相同要素。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,电视,或者网络设备等)执行本申请各个实施例所述的方法。
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。

Claims (10)

  1. 一种剪枝模块的确定方法,其中,所述剪枝模块的确定方法包括以下步骤:
    获取待剪枝神经网络的输入任务和任务数量,以及所述待剪枝神经网络的组成模块的输出信息;
    根据所述输入任务、所述任务数量和所述输出信息确定所述组成模块的任务关联度;
    根据所述任务数量和所述输出信息确定所述组成模块的输入信息保留度;
    根据所述任务关联度和所述输入信息保留度确定所述组成模块的重要性指标值;
    根据所述重要性指标值确定所述待剪枝神经网络中的剪枝模块。
  2. 如权利要求1所述的剪枝模块的确定方法,其中,所述根据所述输入任务、所述任务数量和所述输出信息确定所述组成模块的任务关联度的步骤包括:
    获取当前网络层级中所述组成模块的目标序号;
    确定当前网络层级中序号小于或等于所述目标序号的目标组成模块;
    根据各个所述目标组成模块的输出信息与所述输入任务确定各个所述目标组成模块的输入任务保留度和输入输出信息总和;
    根据所述任务数量、所述输入任务保留度和所述输入输出信息总和确定所述组成模块的任务关联度。
  3. 如权利要求2所述的剪枝模块的确定方法,其中,所述输入任务包括至少一张输入任务图,所述输出信息包括输出特征图,所述根据各个所述目标组成模块的输出信息与所述输入任务确定各个所述目标组成模块的输入任务保留度和输入输出信息总和的步骤包括:
    对各个所述目标组成模块的输出特征图的分辨率进行调节,以使所述输出特征图的分辨率与所述输入任务图的分辨率一致;
    对所述输入任务图和调节后的所述输出特征图进行二值化处理;
    对处理后的所述输入任务图和处理后的所述输出特征图取交集,得到各个所述目标组成模块的输入任务保留度;
    对处理后的所述输入任务图和处理后的所述输出特征图的取并集,得到各个所述目标组成模块的输入输出信息总和。
  4. 如权利要求2所述的剪枝模块的确定方法,其中,所述根据所述任务数量、所述输入任务保留度和所述输入输出信息总和确定所述组成模块的任务关联度的步骤包括:
    获取所述输入任务保留度和所述输入输出信息总和的比值,得到各个所述目标组成模块的任务保留占比;
    对各个所述任务保留占比求和,得到任务保留占比和值;
    将所述任务保留占比和值与所述任务数量的商值确定为所述组成模块的任务关联度。
  5. 如权利要求1所述的剪枝模块的确定方法,其中,所述根据所述任务数量和所述输出信息确定所述组成模块的输入信息保留度的步骤包括:
    获取所述组成模块在当前网络层级中的目标序号,以及所述输入任务包含的图像数量;
    确定当前网络层级中序号小于或等于所述目标序号的目标组成模块;
    根据各个所述目标组成模块的输出信息确定当前网络层级中各个所述目标组成模块的第一能量值,根据前一网络层级中各个组成模块的输出信息确定前一网络层级中的组成模块的第二能量值;
    根据所述任务数量、所述第一能量值和所述第二能量值确定所述组成模块的输入信息保留度。
  6. 如权利要求5所述的剪枝模块的确定方法,其中,所述根据各个所述目标组成模块的输出信息确定当前网络层级中各个所述目标组成模块的第一能量值,根据前一网络层级中各个组成模块的输出信息确定前一网络层级中的组成模块的第二能量值的步骤包括:
    求取各个所述目标组成模块的输出信息的范数,将所述范数的平方作为各个所述目标组成模块的第一能量值;
    求取前一网络层级中的各个组成模块的输出信息的方差,得到所述第二能量值。
  7. 如权利要求5所述的剪枝模块的确定方法,其中,所述根据所述任务数量、所述第一能量值和所述第二能量值确定所述组成模块的输入信息保留度的步骤包括:
    获取各个所述第一能量值与所述第二能量值之间的比值,得到各个所述目标组成模块对应的能量占比;
    对各个所述能量占比进行求和,得到能量占比和值;
    将所述能量占比和值与所述任务数量的商值确定为所述组成模块的输入信息保留度。
  8. 如权利要求1所述的剪枝模块的确定方法,其中,所述根据所述任务关联度和所述输入信息保留度确定所述组成模块的重要性指标值的步骤包括:
    对所述任务关联度和所述输入信息保留度进行归一化处理;
    将归一化后的所述任务关联度与归一化后的所述输入信息保留度的和值作为所述组成模块的重要性指标值;
    或者,将归一化后的所述任务关联度与归一化后的所述输入信息保留度的 乘积作为所述组成模块的重要性指标值。
  9. 一种剪枝模块的确定装置,其中,所述剪枝模块的确定装置包括存储器、处理器及存储在存储器上并可在处理器上运行网络模型的剪枝程序,所述处理器执行所述网络模型的剪枝程序时实现权利要求1-8中任一项所述的网络模型的剪枝方法的步骤。
  10. 一种计算机可读存储介质,其中,所述计算机可读存储介质上存储有网络模型的剪枝程序,所述网络模型的剪枝程序被处理器执行时实现如权利要求1-8中任一项所述的网络模型的剪枝方法的步骤。
PCT/CN2021/136849 2021-12-09 2021-12-09 剪枝模块的确定方法、装置及计算机可读存储介质 WO2023102844A1 (zh)

Priority Applications (3)

Application Number Priority Date Filing Date Title
CN202180003874.0A CN114514539A (zh) 2021-12-09 2021-12-09 剪枝模块的确定方法、装置及计算机可读存储介质
PCT/CN2021/136849 WO2023102844A1 (zh) 2021-12-09 2021-12-09 剪枝模块的确定方法、装置及计算机可读存储介质
US17/681,243 US20230186091A1 (en) 2021-12-09 2022-02-25 Method and device for determining task-driven pruning module, and computer readable storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
PCT/CN2021/136849 WO2023102844A1 (zh) 2021-12-09 2021-12-09 剪枝模块的确定方法、装置及计算机可读存储介质

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US17/681,243 Continuation US20230186091A1 (en) 2021-12-09 2022-02-25 Method and device for determining task-driven pruning module, and computer readable storage medium

Publications (1)

Publication Number Publication Date
WO2023102844A1 true WO2023102844A1 (zh) 2023-06-15

Family

ID=81550016

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/136849 WO2023102844A1 (zh) 2021-12-09 2021-12-09 剪枝模块的确定方法、装置及计算机可读存储介质

Country Status (3)

Country Link
US (1) US20230186091A1 (zh)
CN (1) CN114514539A (zh)
WO (1) WO2023102844A1 (zh)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115439719B (zh) * 2022-10-27 2023-03-28 泉州装备制造研究所 一种针对对抗攻击的深度学习模型防御方法及模型

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112488297A (zh) * 2020-12-03 2021-03-12 深圳信息职业技术学院 一种神经网络剪枝方法、模型生成方法及装置
CN112949840A (zh) * 2021-04-20 2021-06-11 中国人民解放军国防科技大学 通道注意力引导的卷积神经网络动态通道剪枝方法和装置
CN113077039A (zh) * 2021-03-22 2021-07-06 北京工业大学 基于任务驱动rbf神经网络的出水总氮tn软测量方法
CN113240111A (zh) * 2021-05-31 2021-08-10 成都索贝视频云计算有限公司 基于离散余弦变换通道重要性得分的剪枝方法
CN113255910A (zh) * 2021-05-31 2021-08-13 浙江宇视科技有限公司 卷积神经网络的剪枝方法、装置、电子设备和存储介质
WO2021195643A1 (en) * 2021-05-03 2021-09-30 Innopeak Technology, Inc. Pruning compression of convolutional neural networks

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112488297A (zh) * 2020-12-03 2021-03-12 深圳信息职业技术学院 一种神经网络剪枝方法、模型生成方法及装置
CN113077039A (zh) * 2021-03-22 2021-07-06 北京工业大学 基于任务驱动rbf神经网络的出水总氮tn软测量方法
CN112949840A (zh) * 2021-04-20 2021-06-11 中国人民解放军国防科技大学 通道注意力引导的卷积神经网络动态通道剪枝方法和装置
WO2021195643A1 (en) * 2021-05-03 2021-09-30 Innopeak Technology, Inc. Pruning compression of convolutional neural networks
CN113240111A (zh) * 2021-05-31 2021-08-10 成都索贝视频云计算有限公司 基于离散余弦变换通道重要性得分的剪枝方法
CN113255910A (zh) * 2021-05-31 2021-08-13 浙江宇视科技有限公司 卷积神经网络的剪枝方法、装置、电子设备和存储介质

Also Published As

Publication number Publication date
US20230186091A1 (en) 2023-06-15
CN114514539A (zh) 2022-05-17

Similar Documents

Publication Publication Date Title
US11176418B2 (en) Model test methods and apparatuses
CN107291945B (zh) 基于视觉注意力模型的高精度服装图像检索方法及系统
CN112257815A (zh) 模型生成方法、目标检测方法、装置、电子设备及介质
US20220245465A1 (en) Picture searching method and apparatus, electronic device and computer readable storage medium
WO2021174699A1 (zh) 用户筛选方法、装置、设备及存储介质
WO2023102844A1 (zh) 剪枝模块的确定方法、装置及计算机可读存储介质
CN110263973B (zh) 预测用户行为的方法及装置
CN111079930A (zh) 数据集质量参数的确定方法、装置及电子设备
CN108428234B (zh) 基于图像分割结果评价的交互式分割性能优化方法
CN117911370A (zh) 一种皮肤图像质量评估方法、装置、电子设备及存储介质
CN107193979B (zh) 一种同源图片检索的方法
CN117474669A (zh) 一种贷款逾期预测方法、装置、设备及存储介质
CN117994021A (zh) 一种资产核销方式的辅助配置方法、装置、设备及介质
CN112835807A (zh) 界面识别方法、装置、电子设备和存储介质
CN110855474B (zh) Kqi数据的网络特征提取方法、装置、设备及存储介质
CN116188917B (zh) 缺陷数据生成模型训练方法、缺陷数据生成方法及装置
CN116071625B (zh) 深度学习模型的训练方法、目标检测方法及装置
CN112016599B (zh) 用于图像检索的神经网络训练方法、装置及电子设备
CN109376619A (zh) 一种细胞检测方法
CN112819079A (zh) 模型的采样算法匹配方法、装置和电子设备
CN111209428A (zh) 图像检索方法、装置、设备及计算机可读存储介质
CN112508135A (zh) 模型训练方法、行人属性预测方法、装置及设备
CN117611580B (zh) 瑕疵检测方法、装置、计算机设备和存储介质
CN114663965B (zh) 一种基于双阶段交替学习的人证比对方法和装置
CN118196567B (zh) 基于大语言模型的数据评价方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21966766

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE