CN111340223A

CN111340223A - Neural network compression method, target detection method, driving control method and device

Info

Publication number: CN111340223A
Application number: CN202010123508.6A
Authority: CN
Inventors: 游山; 苏修; 王飞; 钱晨
Original assignee: Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Current assignee: Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Priority date: 2020-02-27
Filing date: 2020-02-27
Publication date: 2020-06-26

Abstract

The present disclosure provides a neural network compression, target detection, driving control method and device, the method includes: acquiring convolution kernel parameter information of a trained convolution layer of the neural network to be compressed; determining a convolution kernel to be deleted from a plurality of convolution kernels indicated by the convolution kernel parameter information based on the convolution kernel parameter information, and deleting the convolution kernel to be deleted from the plurality of convolution kernels; and determining the compressed neural network based on the deleted convolutional layer.

Description

Neural network compression method, target detection method, driving control method and device

Technical Field

The disclosure relates to the technical field of deep learning, in particular to a neural network compression method, a target detection method, a driving control method and a device.

Background

In order to ensure the performance of the neural network, the structural design of the neural network is complex, and further the performance of the device for operating the neural network is required to be high, so that the designed neural network is not suitable for devices with low performance, such as a mobile phone, a tablet computer and the like, and the development of the neural network is limited.

Therefore, how to compress the neural network to reduce the complexity of the neural network becomes one of the important problems in the field of artificial intelligence.

Disclosure of Invention

In view of the above, the present disclosure provides at least a neural network compression method, a target detection method, a driving control method and a driving control device.

In a first aspect, the present disclosure provides a neural network compression method, including:

acquiring convolution kernel parameter information of a trained convolution layer of the neural network to be compressed;

determining a convolution kernel to be deleted from a plurality of convolution kernels indicated by the convolution kernel parameter information based on the convolution kernel parameter information, and deleting the convolution kernel to be deleted from the plurality of convolution kernels;

and determining the compressed neural network based on the deleted convolutional layer.

By adopting the method, the convolution kernels to be deleted in the neural network are determined based on the parameter information of the convolution kernels, and the convolution kernels to be deleted are deleted from the convolution kernels, so that the number of the convolution kernels in the convolution kernels is reduced, and the structure of the neural network to be compressed is simplified, thereby realizing the compression processing of the neural network from the level of the convolution kernels, namely reducing the complexity of the neural network, and improving the processing efficiency when the neural network based on the compression processing performs operation processing (such as image recognition and the like).

In one possible embodiment, determining a compressed neural network based on the removed convolutional layer includes:

determining the operation speed of the corresponding neural network after deleting the convolution kernel;

and under the condition that the operation speed does not meet the set compression termination condition, taking the neural network corresponding to the deleted convolution kernel as a new neural network to be compressed, returning to the step of acquiring the convolution kernel parameter information of the trained convolution layer of the neural network to be compressed until the operation speed meets the set compression termination condition, and determining the compressed neural network based on the convolution layer subjected to the last deletion.

and under the condition that the operation speed meets the set compression termination condition, determining the compressed neural network based on the convolution layer subjected to the current deletion processing.

In one possible embodiment, when the operation speed is floating-point operations per second FLOPs, the operation speed satisfies the set compression termination condition, including:

the FLOPs are less than a set threshold.

In the above embodiment, because the FLOPs can more accurately characterize the complexity of the compressed neural network, whether to terminate the compression process on the neural network is determined based on the FLOPs of the compressed neural network.

In one possible embodiment, after deleting the convolution kernel to be deleted from the plurality of convolution kernels, before determining an operation speed of a neural network corresponding to the current deletion of the convolution kernel, the method further includes:

and adjusting the network parameters of the neural network corresponding to the deleted convolution kernel based on the training sample data.

In the above embodiment, based on the training sample data, the network parameters of the neural network corresponding to the deletion of the convolution kernel at this time are adjusted, and the information lost in the compressed neural network can be compensated, so that the convolution kernel to be deleted can be accurately determined when the compressed neural network is recompressed next time, and the performance of the recompressed neural network is ensured.

In one possible embodiment, after determining the compressed neural network, the method further comprises:

carrying out parameter initialization processing on the compressed network structure of the neural network;

and training the compressed neural network based on training sample data and the network structure of the compressed neural network obtained after parameter initialization processing.

In the embodiment, the performance of the finally obtained compressed neural network is guaranteed by training the finally obtained compressed neural network.

In one possible embodiment, determining a convolution kernel to be pruned from a plurality of convolution kernels indicated by the convolution kernel parameter information based on the convolution kernel parameter information includes:

determining the importance degree corresponding to each convolution kernel indicated by the convolution kernel parameter information based on the convolution kernel parameter information;

determining a convolution kernel to be pruned from the plurality of convolution kernels based on the importance degree corresponding to each convolution kernel.

The convolution kernel to be deleted is determined based on the importance degree of the convolution kernel, the convolution kernel with low importance degree can be deleted, the situation that the accuracy of the compressed neural network is low due to the fact that the convolution kernel with high importance degree is deleted is avoided, and the performance of the compressed neural network is guaranteed.

In one possible embodiment, determining the convolution kernel to be pruned from the plurality of convolution kernels based on the importance degree corresponding to each convolution kernel includes:

and determining the convolution kernels to be deleted from the plurality of convolution kernels based on the importance degree corresponding to each convolution kernel and the preset number of the convolution kernels to be deleted at each time.

In the above manner, the number of the convolution kernels to be deleted each time is set, and the deletion of the plurality of convolution kernels in the convolution layer each time is completed according to the preset number, so that the situation that the performance of the compressed neural network is poor due to the fact that the deletion of the convolution kernels is more in the deletion process at one time is avoided, the performance of the compressed neural network is guaranteed, and the excessive deletion of the convolution kernels of the neural network to be compressed is avoided.

In one possible embodiment, determining the importance degree corresponding to each convolution kernel indicated by the convolution kernel parameter information based on the convolution kernel parameter information includes:

inputting the convolution kernel parameter information into an average pooling layer for feature extraction to obtain first feature data corresponding to the convolution kernel parameter information;

and analyzing and normalizing the first characteristic data to obtain the importance degree corresponding to each convolution kernel.

In a second aspect, the present disclosure provides a target detection method, including:

acquiring a target image;

and performing target detection on the target image by using a target detection neural network, wherein the target detection neural network is obtained by adopting the neural network compression method of the first aspect or any embodiment, and training is completed.

In a third aspect, the present disclosure provides a running control method including:

acquiring a road image acquired by a driving device in the driving process;

carrying out target detection on the road image by using a target detection neural network, and determining a target object included in the road image; the target detection neural network is obtained by adopting the neural network compression method of the first aspect or any embodiment, and training is completed;

controlling the travel device based on a target object included in the road image.

The following descriptions of the effects of the apparatus, the electronic device, and the like refer to the description of the above method, and are not repeated here.

In a fourth aspect, the present disclosure provides a neural network compression apparatus, comprising:

the acquisition module is used for acquiring the convolutional kernel parameter information of the trained convolutional layer of the neural network to be compressed;

a deleting module, configured to determine a convolution kernel to be deleted from the plurality of convolution kernels indicated by the convolution kernel parameter information based on the convolution kernel parameter information, and delete the convolution kernel to be deleted from the plurality of convolution kernels;

and the compression module is used for determining the compressed neural network based on the deleted convolutional layer.

In a possible implementation, the compression module is configured to:

In a possible implementation, the compression module is further configured to:

the FLOPs are less than a set threshold.

In a possible embodiment, the apparatus further comprises:

and the adjusting module is used for adjusting the network parameters of the neural network corresponding to the deleted convolution kernel based on the training sample data.

In a possible embodiment, the apparatus further comprises:

the initialization processing module is used for carrying out parameter initialization processing on the compressed network structure of the neural network;

and the training module is used for training the compressed neural network based on training sample data and the network structure of the compressed neural network obtained after parameter initialization processing.

In a possible implementation manner, the deleting module is configured to:

In a possible implementation manner, the deleting module is further configured to:

In a fifth aspect, the present disclosure provides an object detection apparatus, comprising:

the image acquisition module is used for acquiring a target image;

and the detection module is used for performing target detection on the target image by using a target detection neural network, wherein the target detection neural network is obtained by adopting the neural network compression method of the first aspect or any embodiment and is trained.

In a sixth aspect, the present disclosure provides a running control apparatus comprising:

the road image acquisition module is used for acquiring a road image acquired by the driving device in the driving process;

the target object detection module is used for carrying out target detection on the road image by utilizing a target detection neural network and determining a target object included in the road image; the target detection neural network is obtained by adopting the neural network compression method of the first aspect or any embodiment, and training is completed;

a control module for controlling the travel device based on a target object included in the road image.

In a seventh aspect, the present disclosure provides an electronic device, comprising: a processor, a memory and a bus, the memory storing machine readable instructions executable by the processor, the processor and the memory communicating via the bus when the electronic device is operating, the machine readable instructions being executable by the processor to perform the steps of a neural network compression method as set forth in the first aspect or any of the embodiments, the steps of an object detection method as set forth in the second aspect or any of the embodiments, or the steps of a driving control method as set forth in the third aspect or any of the embodiments.

In an eighth aspect, the present disclosure provides a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of a neural network compression method as set forth in the first aspect or any one of the embodiments, the steps of an object detection method as set forth in the second aspect or any one of the embodiments, or the steps of a travel control method as set forth in the third aspect or any one of the embodiments.

In order to make the aforementioned objects, features and advantages of the present disclosure more comprehensible, preferred embodiments accompanied with figures are described in detail below.

Drawings

In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for use in the embodiments will be briefly described below, and the drawings herein incorporated in and forming a part of the specification illustrate embodiments consistent with the present disclosure and, together with the description, serve to explain the technical solutions of the present disclosure. It is appreciated that the following drawings depict only certain embodiments of the disclosure and are therefore not to be considered limiting of its scope, for those skilled in the art will be able to derive additional related drawings therefrom without the benefit of the inventive faculty.

Fig. 1 illustrates a flow chart of a neural network compression method provided by an embodiment of the present disclosure;

fig. 2 is a schematic flowchart illustrating a process of determining a convolution kernel to be deleted from a plurality of convolution kernels indicated by convolution kernel parameter information based on the convolution kernel parameter information in a neural network compression method provided by an embodiment of the present disclosure;

fig. 3 is a schematic diagram illustrating an architecture of an intermediate neural network for determining importance in a neural network compression method provided by an embodiment of the present disclosure;

fig. 4 is a schematic flowchart illustrating a process of determining, based on convolution kernel parameter information, an importance degree corresponding to each convolution kernel indicated by the convolution kernel parameter information in a neural network compression method provided by an embodiment of the present disclosure;

fig. 5 is a schematic flowchart illustrating determining a compressed neural network based on a convolutional layer after deletion processing in a neural network compression method provided by an embodiment of the present disclosure;

fig. 6 is a schematic flow chart illustrating a target detection method provided by an embodiment of the present disclosure;

fig. 7 is a flowchart illustrating a driving control method according to an embodiment of the present disclosure;

fig. 8 is a schematic diagram illustrating an architecture of a neural network compression apparatus provided by an embodiment of the present disclosure;

fig. 9 is a schematic diagram illustrating an architecture of an object detection apparatus provided in an embodiment of the present disclosure;

fig. 10 is a schematic diagram illustrating an architecture of a driving control device provided in an embodiment of the present disclosure;

fig. 11 shows a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure;

fig. 12 is a schematic structural diagram of an electronic device provided by an embodiment of the present disclosure;

fig. 13 shows a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present disclosure more clear, the technical solutions of the embodiments of the present disclosure will be described clearly and completely with reference to the drawings in the embodiments of the present disclosure, and it is obvious that the described embodiments are only a part of the embodiments of the present disclosure, not all of the embodiments. The components of the embodiments of the present disclosure, generally described and illustrated in the figures herein, can be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure, presented in the figures, is not intended to limit the scope of the claimed disclosure, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the disclosure without making creative efforts, shall fall within the protection scope of the disclosure.

The neural network compression is to reduce the parameters or the storage space of the network by changing the network structure or using a quantization and approximation method, and reduce the network calculation cost and the storage space under the condition of not influencing the performance of the neural network, so that the compressed neural network can be applied to equipment with lower performance. For example, after the neural network for target detection and tracking is compressed and applied to portable mobile devices with lower performance such as mobile phones and tablet computers, the user can detect and track the object to be detected in real time, so that the detection and tracking efficiency can be improved, the application range of the neural network is expanded, and the practicability of the neural network is improved.

Therefore, in order to enable the neural network to operate on a device with lower performance, the embodiment of the disclosure provides a neural network compression method, which implements compression on the neural network.

For the understanding of the embodiments of the present disclosure, a neural network compression method disclosed in the embodiments of the present disclosure will be described in detail first.

Referring to fig. 1, a schematic flow chart of a neural network compression method provided by an embodiment of the present disclosure is shown, and the method includes S101-S103.

S101, obtaining convolution kernel parameter information of a trained convolution layer of the neural network to be compressed;

s102, determining a convolution kernel to be deleted from a plurality of convolution kernels indicated by convolution kernel parameter information based on the convolution kernel parameter information, and deleting the convolution kernel to be deleted from the plurality of convolution kernels;

s103, based on the deleted convolution layer, determining the compressed neural network.

In the method, the convolution kernels to be deleted in the neural network are determined based on the parameter information of the convolution kernels, and the convolution kernels to be deleted are deleted from the convolution kernels, so that the number of the convolution kernels in the convolution kernels is reduced, and the structure of the neural network to be compressed is simplified, thereby realizing compression processing on the neural network from the level of the convolution kernels, namely reducing the complexity of the neural network, and improving the processing efficiency when the neural network based on the compression processing performs operation processing (such as image recognition and the like).

S101-S103 are described in detail below.

For S101:

in the embodiment of the disclosure, a plurality of convolution kernels are included in a convolution layer, each convolution kernel corresponds to a size, the number of the convolution kernels in the convolution layer is the number of output channels of the convolution layer, the size of each convolution kernel can be × filtering numbers × input channels, and parameter information of the convolution kernels can be matrix information of each convolution kernel.

In the embodiment of the present disclosure, the neural network to be compressed may be any neural network after training, for example, the neural network to be compressed may be a neural network for image detection, a neural network for image segmentation, a neural network for image classification, or a neural network for target detection and tracking, and the like. For example, the neural network for image detection may be a neural network for face recognition, a neural network for gait recognition, a neural network for fingerprint recognition, or the like. Wherein, the neural network to be compressed can comprise a plurality of convolution layers. Specifically, the convolution kernel parameter information of each convolution layer of the trained neural network is determined, so that the convolution kernel parameter information of each convolution layer of the trained neural network to be compressed can be obtained, for each convolution layer, a convolution kernel to be deleted in a plurality of convolution kernels included in the convolution layer is determined based on the convolution kernel parameter information corresponding to the convolution layer, and the convolution kernel to be deleted is deleted.

For S102:

in the embodiment of the disclosure, a convolution kernel to be deleted is determined from a plurality of convolution kernels based on convolution kernel parameter information, and the determined convolution kernel to be deleted is deleted.

In one possible implementation, referring to fig. 2, determining a convolution kernel to be pruned from a plurality of convolution kernels indicated by convolution kernel parameter information based on convolution kernel parameter information includes:

s201, determining the importance degree corresponding to each convolution kernel indicated by the convolution kernel parameter information based on the convolution kernel parameter information.

S202, determining the convolution kernels to be deleted from the plurality of convolution kernels based on the corresponding importance degree of each convolution kernel.

In the embodiment of the present disclosure, the importance degree of each convolution kernel may be determined based on a set intermediate neural network for determining the importance degree, where a structure and parameters of the intermediate neural network for determining the importance degree may be determined according to actual needs, and this is not specifically limited in the embodiment of the present disclosure. For example, the convolution kernel parameter information may be input into the intermediate neural network, and the importance degree corresponding to each convolution kernel indicated by the convolution kernel parameter information is obtained.

In the embodiment of the present disclosure, the importance degree of each corresponding convolution kernel may be determined based on the acquired parameter information of each convolution kernel; the convolution kernels to be pruned are determined based on the importance of each convolution kernel. For example, the number of prunes may be set, and the number of prunes to be pruned may be determined based on the importance of each convolution kernel. For example, if the set number of deletions is 2, based on the importance of each convolution kernel, 2 convolution kernels with lower importance are selected as the convolution kernels to be deleted, and the determined 2 convolution kernels to be deleted are deleted. Alternatively, an importance threshold may be set, and the convolution kernel to be pruned may be determined based on the importance of each convolution kernel. For example, if the set importance threshold is 0.9, based on the importance of each convolution kernel, determining the convolution kernel whose importance is lower than the importance threshold as the convolution kernel to be deleted, and deleting the determined convolution kernel to be deleted.

In the embodiment of the disclosure, the convolution kernel to be deleted is determined based on the importance degree of the convolution kernel, the convolution kernel with low importance degree can be deleted, the situation that the accuracy of the compressed neural network is low due to the deletion of the convolution kernel with high importance degree is avoided, and the performance of the compressed neural network is guaranteed.

In one possible embodiment, determining a convolution kernel to be pruned from a plurality of convolution kernels based on the importance degree corresponding to each convolution kernel includes:

and determining the convolution kernels to be deleted from the plurality of convolution kernels based on the corresponding importance degree of each convolution kernel and the preset number of the convolution kernels to be deleted at each time.

In the embodiment of the present disclosure, the number of the convolution kernels to be deleted each time may be set according to actual needs, and this is not specifically limited in the embodiment of the present disclosure. For example, if the number of convolution kernels in a convolution layer is large, the number of preset convolution kernels per deletion may be large, for example, if the number of convolution kernels is 96, the number of convolution kernels per deletion may be set to 5; conversely, if the number of convolution kernels in the convolution layer is small, the preset number of punctured convolution kernels per time may be reduced, for example, if the number of convolution kernels is 36, the number of punctured convolution kernels per time may be set to 2.

In the embodiment of the present disclosure, if the neural network to be compressed includes a plurality of convolutional layers, and the number of convolution kernels corresponding to each convolutional layer may be different, the number of each truncated convolution kernel may be set for all convolutional layers included in the neural network, for example, the number of each truncated convolution kernel may be 2, and the number of each truncated convolution kernel is 2 for each convolutional layer included in the neural network. Alternatively, the number of punctured convolution kernels per time may be set for each convolution layer, where the number of convolution kernels is different for each convolution layer, and the number of punctured convolution kernels per time is also different for each convolution layer. For example, if the neural network a includes a first convolutional layer including 64 convolutional kernels and a second convolutional layer including 128 convolutional kernels, the number of each truncated convolutional kernel corresponding to the first convolutional layer may be 2, and the number of each truncated convolutional kernel corresponding to the second convolutional layer may be 4.

In the embodiment of the disclosure, based on the importance degree corresponding to each convolution kernel and the number of the deletion convolution kernels at each time, a preset number of convolution kernels with lower importance degrees are determined as the convolution kernels to be deleted.

In the above embodiment, the number of the convolution kernels to be deleted each time is set, and the deletion of the plurality of convolution kernels in the convolution layer each time is completed according to the preset number, so that the situation that the performance of the compressed neural network is poor due to the fact that the deletion of the convolution kernels is more in the deletion process at one time is avoided, the performance of the compressed neural network is guaranteed, and the excessive deletion of the convolution kernels of the neural network to be compressed is avoided.

Illustratively, referring to fig. 3, the intermediate neural network for determining the importance level may include an average pooling layer 301, at least one fully-connected layer 302, and an activation function layer 303, and the number of fully-connected layers may be determined according to actual needs. The number of fully-connected layers may be two, that is, the intermediate neural network may include an average pooling layer, a first fully-connected layer, a second fully-connected layer, and an activation function layer. Alternatively, the intermediate neural network for determining the degree of importance may further include an average pooling layer, at least one convolution layer, and an activation function layer. The intermediate neural network may also include other structural forms, and the embodiments of the present disclosure are only exemplary.

Illustratively, referring to fig. 4, determining the importance level corresponding to each convolution kernel indicated by the convolution kernel parameter information based on the convolution kernel parameter information includes:

s401, inputting convolution kernel parameter information into an average pooling layer for feature extraction to obtain first feature data corresponding to the convolution kernel parameter information;

s402, analyzing and normalizing the first characteristic data to obtain the importance degree corresponding to each convolution kernel.

For example, the initial weight degree of each convolution kernel indicated by the convolution kernel parameters may be set to 1, and then the convolution kernel parameters with the initial weight degree set are input into an average pooling layer, and the average pooling layer performs feature extraction on convolution kernel parameter information to obtain first feature data corresponding to the convolution kernel parameter information; the first characteristic data can be analyzed and normalized to obtain the importance degree corresponding to each convolution kernel, wherein the importance degree corresponding to each convolution kernel can be a numerical value between 0 and 1, in specific application, the first characteristic data can be input into at least one full connection layer for analysis, and the second characteristic data obtained after analysis is input into an activation function layer for analysis processing, so that the importance degree corresponding to each convolution kernel is obtained. For example, the importance degree of the obtained convolution kernel a may be 0.95, and the importance degree of the obtained convolution kernel B may be 0.90, which indicates that the importance degree of the convolution kernel a is higher than that of the convolution kernel B, and further indicates that the importance degree of the convolution kernel a is less decreased and the importance degree of the convolution kernel B is more decreased.

In the embodiment of the present disclosure, when determining a convolution kernel to be deleted from a plurality of convolution kernels based on the importance degree corresponding to each convolution kernel, the convolution kernel with a higher importance degree may be determined as the convolution kernel to be deleted. For example, an importance degree reduction threshold may be set, and a convolution kernel whose importance degree reduction value is greater than the set reduction threshold is determined as a convolution kernel to be deleted; or, the number of the convolution kernels to be deleted at each time may be preset, and the preset number of the convolution kernels with the lower importance degree is determined as the convolution kernels to be deleted.

For S103:

in the embodiment of the present disclosure, after deleting the convolution kernel to be deleted, the compressed neural network is determined based on the deleted convolution layer.

In one possible embodiment, referring to fig. 5, the determining a compressed neural network based on the removed convolutional layer includes:

s501, determining the operation speed of the corresponding neural network after the convolution kernel is deleted.

And S502, under the condition that the operation speed does not meet the set compression termination condition, taking the neural network corresponding to the deleted convolution kernel as a new neural network to be compressed, returning to the step of acquiring the convolution kernel parameter information of the trained convolution layer of the neural network to be compressed until the operation speed meets the set compression termination condition, and determining the compressed neural network based on the convolution layer subjected to the last deletion.

And S503, under the condition that the operation speed meets the set compression termination condition, determining the compressed neural network based on the convolution layer subjected to the current deletion processing.

In the embodiment of the present disclosure, the operation speed of the neural network may be floating-point Operations Per Second (FLOPs); alternatively, the number of multiply-accumulate operations (MACC) may be used, for example.

In one possible embodiment, when the operation speed is the number of floating-point operations per second FLOPs, the operation speed satisfies the set compression termination condition, including:

FLOPs are less than a set threshold.

In the embodiment of the present disclosure, a neural network to be compressed is subjected to a puncturing convolution kernel processing once, and if the FLOPs of the neural network obtained after the puncturing processing once do not satisfy the set compression termination condition, the neural network to be compressed may be subjected to multiple puncturing convolution kernel processing until the FLOPs of the corresponding neural network after the puncturing convolution kernel processing is smaller than the set threshold. Specifically, after deleting a convolution kernel to be deleted from a plurality of convolution kernels each time, determining the FLOPs of the compressed neural network corresponding to the deletion, if the determined FLOPs are smaller than a set threshold value, completing the compression of the neural network, and determining the compressed neural network based on the convolution layer subjected to the deletion; if the determined FLOPs are greater than or equal to the set threshold, the parameters of the compressed neural network can be adjusted, the neural network (the compressed neural network) with the adjusted parameters is used as a new neural network to be compressed, the convolution kernel parameter information of the convolution layer of the new neural network to be compressed with the adjusted parameters is obtained, the convolution kernel to be deleted is determined based on the obtained convolution kernel parameter information, and deletion is performed again until the determined FLOPs are smaller than the set threshold.

For example, the FLOPs of each layer included in the neural network may be determined according to the structural parameters of the compressed neural network (i.e., parameter information of the convolutional layer included in the neural network, and/or parameter information of the fully-connected layer, and/or parameter information of the pooling layer, etc.), and then the FLOPs of the compressed neural network may be obtained based on the FLOPs of each layer.

In the embodiment of the present disclosure, the threshold of the FLOPs may be set according to actual needs, for example, the threshold of the FLOPs may be determined according to the performance of the device that operates the neural network to be compressed, or the threshold of the FLOPs may also be any value that is set by a user and corresponds to the performance of the device.

In the above embodiment, since the FLOPs can more accurately characterize the complexity of the compressed neural network, whether to terminate the compression process on the neural network is determined based on the FLOPs of the compressed neural network.

In one possible embodiment, after deleting a convolution kernel to be deleted from a plurality of convolution kernels, before determining an operation speed of a neural network corresponding to the deletion of the current convolution kernel, the method further includes:

In the embodiment of the present disclosure, after deleting the convolution kernel to be deleted from the plurality of convolution kernels each time, the network parameter of the neural network corresponding to the deletion of the convolution kernel at this time may be adjusted.

Illustratively, training sample data is input into the compressed neural network corresponding to the deleted convolution kernel, and the compressed neural network corresponding to the deleted convolution kernel is trained until the compressed neural network corresponding to the deleted convolution kernel meets a preset condition, for example, until the accuracy of the compressed neural network corresponding to the deleted convolution kernel is greater than a first accuracy threshold, or until the loss value of the compressed neural network corresponding to the deleted convolution kernel is less than a first loss threshold, and the like, wherein when the network parameter of the compressed neural network corresponding to the deleted convolution kernel is adjusted, the adjusted network parameter is the network parameter of the network structure remaining after the deleted convolution kernel. The training sample data used for adjusting the network parameters of the compressed neural network and the training sample data for training the neural network to be compressed may be the same training sample data.

In the above embodiment, based on the training sample data, the network parameters of the compressed neural network corresponding to the deletion of the convolution kernel at this time are adjusted, and the lost information in the compressed neural network can be compensated, so that the convolution kernel to be deleted can be accurately determined when the compressed neural network is recompressed next time, and the performance of the recompressed neural network is ensured.

and training the compressed neural network based on the training sample data and the network structure of the compressed neural network obtained after parameter initialization processing.

In the embodiment of the present disclosure, if the determined operation speed satisfies the set compression termination condition, that is, when the determined FLOPs is smaller than the set threshold, it is characterized that the compression of the neural network to be compressed is completed, and further, the finally obtained compressed neural network may be trained based on training sample data and the finally obtained network structure of the compressed neural network.

For example, parameter initialization processing may be performed on the finally obtained parameter information of the network structure of the compressed neural network, and training sample data may be input into the compressed neural network after the parameter initialization processing, and the finally obtained compressed neural network may be trained until the finally obtained compressed neural network meets the set condition. For example, until the accuracy of the finally obtained compressed neural network is greater than a second accuracy threshold, or until the loss value of the finally obtained compressed neural network is less than a second loss threshold, and the like, where the second accuracy threshold may be the same as or different from the first accuracy threshold; and the second loss threshold may or may not be the same as the first loss threshold. Further, the training sample data used for training the compressed neural network obtained finally and the training sample data used for adjusting the network parameters of the compressed neural network may be the same training sample data.

In the above embodiment, the performance of the finally obtained compressed neural network is ensured by training the finally obtained compressed neural network.

It will be understood by those skilled in the art that in the method of the present invention, the order of writing the steps does not imply a strict order of execution and any limitations on the implementation, and the specific order of execution of the steps should be determined by their function and possible inherent logic.

In addition, the embodiment of the present disclosure further provides a target detection method, which is shown in fig. 6 and includes S601-S602. Specifically, the method comprises the following steps:

s601, acquiring a target image;

and S602, performing target detection on the target image by using a target detection neural network, wherein the target detection neural network is obtained by adopting the neural network compression method provided by the embodiment of the disclosure, and training is completed.

In addition, the embodiment of the disclosure also provides a running gear control method, which includes S701-S703, as shown in fig. 7. Specifically, the method comprises the following steps:

s701, acquiring a road image acquired by a driving device in the driving process.

S702, carrying out target detection on the road image by using a target detection neural network, and determining a target object included in the road image; the target detection neural network is obtained by adopting the neural network compression method provided by the embodiment of the disclosure, and training is completed.

S703, controlling the running device based on a target object included in the road image.

For example, the traveling device may be an autonomous vehicle, a vehicle equipped with an Advanced Driving Assistance System (ADAS), a robot, or the like. The road image may be an image acquired by the driving device in real time during driving. The target object may be any sign, and/or any object that may be present in the road. For example, the target object may be a road sign such as a zebra crossing or a right turn, a signboard on which parking is prohibited, an animal or a pedestrian appearing on the road, or another vehicle on the road.

When the driving device is controlled, the driving device can be controlled to accelerate, decelerate, turn, brake and the like, or voice prompt information can be played to prompt a driver to control the driving device to accelerate, decelerate, turn, brake and the like.

Based on the same concept, an embodiment of the present disclosure further provides a neural network compression apparatus, as shown in fig. 8, which is an architecture schematic diagram of the neural network compression apparatus provided in the embodiment of the present disclosure, and includes an obtaining module 801, a deleting module 802, and a compressing module 803, specifically:

an obtaining module 801, configured to obtain convolutional kernel parameter information of a convolutional layer of a trained neural network to be compressed;

a deleting module 802, configured to determine a convolution kernel to be deleted from the plurality of convolution kernels indicated by the convolution kernel parameter information based on the convolution kernel parameter information, and delete the convolution kernel to be deleted from the plurality of convolution kernels;

a compressing module 803, configured to determine a compressed neural network based on the removed convolutional layer.

In a possible implementation, the compressing module 803 is configured to:

In a possible implementation, the compressing module 803 is further configured to:

the FLOPs are less than a set threshold.

In a possible embodiment, the apparatus further comprises:

In a possible implementation manner, the deleting module 802 is configured to:

In a possible implementation manner, the deleting module 802 is further configured to:

The embodiment of the present disclosure further provides a target detection apparatus, as shown in fig. 9, which is an architecture schematic diagram of the target detection apparatus provided in the embodiment of the present disclosure, and the target detection apparatus includes an image acquisition module 901 and a category determination module 902, specifically:

an image obtaining module 901, configured to obtain a target image;

a detecting module 902, configured to perform target detection on the target image by using a target detection neural network, where the target detection neural network is obtained by using the neural network compression method described in the first aspect or any embodiment of the first aspect, and the training is completed.

The embodiment of the present disclosure further provides a driving device control device, as shown in fig. 10, which is a schematic structural diagram of the driving device control device provided in the embodiment of the present disclosure, and includes a road image acquisition module 1001, a target object detection module 1002, and a control module 1003, specifically:

a road image acquisition module 1001 configured to acquire a road image acquired by a driving device during driving;

a target object detection module 1002, configured to perform target detection on the road image by using a target detection neural network, and determine a target object included in the road image; the target detection neural network is obtained by adopting the neural network compression method of the first aspect or any embodiment of the first aspect, and training is completed;

a control module 1003 for controlling the running device based on a target object included in the road image.

In some embodiments, the functions of the apparatus provided in the embodiments of the present disclosure or the included templates may be used to execute the method described in the above method embodiments, and specific implementation thereof may refer to the description of the above method embodiments, and for brevity, no further description is provided here.

Based on the same technical concept, the embodiment of the disclosure also provides an electronic device. Referring to fig. 11, a schematic structural diagram of an electronic device provided in the embodiment of the present disclosure includes a processor 1101, a memory 1102, and a bus 1103. The storage 1102 is used for storing execution instructions and includes a memory 11021 and an external storage 11022; the memory 11021 is also referred to as an internal memory, and temporarily stores operation data in the processor 1101 and data exchanged with an external memory 11022 such as a hard disk, the processor 1101 exchanges data with the external memory 11022 through the memory 11021, and when the electronic device 1100 operates, the processor 1101 communicates with the memory 1102 through the bus 1103, so that the processor 1101 executes the following instructions:

Based on the same technical concept, the embodiment of the disclosure also provides an electronic device. Referring to fig. 12, a schematic structural diagram of an electronic device provided in the embodiment of the present disclosure includes a processor 1201, a memory 1202, and a bus 1203. The storage 1202 is used for storing execution instructions, and includes a memory 12021 and an external storage 12022; the memory 12021 is also referred to as an internal memory, and is used to temporarily store operation data in the processor 1201 and data exchanged with an external memory 12022 such as a hard disk, the processor 1201 exchanges data with the external memory 12022 through the memory 12021, and when the electronic apparatus 1200 is operated, the processor 1201 and the memory 1202 communicate with each other through the bus 1203 to cause the processor 1201 to execute the following instructions:

acquiring a target image;

and performing target detection on the target image by using a target detection neural network, wherein the target detection neural network is obtained by adopting the neural network compression method of the first aspect or any embodiment of the first aspect, and training is completed.

Based on the same technical concept, the embodiment of the disclosure also provides an electronic device. Referring to fig. 13, a schematic structural diagram of an electronic device provided in the embodiment of the present disclosure includes a processor 1301, a memory 1302, and a bus 1303. The storage 1302 is used for storing execution instructions and includes a memory 13021 and an external storage 13022; the memory 13021 is also referred to as an internal memory, and is used for temporarily storing operation data in the processor 1301 and data exchanged with an external storage 13022 such as a hard disk, the processor 1301 exchanges data with the external storage 13022 through the memory 13021, and when the electronic device 1300 operates, the processor 1301 and the storage 1302 communicate through the bus 1303, so that the processor 1301 executes the following instructions:

acquiring a road image acquired by a driving device in the driving process;

carrying out target detection on the road image by using a target detection neural network, and determining a target object included in the road image; the target detection neural network is obtained by adopting the neural network compression method of the first aspect or any embodiment of the first aspect, and training is completed;

Furthermore, an embodiment of the present disclosure also provides a computer-readable storage medium, on which a computer program is stored, where the computer program is executed by a processor to perform the steps of the neural network compression method described in the above-mentioned method embodiment, or the steps of the target detection method described in the above-mentioned embodiment, or the steps of the driving control method described in the above-mentioned embodiment.

The computer program product of the neural network compression method provided in the embodiments of the present disclosure includes a computer-readable storage medium storing a program code, where instructions included in the program code may be used to execute the steps of the neural network compression method described in the above method embodiments, which may be referred to in the above method embodiments specifically, and are not described herein again.

It is clear to those skilled in the art that, for convenience and brevity of description, the specific working processes of the system and the apparatus described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed system, apparatus, and method may be implemented in other ways. The above-described embodiments of the apparatus are merely illustrative, and for example, the division of the units is only one logical division, and there may be other divisions when actually implemented, and for example, a plurality of units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection of devices or units through some communication interfaces, and may be in an electrical, mechanical or other form.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.

The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer-readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in the form of a software product, which is stored in a storage medium and includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present disclosure. And the aforementioned storage medium includes: various media capable of storing program codes, such as a usb disk, a removable hard disk, a Read-only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk.

The above are only specific embodiments of the present disclosure, but the scope of the present disclosure is not limited thereto, and any person skilled in the art can easily conceive of changes or substitutions within the technical scope of the present disclosure, and shall be covered by the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims

1. A neural network compression method, comprising:

2. The method of claim 1, wherein determining the compressed neural network based on the pruned convolutional layer comprises:

3. The method of claim 1, wherein determining the compressed neural network based on the pruned convolutional layer comprises:

4. The method according to claim 2 or 3, wherein when the operation speed is floating point operations per second ("FLOPs"), the operation speed satisfies the set compression termination condition, and includes:

the FLOPs are less than a set threshold.

5. The method according to claim 2 or 3, wherein after deleting the convolution kernel to be deleted from the plurality of convolution kernels, before determining the operation speed of the corresponding neural network after deleting the convolution kernel this time, the method further comprises:

6. The method of any one of claims 1 to 5, wherein after determining the compressed neural network, the method further comprises:

7. The method according to any one of claims 1 to 6, wherein determining a convolution kernel to be pruned from a plurality of convolution kernels indicated by the convolution kernel parameter information based on the convolution kernel parameter information comprises:

8. The method of claim 7, wherein determining the punctured convolution kernel from the plurality of convolution kernels based on the importance level corresponding to each convolution kernel comprises:

9. The method of claim 7, wherein determining the importance level corresponding to each convolution kernel indicated by the convolution kernel parameter information based on the convolution kernel parameter information comprises:

10. A method of object detection, comprising:

acquiring a target image;

and carrying out target detection on the target image by using a target detection neural network, wherein the target detection neural network is obtained by adopting the neural network compression method of any one of claims 1 to 9, and the training is completed.

11. A travel control method characterized by comprising:

acquiring a road image acquired by a driving device in the driving process;

carrying out target detection on the road image by using a target detection neural network, and determining a target object included in the road image; the target detection neural network is obtained by adopting the neural network compression method of any one of claims 1 to 9 and is trained;

12. A neural network compression device, comprising:

13. An object detection device, comprising:

the image acquisition module is used for acquiring a target image;

a detection module, configured to perform target detection on the target image by using a target detection neural network, where the target detection neural network is obtained by using the neural network compression method according to any one of claims 1 to 9, and the training is completed.

14. A travel control device characterized by comprising:

the target object detection module is used for carrying out target detection on the road image by utilizing a target detection neural network and determining a target object included in the road image; the target detection neural network is obtained by adopting the neural network compression method of any one of claims 1 to 9 and is trained;

15. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is operating, the machine-readable instructions when executed by the processor performing the steps of the neural network compression method of any one of claims 1 to 9, the steps of the object detection method of claim 10, or the steps of the travel control method of claim 11.

16. A computer-readable storage medium, characterized in that a computer program is stored thereon, which, when being executed by a processor, carries out the steps of the neural network compression method as claimed in any one of claims 1 to 9, the steps of the object detection method as claimed in claim 10, or the steps of the travel control method as claimed in claim 11.