CN111291866B - Neural network generation, image processing and intelligent driving control method and device - Google Patents

Neural network generation, image processing and intelligent driving control method and device Download PDF

Info

Publication number
CN111291866B
CN111291866B CN202010075144.9A CN202010075144A CN111291866B CN 111291866 B CN111291866 B CN 111291866B CN 202010075144 A CN202010075144 A CN 202010075144A CN 111291866 B CN111291866 B CN 111291866B
Authority
CN
China
Prior art keywords
network
neural network
layer
sub
networks
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010075144.9A
Other languages
Chinese (zh)
Other versions
CN111291866A (en
Inventor
俞海宝
韩琪
李建波
程光亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Original Assignee
Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Sensetime Lingang Intelligent Technology Co Ltd filed Critical Shanghai Sensetime Lingang Intelligent Technology Co Ltd
Priority to CN202010075144.9A priority Critical patent/CN111291866B/en
Publication of CN111291866A publication Critical patent/CN111291866A/en
Application granted granted Critical
Publication of CN111291866B publication Critical patent/CN111291866B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

The method and the device for generating, processing images and controlling intelligent running of the neural network are characterized in that after the super-neural network to be compressed is generated, the super-neural network is trained through an obtained sample image, in the process of training the super-neural network, loss information is generated based on a prediction result and a labeling result of the sample image, index values representing importance degrees of all sub-networks in each layer of network in the super-neural network, network parameters of different sub-networks corresponding to each layer of network in the super-neural network and calculation cost constraint conditions, and the loss information is based on the loss information, network parameters of different sub-networks in the super-neural network and index values representing importance degrees of all sub-networks are adjusted, so that a loss function can be determined based on limitation of the calculation cost constraint conditions, a screening process of the neural network is limited in an effective space limited based on the calculation cost constraint conditions, and compression efficiency of the super-neural network to be compressed is further improved.

Description

Neural network generation, image processing and intelligent driving control method and device
Technical Field
The disclosure relates to the technical field of image processing, in particular to a method and a device for generating a neural network, processing the image and controlling intelligent driving.
Background
With the wide application of the neural network in various fields, the landing of a series of intelligent products is promoted. In order to make the neural network have better effect, the number of layers of the neural network is continuously increased, and parameters in each layer of the neural network are also more and more, for example, in an image processing neural network, in order to extract more features in an image, each layer of the neural network often needs to perform convolution processing on the image through tens of convolution kernels. This results in that neural network based products are mostly very dependent on good operating environments, resulting in limited application range of the neural network to be compressed, for example, when the neural network to be compressed is applied to the autopilot field, the neural network to be compressed needs to be deployed on edge devices such as field programmable gate arrays (Field Programmable Gate Array, FPGAs), which often do not have the operating environments required by large-scale neural networks to be compressed. In order to realize the embedded application of the neural network to be compressed, the volume of the neural network to be compressed needs to be compressed below a certain range.
Quantization is currently a common method of neural network compression. The method can quantize the neural network to be compressed from the floating point precision neural network to the fixed precision neural network, namely, reduces the calculation resources and the data bandwidth required by the neural network to be compressed in the use process by reducing bit bits representing parameters in each network layer in the neural network to be compressed, and reduces the operation environment requirement required by the neural network to be compressed. However, the current quantization method requires longer time for compressing the neural network to be compressed, resulting in lower compression efficiency.
Disclosure of Invention
The embodiment of the disclosure at least provides a method and a device for generating a neural network, processing images and controlling intelligent driving.
In a first aspect, an embodiment of the present disclosure provides a method for generating a neural network, including: generating an ultra-neural network to be compressed; the super neural network comprises a plurality of sub-networks corresponding to each layer of network in the multi-layer network respectively; wherein, the bit number of the parameters in the networks of different sub-networks corresponding to the same layer of network is different; each of a plurality of rounds of training is performed on the superneural network and the compressed neural network is generated based on the plurality of rounds of trained superneural network in the following manner: inputting the obtained sample image into a current super neural network for processing to obtain a prediction result corresponding to the sample image; generating loss information based on a prediction result, a labeling result, an index value representing the importance degree of each sub-network in each layer of network in the current super-neural network, network parameters of different sub-networks corresponding to each layer of network in the current super-neural network and a calculation cost constraint condition, and adjusting the network parameters of the different sub-networks in the current super-neural network and the index value representing the importance degree of each sub-network based on the loss information.
In this way, after the super-neural network is generated, the super-neural network is trained through the obtained sample image, and in the process of training the super-neural network, loss information is generated based on the prediction result and the labeling result of the sample image, the index value representing the importance degree of each sub-network in each layer of network in the super-neural network, the intra-network parameters of different sub-networks corresponding to each layer of network in the super-neural network and the calculation cost constraint condition, and the intra-network parameters of different sub-networks in the super-neural network and the index value representing the importance degree of each sub-network are adjusted based on the loss information, so that a loss function can be determined based on the limitation of the calculation cost constraint condition, the screening process of the sub-networks in the super-neural network is limited in an effective space limited based on the calculation cost constraint condition, so that the process of screening a quantization scheme does not need to consider the scheme which does not meet the calculation cost constraint condition, the final quantization scheme can be screened from a plurality of quantization schemes, and the compression efficiency of the neural network to be compressed can be further improved.
In an alternative embodiment, the generating the superneural network to be compressed includes: for each layer of network in the multi-layer network in the initial neural network, carrying out multiple times of quantization on parameters in the network in the layer to obtain the parameters in the network corresponding to each time of quantization in the multiple times of quantization; based on parameters in the network corresponding to each quantization in multiple times and the network structure of the layer of network, obtaining a plurality of sub-networks corresponding to the layer of network; and forming the superneural network to be compressed based on a plurality of sub-networks corresponding to each layer of network in the multi-layer network.
In this way, a superneural network including a plurality of sub-networks can be generated.
In an optional implementation manner, the forming the to-be-compressed superneural network based on a plurality of sub-networks corresponding to each layer of network in the multi-layer network respectively includes; initializing index values of importance degrees corresponding to a plurality of sub-networks corresponding to each network layer, and forming the to-be-compressed superneural network based on an initialization result and the plurality of sub-networks corresponding to each network layer in the multi-layer network.
Thus, initial index values of importance degrees respectively corresponding to a plurality of sub-networks respectively corresponding to each layer of network in the multi-layer network are obtained.
In an alternative embodiment, the adjusting the parameters in the network of the different sub-networks in the current super-neural network and the index value characterizing the importance degree of each sub-network includes: generating first loss information based on a prediction result and a labeling result of the sample image, index values representing importance degrees of each sub-network in each layer of network in the current super-neural network, and network internal parameters of different sub-networks corresponding to each layer of network in the current super-neural network, and adjusting the network internal parameters of the different sub-networks in the current super-neural network based on the first loss information; generating second loss information based on the prediction result, the labeling result, an index value representing the importance degree of each sub-network in each layer of network in the current super-neural network, the adjusted network parameters of different sub-networks corresponding to each layer of network and the calculation cost constraint condition, and adjusting the index value representing the importance degree of each sub-network in the current super-neural network based on the second loss information.
In this way, an adjustment of the in-network parameters of the different sub-networks in the current superneural network, and of the index values of the degree of importance of the current superneural network characterizing each of the sub-networks, is achieved.
In an alternative embodiment, the computational cost constraint includes one or more of the following: the maximum iteration round number, the maximum threshold value of average bit number of parameters in each layer of network in the super-neural network and the preset barrier penalty coefficient.
In an alternative embodiment, generating the compressed neural network based on the super neural network after multiple rounds of training includes: determining the weight of each sub-network in each layer of the super-neural network after multiple rounds of training based on the index value of the importance degree of each sub-network in each layer of the network after multiple rounds of training; determining target subnetworks corresponding to each layer of network from a plurality of subnetworks corresponding to each layer of network respectively based on the weight of each subnetwork in each layer of network in the super-neural network after multiple rounds of training; and forming the compressed neural network based on the target sub-networks respectively corresponding to the networks of all layers.
In a second aspect, an embodiment of the present disclosure further provides an image processing method, including: acquiring an image to be processed; and processing the image to be processed by using the neural network generated based on the first aspect or the method in any possible implementation manner of the first aspect, so as to obtain a processing result of the image to be processed.
In a third aspect, an embodiment of the present disclosure further provides an intelligent driving control method, including: acquiring a road image; processing the road image by using a neural network generated based on the first aspect or any one of the possible implementation manners of the first aspect, to obtain a processing result of the road image; and controlling the intelligent running equipment for acquiring the road image according to the processing result.
In a fourth aspect, an embodiment of the present disclosure further provides a generating device of a neural network, where the generating device of the neural network includes: the generation module is used for generating the superneural network to be compressed; the super neural network comprises a plurality of sub-networks corresponding to each layer of network in the multi-layer network respectively; wherein, the bit number of the parameters in the networks of different sub-networks corresponding to the same layer of network is different; the training module is used for carrying out each round of training in multiple rounds of training on the super neural network; the determining module is used for generating the compressed neural network based on the super neural network after multiple rounds of training; the training module is used for carrying out each training in a plurality of training rounds on the super neural network by adopting the following mode: inputting the obtained sample image into a current super neural network for processing to obtain a prediction result corresponding to the sample image; generating loss information based on a prediction result, a labeling result, an index value representing the importance degree of each sub-network in each layer of network in the current super-neural network, network parameters of different sub-networks corresponding to each layer of network in the current super-neural network and a calculation cost constraint condition, and adjusting the network parameters of the different sub-networks in the current super-neural network and the index value representing the importance degree of each sub-network based on the loss information.
In a possible implementation manner, the generating module is configured to generate the to-be-compressed superneural network in the following manner: aiming at each layer of network in the multi-layer network in the initial neural network, carrying out multiple times of quantization on parameters in the network in the layer to obtain the parameters in the network which correspond to each time of quantization in the multiple times of quantization; based on parameters in the network corresponding to each quantization in multiple times and the network structure of the layer of network, obtaining a plurality of sub-networks corresponding to the layer of network; and forming the superneural network to be compressed based on a plurality of sub-networks corresponding to each layer of network in the multi-layer network.
In a possible implementation manner, the generating module is configured to form the to-be-compressed superneural network based on a plurality of sub-networks corresponding to each layer of network in the multi-layer network in the following manner; initializing index values of importance degrees corresponding to a plurality of sub-networks corresponding to each network layer, and forming the to-be-compressed superneural network based on an initialization result and the plurality of sub-networks corresponding to each network layer in the multi-layer network.
In a possible implementation manner, the training module is configured to adjust parameters in the network of the different sub-networks in the current super-neural network and index values representing importance degrees of the sub-networks in the current super-neural network in the following manner: generating first loss information based on a prediction result and a labeling result of the sample image, index values representing importance degrees of each sub-network in each layer of network in the current super-neural network, and network internal parameters of different sub-networks corresponding to each layer of network in the current super-neural network, and adjusting the network internal parameters of the different sub-networks in the current super-neural network based on the first loss information; generating second loss information based on the prediction result, the labeling result, an index value representing the importance degree of each sub-network in each layer of network in the current super-neural network, the adjusted network parameters of different sub-networks corresponding to each layer of network and the calculation cost constraint condition, and adjusting the index value representing the importance degree of each sub-network in the current super-neural network based on the second loss information.
In a possible implementation manner, the calculation cost constraint condition includes one or more of the following: maximum iteration round number, maximum average bit number threshold of parameters in each layer of network in the super-neural network, and preset barrier penalty coefficient.
In a possible implementation manner, the determining module is configured to generate the compressed neural network based on the super neural network after multiple rounds of training in the following manner: determining the weight of each sub-network in each layer of the super-neural network after multiple rounds of training based on the index value of the importance degree of each sub-network in each layer of the network after multiple rounds of training; determining target subnetworks corresponding to each layer of network from a plurality of subnetworks corresponding to each layer of network respectively based on the weight of each subnetwork in each layer of network in the super-neural network after multiple rounds of training; and forming the compressed neural network based on the target sub-networks respectively corresponding to the networks of all layers.
In a fifth aspect, an embodiment of the present disclosure further provides an image processing apparatus, including: the first acquisition module is used for acquiring an image to be processed; the first processing module is used for processing the image to be processed by utilizing the neural network generated based on any one of the first aspects to obtain a processing result of the image to be processed.
In a sixth aspect, an embodiment of the present disclosure further provides an intelligent travel control apparatus, including: the second acquisition module is used for acquiring road images; a second processing module, configured to process the road image by using the neural network generated based on any one of the first aspects, to obtain a processing result of the road image; and the control module is used for controlling the intelligent driving equipment for acquiring the road image according to the processing result.
In a seventh aspect, embodiments of the present disclosure further provide a hardware platform based on a field programmable gate array FPGA, including: the device comprises a processor, an external memory, a memory and an FPGA operation unit;
the external memory is used for storing network parameters of the neural network generated by the method in the first aspect or any possible implementation manner of the first aspect; wherein, the bit number of the network parameters of at least two network layers in the neural network is different;
the processor is used for reading the network parameters of the neural network into the memory and inputting the data on the memory and the image to be processed into the FPGA operation unit;
and the FPGA operation unit is used for performing operation processing according to the image to be processed and the network parameters to obtain a processing result of the image.
In an eighth aspect, an embodiment of the present disclosure further provides an electronic device, including: a processor, a memory and a bus, the memory storing machine-readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is running, the machine-readable instructions when executed by the processor performing the steps of the first aspect, or any of the possible implementations of the first aspect, or performing the steps of the second aspect, or performing the steps of the third aspect.
In a ninth aspect, the disclosed embodiments further provide a computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the first aspect, or any of the possible implementations of the first aspect, or performs the steps of the second aspect, or performs the steps of the third aspect.
The foregoing objects, features and advantages of the disclosure will be more readily apparent from the following detailed description of the preferred embodiments taken in conjunction with the accompanying drawings.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present disclosure, the drawings required for the embodiments are briefly described below, which are incorporated in and constitute a part of the specification, these drawings showing embodiments consistent with the present disclosure and together with the description serve to illustrate the technical solutions of the present disclosure. It is to be understood that the following drawings illustrate only certain embodiments of the present disclosure and are therefore not to be considered limiting of its scope, for the person of ordinary skill in the art may admit to other equally relevant drawings without inventive effort.
FIG. 1 illustrates a schematic diagram of a specific example of a hybrid bit neural network provided by embodiments of the present disclosure;
FIG. 2 illustrates a flow chart of a method of generating a neural network provided by an embodiment of the present disclosure;
FIG. 3 is a schematic diagram showing a specific structural example of a superneural network to be compressed according to an embodiment of the present disclosure;
fig. 4 is a schematic diagram showing a specific method for adjusting parameters in networks of different sub-networks in a current super-neural network and an index value representing importance degree of each sub-network in a neural network generation method according to an embodiment of the present disclosure;
FIG. 5 illustrates a schematic diagram of a specific example of generating a compressed neural network based on a multi-round trained superneural network, provided by embodiments of the present disclosure;
FIG. 6 shows a flowchart of an image processing method provided by an embodiment of the present disclosure;
FIG. 7 shows a flowchart of a method for intelligent travel control of the map provided by an embodiment of the present disclosure;
fig. 8 is a schematic diagram of a neural network generating apparatus according to an embodiment of the present disclosure;
fig. 9 shows a schematic diagram of an image processing apparatus provided by an embodiment of the present disclosure;
Fig. 10 shows a schematic diagram of an intelligent driving control apparatus according to an embodiment of the present disclosure;
FIG. 11 illustrates a schematic diagram of a Field Programmable Gate Array (FPGA) -based hardware platform provided by an embodiment of the present disclosure;
fig. 12 shows a schematic diagram of an electronic device provided by an embodiment of the disclosure.
Detailed Description
For the purposes of making the objects, technical solutions and advantages of the embodiments of the present disclosure more apparent, the technical solutions in the embodiments of the present disclosure will be clearly and completely described below with reference to the drawings in the embodiments of the present disclosure, and it is apparent that the described embodiments are only some embodiments of the present disclosure, but not all embodiments. The components of the embodiments of the present disclosure, which are generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure provided in the accompanying drawings is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be made by those skilled in the art based on the embodiments of this disclosure without making any inventive effort, are intended to be within the scope of this disclosure.
According to research, when the neural network to be compressed is currently compressed based on a quantization method, due to the fact that different network layers in the neural network to be compressed have different sensitivities to quantized bit positions, proper bit positions can be selected for each layer of network in the neural network to be compressed to be quantized, and finally the mixed bit neural network of the neural network to be compressed is obtained; wherein the mixed bit neural network refers to the different bit numbers of parameters in the network in different network layers.
As shown in fig. 1, a specific example of a hybrid bit neural network is provided.
In this example, the neural network to be compressed comprises a 4-layer network; wherein, in the first layer network, the bit number of the weight is 3 bits, and the bit number of the activation value (the output of the layer network and the corresponding super parameter of the layer network represent the specific bit number to be quantized of the activation value) is 5 bits; in the second layer network, the number of bits of the weight is 6 bits, and the number of bits of the activation is 7 bits; in the third layer network, the number of bits of the weight is 4 bits, and the number of bits of the activation is 6 bits; in the fourth layer network, the number of bits of the weight is 5 bits, and the number of bits of the activation is 6 bits. The number of bits of parameters in the networks in different layers is different, and the mixed bit neural network is integrally formed.
When the neural network to be compressed is compressed based on the quantization method, as the number of network layers in the neural network to be compressed is large, each network layer can form a plurality of quantization schemes to be selected; if there are M network layers in the neural network to be compressed and each network layer can form N quantization schemes, the final possible neural network scheme is M N Species, need to be selected from M N Determining a quantization scheme from among quantization schemes to generate a hybrid bit neural network requires a large amount of resources to be consumed. The method currently commonly employed is generally a gradient-based neural network architecture search (Neural Architecture Search, NAS) algorithm to determine a quantization scheme for a neural network from among a plurality of quantization schemes. The method needs to take a lot of time to carry out a lot of experiments and error experiments to adjust the quantization precision, so that the compression efficiency is low.
Based on the above study, the disclosure provides a method for generating a neural network, after generating a superneural network to be compressed, training the superneural network through an obtained sample image, and in the process of training the superneural network, generating loss information based on a prediction result of the sample image, a labeling result, an index value representing the importance degree of each sub-network in each layer of network in the superneural network, parameters in networks corresponding to different sub-networks in each layer of network, and a calculation cost constraint condition, and adjusting parameters in networks of different sub-networks in the superneural network and index values representing the importance degree of each sub-network based on the loss information, so as to determine a loss function based on the limitation of the calculation cost constraint condition, so that the screening process of the superneural network sub-network is limited in an effective space limited based on the calculation cost constraint condition, the process of screening quantization scheme does not need to consider a scheme which does not satisfy the calculation cost constraint condition, the final quantization scheme can be screened from multiple quantization schemes, and the compression efficiency of the neural network to be compressed is further improved.
The present invention is directed to a method for manufacturing a semiconductor device, and a semiconductor device manufactured by the method.
The following description of the embodiments of the present disclosure will be made clearly and fully with reference to the accompanying drawings in which it is evident that the embodiments described are only some, but not all, of the embodiments of the present disclosure. The components of the present disclosure, as generally described and illustrated in the figures herein, may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present disclosure provided in the accompanying drawings is not intended to limit the scope of the disclosure, as claimed, but is merely representative of selected embodiments of the disclosure. All other embodiments, which can be made by those skilled in the art based on the embodiments of this disclosure without making any inventive effort, are intended to be within the scope of this disclosure.
It should be noted that: like reference numerals and letters denote like items in the following figures, and thus once an item is defined in one figure, no further definition or explanation thereof is necessary in the following figures.
For the sake of understanding the present embodiment, first, a detailed description will be given of a method for generating a neural network disclosed in an embodiment of the present disclosure, where an execution body of the method for generating a neural network provided in the embodiment of the present disclosure is generally an electronic device having a certain computing capability, and the electronic device includes, for example: the terminal device, or server or other processing device, may be a User Equipment (UE), mobile device, user terminal, cellular phone, cordless phone, personal digital assistant (Personal Digital Assistant, PDA), handheld device, computing device, vehicle mounted device, wearable device, etc. In some possible implementations, the method of generating the neural network may be implemented by a processor invoking computer readable instructions stored in a memory.
The following describes a method for generating a neural network according to an embodiment of the present disclosure, taking an execution body as a server as an example.
Referring to fig. 2, a flowchart of a method for generating a neural network according to an embodiment of the disclosure is shown, where the method includes steps S201 to S203, and S2021 to S2022, where:
s201: generating an ultra-neural network to be compressed; the super neural network comprises a plurality of sub-networks corresponding to each layer of network in the multi-layer network respectively; wherein, the bit number of the parameters in the networks of different sub-networks corresponding to the same layer of network is different;
S202: each of the multiple rounds of training is performed on the superneural network in the manner described in S2021 to S2022 below.
S2021: inputting the obtained sample image into a current super neural network for processing to obtain a prediction result corresponding to the sample image;
s2022: generating loss information based on a prediction result, a labeling result, an index value representing the importance degree of each sub-network in each layer of network in the current super-neural network, network parameters of different sub-networks corresponding to each layer of network in the current super-neural network and a calculation cost constraint condition, and adjusting the network parameters of the different sub-networks in the current super-neural network and the index value representing the importance degree of each sub-network based on the loss information.
S203: and generating a compressed neural network based on the super neural network after multiple rounds of training.
The above steps are described in detail below, respectively.
I: in S201, the superneural network to be compressed is generated based on the neural network to be compressed. The neural network to be compressed is an image processing neural network which is trained based on sample images and can complete certain image processing tasks. The image processing tasks include, for example, one or more of the following: face recognition, emotion recognition, face key point recognition, recognition of road images in automatic driving or assisted driving, and the like.
After training, parameters in the network of each layer in the neural network to be compressed, such as weight parameters, activation values and the like, are included.
Specifically, the superneural network to be compressed may be generated in the following manner:
aiming at each layer of network in the multi-layer network in the initial neural network, carrying out multiple times of quantization on parameters in the network in the layer to obtain the parameters in the network corresponding to each time of quantization in the multiple times of quantization;
based on parameters in the network corresponding to each quantization in multiple times and the network structure of the layer of network, obtaining a plurality of sub-networks corresponding to the layer of network;
and forming the superneural network to be compressed based on a plurality of sub-networks corresponding to each layer of network in the multi-layer network.
In a specific implementation, the in-network parameters to be quantized include, for example, weight parameters, and activation parameters.
Taking the example of quantizing the weight parameter, when the weight parameter is quantized, for example, the following formula (1) may be used:
wherein in the formula (1), i represents an i-th convolution kernel in the network layer; w (W) i Representing the weights to be quantized, w_b i Bit width of the weight, []Representation of Rounding operation, T i Represents a cutoff threshold, clip (W i ,T i ) The representation is based on T i Weight W i Performing cut-off operation;
the parameter s in the above formula (1) satisfies the formula:
cut-off threshold T i The formula is satisfied: t (T) i =k*||W i || 1 /n。
Wherein, W is i || 1 Represents W i I 1 norm of (2); n represents the total number of weights corresponding to the ith convolution kernel. k is an adjustment coefficient, and the weight parameter is quantized for a plurality of times by adjusting the value of k, so that the weight parameter is converted from floating point data into data with different bits.
In the case of quantifying the activation parameter, for example, a similar method may be used for quantification, which is not described here.
In addition, when the super neural network is formed based on a plurality of sub-networks corresponding to each layer of network in the multi-layer network, the importance index values corresponding to the plurality of sub-networks corresponding to each layer of network are initialized.
When initializing the importance index values corresponding to the sub-networks corresponding to each layer of network, random initialization may be adopted, or the index values of the importance degrees corresponding to the sub-networks corresponding to each layer of network may be set to the same value, or initialization may be performed based on the bit numbers of the parameters in the networks of the sub-networks, for example, the greater the bit number is, the smaller the index value of the importance degree corresponding to the initialization is; the smaller the number of bits, the larger the index value of the importance degree of the corresponding initialization. The specific initialization mode may be set according to actual needs, and is not limited herein.
For example, fig. 3 shows a specific structural example of a superneural network to be compressed; in this example, the neural network to be compressed includes a 3-layer network, and m times of quantization is performed on parameters in the network of each layer of network, so as to generate m sub-networks respectively corresponding to each layer of network, where:
after m times of quantization is carried out on parameters in a layer 1 network of the neural network to be compressed, m sub-networks of op11, op12, op13, … … and op1m are generated, and index values of importance degrees corresponding to the op11, op12, op13, … … and op1m are respectively initialized to be theta 11 、θ 12 、θ 13 、...、θ 1m
M times of quantization are carried out on parameters in a layer 2 network of the neural network to be compressed to generate m sub-networks of op21, op22, op23, … … and op2m; and respectively initializing index values of importance degrees corresponding to the ops 21, op22, op23, … … and op2m as theta 21 、θ 22 、θ 23 、...、θ 2m
M times of quantization are carried out on parameters in a layer 3 network of the neural network to be compressed to generate m sub-networks of op31, op32, op33, … … and op3 m; and respectively initializing index values of importance degrees corresponding to the op31, the op32, the op33, the op … … and the op3m as theta 31 、θ 32 、θ 33 、...、θ 3m
V1 represents that based on θ 11 、θ 12 、θ 13 、...、θ 1m The method comprises the steps of respectively carrying out weighted summation on characteristic data output after processing sample images by op11, op12, op13, … … and op1m to obtain first-layer characteristic data of the sample images; then, the first layer of characteristic data are respectively input into op21, op22, op23 and … … op2m;
V2 represents that based on θ 21 、θ 22 、θ 23 、...、θ 2m The first layer of characteristic data are respectively processed by op21, op22, op23, … … and op2m, and then the output characteristic data are weighted and summed to obtain the second layer of characteristic data of the sample image; then, respectively inputting second characteristic data into the ops 31, 32, 33, … … and 3m;
v3 represents that based on θ 31 、θ 32 、θ 33 、...、θ 3m Processing the second layer characteristic data respectively by op31, op32, op33 and … … op3mAnd carrying out weighted summation on the characteristic data output after the step to obtain third-layer characteristic data of the sample image.
Then, a prediction result of the sample image can be obtained based on the third layer characteristic data.
II: in S202, the superneural network is trained in one round of the tourets training as in S2021 to S2022.
A: in S2021, the processing of the sample image includes at least one of classification processing, regression processing, and the like, for example.
The current superneural network refers to the superneural network aimed at when the superneural network is trained for one round.
Specifically, for the case that the current training is the first training, the current superneural network is the superneural network constructed in S201 above.
Aiming at the condition that the current training is not the first training, the current super-neural network is the super-neural network obtained by the previous training.
The process of obtaining the prediction result corresponding to the sample image may be specifically described with reference to the example corresponding to fig. 2, which is not described herein.
B: in the above-mentioned S2022, referring to fig. 4, specifically, the intra-network parameters of the different sub-networks in the current super-neural network and the index values characterizing the importance degree of each sub-network may be adjusted in the following manner:
s401: generating first loss information based on a prediction result and a labeling result of the sample image, index values representing importance degrees of each sub-network in each layer of network in the current super-neural network, and network internal parameters of different sub-networks corresponding to each layer of network in the current super-neural network, and adjusting the network internal parameters of the different sub-networks in the current super-neural network based on the first loss information.
In an implementation, the first loss information L 1 For example, the following formula (2) is satisfied:
wherein W represents the parameters in the network of different sub-networks corresponding to each layer of network in the current super-neural network; an index value of importance degree of each sub-network in each layer of network in the theta current super-neural network; l (L) val (. Cndot.). Cndot.,) represents a loss function associated with θ and W, which is determined from the predicted result, labeled result of the sample image.
Here, L val (. Cndot.) is a loss function related to θ and W, which in fact means that the predicted outcome of the sample image is affected by θ and W. L (L) val (. Cndot. Cndot.) is, for example, cross entropy loss.
Alpha is a super parameter, and can be specifically set according to the requirement of a training task, for example, the alpha can be set to be 0.0001. W 2 The l2 norm of W is indicated.
Wherein,in the process of training the super neural network, the method plays a role in preventing overfitting in the process of training the super neural network.
Without taking into account overfitting, in some embodiments, the first loss information L 1 For example, satisfy: l (L) 1 =L val (W;θ)。
S402: generating second loss information based on the prediction result, the labeling result, an index value representing the importance degree of each sub-network in each layer of network in the current super-neural network, the adjusted network parameters of different sub-networks corresponding to each layer of network and the calculation cost constraint condition, and adjusting the index value representing the importance degree of each sub-network in the current super-neural network based on the second loss information.
In an implementation, the second loss information L is generated at the time of generating the second loss information 2 For example, the following formula (3) is satisfied:
Wherein,representing the barrier penalty function determined based on the computational cost constraints.
Specifically, calculating cost constraints includes, for example: maximum iteration round number epoch max And the average bit number maximum threshold C of parameters in each layer of network in the compressed neural network, a preset barrier penalty coefficient and the like.
L val (W;θ)、The specific meaning of (2) is the same as that in the above formula (2), and will not be described again here.
Represents an obstacle penalty function determined based on the computational cost constraint, e.g., satisfying the following equation (4):
where μ represents a target obstacle penalty coefficient, which satisfies the formula: μ=μ 0 *exp(-epoch/β 0 );
C slack Characterizing an obstacle boundary margin, which satisfies the formula: c (C) slack =μ 1 *exp(-epoch/β 1 )+ε slack
Wherein epoch characterizes the number of rounds of the current iteration; mu (mu) 0 、β 0 、μ 1 And beta 1 Are predetermined barrier penalty coefficients. Epsilon slack Representing a predetermined relaxation value for determining an obstacle boundary margin; in particular, since there is a great gap between the expected cost of the compressed neural network to be generated and the actual cost of the superneural network when training the superneural network is started, it is possible toTo pass through C slack The search space is narrowed to an area bounded by an obstacle while the cost is allowed to exceed C during the reduction of the search space, and then gradually narrowed to an area bounded by an obstacle during multiple rounds of training on the superneural network. The search space refers to a set formed by different quantization schemes of the neural network to be compressed; the process of searching from the search space is the process of determining the final desired quantization scheme from a plurality of different quantization schemes; the effective space refers to a set formed by quantization schemes meeting the constraint of computation cost.
SN represents the current superneural network; θ represents an index value of importance degree of each sub-network in each layer of network in the current super neural network; f (SN, θ) refers to a cost function based on the current θ superneural network.
In addition, in another embodiment of the present disclosure, the second loss information L 2 For example, the following formula (5) is satisfied:
wherein L is prob-1 A regularization term representing an index value θ based on the importance level of each sub-network in each layer of networks in the current superneural network.
Wherein L is prob-1 The following formula (6) is satisfied:
wherein lambda is i Representing an adaptive scale factor that satisfies the formula:
p i,j index value θ representing importance degree based on jth sub-network in ith layer network in current superneural network i,j And determining the weight of each sub-network in each layer of network. And p is i,j The following formula (7) is satisfied:
wherein m represents m sub-networks corresponding to the ith layer of network.
In the formula (6) of the present invention,regularization for adaptively adjusting a plurality of sub-network weights corresponding to each layer of network. With the increase of training wheels +.in equation (6)>And also increases.
The other parameters are similar to those in the above formula (3), and will not be described here again.
By adding regularization term L to the second loss function prob-1 I.e. by matching the probability p i,j Enhanced Prob-1 regularization to optimize the second loss function, more and more p i,j Tending to 0 or 1.
After obtaining a prediction result, a labeling result, an index value representing the importance degree of each sub-network in each layer of network in the current super-neural network, the adjusted network parameters of different sub-networks corresponding to each layer of network and a calculation cost constraint condition of the sample image, substituting the information into a second loss function to obtain second loss information.
And then, based on a gradient descent method, adjusting index values of importance degrees respectively corresponding to a plurality of sub-networks respectively corresponding to each layer of network in the multi-layer network in the super-neural network.
III: in S203, after the training of the multi-pair superneural network, since the sum of the weights of the multiple sub-networks corresponding to the same layer of network of the superneural network is 1, in the superneural network obtained after the training of the multi-pair superneural network, the weight of one sub-network tends to be 1 and the weights of the other sub-networks tend to be 0 in the multiple sub-networks corresponding to each layer of network. Because the weights of the sub-networks corresponding to each layer of the multi-layer network in the super-neural network are determined by the formula (7), one sub-network can be selected from the sub-networks corresponding to each layer of the multi-layer network based on the formula (7) and the index value of the importance degree of each sub-network in each layer of the multi-layer network after multi-training, and the sub-network is used as one layer of the compressed neural network.
Specifically, the compressed neural network may be generated in the following manner:
determining the weight of each sub-network in each layer of the super-neural network after multiple rounds of training based on the index value of the importance degree of each sub-network in each layer of the network after multiple rounds of training;
determining target subnetworks corresponding to each layer of network from a plurality of subnetworks corresponding to each layer of network respectively based on the weight of each subnetwork in each layer of network in the super-neural network after multiple rounds of training;
and forming the compressed neural network based on the target sub-networks respectively corresponding to the networks of all layers.
Referring to fig. 5, the present disclosure provides a specific example of generating a compressed neural network corresponding to a neural network to be compressed based on a multi-round trained superneural network, where in the example, if, based on an index value of importance degree of each sub-network in each layer network in the multi-round trained superneural network, p is determined in weights of each sub-network 12 、p 23 And p 31 If 1, the compressed neural network is finally formed based on the sub-networks op12, op23 and op 31.
In another embodiment of the present disclosure, after obtaining the compressed neural network, some sample images may be reused to fine tune the compressed neural network, and the fine-tuned compressed neural network may be determined as the final compressed neural network.
According to the embodiment of the disclosure, after the super-neural network of the neural network to be compressed is generated, the super-neural network is trained through the obtained sample image, and in the process of training the super-neural network, the loss function can be determined based on the limitation of the computation cost constraint condition, so that the screening process of the neural network in the super-neural network is limited in an effective space limited based on the computation cost constraint condition, the scheme which does not meet the computation cost constraint condition in the process of screening the quantization scheme does not need to be considered, the final quantization scheme can be screened from a plurality of quantization schemes more quickly, and the compression efficiency of the neural network to be compressed is further improved.
Referring to fig. 6, an embodiment of the present disclosure further provides an image processing method, including:
S601: acquiring an image to be processed;
s602: and processing the image to be processed by utilizing the neural network generated based on any embodiment of the disclosure to obtain a processing result of the image to be processed.
Here, the image processing task includes, for example, at least one of: motion recognition, face emotion recognition, face key point recognition, living body recognition, recognition of road images in automatic driving or assisted driving.
Determining sample images and labels corresponding to the respective sample images based on predetermined image processing tasks; and then training to obtain a neural network to be compressed based on the sample images and labels corresponding to the sample images, and compressing the neural network based on the generation method of the neural network provided by any embodiment of the disclosure to obtain the neural network, and then fine-tuning the neural network by using some sample images to obtain the final neural network.
Referring to fig. 7, an embodiment of the present disclosure further provides an intelligent driving control method, including:
s701: acquiring a road image;
s702: processing the road image by using a neural network generated based on the neural network generation method provided by the embodiment of the disclosure to obtain a processing result of the road image;
S703: and controlling the intelligent running equipment for acquiring the road image according to the processing result.
In a specific implementation, the intelligent running apparatus includes, for example: an autonomous vehicle, a vehicle equipped with advanced auxiliary driving systems, a robot, etc.
The neural network generated by the neural network generation method provided by the embodiment of the invention can complete the image processing task with fewer computing resources on the premise of not losing the precision, and is further more suitable for being deployed in intelligent driving equipment.
It will be appreciated by those skilled in the art that in the above-described method of the specific embodiments, the written order of steps is not meant to imply a strict order of execution but rather should be construed according to the function and possibly inherent logic of the steps.
Based on the same inventive concept, the embodiments of the present disclosure further provide a device for generating a neural network, where the device in the embodiments of the present disclosure is similar to the method for generating a neural network according to the embodiments of the present disclosure in terms of the principle of solving the problem, so that the implementation of the device may refer to the implementation of the method, and the repetition is omitted.
Referring to fig. 8, a schematic diagram of a generating device of a neural network according to an embodiment of the disclosure is shown, where the device includes: a generating module 81, a training module 82, a determining module 83; wherein:
a generating module 81, configured to generate a superneural network to be compressed; the super neural network comprises a plurality of sub-networks corresponding to each layer of network in the multi-layer network respectively; wherein, the bit number of the parameters in the networks of different sub-networks corresponding to the same layer of network is different;
a training module 82, configured to perform each of a plurality of training rounds on the super neural network;
a determining module 83, configured to generate the compressed neural network based on the super neural network after multiple rounds of training;
the training module is used for carrying out each training in a plurality of training rounds on the super neural network by adopting the following mode:
inputting the obtained sample image into a current super neural network for processing to obtain a prediction result corresponding to the sample image; generating loss information based on a prediction result, a labeling result, an index value representing the importance degree of each sub-network in each layer of network in the current super-neural network, network parameters of different sub-networks corresponding to each layer of network in the current super-neural network and a calculation cost constraint condition, and adjusting the network parameters of the different sub-networks in the current super-neural network and the index value representing the importance degree of each sub-network based on the loss information.
In a possible implementation manner, the generating module 81 is configured to generate the superneural network to be compressed in the following manner:
aiming at each layer of network in the initial neural network, carrying out multiple times of quantization on parameters in the network in the layer to obtain the parameters in the network corresponding to each time of quantization in the multiple times of quantization;
based on parameters in the network corresponding to each quantization in multiple times and the network structure of the layer of network, obtaining a plurality of sub-networks corresponding to the layer of network;
and forming the superneural network to be compressed based on a plurality of sub-networks corresponding to each layer of network in the multi-layer network.
In a possible implementation manner, the generating module 81 is configured to form the to-be-compressed superneural network based on a plurality of sub-networks corresponding to each layer of the multi-layer network respectively in the following manner;
initializing index values of importance degrees corresponding to a plurality of sub-networks corresponding to each network layer, and forming the to-be-compressed superneural network based on an initialization result and the plurality of sub-networks corresponding to each network layer in the multi-layer network.
In a possible implementation manner, the training module 82 is configured to adjust parameters in the network of the different sub-networks in the current super-neural network, and the index value characterizing the importance degree of each sub-network in the following manner:
Generating first loss information based on a prediction result and a labeling result of the sample image, index values representing importance degrees of each sub-network in each layer of network in the current super-neural network, and network internal parameters of different sub-networks corresponding to each layer of network in the current super-neural network, and adjusting the network internal parameters of the different sub-networks in the current super-neural network based on the first loss information;
generating second loss information based on the prediction result, the labeling result, an index value representing the importance degree of each sub-network in each layer of network in the current super-neural network, the adjusted network parameters of different sub-networks corresponding to each layer of network and the calculation cost constraint condition, and adjusting the index value representing the importance degree of each sub-network in the current super-neural network based on the second loss information.
In a possible implementation manner, the calculation cost constraint condition includes one or more of the following: maximum iteration round number, maximum average bit number threshold of parameters in each layer of network in the super-neural network, and preset barrier penalty coefficient.
In a possible implementation manner, the determining module 83 is configured to generate the compressed neural network based on the super neural network after multiple rounds of training in the following manner:
determining the weight of each sub-network in each layer of the super-neural network after multiple rounds of training based on the index value of the importance degree of each sub-network in each layer of the network after multiple rounds of training;
determining target subnetworks corresponding to each layer of network from a plurality of subnetworks corresponding to each layer of network respectively based on the weight of each subnetwork in each layer of network in the super-neural network after multiple rounds of training;
and forming the compressed neural network based on the target sub-networks respectively corresponding to the networks of all layers.
The process flow of each module in the apparatus and the interaction flow between the modules may be described with reference to the related descriptions in the above method embodiments, which are not described in detail herein.
In addition, referring to fig. 9, an embodiment of the present disclosure further provides an image processing apparatus including:
a first acquiring module 91, configured to acquire an image to be processed;
the first processing module 92 is configured to process the image to be processed by using a neural network generated by using the method for generating a neural network according to any embodiment of the present disclosure, so as to obtain a processing result of the image to be processed.
Referring to fig. 10, an embodiment of the present disclosure further provides an intelligent driving control apparatus, including:
a second acquisition module 101 for acquiring a road image;
the second processing module 102 is configured to process the road image by using the neural network generated by the neural network generating method provided by any embodiment of the present disclosure, so as to obtain a processing result of the road image;
and the control module 103 is used for controlling the intelligent driving equipment for acquiring the road image according to the processing result.
Referring to FIG. 11, embodiments of the present disclosure also provide a Field programmable gate array (Field-Programmable Gate Array, FPGA) based hardware platform, comprising: a processor 111, an external memory 112, a memory 113, and an FPGA operation unit 114;
the external memory 112 is configured to store network parameters of the neural network generated by using the method for generating the neural network provided in any embodiment of the present disclosure; wherein, the bit number of the network parameters of at least two network layers in the neural network is different;
the processor 111 is configured to read network parameters of the neural network into the memory 113, and input data on the memory 113 and an image to be processed into the FPGA operation unit 114;
The FPGA operation unit 114 is configured to perform operation processing according to the image to be processed and the network parameter, so as to obtain a processing result of the image.
The embodiment of the present disclosure further provides an electronic device 120, as shown in fig. 12, which is a schematic structural diagram of the electronic device 120 provided in the embodiment of the present disclosure, including: a processor 121, a memory 122, and a bus 123. The memory 122 stores machine-readable instructions executable by the processor 121 (e.g., execution instructions corresponding to an execution device of a face recognition task, etc.), and when the electronic device 120 is running, the processor 121 communicates with the memory 122 via the bus 123, and the machine-readable instructions when executed by the processor 11 perform the following processing:
generating an superneural network of the neural network to be compressed; the super neural network comprises a plurality of sub-networks corresponding to each layer of network in the multi-layer network respectively; wherein, the bit number of the parameters in the networks of different sub-networks corresponding to the same layer of network is different;
each of a plurality of rounds of training is performed on the superneural network and the compressed neural network is generated based on the plurality of rounds of trained superneural network in the following manner:
Inputting the obtained sample image into a current super neural network for processing to obtain a prediction result corresponding to the sample image;
generating loss information based on a prediction result, a labeling result, an index value representing the importance degree of each sub-network in each layer of network in the current super-neural network, network internal parameters of different sub-networks corresponding to each layer of network in the current super-neural network and a calculation cost constraint condition, and adjusting the network internal parameters of the different sub-networks in the current super-neural network and the index value representing the importance degree of each sub-network based on the loss information.
Alternatively, the machine readable instructions, when executed by the processor 121, perform the following: acquiring an image to be processed; and processing the image to be processed by using the neural network generated based on the neural network generation method provided by any embodiment of the disclosure, so as to obtain a processing result of the image to be processed.
Alternatively, the machine readable instructions, when executed by the processor 121, perform the following: acquiring a road image; processing the road image by using a neural network generated based on the neural network generation method provided by any embodiment of the disclosure, so as to obtain a processing result of the road image; and controlling the intelligent running equipment for acquiring the road image according to the processing result.
The disclosed embodiments also provide a computer-readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the neural network generation method described in the method embodiments described above, or performs the steps of the image processing method described in the method embodiments described above.
The computer program product of the method provided by the embodiments of the present disclosure includes a computer readable storage medium storing a program code, where the program code includes instructions for executing the steps of the method described in the foregoing method embodiments, and specifically reference may be made to the foregoing method embodiments, which are not repeated herein.
The disclosed embodiments also provide a computer program which, when executed by a processor, implements any of the methods of the previous embodiments. The computer program product may be realized in particular by means of hardware, software or a combination thereof. In an alternative embodiment, the computer program product is embodied as a computer storage medium, and in another alternative embodiment, the computer program product is embodied as a software product, such as a software development kit (Software Development Kit, SDK), or the like.
It will be clear to those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system and apparatus may refer to corresponding procedures in the foregoing method embodiments, which are not described herein again. In the several embodiments provided in the present disclosure, it should be understood that the disclosed systems, devices, and methods may be implemented in other manners. The above-described apparatus embodiments are merely illustrative, for example, the division of the units is merely a logical function division, and there may be other manners of division in actual implementation, and for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be through some communication interface, device or unit indirect coupling or communication connection, which may be in electrical, mechanical or other form.
The units described as separate units may or may not be physically separate, and units shown as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution of this embodiment.
In addition, each functional unit in each embodiment of the present disclosure may be integrated in one processing unit, or each unit may exist alone physically, or two or more units may be integrated in one unit.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a non-volatile computer readable storage medium executable by a processor. Based on such understanding, the technical solution of the present disclosure may be embodied in essence or a part contributing to the prior art or a part of the technical solution, or in the form of a software product stored in a storage medium, including several instructions for causing an electronic device (which may be a personal computer, a server, or a network device, etc.) to perform all or part of the steps of the method described in the embodiments of the present disclosure. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (Random Access Memory, RAM), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Finally, it should be noted that: the foregoing examples are merely specific embodiments of the present disclosure, and are not intended to limit the scope of the disclosure, but the present disclosure is not limited thereto, and those skilled in the art will appreciate that while the foregoing examples are described in detail, it is not limited to the disclosure: any person skilled in the art, within the technical scope of the disclosure of the present disclosure, may modify or easily conceive changes to the technical solutions described in the foregoing embodiments, or make equivalent substitutions for some of the technical features thereof; such modifications, changes or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the disclosure, and are intended to be included within the scope of the present disclosure. Therefore, the protection scope of the present disclosure shall be subject to the protection scope of the claims.

Claims (13)

1. A method for generating a neural network, the method comprising:
generating an ultra-neural network to be compressed; the super neural network comprises a plurality of sub-networks corresponding to each layer of network in the multi-layer network respectively; wherein, the bit number of the parameters in the networks of different sub-networks corresponding to the same layer of network is different;
each of a plurality of rounds of training is performed on the superneural network and a compressed neural network is generated based on the superneural network after the plurality of rounds of training in the following manner:
inputting the obtained sample image into a current super neural network for processing to obtain a prediction result corresponding to the sample image; wherein the sample image comprises a road image;
generating loss information based on a prediction result, a labeling result, an index value representing the importance degree of each sub-network in each layer of network in the current super-neural network, network internal parameters of different sub-networks corresponding to each layer of network in the current super-neural network and a calculation cost constraint condition, and adjusting the network internal parameters of the different sub-networks in the current super-neural network and the index value representing the importance degree of each sub-network based on the loss information;
The compressed neural network is used for processing road images in the driving process and controlling intelligent driving equipment for acquiring the road images based on processing results;
the generating the superneural network to be compressed comprises the following steps:
aiming at each layer of network in the multi-layer network in the initial neural network, carrying out multiple times of quantization on parameters in the network in the layer to obtain the parameters in the network corresponding to each time of quantization in the multiple times of quantization;
based on parameters in the network corresponding to each quantization in multiple times and the network structure of the layer of network, obtaining a plurality of sub-networks corresponding to the layer of network;
and forming the superneural network to be compressed based on a plurality of sub-networks corresponding to each layer of network in the multi-layer network.
2. The method for generating a neural network according to claim 1, wherein the forming the superneural network to be compressed based on a plurality of sub-networks respectively corresponding to each layer of the multi-layer network includes;
initializing index values of importance degrees corresponding to a plurality of sub-networks corresponding to each network layer, and forming the to-be-compressed superneural network based on an initialization result and the plurality of sub-networks corresponding to each network layer in the multi-layer network.
3. The method for generating a neural network according to claim 1 or 2, wherein said adjusting in-network parameters of the different sub-networks in the current super-neural network and an index value characterizing the importance degree of each sub-network includes:
generating first loss information based on a prediction result and a labeling result of the sample image, index values representing importance degrees of each sub-network in each layer of network in the current super-neural network, and network internal parameters of different sub-networks corresponding to each layer of network in the current super-neural network, and adjusting the network internal parameters of the different sub-networks in the current super-neural network based on the first loss information;
generating second loss information based on the prediction result, the labeling result, an index value representing the importance degree of each sub-network in each layer of network in the current super-neural network, the adjusted network parameters of different sub-networks corresponding to each layer of network and the calculation cost constraint condition, and adjusting the index value representing the importance degree of each sub-network in the current super-neural network based on the second loss information.
4. The method of generating a neural network according to claim 1 or 2, wherein the computational cost constraint comprises one or more of: the maximum iteration round number, the maximum threshold value of average bit number of parameters in each layer of network in the super-neural network and the preset barrier penalty coefficient.
5. The method for generating a neural network according to claim 1 or 2, wherein generating the compressed neural network based on the super neural network after a plurality of rounds of training comprises:
determining the weight of each sub-network in each layer of the super-neural network after multiple rounds of training based on the index value of the importance degree of each sub-network in each layer of the network after multiple rounds of training;
determining target subnetworks corresponding to each layer of network from a plurality of subnetworks corresponding to each layer of network respectively based on the weight of each subnetwork in each layer of network in the super-neural network after multiple rounds of training;
and forming the compressed neural network based on the target sub-networks respectively corresponding to the networks of all layers.
6. An image processing method, comprising:
acquiring an image to be processed;
Processing the image to be processed by using the neural network generated based on any one of the methods of claims 1-5 to obtain a processing result of the image to be processed.
7. An intelligent travel control method is characterized by comprising the following steps:
acquiring a road image;
processing the road image by using a neural network generated based on any one of the methods of claims 1-5 to obtain a processing result of the road image;
and controlling the intelligent running equipment for acquiring the road image according to the processing result.
8. A neural network generation apparatus, comprising:
the generation module is used for generating the superneural network to be compressed; the super neural network comprises a plurality of sub-networks corresponding to each layer of network in the multi-layer network respectively; wherein, the bit number of the parameters in the networks of different sub-networks corresponding to the same layer of network is different;
the training module is used for carrying out each round of training in multiple rounds of training on the super neural network;
the determining module is used for generating a compressed neural network based on the super neural network after multiple rounds of training; the compressed neural network is used for processing road images in the driving process and controlling intelligent driving equipment for acquiring the road images based on processing results;
The training module is used for carrying out each training in a plurality of training rounds on the super neural network by adopting the following mode:
inputting the obtained sample image into a current super neural network for processing to obtain a prediction result corresponding to the sample image; wherein the sample image comprises a road image;
generating loss information based on a prediction result, a labeling result, an index value representing the importance degree of each sub-network in each layer of network in the current super-neural network, network internal parameters of different sub-networks corresponding to each layer of network in the current super-neural network and a calculation cost constraint condition, and adjusting the network internal parameters of the different sub-networks in the current super-neural network and the index value representing the importance degree of each sub-network based on the loss information;
the generation module is used for, when the super neural network to be compressed is generated:
aiming at each layer of network in the multi-layer network in the initial neural network, carrying out multiple times of quantization on parameters in the network in the layer to obtain the parameters in the network corresponding to each time of quantization in the multiple times of quantization;
based on parameters in the network corresponding to each quantization in multiple times and the network structure of the layer of network, obtaining a plurality of sub-networks corresponding to the layer of network;
And forming the superneural network to be compressed based on a plurality of sub-networks corresponding to each layer of network in the multi-layer network.
9. An image processing apparatus, comprising:
the first acquisition module is used for acquiring an image to be processed;
the first processing module is used for processing the image to be processed by utilizing the neural network generated by the method according to any one of claims 1-5 to obtain a processing result of the image to be processed.
10. An intelligent travel control device, comprising:
the second acquisition module is used for acquiring road images;
a second processing module, configured to process the road image by using the neural network generated by any one of the methods according to claims 1-5, to obtain a processing result of the road image;
and the control module is used for controlling the intelligent driving equipment for acquiring the road image according to the processing result.
11. A hardware platform based on a field programmable gate array FPGA, comprising: the device comprises a processor, an external memory, a memory and an FPGA operation unit;
the external memory is used for storing network parameters of the neural network generated by the method of any one of claims 1-5; wherein, the bit number of the network parameters of at least two network layers in the neural network is different;
The processor is used for reading the network parameters of the neural network into the memory and inputting the data on the memory and the image to be processed into the FPGA operation unit;
and the FPGA operation unit is used for performing operation processing according to the image to be processed and the network parameters to obtain a processing result of the image.
12. An electronic device, comprising: a processor, a memory and a bus, the memory storing machine readable instructions executable by the processor, the processor and the memory communicating over the bus when the electronic device is running, the machine readable instructions when executed by the processor performing the method of generating a neural network according to any one of claims 1 to 5, or performing the image processing method according to claim 6, or performing the intelligent travel control method according to claim 7.
13. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a computer program which, when executed by a processor, performs the method of generating a neural network according to any one of claims 1 to 5, or performs the image processing method according to claim 6, or performs the intelligent travel control method according to claim 7.
CN202010075144.9A 2020-01-22 2020-01-22 Neural network generation, image processing and intelligent driving control method and device Active CN111291866B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010075144.9A CN111291866B (en) 2020-01-22 2020-01-22 Neural network generation, image processing and intelligent driving control method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010075144.9A CN111291866B (en) 2020-01-22 2020-01-22 Neural network generation, image processing and intelligent driving control method and device

Publications (2)

Publication Number Publication Date
CN111291866A CN111291866A (en) 2020-06-16
CN111291866B true CN111291866B (en) 2024-03-26

Family

ID=71029926

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010075144.9A Active CN111291866B (en) 2020-01-22 2020-01-22 Neural network generation, image processing and intelligent driving control method and device

Country Status (1)

Country Link
CN (1) CN111291866B (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106709875A (en) * 2016-12-30 2017-05-24 北京工业大学 Compressed low-resolution image restoration method based on combined deep network
CN106791836A (en) * 2016-12-02 2017-05-31 深圳市唯特视科技有限公司 It is a kind of to be based on a pair of methods of the reduction compression of images effect of Multi net voting
WO2019042139A1 (en) * 2017-08-29 2019-03-07 京东方科技集团股份有限公司 Image processing method, image processing apparatus, and a neural network training method

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107767343B (en) * 2017-11-09 2021-08-31 京东方科技集团股份有限公司 Image processing method, processing device and processing equipment

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106791836A (en) * 2016-12-02 2017-05-31 深圳市唯特视科技有限公司 It is a kind of to be based on a pair of methods of the reduction compression of images effect of Multi net voting
CN106709875A (en) * 2016-12-30 2017-05-24 北京工业大学 Compressed low-resolution image restoration method based on combined deep network
WO2019042139A1 (en) * 2017-08-29 2019-03-07 京东方科技集团股份有限公司 Image processing method, image processing apparatus, and a neural network training method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
张娜 ; 秦品乐 ; 曾建潮 ; 李启 ; .基于密集神经网络的灰度图像着色算法.计算机应用.2019,(06),全文. *
耿丽丽 ; 牛保宁 ; .深度神经网络模型压缩综述.计算机科学与探索.(09),全文. *

Also Published As

Publication number Publication date
CN111291866A (en) 2020-06-16

Similar Documents

Publication Publication Date Title
EP3543917B1 (en) Dynamic adaptation of deep neural networks
US11429862B2 (en) Dynamic adaptation of deep neural networks
Choi et al. Pact: Parameterized clipping activation for quantized neural networks
Jin et al. Adabits: Neural network quantization with adaptive bit-widths
Sakr et al. Analytical guarantees on numerical precision of deep neural networks
Faraone et al. Syq: Learning symmetric quantization for efficient deep neural networks
WO2022006919A1 (en) Activation fixed-point fitting-based method and system for post-training quantization of convolutional neural network
US12039448B2 (en) Selective neural network pruning by masking filters using scaling factors
Roth et al. Resource-efficient neural networks for embedded systems
CN113326930B (en) Data processing method, neural network training method, related device and equipment
Boo et al. Structured sparse ternary weight coding of deep neural networks for efficient hardware implementations
CN105447498A (en) A client device configured with a neural network, a system and a server system
CN110298446B (en) Deep neural network compression and acceleration method and system for embedded system
US11914670B2 (en) Methods and systems for product quantization-based compression of a matrix
CN114071141A (en) Image processing method and equipment
Kundu et al. Ternary residual networks
CN111291866B (en) Neural network generation, image processing and intelligent driving control method and device
Roth et al. Resource-Efficient Neural Networks for Embedded Systems
Paul et al. Non-iterative online sequential learning strategy for autoencoder and classifier
Bilal et al. Fast codebook generation using pattern based masking algorithm for image compression
Marusic et al. Adaptive prediction for lossless image compression
CN115905546B (en) Graph convolution network literature identification device and method based on resistive random access memory
Aref et al. Robust deep reinforcement learning for interference avoidance in wideband spectrum
US12033070B2 (en) Low resource computational block for a trained neural network
CN115759192A (en) Neural network acceleration method, device, equipment, chip and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant