US20170337472A1 - Passive pruning of filters in a convolutional neural network - Google Patents
Passive pruning of filters in a convolutional neural network Download PDFInfo
- Publication number
- US20170337472A1 US20170337472A1 US15/595,049 US201715595049A US2017337472A1 US 20170337472 A1 US20170337472 A1 US 20170337472A1 US 201715595049 A US201715595049 A US 201715595049A US 2017337472 A1 US2017337472 A1 US 2017337472A1
- Authority
- US
- United States
- Prior art keywords
- weights
- neural network
- layer
- block
- training
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 238000013527 convolutional neural network Methods 0.000 title claims description 48
- 238000013138 pruning Methods 0.000 title claims description 33
- 238000012549 training Methods 0.000 claims abstract description 48
- 238000013528 artificial neural network Methods 0.000 claims abstract description 39
- 238000000034 method Methods 0.000 claims abstract description 22
- 238000010200 validation analysis Methods 0.000 claims description 5
- 230000015654 memory Effects 0.000 description 13
- 244000141353 Prunus domestica Species 0.000 description 5
- 238000010586 diagram Methods 0.000 description 5
- 210000002569 neuron Anatomy 0.000 description 5
- 238000012545 processing Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 4
- 230000008569 process Effects 0.000 description 4
- 238000004590 computer program Methods 0.000 description 3
- 230000009471 action Effects 0.000 description 2
- 230000007423 decrease Effects 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 239000004065 semiconductor Substances 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 230000001133 acceleration Effects 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000006835 compression Effects 0.000 description 1
- 238000007906 compression Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000001537 neural effect Effects 0.000 description 1
- 230000004297 night vision Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 210000004205 output neuron Anatomy 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 230000004043 responsiveness Effects 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 230000032258 transport Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/082—Learning methods modifying the architecture, e.g. adding, deleting or silencing nodes or connections
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G06N3/0635—
Definitions
- the present invention relates to neural networks and, more particularly, to filter pruning in convolutional neural networks.
- CNNs convolutional neural networks
- the cost of computing inferences increases with the number of parameters and convolution operations involved. These computational costs are particularly relevant when dealing with embedded sensors and mobile devices where computational and power resources are limited. High inference costs post a similar barrier in contexts where high responsiveness and low latency are needed.
- a method of training a neural network includes training a neural network based on training data. Weights of a layer of the neural network are multiplied by an attrition factor. A block of weights is pruned from the layer if the block of weights in the layer has a contribution to an output of the layer that is below a threshold.
- a method of training a neural network includes training a convolutional neural network based on training data. Weights of a layer of the neural network are multiplied by a number less than one. A block of weights in the layer is pruned, a filter corresponding to the block of weights in a subsequent layer in the neural network is pruned, and a block of weights that corresponds to the pruned filter in a subsequent layer in the neural network is pruned if the block of weights in the layer has a contribution to an output of the layer that is below a threshold. The contribution of a block of weights to the output of the layer is calculated as a percentage of a sum of absolute weights of the weights in the layer made up by a sum of absolute weights of the weights in the block of weights.
- a system for training a neural network includes a neural network.
- a training module is configured to train the neural network based on training data.
- a pruning module is configured to multiply weights of a layer of the neural network by an attrition factor and to prune a block of weights from the layer if the block of weights in the layer has a contribution to an output of the layer that is below a threshold.
- FIG. 1 is a diagram illustrating the pruning of a block of weights and corresponding filter from a convolutional neural network in accordance with an embodiment of the present invention
- FIG. 2 is a block/flow diagram of a method for pruning weights and filters from a convolutional neural network in accordance with an embodiment of the present invention
- FIG. 3 is a block diagram of a convolutional neural network system that prunes the convolutional neural network in accordance with an embodiment of the present invention.
- FIG. 4 is a block diagram of a security system based on pruned convolutional neural network classifiers in accordance with an embodiment of the present invention.
- CNNs convolutional neural networks
- the present embodiments reduce the size of all weights between each iteration, driving the weight values toward zero. Once a set of weights falls below a threshold, the weights are removed from the CNN along with associated kernels, thereby reducing the computational cost of using the pruned CNN without increasing the sparsity of the CNN. Because sparsity does not increase, the present embodiments do not necessitate the use of sparse libraries or specialized hardware.
- the number of filters that are pruned correlates directly with computational acceleration by reducing the number of matrix multiplications.
- CNNs are extensively used in image and video recognition, natural language processing, and other machine learning processes.
- CNNs use multi-dimensional layers of weights to create filters that have small spatial coverage but that extend through the full depth of an input volume.
- the individual pixels represent the width and height of the input, while the number of colors (e.g., red, green, and blue) represent the depth.
- a filter in a CNN being used to process image data would apply to a limited number of pixels but would apply to all of the color information for those pixels.
- the filter is convolved across the width and height of the input volume, with dot products being calculated between entries of the filter and the input at each position.
- the present embodiments actively drive weights to zero during training and prune convolutional filters associated with low-magnitude weights.
- convolutional filters that are not maintained during training are driven down to zero. This results in an efficient network that involves fewer convolutional operations.
- An input layer 102 is provided with neurons that perform a processing function on an input volume 102 that may represent, for example, an image, a frame of video, a document, or any other appropriate set of multi-dimensional input data.
- the input layer 102 in this example includes three dimensions (e.g., x, y, and n).
- the neurons can be grouped into n filters, each filter having a dimensions x and y.
- the outputs of the input layer 102 are provided to a first array of weights 104 .
- a an attrition factor
- those weights which are not enhanced by the training process will eventually decrease in magnitude until they fall below a threshold.
- a column 106 has fallen below the threshold, representing weights which do not contribute to the accuracy of the output. This column 106 is pruned from the first array of weights 104 .
- the first array of weights 104 provides its output to a layer of hidden neurons 108 .
- the pruned column 106 corresponds to one filter 110 that is pruned from the layer of hidden neurons 108 .
- the layer of hidden neurons 108 perform a computational function and provide an output to a second array of weights 112 .
- the pruned filter 108 in turn corresponds to a row 114 of the second array of weights 112 , which is also pruned.
- the second array of weights provides its output to a layer of output neurons 116 (or, alternatively, additional hidden layers) which performs a computational function and provides the output of the CNN.
- the active pruning of weights significantly reduces the number of computations needed to produce the output.
- FIG. 2 a method for training a CNN is shown.
- the present embodiments perform training by performing a forward pass 202 through a CNN to provide a calculation.
- Block 210 determines whether any block of weight values have dropped below a threshold contribution to the output (e.g., 10%).
- the contribution to the output of a block of weights may be determined as a sum of absolute weight values.
- the sum of absolute weight values for each block of weights can then be compared to the total sum of absolute weight values to determine the contribution of that block of weights.
- Block 212 prunes those weights, and any associated filters and weights on other layers.
- Block 214 determines whether training is complete (e.g., whether the output of the trained CNN matches an expected output). If so, training completes. If not, processing returns to block 202 .
- any of the filters in any convolutional layer can be easily pruned.
- pruning may not be straightforward.
- Complex architectures may impose restrictions, such that filters need to be pruned carefully.
- correspondences between filters may necessitate the pruning of filters to permit pruning of a given convolutional layer.
- the above-described method is applied on a per-layer basis until the entire CNN is trained. At each layer, pruning may be repeated until the validation error raises beyond some threshold.
- the validation set is a set of data that have not been used in training, but instead are processed by the network in forward passes to determine how accurately the network classifies data it has not seen during training.
- the error rate on the validation set generally decreases. Once the error rate stops decreasing, the neural net is generally considered to be trained. Thus, pruning as part of the training process stops at this point as well.
- Embodiments described herein may be entirely hardware, entirely software or including both hardware and software elements.
- the present invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
- Embodiments may include a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system.
- a computer-usable or computer readable medium may include any apparatus that stores, communicates, propagates, or transports the program for use by or in connection with the instruction execution system, apparatus, or device.
- the medium can be magnetic, optical, electronic, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium.
- the medium may include a computer-readable storage medium such as a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk, etc.
- Each computer program may be tangibly stored in a machine-readable storage media or device (e.g., program memory or magnetic disk) readable by a general or special purpose programmable computer, for configuring and controlling operation of a computer when the storage media or device is read by the computer to perform the procedures described herein.
- the inventive system may also be considered to be embodied in a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner to perform the functions described herein.
- a data processing system suitable for storing and/or executing program code may include at least one processor coupled directly or indirectly to memory elements through a system bus.
- the memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code to reduce the number of times code is retrieved from bulk storage during execution.
- I/O devices including but not limited to keyboards, displays, pointing devices, etc. may be coupled to the system either directly or through intervening I/O controllers.
- Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks.
- Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
- the system 300 includes a hardware processor 302 and memory 304 .
- a CNN 306 is implemented either in hardware or in software.
- the CNN 306 takes input data and generates an output based on the filters and weights that make up the CNN's configuration.
- the system 300 furthermore includes one or more functional modules that may, in some embodiments, be implemented as software that is stored in the memory 304 and executed by hardware processor 302 .
- the functional modules may be implemented as one or more discrete hardware components in the form of, e.g., application specific integrated chips or field programmable gate arrays.
- a training module 308 trains the CNN 306 based on training data.
- the training data includes one set of data used to train the CNN 306 and another set of data used to test the CNN 306 , with differences between the outcome of the 306 and expected outcome from the testing data being used to adjust the CNN 306 .
- a pruning module 310 actively moves the weights of the CNN toward zero in each round of training and prunes filters from the CNN 306 to reduce the computational complexity.
- the training module 308 and the pruning module 310 work together as described above to ensure that the output of the CNN 306 is not significantly degraded by pruning.
- the security system 400 includes a hardware processor 402 and a memory 404 .
- One or more sensors 406 provide data about a monitored area to the security system 400 .
- the sensors 406 may include, for example, a camera, a night vision camera (e.g., operating in infrared), door and window sensors, acoustic sensors, temperature sensors, and any other sensors that collect raw data regarding the monitored area.
- the CNN system 300 is included in the security system 400 .
- the CNN system 300 accepts information that is gathered by the sensors 406 and stored in memory 404 , outputting security status information.
- the CNN system 300 may include its own separate processor 302 and memory 304 or may, alternatively, omit those feature in favor of using the processor 402 and memory 404 of the security system 400 .
- An alert module 408 accepts the output of the CNN system 300 .
- the alert module 408 determines if the state of the area being monitored has changed and, if so, whether an alert should be issued.
- the CNN system 300 may detect movement or the presence of a person or object in a place where it does not belong.
- the CNN system 300 may detect an intrusion event.
- the alert module 408 provides an appropriate alert to one or more of the user and a response organization (e.g., medical, police, or fire).
- the alert module 408 provide the alert by any appropriate communications mechanism, including by wired or wireless network connections or by a user interface.
- a control module 410 works with the alert module 408 to perform appropriate security management actions. For example, if an unauthorized person is detected by the CNN system 300 , the control module 410 may automatically increase a security level and perform such actions as locking doors, increasing sensor sensitivity, and changing the sensitivity of the alert module 408 .
- the CNN system 300 can provide accurate results with relatively low computational complexity, making it possible to implement the security system 400 on lower-power hardware.
- the processor 402 need not be a high-powered device and may in particular be implemented in an embedded environment.
Abstract
Description
- This application claims priority to U.S. Patent Application No. 62/338,573, filed on May 19, 2016, and 62/338,797, filed on May 19, 2016, incorporated herein by reference in their entirety.
- The present invention relates to neural networks and, more particularly, to filter pruning in convolutional neural networks.
- As convolutional neural networks (CNNs) grow deeper (i.e., involve progressively more layers), the cost of computing inferences increases with the number of parameters and convolution operations involved. These computational costs are particularly relevant when dealing with embedded sensors and mobile devices where computational and power resources are limited. High inference costs post a similar barrier in contexts where high responsiveness and low latency are needed.
- Existing approaches to reducing the storage and computation costs involve model compression by pruning weights with small magnitudes and then retraining the model. However, pruning parameters does not necessarily reduce computation time, because the majority of the parameters that are removed are from fully connected layers where the computation cost is low. In addition, the resulting sparse models lack optimizations that make computations practical.
- A method of training a neural network includes training a neural network based on training data. Weights of a layer of the neural network are multiplied by an attrition factor. A block of weights is pruned from the layer if the block of weights in the layer has a contribution to an output of the layer that is below a threshold.
- A method of training a neural network includes training a convolutional neural network based on training data. Weights of a layer of the neural network are multiplied by a number less than one. A block of weights in the layer is pruned, a filter corresponding to the block of weights in a subsequent layer in the neural network is pruned, and a block of weights that corresponds to the pruned filter in a subsequent layer in the neural network is pruned if the block of weights in the layer has a contribution to an output of the layer that is below a threshold. The contribution of a block of weights to the output of the layer is calculated as a percentage of a sum of absolute weights of the weights in the layer made up by a sum of absolute weights of the weights in the block of weights.
- A system for training a neural network includes a neural network. A training module is configured to train the neural network based on training data. A pruning module is configured to multiply weights of a layer of the neural network by an attrition factor and to prune a block of weights from the layer if the block of weights in the layer has a contribution to an output of the layer that is below a threshold.
- These and other features and advantages will become apparent from the following detailed description of illustrative embodiments thereof, which is to be read in connection with the accompanying drawings.
- The disclosure will provide details in the following description of preferred embodiments with reference to the following figures wherein:
-
FIG. 1 is a diagram illustrating the pruning of a block of weights and corresponding filter from a convolutional neural network in accordance with an embodiment of the present invention; -
FIG. 2 is a block/flow diagram of a method for pruning weights and filters from a convolutional neural network in accordance with an embodiment of the present invention; -
FIG. 3 is a block diagram of a convolutional neural network system that prunes the convolutional neural network in accordance with an embodiment of the present invention; and -
FIG. 4 is a block diagram of a security system based on pruned convolutional neural network classifiers in accordance with an embodiment of the present invention. - In accordance with the present principles, systems and methods are provided for active pruning of filters in convolutional neural networks (CNNs). During training, the present embodiments reduce the size of all weights between each iteration, driving the weight values toward zero. Once a set of weights falls below a threshold, the weights are removed from the CNN along with associated kernels, thereby reducing the computational cost of using the pruned CNN without increasing the sparsity of the CNN. Because sparsity does not increase, the present embodiments do not necessitate the use of sparse libraries or specialized hardware. The number of filters that are pruned correlates directly with computational acceleration by reducing the number of matrix multiplications.
- CNNs are extensively used in image and video recognition, natural language processing, and other machine learning processes. CNNs use multi-dimensional layers of weights to create filters that have small spatial coverage but that extend through the full depth of an input volume. To use the example of an image input, the individual pixels represent the width and height of the input, while the number of colors (e.g., red, green, and blue) represent the depth. Thus, a filter in a CNN being used to process image data would apply to a limited number of pixels but would apply to all of the color information for those pixels. The filter is convolved across the width and height of the input volume, with dot products being calculated between entries of the filter and the input at each position.
- The present embodiments actively drive weights to zero during training and prune convolutional filters associated with low-magnitude weights. Thus, convolutional filters that are not maintained during training are driven down to zero. This results in an efficient network that involves fewer convolutional operations.
- Referring now in detail to the figures in which like numerals represent the same or similar elements and initially to
FIG. 1 , a diagram of active pruning of a CNN is shown. Aninput layer 102 is provided with neurons that perform a processing function on aninput volume 102 that may represent, for example, an image, a frame of video, a document, or any other appropriate set of multi-dimensional input data. Theinput layer 102 in this example includes three dimensions (e.g., x, y, and n). The neurons can be grouped into n filters, each filter having a dimensions x and y. The outputs of theinput layer 102 are provided to a first array ofweights 104. - During training, the values of the
weights 104 are multiplied by an attrition factor a that is less than 1 (e.g., a=0.9999). Thus, during each iteration of training, those weights which are not enhanced by the training process will eventually decrease in magnitude until they fall below a threshold. In this example, acolumn 106 has fallen below the threshold, representing weights which do not contribute to the accuracy of the output. Thiscolumn 106 is pruned from the first array ofweights 104. - The first array of
weights 104 provides its output to a layer ofhidden neurons 108. Thepruned column 106 corresponds to onefilter 110 that is pruned from the layer ofhidden neurons 108. The layer ofhidden neurons 108 perform a computational function and provide an output to a second array ofweights 112. - The
pruned filter 108 in turn corresponds to arow 114 of the second array ofweights 112, which is also pruned. The second array of weights provides its output to a layer of output neurons 116 (or, alternatively, additional hidden layers) which performs a computational function and provides the output of the CNN. The active pruning of weights significantly reduces the number of computations needed to produce the output. - Referring now to
FIG. 2 , a method for training a CNN is shown. The present embodiments perform training by performing aforward pass 202 through a CNN to provide a calculation. This forward pass can be expressed as, e.g., K=I×W, where W represents an array of weights, data input to the array of weights, and K represents the output of the array of weights.Block 204 then performs a backward pass, B=G×W, where G is a gradient input to the array of weights W, providing the back-propagatedgradient B. Block 206 performs a learning pass ΔW=I×G, which adjusts the weights in the weight array by ΔW. - After performing the learning pass, the weights are driven toward zero in
block 208 by multiplying the weights by an attrition factor a that is less than zero (e.g., a=0.9999).Block 210 determines whether any block of weight values have dropped below a threshold contribution to the output (e.g., 10%). The contribution to the output of a block of weights (e.g., a column or row of weights) may be determined as a sum of absolute weight values. The sum of absolute weight values for each block of weights can then be compared to the total sum of absolute weight values to determine the contribution of that block of weights. - If the block of weight values have dropped below the threshold level of contribution toward the output, block 212 prunes those weights, and any associated filters and weights on other layers.
Block 214 then determines whether training is complete (e.g., whether the output of the trained CNN matches an expected output). If so, training completes. If not, processing returns to block 202. - For simpler CNNs, any of the filters in any convolutional layer can be easily pruned. However, for complex network architectures, pruning may not be straightforward. Complex architectures may impose restrictions, such that filters need to be pruned carefully. In one example, correspondences between filters may necessitate the pruning of filters to permit pruning of a given convolutional layer.
- The above-described method is applied on a per-layer basis until the entire CNN is trained. At each layer, pruning may be repeated until the validation error raises beyond some threshold. When training neural networks, the validation set is a set of data that have not been used in training, but instead are processed by the network in forward passes to determine how accurately the network classifies data it has not seen during training. During training, the error rate on the validation set generally decreases. Once the error rate stops decreasing, the neural net is generally considered to be trained. Thus, pruning as part of the training process stops at this point as well.
- Embodiments described herein may be entirely hardware, entirely software or including both hardware and software elements. In a preferred embodiment, the present invention is implemented in software, which includes but is not limited to firmware, resident software, microcode, etc.
- Embodiments may include a computer program product accessible from a computer-usable or computer-readable medium providing program code for use by or in connection with a computer or any instruction execution system. A computer-usable or computer readable medium may include any apparatus that stores, communicates, propagates, or transports the program for use by or in connection with the instruction execution system, apparatus, or device. The medium can be magnetic, optical, electronic, electromagnetic, infrared, or semiconductor system (or apparatus or device) or a propagation medium. The medium may include a computer-readable storage medium such as a semiconductor or solid state memory, magnetic tape, a removable computer diskette, a random access memory (RAM), a read-only memory (ROM), a rigid magnetic disk and an optical disk, etc.
- Each computer program may be tangibly stored in a machine-readable storage media or device (e.g., program memory or magnetic disk) readable by a general or special purpose programmable computer, for configuring and controlling operation of a computer when the storage media or device is read by the computer to perform the procedures described herein. The inventive system may also be considered to be embodied in a computer-readable storage medium, configured with a computer program, where the storage medium so configured causes a computer to operate in a specific and predefined manner to perform the functions described herein.
- A data processing system suitable for storing and/or executing program code may include at least one processor coupled directly or indirectly to memory elements through a system bus. The memory elements can include local memory employed during actual execution of the program code, bulk storage, and cache memories which provide temporary storage of at least some program code to reduce the number of times code is retrieved from bulk storage during execution. Input/output or I/O devices (including but not limited to keyboards, displays, pointing devices, etc.) may be coupled to the system either directly or through intervening I/O controllers.
- Network adapters may also be coupled to the system to enable the data processing system to become coupled to other data processing systems or remote printers or storage devices through intervening private or public networks. Modems, cable modem and Ethernet cards are just a few of the currently available types of network adapters.
- Referring now to
FIG. 3 , aCNN system 300 is shown. Thesystem 300 includes ahardware processor 302 andmemory 304. ACNN 306 is implemented either in hardware or in software. TheCNN 306 takes input data and generates an output based on the filters and weights that make up the CNN's configuration. Thesystem 300 furthermore includes one or more functional modules that may, in some embodiments, be implemented as software that is stored in thememory 304 and executed byhardware processor 302. In alternative embodiments, the functional modules may be implemented as one or more discrete hardware components in the form of, e.g., application specific integrated chips or field programmable gate arrays. - In particular, a
training module 308 trains theCNN 306 based on training data. The training data includes one set of data used to train theCNN 306 and another set of data used to test theCNN 306, with differences between the outcome of the 306 and expected outcome from the testing data being used to adjust theCNN 306. Apruning module 310 actively moves the weights of the CNN toward zero in each round of training and prunes filters from theCNN 306 to reduce the computational complexity. Thetraining module 308 and thepruning module 310 work together as described above to ensure that the output of theCNN 306 is not significantly degraded by pruning. - Referring now to
FIG. 4 , asecurity system 400 is shown as one possible implementation of the present embodiments. Thesecurity system 400 includes ahardware processor 402 and amemory 404. One ormore sensors 406 provide data about a monitored area to thesecurity system 400. Thesensors 406 may include, for example, a camera, a night vision camera (e.g., operating in infrared), door and window sensors, acoustic sensors, temperature sensors, and any other sensors that collect raw data regarding the monitored area. - The
CNN system 300 is included in thesecurity system 400. TheCNN system 300 accepts information that is gathered by thesensors 406 and stored inmemory 404, outputting security status information. TheCNN system 300 may include its ownseparate processor 302 andmemory 304 or may, alternatively, omit those feature in favor of using theprocessor 402 andmemory 404 of thesecurity system 400. - An
alert module 408 accepts the output of theCNN system 300. Thealert module 408 determines if the state of the area being monitored has changed and, if so, whether an alert should be issued. For example, theCNN system 300 may detect movement or the presence of a person or object in a place where it does not belong. Alternatively, theCNN system 300 may detect an intrusion event. In such a situation, thealert module 408 provides an appropriate alert to one or more of the user and a response organization (e.g., medical, police, or fire). Thealert module 408 provide the alert by any appropriate communications mechanism, including by wired or wireless network connections or by a user interface. - A
control module 410 works with thealert module 408 to perform appropriate security management actions. For example, if an unauthorized person is detected by theCNN system 300, thecontrol module 410 may automatically increase a security level and perform such actions as locking doors, increasing sensor sensitivity, and changing the sensitivity of thealert module 408. - Because the
CNN system 300 has been pruned, theCNN system 300 can provide accurate results with relatively low computational complexity, making it possible to implement thesecurity system 400 on lower-power hardware. In particular, theprocessor 402 need not be a high-powered device and may in particular be implemented in an embedded environment. - The foregoing is to be understood as being in every respect illustrative and exemplary, but not restrictive, and the scope of the invention disclosed herein is not to be determined from the Detailed Description, but rather from the claims as interpreted according to the full breadth permitted by the patent laws. It is to be understood that the embodiments shown and described herein are only illustrative of the principles of the present invention and that those skilled in the art may implement various modifications without departing from the scope and spirit of the invention. Those skilled in the art could implement various other feature combinations without departing from the scope and spirit of the invention. Having thus described aspects of the invention, with the details and particularity required by the patent laws, what is claimed and desired protected by Letters Patent is set forth in the appended claims.
Claims (19)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US15/595,049 US10740676B2 (en) | 2016-05-19 | 2017-05-15 | Passive pruning of filters in a convolutional neural network |
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201662338573P | 2016-05-19 | 2016-05-19 | |
US201662338797P | 2016-05-19 | 2016-05-19 | |
US15/595,049 US10740676B2 (en) | 2016-05-19 | 2017-05-15 | Passive pruning of filters in a convolutional neural network |
Publications (2)
Publication Number | Publication Date |
---|---|
US20170337472A1 true US20170337472A1 (en) | 2017-11-23 |
US10740676B2 US10740676B2 (en) | 2020-08-11 |
Family
ID=60330282
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/595,049 Active 2039-05-26 US10740676B2 (en) | 2016-05-19 | 2017-05-15 | Passive pruning of filters in a convolutional neural network |
Country Status (1)
Country | Link |
---|---|
US (1) | US10740676B2 (en) |
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108388537A (en) * | 2018-03-06 | 2018-08-10 | 上海熠知电子科技有限公司 | A kind of convolutional neural networks accelerator and method |
CN108846476A (en) * | 2018-07-13 | 2018-11-20 | 电子科技大学 | A kind of intelligent terminal security level classification method based on convolutional neural networks |
CN109598340A (en) * | 2018-11-15 | 2019-04-09 | 北京知道创宇信息技术有限公司 | Method of cutting out, device and the storage medium of convolutional neural networks |
CN110097177A (en) * | 2019-05-15 | 2019-08-06 | 电科瑞达(成都)科技有限公司 | A kind of network pruning method based on pseudo- twin network |
CN110276452A (en) * | 2019-06-28 | 2019-09-24 | 北京中星微电子有限公司 | Pruning method, device, equipment and the artificial intelligence chip of neural network model |
CN110287857A (en) * | 2019-06-20 | 2019-09-27 | 厦门美图之家科技有限公司 | A kind of training method of characteristic point detection model |
CN110458289A (en) * | 2019-06-10 | 2019-11-15 | 北京达佳互联信息技术有限公司 | The construction method of multimedia class model, multimedia class method and device |
CN110533156A (en) * | 2018-05-23 | 2019-12-03 | 富士通株式会社 | The method and apparatus for improving the processing speed of convolutional neural networks |
CN110909667A (en) * | 2019-11-20 | 2020-03-24 | 北京化工大学 | Lightweight design method for multi-angle SAR target recognition network |
WO2020091139A1 (en) * | 2018-10-31 | 2020-05-07 | 주식회사 노타 | Effective network compression using simulation-guided iterative pruning |
US10657426B2 (en) | 2018-01-25 | 2020-05-19 | Samsung Electronics Co., Ltd. | Accelerating long short-term memory networks via selective pruning |
US20200272904A1 (en) * | 2019-02-27 | 2020-08-27 | Oracle International Corporation | Forming an artificial neural network by generating and forming of tunnels |
US20200334537A1 (en) * | 2016-06-30 | 2020-10-22 | Intel Corporation | Importance-aware model pruning and re-training for efficient convolutional neural networks |
US10936913B2 (en) * | 2018-03-20 | 2021-03-02 | The Regents Of The University Of Michigan | Automatic filter pruning technique for convolutional neural networks |
US10997502B1 (en) * | 2017-04-13 | 2021-05-04 | Cadence Design Systems, Inc. | Complexity optimization of trainable networks |
US20210158168A1 (en) * | 2019-11-26 | 2021-05-27 | Numenta, Inc. | Performing Inference and Training Using Sparse Neural Network |
CN113033804A (en) * | 2021-03-29 | 2021-06-25 | 北京理工大学重庆创新中心 | Convolution neural network compression method for remote sensing image |
JP2021124949A (en) * | 2020-02-05 | 2021-08-30 | 株式会社東芝 | Machine learning model compression system, pruning method, and program |
US11157814B2 (en) * | 2016-11-15 | 2021-10-26 | Google Llc | Efficient convolutional neural networks and techniques to reduce associated computational costs |
US20220092618A1 (en) * | 2017-08-31 | 2022-03-24 | Paypal, Inc. | Unified artificial intelligence model for multiple customer value variable prediction |
US11315018B2 (en) * | 2016-10-21 | 2022-04-26 | Nvidia Corporation | Systems and methods for pruning neural networks for resource efficient inference |
US20220207375A1 (en) * | 2017-09-18 | 2022-06-30 | Intel Corporation | Convolutional neural network tuning systems and methods |
US11449728B2 (en) | 2018-07-01 | 2022-09-20 | Al Falcon Ltd. | Method of optimization of operating a convolutional neural network and system thereof |
US20230053289A1 (en) * | 2017-12-30 | 2023-02-16 | Intel Corporation | Machine learning accelerator mechanism |
US11586909B2 (en) * | 2017-01-13 | 2023-02-21 | Kddi Corporation | Information processing method, information processing apparatus, and computer readable storage medium |
-
2017
- 2017-05-15 US US15/595,049 patent/US10740676B2/en active Active
Cited By (33)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200334537A1 (en) * | 2016-06-30 | 2020-10-22 | Intel Corporation | Importance-aware model pruning and re-training for efficient convolutional neural networks |
US11907843B2 (en) * | 2016-06-30 | 2024-02-20 | Intel Corporation | Importance-aware model pruning and re-training for efficient convolutional neural networks |
US11315018B2 (en) * | 2016-10-21 | 2022-04-26 | Nvidia Corporation | Systems and methods for pruning neural networks for resource efficient inference |
US11157815B2 (en) * | 2016-11-15 | 2021-10-26 | Google Llc | Efficient convolutional neural networks and techniques to reduce associated computational costs |
US11157814B2 (en) * | 2016-11-15 | 2021-10-26 | Google Llc | Efficient convolutional neural networks and techniques to reduce associated computational costs |
US11586909B2 (en) * | 2017-01-13 | 2023-02-21 | Kddi Corporation | Information processing method, information processing apparatus, and computer readable storage medium |
US10997502B1 (en) * | 2017-04-13 | 2021-05-04 | Cadence Design Systems, Inc. | Complexity optimization of trainable networks |
US20220092618A1 (en) * | 2017-08-31 | 2022-03-24 | Paypal, Inc. | Unified artificial intelligence model for multiple customer value variable prediction |
US20220207375A1 (en) * | 2017-09-18 | 2022-06-30 | Intel Corporation | Convolutional neural network tuning systems and methods |
US20230053289A1 (en) * | 2017-12-30 | 2023-02-16 | Intel Corporation | Machine learning accelerator mechanism |
US11151428B2 (en) | 2018-01-25 | 2021-10-19 | Samsung Electronics Co., Ltd. | Accelerating long short-term memory networks via selective pruning |
US10657426B2 (en) | 2018-01-25 | 2020-05-19 | Samsung Electronics Co., Ltd. | Accelerating long short-term memory networks via selective pruning |
CN108388537A (en) * | 2018-03-06 | 2018-08-10 | 上海熠知电子科技有限公司 | A kind of convolutional neural networks accelerator and method |
US10936913B2 (en) * | 2018-03-20 | 2021-03-02 | The Regents Of The University Of Michigan | Automatic filter pruning technique for convolutional neural networks |
CN110533156A (en) * | 2018-05-23 | 2019-12-03 | 富士通株式会社 | The method and apparatus for improving the processing speed of convolutional neural networks |
US11449728B2 (en) | 2018-07-01 | 2022-09-20 | Al Falcon Ltd. | Method of optimization of operating a convolutional neural network and system thereof |
CN108846476A (en) * | 2018-07-13 | 2018-11-20 | 电子科技大学 | A kind of intelligent terminal security level classification method based on convolutional neural networks |
WO2020091139A1 (en) * | 2018-10-31 | 2020-05-07 | 주식회사 노타 | Effective network compression using simulation-guided iterative pruning |
CN109598340A (en) * | 2018-11-15 | 2019-04-09 | 北京知道创宇信息技术有限公司 | Method of cutting out, device and the storage medium of convolutional neural networks |
US11615309B2 (en) * | 2019-02-27 | 2023-03-28 | Oracle International Corporation | Forming an artificial neural network by generating and forming of tunnels |
US20200272904A1 (en) * | 2019-02-27 | 2020-08-27 | Oracle International Corporation | Forming an artificial neural network by generating and forming of tunnels |
CN110097177A (en) * | 2019-05-15 | 2019-08-06 | 电科瑞达(成都)科技有限公司 | A kind of network pruning method based on pseudo- twin network |
CN110458289A (en) * | 2019-06-10 | 2019-11-15 | 北京达佳互联信息技术有限公司 | The construction method of multimedia class model, multimedia class method and device |
CN110458289B (en) * | 2019-06-10 | 2022-06-10 | 北京达佳互联信息技术有限公司 | Multimedia classification model construction method, multimedia classification method and device |
CN110287857A (en) * | 2019-06-20 | 2019-09-27 | 厦门美图之家科技有限公司 | A kind of training method of characteristic point detection model |
CN110276452A (en) * | 2019-06-28 | 2019-09-24 | 北京中星微电子有限公司 | Pruning method, device, equipment and the artificial intelligence chip of neural network model |
CN110909667A (en) * | 2019-11-20 | 2020-03-24 | 北京化工大学 | Lightweight design method for multi-angle SAR target recognition network |
US11681922B2 (en) * | 2019-11-26 | 2023-06-20 | Numenta, Inc. | Performing inference and training using sparse neural network |
US20230274150A1 (en) * | 2019-11-26 | 2023-08-31 | Numenta, Inc. | Performing Inference And Training Using Sparse Neural Network |
US20210158168A1 (en) * | 2019-11-26 | 2021-05-27 | Numenta, Inc. | Performing Inference and Training Using Sparse Neural Network |
JP2021124949A (en) * | 2020-02-05 | 2021-08-30 | 株式会社東芝 | Machine learning model compression system, pruning method, and program |
JP7242590B2 (en) | 2020-02-05 | 2023-03-20 | 株式会社東芝 | Machine learning model compression system, pruning method and program |
CN113033804A (en) * | 2021-03-29 | 2021-06-25 | 北京理工大学重庆创新中心 | Convolution neural network compression method for remote sensing image |
Also Published As
Publication number | Publication date |
---|---|
US10740676B2 (en) | 2020-08-11 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US10740676B2 (en) | Passive pruning of filters in a convolutional neural network | |
US10885437B2 (en) | Security system using a convolutional neural network with pruned filters | |
US10402653B2 (en) | Large margin high-order deep learning with auxiliary tasks for video-based anomaly detection | |
CN106982359B (en) | A kind of binocular video monitoring method, system and computer readable storage medium | |
KR102563752B1 (en) | Training method for neural network, recognition method using neural network, and devices thereof | |
KR102492318B1 (en) | Model training method and apparatus, and data recognizing method | |
CN110647918B (en) | Mimicry defense method for resisting attack by deep learning model | |
US9047568B1 (en) | Apparatus and methods for encoding of sensory data using artificial spiking neurons | |
CN106796580B (en) | Method, apparatus, and medium for processing multiple asynchronous event driven samples | |
WO2020046806A1 (en) | Unsupervised anomaly detection, diagnosis, and correction in multivariate time series data | |
CN113449864B (en) | Feedback type impulse neural network model training method for image data classification | |
US11815893B1 (en) | Apparatus and method for monitoring and controlling of a neural network using another neural network implemented on one or more solid-state chips | |
US11606393B2 (en) | Node classification in dynamic networks using graph factorization | |
CN110458109A (en) | A kind of tealeaves disease recognition system and working method based on image recognition technology | |
US20210326661A1 (en) | Determining an explanation of a classification | |
JP7217761B2 (en) | Abnormal device detection from communication data | |
KR20160132032A (en) | Blink and averted gaze avoidance in photographic images | |
US11468276B2 (en) | System and method of a monotone operator neural network | |
WO2016053748A1 (en) | Vibration signatures for prognostics and health monitoring of machinery | |
CN107038450A (en) | Unmanned plane policing system based on deep learning | |
Poon et al. | Driver distracted behavior detection technology with YOLO-based deep learning networks | |
JP2023118101A (en) | Device and method for determining adversarial patch for machine learning system | |
US11289175B1 (en) | Method of modeling functions of orientation and adaptation on visual cortex | |
KR102488281B1 (en) | Method for adaptively controling precision of activation and weight of artificial neural network and apparatus thereof | |
Popov et al. | Recognition of Dynamic Targets using a Deep Convolutional Neural Network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: NEC LABORATORIES AMERICA, INC., NEW JERSEY Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:DURDANOVIC, IGOR;GRAF, HANS PETER;REEL/FRAME:042378/0906 Effective date: 20170511 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: DOCKETED NEW CASE - READY FOR EXAMINATION |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: NOTICE OF ALLOWANCE MAILED -- APPLICATION RECEIVED IN OFFICE OF PUBLICATIONS |
|
AS | Assignment |
Owner name: NEC CORPORATION, JAPAN Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:NEC LABORATORIES AMERICA, INC.;REEL/FRAME:052996/0574 Effective date: 20200616 |
|
STPP | Information on status: patent application and granting procedure in general |
Free format text: PUBLICATIONS -- ISSUE FEE PAYMENT VERIFIED |
|
STCF | Information on status: patent grant |
Free format text: PATENTED CASE |
|
MAFP | Maintenance fee payment |
Free format text: PAYMENT OF MAINTENANCE FEE, 4TH YEAR, LARGE ENTITY (ORIGINAL EVENT CODE: M1551); ENTITY STATUS OF PATENT OWNER: LARGE ENTITY Year of fee payment: 4 |