WO2018052496A1 - Procédé de détection d'objet dans une image et une vidéo numériques à l'aide de réseaux de neurones impulsionnels - Google Patents

Procédé de détection d'objet dans une image et une vidéo numériques à l'aide de réseaux de neurones impulsionnels Download PDF

Info

Publication number
WO2018052496A1
WO2018052496A1 PCT/US2017/034093 US2017034093W WO2018052496A1 WO 2018052496 A1 WO2018052496 A1 WO 2018052496A1 US 2017034093 W US2017034093 W US 2017034093W WO 2018052496 A1 WO2018052496 A1 WO 2018052496A1
Authority
WO
WIPO (PCT)
Prior art keywords
generating
intensity
gaussian
input image
color
Prior art date
Application number
PCT/US2017/034093
Other languages
English (en)
Inventor
Yongqiang CAO
Qin Jiang
Yang Chen
Deepak Khosla
Original Assignee
Hrl Laboratories, Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from US15/269,777 external-priority patent/US10198689B2/en
Application filed by Hrl Laboratories, Llc filed Critical Hrl Laboratories, Llc
Priority to EP17851211.7A priority Critical patent/EP3516592A4/fr
Priority to CN201780050666.XA priority patent/CN109643390B/zh
Publication of WO2018052496A1 publication Critical patent/WO2018052496A1/fr

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/06Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons
    • G06N3/063Physical realisation, i.e. hardware implementation of neural networks, neurons or parts of neurons using electronic means
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • G06V10/443Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components by matching or filtering
    • G06V10/449Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters
    • G06V10/451Biologically inspired filters, e.g. difference of Gaussians [DoG] or Gabor filters with interaction between the filter responses, e.g. cortical complex cells
    • G06V10/454Integrating the filters into a hierarchical structure, e.g. convolutional neural networks [CNN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/70Arrangements for image or video recognition or understanding using pattern recognition or machine learning
    • G06V10/82Arrangements for image or video recognition or understanding using pattern recognition or machine learning using neural networks

Definitions

  • the present invention relates to a system for object detection and, more
  • Moving object detection models, or motion models, are good at detecting
  • moving objects in videos taken f om a stationary camera i.e., the background is not moving.
  • motion models are not good at detecting still objects or moving objects in videos taken from a moving camera, because the background is moving too.
  • saliency models can detect salient objects, whether the
  • the present invention relates to a system for object detection and, more particularly, to a system for object detection using spiking neural networks.
  • the system comprises one or more processors and a memory having instructions such that when the instructions are executed, the one or more processors perform multiple operations.
  • the system generates an intensity saliency map from an intensity of an input image having color components using a spiking neural network.
  • a color saliency map is generated from each color component in the input image using a spiking neural network.
  • An object detection model is generated by combining the intensity saliency map and at least one color saliency map. The object detection model is used to detect multiple objects of interest in the input image.
  • a plurality of spikes are generated from the intensity of the input image.
  • the plurality of spikes are convolved with Gaussian kernels to generate a plurality of Gaussian maps, each Gaussian map having a different scale.
  • a set of feature maps are generated from the plurality of Gaussian maps.
  • a set of final feature maps are generated by adding the set of feature maps, and the intensity saliency map is generated by adding the set of final feature maps.
  • a plurality of spikes are generated for each color component in the input image. For each color component, the plurality of spikes are convolved with Gaussian kernels to generate a plurality of Gaussian maps, each Gaussian map having a different scale.
  • a set of feature maps For each color component, a set of feature maps s generated from the plurality of Gaussian maps. For each color component, a set of final feature maps are generated by adding the set of feature maps, and for each color component, a color saliency map is generated by adding the set of final feature maps.
  • spikes from each intensity saliency map and color saliency map are accumulated, and a threshold is applied to the accumulated spikes.
  • a final saliency spike activity is obtained, and object detection boxes are obtained from the final saliency spike activity.
  • the color components are normalized according to an overall intensity of the input image.
  • normalizing includes increasing spike activity for a bright image.
  • normalizing includes reducing spike activity for a dark image.
  • both salient and less salient objects of interest are detected in the input image.
  • the object detection model is implemented in low power spiking neuromorphic hardware.
  • the present invention also comprises a method for causing a processor to perform the operations described herein.
  • the present invention also comprises a
  • FIG. 1 is a block diagram depicting the components of a system for object detection using spiking neural networks according to various embodiments of the present disclosure
  • FIG. 2 is an illustration of a computer program product according to various embodiments of the present disclosure
  • FIG. 3 is an illustration of a network structure for an intensity saliency map according to various embodiments of the present disclosure
  • FIG. 4 is an illustration of a network structure for a color saliency map using blue/yellow opponent color channels as an example according to various embodiments of the present disclosure
  • FIG. 5 is an illustration of the combination of multiple channels to detect objects in various sizes and colors according to various embodiments of the present disclosure
  • FIG. 6A is an input image according to various embodiments of the present disclosure
  • FIG. 6B is an illustration of a small-scale intensity channel detecting a still person in FIG. 6A according to various embodiments of the present disclosure
  • FIG. 7 A is an input image according to various embodiments of the present disclosure.
  • FIG. 7B is an illustration of a medium-scale blue color channel detecting a blue car in FIG. 7A according to various embodiments of the present disclosure
  • FIG. 8A is an input image according to various embodiments of the present disclosure
  • FIG. 8B is an illustration of a small-scale blue color channel detecting a person standing behind a blue car and two cyclists in FIG. 8A according to various embodiments of the present disclosure
  • FIG. 9A is a bright input image according to various embodiments of the present disclosure
  • FIG. 9B is an illustration of a medium-scale blue color channel result for
  • FIG. 9A without brightness normalization according to various embodiments of die present disclosure
  • FIG. 9C is a dark input image according to various embodiments of the present disclosure.
  • FIG. 9D is an illustration of a medium-scale blue color channel result for
  • FIG. 9C without brightness normalization according to various embodiments of the present disclosure
  • FIG. 10A is an illustration of a medium-scale blue color channel result for FIG. 9A with brightness normalization according to various embodiments of the present disclosure
  • FIG. 10B is an illustration of a medium-scale blue color channel result for FIG. 9C with brightness normalization according to various embodiments of the present disclosure.
  • FIG. 1 1 is an image result with object detection boxes from combining
  • small-scale intensity, small-scale blue color, and medium-scale blue color channels according to various embodiments of the present disclosure.
  • the present invention relates to a system for object detection and, more particularly, to a system for object detection using spiking neural networks.
  • the following description is presented to enable one of ordinary skill in the art to make and use the invention and to incorporate it in the context of particular applications. Various modifications, as well as a variety of uses in different applications will be readily apparent to those skilled in the art, and the general principles defined herein may be applied to a wide range of aspects. Thus, the present invention is not intended to be limited to the aspects presented, but is to be accorded the widest scope consistent with the principles and novel features disclosed herein. [00046] In the following detailed description, numerous specific details are set forth in order to provide a more thorough understanding of the present invention. However, it will be apparent to one skilled in the art that the present invention may be practiced without necessarily being limited to these specific details. In other instances, well-known structures and devices are shown in block diagram form, rather than in detail, in order to avoid obscuring the present invention.
  • any element in a claim that does not explicitly state "means for” performing a specified function, or “step for” performing a specific function, is not to be interpreted as a "means” or “step” clause as specified in 35 U.S.C. Section 1 12, Paragraph 6.
  • the use of "step of or “act of in the claims herein is not intended to invoke the provisions of 35 U.S.C. 1 12, Paragraph 6.
  • the labels left, right, front, back, top, bottom, forward, reverse, clockwise and counter-clockwise have been used for convenience purposes only and are not intended to imply any particular fixed direction. Instead, they are used to reflect relative locations and/or directions between various portions of an object. As such, as the present invention is changed, the above labels may change their orientation.
  • the present invention has three "principal" aspects.
  • the first is a system for object detection using spiking neural networks.
  • the system is typically in the form of a computer system operating software or in the form of a "hard-coded" instruction set. This system may be incorporated into a wide variety of devices that provide different functionalities.
  • the second principal aspect is a method, typically in the form of software, operated using a data processing system (computer).
  • the third principal aspect is a computer program product.
  • the computer program product generally represents computer-readable instructions stored on a non-transitory computer-readable medium such as an optical storage device, e.g., a compact disc (CD) or digital versatile disc (DVD), or a magnetic storage device such as a floppy disk or magnetic tape.
  • Other, non-limiting examples of computer-readable media include hard disks, read-only memory (ROM), and flash-type memories.
  • the computer system 100 is configured to perform calculations, processes, operations, and/or functions associated with a program or algorithm.
  • certain processes and steps discussed herein are realized as a series of instructions (e.g., software program) that reside within computer readable memory units and are executed by one or more processors of the computer system 100. When executed, the instructions cause the computer system 100 to perform specific actions and exhibit specific behavior, such as described herein.
  • the computer system 100 may include an address/data bus 102 that is
  • processor 104 configured to communicate information. Additionally, one or more data processing units, such as a processor 104 (or processors), are coupled with the address/data bus 102.
  • the processor 104 is configured to process information and instructions.
  • the processor 104 is a microprocessor.
  • the processor 104 may be a different type of processor such as a parallel processor, or a field programmable gate array.
  • the computer system 100 is configured to utilize one or more data storage units.
  • the computer system 100 may include a volatile memory unit 106 (e.g., random access memory (“RAM”), static RAM, dynamic RAM, etc.) coupled with the address/data bus 102, wherein a volatile memory unit 106 is configured to store information and instructions for the processor 104.
  • RAM random access memory
  • static RAM static RAM
  • dynamic RAM dynamic RAM
  • the computer system 100 further may include a non-volatile memory unit 108 (e.g., read-only memory (“ROM”), programmable ROM (“PROM”), erasable programmable ROM (“EPROM”), electrically erasable programmable ROM “EEPROM”), flash memory, etc.) coupled with the address/data bus 102, wherein the nonvolatile memory unit 108 is configured to store static information and instructions for the processor 104.
  • the computer system 100 may execute instructions retrieved from an online data storage unit such as in
  • the computer system 100 also may include one or more interfaces, such as an interface 110, coupled with the address/data bus 102.
  • the one or more interfaces are configured to enable the computer system 100 to interface with other electronic devices and computer systems.
  • the communication interfaces implemented by the one or more interfaces may include wireline (e.g., serial cables, modems, network adaptors, etc.) and/or wireless (e.g., wireless modems, wireless network adaptors, etc.) communication technology.
  • the computer system 100 may include an input device 112 coupled with the address/data bus 102, wherein the input device 1 12 is configured to communicate information and command selections to the processor 100.
  • the input device 112 is an alphanumeric input device, such as a keyboard, that may include alphanumeric and/or function keys.
  • the input device 1 12 may be an input device other than an alphanumeric input device.
  • the input device 112 may include one or more sensors, such as a camera for video or still images, a microphone, or a neural sensor.
  • Other example input devices 112 may include an accelerometer, a GPS sensor, or a gyroscope.
  • the computer system 100 may include a cursor control device 114 coupled with the address/data bus 102, wherein the cursor control device 114 is configured to communicate user input information and/or command selections to the processor 100.
  • the cursor control device 114 is implemented using a device such as a mouse, a track-ball, a track-pad, an optical tracking device, or a touch screen.
  • the cursor control device 1 14 is directed and/or activated via input from the input device 1 12, such as in response to the use of special keys and key sequence commands associated with the input device 112.
  • die cursor control device 1 14 is configured to be directed or guided by voice commands.
  • the computer system 100 further may include one or more
  • a storage device 1 16 coupled with the address/data bus 102.
  • the storage device 1 16 is configured to store information and/or computer executable instructions.
  • the storage device 116 is a storage device such as a magnetic or optical disk drive (e.g., hard disk drive (“HDD”), floppy diskette, compact disk read only memory (“CD-ROM”), digital versatile disk (“DVD”)).
  • a display device 1 18 is coupled with the address/data bus 102, wherein the display device 1 18 is configured to display video and/or graphics.
  • the display device 1 18 may include a cathode ray tube (“CRT”), liquid crystal display (“LCD”), field emission display (“FED”), plasma display, or any other display device suitable for displaying video and/or graphic images and alphanumeric characters recognizable to a user.
  • CTR cathode ray tube
  • LCD liquid crystal display
  • FED field emission display
  • plasma display or any other display device suitable for displaying video and/or graphic images and alphanumeric characters recognizable to a user.
  • the computer system 100 presented herein is an example computing
  • the non-limiting example of the computer system 100 is not strictly limited to being a computer system.
  • the computer system 100 represents a type of data processing analysis that may be used in accordance with various aspects described herein.
  • other computing systems may also be
  • one or more operations of various aspects of the present technology are controlled or implemented using computer-executable instructions, such as program modules, being executed by a computer.
  • program modules include routines, programs, objects, components and/or data structures that are configured to perform particular tasks or implement particular abstract data types.
  • an aspect provides that one or more aspects of the present technology are implemented by utilizing one or more distributed computing environments, such as where tasks are performed by remote processing devices that are linked through a communications network, or such as where various program modules are located in both local and remote computer-storage media including memory-storage devices.
  • FIG. 2 An illustrative diagram of a computer program product (i.e., storage device) embodying the present invention is depicted in FIG. 2.
  • the computer program product is depicted as floppy disk 200 or an optical disk 202 such as a CD or DVD.
  • the computer program product generally represents computer-readable instructions stored on any compatible non-transitory computer-readable medium.
  • the term "instructions” as used with respect to this invention generally indicates a set of operations to be performed on a computer, and may represent pieces of a whole program or individual, separable, software modules.
  • Non-limiting examples of "instruction” include computer program code (source or object code) and "hard-coded" electronics (i.e. computer operations coded into a computer chip).
  • the "instruction" is stored on any non-transitory computer-readable medium, such as in the memory of a computer or on a floppy disk, a CD-ROM, and a flash drive. In either event, the instructions are encoded on a non-transitory computer-readable medium.
  • the first step is to determine the possible locations in the image that objects are found.
  • motion models For moving objects in a fixed camera video, most motion models can do a reasonable job. However, motion models do not work for still objects. Motion models also cannot be easily extended to work for moving camera videos, because everything, including background, is moving here.
  • the prevailing method to locate objects in an image is based on exhaustive search with a trained classifier for objects of interest. As the total number of windows to evaluate in an exhaustive search is huge, the computational cost is impractical for most applications. On the other hand, when a human looks at a scene, attention plays a key role in locating objects.
  • saliency models (as described in Itti 1998 and Itti 2000) attempt to detect salient spots (regions) in an image by building up a saliency map.
  • saliency models can miss non-salient objects.
  • the system according to embodiments described herein detects still objects in a fixed camera video or objects in a moving camera video (e.g., on a moving platform, such as an unmanned aerial vehicle (UAV)) even when the objects are not salient in other models.
  • UAV unmanned aerial vehicle
  • a spiking neural network (SNN) model for object detection in images or videos.
  • the SNN implementation maps directly to emerging ultra- low power spiking neuromorphic hardware applications, such as those described by Cruz-Albrecht, et al. in “Energy efficient neuron, synapse and STDP integrated circuits" in IEEE Transactions on Biomedical Circuits and Systems, 6(3), 246-256, 2012 and Merolla et. al in "A million spiking-neuron integrated circuit with a scalable communication network and interface” in Science, Vol. 345, Issue 6197, 668-673, 2014, both of which are incorporated by reference as though fully set forth herein.
  • the system can detect still objects in a fixed camera video or objects in a moving camera video (e.g., unmanned aerial vehicle (UA V)), no matter whether the objects are salient or not in a typical condition. Further, described is a method for color channel normalization according to overall image brightness. This makes the model according to embodiments of the present disclosure work well in various lighting conditions.
  • a moving camera video e.g., unmanned aerial vehicle (UA V)
  • UAV unmanned aerial vehicle
  • the system described herein can detect still objects in a fixed camera video or all interesting objects in a moving camera video. Compared to existing saliency models, it can detect objects of interest that cannot be detected by these models. Compared to traditional methods of object detection using exhaustive search, the present invention provides a very efficient computation model for object detection.
  • the spiking neurons for the neuromorphic implementation according to some embodiments of the present disclosure are all leaky integrate-and-flre type neurons whose membrane potentials (V) are defined by:
  • A is a leakage parameter
  • /(/) is the weighted sum of all inputs.
  • FIG. 3 illustrates the network structure for a neuromorphic implementation of building up a saliency map from image intensity. Dashed arrow lines denote the connections are optional.
  • spikes 300 are generated from the intensity of an input image 302.
  • the spikes 300 is convolved with 6 Gaussian kernels 304 to generate Gaussian maps of 6 scales (e.g., scale 1 (element 306), scale 2 (element 308), scale 4 (element 310), scale 6 (element 312)).
  • a p+m ⁇ +n (t) is the input spikes 300 generated from input image intensity
  • a neuron at pixel (ij) produces a spike if and only if where rand( ) is a random number generator with uniform distribution on (0,1), Q is a constant to scale the frequency of spikes generated, and is the image
  • Gaussian maps are the same size in pixels as the input image (element 302). This is different from the model described by Itti et al. (Itti 1998, Itti 2000) in which the Gaussian pyramids with images of different sizes generated from sub-sampling the input image are used.
  • the next step is to generate ON and OFF feature maps, as depicted in FIG.
  • ON feature maps 314 are generated by subtracting large scale Gaussian maps from small scale Gaussian maps.
  • FIG. 4 illustrates network structure for a color saliency map
  • the color input image 400 (in rgb (red, green, blue)) is first normalized by local image intensity (I) to decouple hue rom intensity as follows:
  • c is color r, g, or b, and / is image intensity defined by:
  • This preprocessing step is the same as in the model described by Itti et al. It helps to generate pure color components (i.e., color component generation 402). However, it has a drawback. It generates stronger color signals in a dark image than in a bright image. Therefore, a lightness/brightness normalization 404 process is added in the model according to some embodiments of the present disclosure, which will be described in further detail below.
  • Spikes 414 and 416 are generated from the blue input 406 and the yellow input 408, respectively. Then, the spikes 414 and 416 are convolved with 6 Gaussian kernels 304 to generate Gaussian maps 418 of 6 scales for each color input (i.e., blue input 406 and yellow input 408). However, instead of one intensity input, for each double-opponent color pair (e.g., blue/yellow), there are two color inputs, a blue input 406 and a yellow input 408. As a result, each feature map (e.g., elements 410 and 412) for color difference has four inputs.
  • the feature maps 410 and 412 are used to generate recurrent DoG kernel feature maps (e.g., elements 420 and 422), which are added to generate a color saliency map 424.
  • FIG. 4 only shows an example for a blue/yellow double-opponent color pair.
  • C/S center/surround
  • the S (surround) color is optional. According to experimental studies, using only C (center) color is better and gives less noisy results in some cases (as described in detail below).
  • the preprocessing that generates color components produces stronger color signals in a dark image than in a bright image. This gives a stronger spike activity in the final color saliency map 424 for a dark image.
  • the proper lightness normalization 404 is performed, one can increase the color channel spike activity for a bright image while reducing the spike activity for a dark image, making the process invariant to image brightness.
  • the process is to normalize the color components according to the overall intensity of an input image 400. The method is as follows. Let / be the image intensity with values from 0 to 2SS, the
  • spikes in multiple consecutive frames may need to be first accumulated (i.e., spike accumulation in multiple frames 504, 506, and 508) and then thresholded 510 before adding multiple channels and scales together.
  • Object detection boxes 500 can be obtained from the final saliency spike activity 502.
  • typical methods to add the object detection boxes to a saliency map can be used. In the simulation described below, 20 image frames were first accumulated. Then, the final saliency spike activity is obtained by: where S is the final saliency spike activity 502, 5 / is the accumulated spikes 504 from a small scale intensity channel 512 (or saliency map), S c is the
  • Equation ( 18) The numbers 8 and 6 in Equation ( 18) are thresholds 510.
  • the weight and threshold number for each channel in Equation (18) are non-limiting examples that are determined by experimental studies on Stanford videos. They may be different for other videos. The accumulations need to go above the threshold to be considered in order to suppress noise. Anything below the threshold is considered noise. [00083] (3.6) Experimental Studies
  • an optional opponent color e.g., yellow color
  • FIGs. 8A and 8B show an input image (FIG. 8A) and the result from a
  • FIG. 8B small-scale blue color saliency map
  • FIGs. 9A-9D show the results from medium-scale blue color saliency maps (FIGs. 9B and 9D) for bright and dark images (FIGs. 9A and 9C, respectively), without brightness/lightness normalization. As shown in FIG. 9D, the spike activity for the dark image (FIG. 9C) is much stronger than the bright image
  • FIG. 9A The largest spike activity for the bright image (FIG. 9A) is 16 spikes while it is 45 for the dark image (FIG. 9C).
  • the dark image result (FIG. 9D) is also much noisier.
  • FIGs. 10A and 10B show the results after brightness normalization of FIGs.
  • FIG. 1 1 shows the result for object detection boxes from combining small- scale intensity, small-scale blue color, and medium-scale blue color channels.
  • the result shown here is for image frame 38 in Stanford video sequence 037.
  • the two still persons standing behind the blue car are detected. It also detects the blue car and the pool.
  • the special intensity and color channel combination i.e., small-scale intensity, small-scale blue color and medium-scale blue color channels
  • Adding red color and yellow color channels to the combination the walking persons in red and yellow can be detected too.
  • the invention described herein has applications in any commercial products mat could benefit from object detection and recognition.
  • the miniature unmanned aerial vehicle (UAV) market is a non-limiting example of a commercial market mat could benefit from the system according to
  • a UAV can be built with object detection and recognition capabilities for surveillance with lower power requirements (from batteries) than conventional CPU/GPU implementation, resulting in UAVs that are lighter and/or have longer endurance times.
  • any application that requires low power video processing can benefit from the present invention.
  • self-driving vehicles e.g., cars
  • spike-based processors can perform real-time video processing using the system described herein for real-time object detection and recognition (e.g., pedestrians, cars, street signs) at much lower power than is currently done, thereby enabling lighter and less expensive autonomous vehicles.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Medical Informatics (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Neurology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Image Analysis (AREA)

Abstract

L'invention concerne un système de détection d'objet dans des images ou des vidéos à l'aide de réseaux de neurones impulsionnels. Une carte de relief d'intensité est générée à partir de l'intensité d'une image d'entrée ayant des composantes de couleur à l'aide d'un réseau de neurones impulsionnels. De plus, une carte de relief de couleur est générée à partir d'une pluralité de couleurs dans l'image d'entrée à l'aide d'un réseau de neurones impulsionnels. Un modèle de détection d'objet est généré par combinaison de la carte de relief d'intensité et de multiples cartes de relief de couleur. Le modèle de détection d'objet est utilisé pour détecter de multiples objets dignes d'intérêt dans l'image d'entrée.
PCT/US2017/034093 2016-09-19 2017-05-23 Procédé de détection d'objet dans une image et une vidéo numériques à l'aide de réseaux de neurones impulsionnels WO2018052496A1 (fr)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP17851211.7A EP3516592A4 (fr) 2016-09-19 2017-05-23 Procédé de détection d'objet dans une image et une vidéo numériques à l'aide de réseaux de neurones impulsionnels
CN201780050666.XA CN109643390B (zh) 2016-09-19 2017-05-23 使用尖峰神经网络进行对象检测的方法、系统和程序产品

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US15/269,777 2016-09-19
US15/269,777 US10198689B2 (en) 2014-01-30 2016-09-19 Method for object detection in digital image and video using spiking neural networks

Publications (1)

Publication Number Publication Date
WO2018052496A1 true WO2018052496A1 (fr) 2018-03-22

Family

ID=61618861

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2017/034093 WO2018052496A1 (fr) 2016-09-19 2017-05-23 Procédé de détection d'objet dans une image et une vidéo numériques à l'aide de réseaux de neurones impulsionnels

Country Status (3)

Country Link
EP (1) EP3516592A4 (fr)
CN (1) CN109643390B (fr)
WO (1) WO2018052496A1 (fr)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021012752A1 (fr) * 2019-07-23 2021-01-28 中建三局智能技术有限公司 Procédé et système de suivi de courte portée basés sur un réseau neuronal impulsionnel
CN112465746A (zh) * 2020-11-02 2021-03-09 新疆天维无损检测有限公司 一种射线底片中小缺陷检测方法

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8369652B1 (en) * 2008-06-16 2013-02-05 Hrl Laboratories, Llc Visual attention system for salient regions in imagery
US20130297542A1 (en) * 2012-05-07 2013-11-07 Filip Piekniewski Sensory input processing apparatus in a spiking neural network
US20150310303A1 (en) * 2014-04-29 2015-10-29 International Business Machines Corporation Extracting salient features from video using a neurosynaptic system
US20160086052A1 (en) * 2014-09-19 2016-03-24 Brain Corporation Apparatus and methods for saliency detection based on color occurrence analysis

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20050136509A1 (en) * 2003-09-10 2005-06-23 Bioimagene, Inc. Method and system for quantitatively analyzing biological samples
AU2010201740B2 (en) * 2010-04-30 2013-03-07 Canon Kabushiki Kaisha Method, apparatus and system for performing a zoom operation
US9460387B2 (en) * 2011-09-21 2016-10-04 Qualcomm Technologies Inc. Apparatus and methods for implementing event-based updates in neuron networks
US9317776B1 (en) * 2013-03-13 2016-04-19 Hrl Laboratories, Llc Robust static and moving object detection system via attentional mechanisms
US8977582B2 (en) * 2012-07-12 2015-03-10 Brain Corporation Spiking neuron network sensory processing apparatus and methods
US9123127B2 (en) * 2012-12-10 2015-09-01 Brain Corporation Contrast enhancement spiking neuron network sensory processing apparatus and methods
US9373058B2 (en) * 2014-05-29 2016-06-21 International Business Machines Corporation Scene understanding using a neurosynaptic system

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8369652B1 (en) * 2008-06-16 2013-02-05 Hrl Laboratories, Llc Visual attention system for salient regions in imagery
US20130297542A1 (en) * 2012-05-07 2013-11-07 Filip Piekniewski Sensory input processing apparatus in a spiking neural network
US20150310303A1 (en) * 2014-04-29 2015-10-29 International Business Machines Corporation Extracting salient features from video using a neurosynaptic system
US20160086052A1 (en) * 2014-09-19 2016-03-24 Brain Corporation Apparatus and methods for saliency detection based on color occurrence analysis

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
LAURENT ITTI ET AL.: "A saliency-based search mechanism for overt and covert shifts of visual attention", VISION RESEARCH, vol. 40, 2000, pages 1489 - 1506, XP008060077 *
See also references of EP3516592A4 *

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021012752A1 (fr) * 2019-07-23 2021-01-28 中建三局智能技术有限公司 Procédé et système de suivi de courte portée basés sur un réseau neuronal impulsionnel
CN112465746A (zh) * 2020-11-02 2021-03-09 新疆天维无损检测有限公司 一种射线底片中小缺陷检测方法
CN112465746B (zh) * 2020-11-02 2024-03-05 新疆天维无损检测有限公司 一种射线底片中小缺陷检测方法

Also Published As

Publication number Publication date
EP3516592A4 (fr) 2020-05-20
EP3516592A1 (fr) 2019-07-31
CN109643390A (zh) 2019-04-16
CN109643390B (zh) 2023-04-18

Similar Documents

Publication Publication Date Title
US10198689B2 (en) Method for object detection in digital image and video using spiking neural networks
Garg et al. A deep learning approach for face detection using YOLO
Kim et al. Fast learning method for convolutional neural networks using extreme learning machine and its application to lane detection
Wen et al. A rapid learning algorithm for vehicle classification
US20180114071A1 (en) Method for analysing media content
EP3289529B1 (fr) Réduction de la résolution d'image dans des réseaux à convolution profonde
US8345984B2 (en) 3D convolutional neural networks for automatic human action recognition
US8649594B1 (en) Active and adaptive intelligent video surveillance system
US9576214B1 (en) Robust object recognition from moving platforms by combining form and motion detection with bio-inspired classification
CA2953394A1 (fr) Systeme et procede de description d'evenement visuel et d'analyse d'evenement
Wu et al. Real-time background subtraction-based video surveillance of people by integrating local texture patterns
US10691972B2 (en) Machine-vision system for discriminant localization of objects
Cao et al. Robust vehicle detection by combining deep features with exemplar classification
Lee et al. Accurate traffic light detection using deep neural network with focal regression loss
Haider et al. Human detection in aerial thermal imaging using a fully convolutional regression network
Ye et al. A two-stage real-time YOLOv2-based road marking detector with lightweight spatial transformation-invariant classification
US10002430B1 (en) Training system for infield training of a vision-based object detector
CN111047626A (zh) 目标跟踪方法、装置、电子设备及存储介质
Yang et al. Non-temporal lightweight fire detection network for intelligent surveillance systems
Parmar et al. Deeprange: deep‐learning‐based object detection and ranging in autonomous driving
Nguyen et al. Hybrid deep learning-Gaussian process network for pedestrian lane detection in unstructured scenes
Li et al. Aligning discriminative and representative features: An unsupervised domain adaptation method for building damage assessment
Liu et al. Sector-ring HOG for rotation-invariant human detection
Panda et al. Encoder and decoder network with ResNet-50 and global average feature pooling for local change detection
Chen et al. Global2Salient: Self-adaptive feature aggregation for remote sensing smoke detection

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17851211

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2017851211

Country of ref document: EP

Effective date: 20190423