CN106845507A

CN106845507A - A kind of blocking object detection method based on notice

Info

Publication number: CN106845507A
Application number: CN201510887751.4A
Authority: CN
Inventors: 钟南; 成健; 张建伟; 张丹普; 张晓林; 王亚静
Original assignee: China Changfeng Science Technology Industry Group Corp
Current assignee: China Changfeng Science Technology Industry Group Corp
Priority date: 2015-12-07
Filing date: 2015-12-07
Publication date: 2017-06-13

Abstract

A kind of blocking object detection method based on notice, first to a width picture, determines focus, is randomly assigned first, finds zonule interested, then as center, creates three picture blocks of proportional size, then zooms to same size；The three width pictures that will be obtained are input to recurrent neural network, and then Recursive Networks produce two outputs, and an output enters positioning network and produces location information, for determining the interesting target focus in picture again；Another input fully-connected network is used to determine whether this picture block for producing is an object, is if it is fed back to 1, if not being fed back to 0, as the signal of enhancing study.

Description

A kind of blocking object detection method based on notice

Technical field

The invention belongs to image information data processing technology field, it is related to image information data treatment technology in deep learning, video Analysis, the application of object detection field.

Background technology

Depth convolutional neural networks achieve achievement best at present in object monitoring identification field, are designed to process Multidimensional numerical Data, the attribute of natural sign is utilized using 4 ideas of key：Local connection, shared weights, pond and Multi net voting The use of layer.

In target detection, candidate mainly is extracted to there may be target area using methods such as selective search, Then convolutional neural networks are input into, because every pictures will produce thousands of candidates, each candidate are then input into convolutional Neural net Network, it is significantly slack-off in such speed.

The deepmind team of Google proposes a kind of learning process for imitating human vision, is processed one by one according to notice, learns Habit process strengthens the feedback of correct study by continuous trial and error, to try to achieve maximal rewards, letter, number is only applied at present The simple applications such as word identification.

The content of the invention

Present object identification field, the fast-rcnn models that Microsoft proposes, effect is best on image-net data sets, but It is fixed dimension to need input picture, it is an object of the invention to provide a kind of blocking object detection method based on notice, Using enhancing study extension faster-rcnn so that input dimension of picture can be with arbitrary size.

Technical scheme is as follows：

A kind of blocking object detection method based on notice, it is characterised in that：

(1) receptor network is built；The central point for being input into picture is defined as (0,0), and (- 1, -1) is orientated in the upper left corner as, adopts Collect a target of picture, constitute three picture blocks of multiresolution, be input to fully-connected network, while it is directly defeated to position target Enter a fully-connected network, two outputs of network are input to next fully-connected network simultaneously, exported, as lower step The input of recurrent neural network；

(2) recurrent neural network is built：Obtain the output of receptor network, and last time Recursive Networks output as input, Into internal state, then connection fully-connected network is exported, obtain positioning output, impression is again flowed into together with picture is originally inputted Device network, internal state will also flow into fully-connected network and judge whether containing object, and used as reward, it is straight that circulation carries out this process Predetermined number of times has all been found or reached to all objects；

(3) receptor network is input to fast-rcnn models by repeatedly circulating many target candidates for obtaining, it is determined that Classification and position, wherein fast-rcnn models carry out pre-training on image-net.

The present invention is combined based on current main flows such as notice, recurrent neural network, enhancing study, convolutional Neural neutral nets Model and method, the defect of the faster-rcnn of solution：Input picture needs fixed size, speed to greatly promote, while ensureing Accuracy rate.

Specific embodiment

Large-sized picture is processed using convolutional neural networks, amount of calculation is very huge, and human eye treatment picture is based on attention Power, so there is the identification of emphasis, rather than the treatment of the width figure integral part emphasis to seeing.Meanwhile, the study of people Journey or the continuous correct understanding of reinforcement, the process that mistake is gradually reduced, enhancing study is exactly to imitate this principle.

First to a width picture, focus is determined, be randomly assigned first, zonule interested is found, then as in The heart, creates three picture blocks of proportional size, then zooms to same size, this namely human eye quick glance.Three for obtaining Width picture is input to recurrent neural network, and then Recursive Networks produce two outputs, an output to enter positioning network and produce positioning Information Lt, for determining the interesting target focus in picture again；Another input fully-connected network is used to determine that this is produced Picture block whether be an object, 1 is if it is fed back to, if not being fed back to 0, as the signal of enhancing study. Enhancing learning process of the invention is, according to former picture focus, the focus of next time to be selected, so as to generate whether one contain There is the reward of object, the target of study is to ask for reward to maximize.Reward is much contained thing after circulating several times The picture block of body, as the input of fast-rcnn models, exports target classification and position coordinates, and fast-rcnn exists before this Pre-training has been done above image-net.

The inside shape body of recurrent neural network is exactly that a kind of coding schedule of picture block and location information Lt reaches, and two outputs are exactly to increase The strong randomized policy to be learnt, moves to certain next position location, and the reward for being obtained.

(1) receptor network is built.The central point for being input into picture is defined as (0,0), upper left corner positioning (- 1, -1) collection picture One target, constitutes three picture blocks of multiresolution, is input to fully-connected network, while positioning target directly inputs one entirely Two outputs of network are input to next fully-connected network by connection network simultaneously, are exported, used as lower step recurrence god Through the input of network, i.e. g=Rect (Linear (hg)+Linear (hl)).

(2) recurrent neural network is built.Obtain the output of receptor network, and last time Recursive Networks output as input, Into internal state, then connection fully-connected network is exported, obtain positioning output, impression is again flowed into together with picture is originally inputted Device network, internal state will also flow into fully-connected network and judge whether containing object, used as reward, R=sum (r1+r2+r3 + ...), circulation carries out this process until all objects have all found or reached predetermined number of times.

(3) receptor network is input to fast-rcnn models by repeatedly circulating many target candidates for obtaining, and determines class Other and position, wherein fast-rcnn models carry out pre-training on image-net.

Claims

1. a kind of blocking object detection method based on notice, it is characterised in that：

(3) receptor network is input to fast-rcnn models by repeatedly circulating many target candidates for obtaining, it is determined that Classification and position.