CN108121986A

CN108121986A - Object detection method and device, computer installation and computer readable storage medium

Info

Publication number: CN108121986A
Application number: CN201711484723.3A
Authority: CN
Inventors: 牟永强; 刘荣杰; 裴超
Original assignee: Shenzhen Intellifusion Technologies Co Ltd
Current assignee: Shenzhen Intellifusion Technologies Co Ltd
Priority date: 2017-12-29
Filing date: 2017-12-29
Publication date: 2018-06-05
Anticipated expiration: 2037-12-29
Also published as: CN108121986B

Abstract

A kind of object detection method, the described method includes：Training sample set is obtained, the training sample set includes multiple target images for being labeled with target location and target angle type；It is trained using the training sample set pair acceleration region convolutional neural networks model, obtains trained acceleration region convolutional neural networks model；Obtain image to be detected；Target detection is carried out to described image to be detected using the trained acceleration region convolutional neural networks model, obtains the target area of described image to be detected and the target angle type of the target area.The present invention also provides a kind of object detecting device, computer installation and readable storage medium storing program for executing.The present invention can realize the target detection of quick high detection rate.

Description

Object detection method and device, computer installation and computer readable storage medium

Technical field

The present invention relates to technical field of image processing, and in particular to a kind of object detection method and device, computer installation And computer readable storage medium.

Background technology

Existing target detection technique includes the target detection of the complex characteristic based on simple pixel characteristic or hand-designed. Using simple pixel characteristic, such as representative HAAR, pixel value difference etc., although computational efficiency is high, real-time is good, The robustness of the factors such as the background variation for complicated variety is poor, is short of in accuracy of detection.And based on hand-designed Complex characteristic, such as HOG in DPM etc., although feature representation is more preferable, robustness is stronger, because GPU cannot be used Accelerate, calculated on CPU complicated, it is difficult to reach the requirement of real-time.

Existing target detection technique further includes the target detection based on convolutional neural networks.Based on convolutional neural networks Although object detection method improves the precision of detection, but the thing followed is the significantly promotion of calculation amount.Although GPU is calculated The computational problem of extraction convolution feature is solved, but candidate region extraction still expends for quite a long time.It is further, since entire Scheme is first to extract candidate region, then the frame flow classified, and leads to not realize and detect end to end, applies It is relatively cumbersome.

Further, since shooting angle is different, bigger variation can occur on the image for the appearance of target object, existing Target detection technique does not consider the problems of shooting angle, and the recall rate for causing target is relatively low.

The content of the invention

In view of the foregoing, it is necessary to propose a kind of object detection method and device, computer installation and computer-readable Storage medium can realize the target detection of quick high detection rate.

The first aspect of the application provides a kind of object detection method, the described method includes：

Training sample set is obtained, the training sample set includes multiple mesh for being labeled with target location and target angle type Logo image；

It is trained using the training sample set pair acceleration region convolutional neural networks model, obtains trained acceleration Region convolutional neural networks model, the acceleration region convolutional neural networks model include region and suggest network and fast area volume Product neutral net, the region suggest that network and the fast area convolutional neural networks share convolutional layer, and the convolutional layer carries The training sample is taken to concentrate the characteristic pattern of each target image, the region suggests network according to obtaining the characteristic pattern The target angle type of candidate region and the candidate region in each target image, the fast area convolutional Neural net Network is screened and adjusted to the candidate region according to the characteristic pattern, obtain each target image target area and The target angle type of the target area；

Obtain image to be detected；

Target is carried out to described image to be detected using in the trained acceleration region convolutional neural networks model Detection, obtains the target area of described image to be detected and the target angle type of the target area.

It is described to use the training sample set pair acceleration region convolutional neural networks mould in alternatively possible realization method Type be trained including：

(1) region described in Imagenet model initializations is used to suggest network, using described in training sample set training Suggest network in region；

(2) region in (1) after training is used to suggest that network generates the candidate region of each target image, utilizes institute It states candidate region and trains the fast area convolutional neural networks；

(3) the fast area convolutional neural networks in (2) after training is used to initialize the region and suggest network, use institute Stating training sample set trains the region to suggest network；

(4) region in (3) after training is used to suggest fast area convolutional neural networks described in netinit, and is kept The convolutional layer is fixed, and the fast area convolutional neural networks are trained using the training sample set.

Network and the fast area convolutional neural networks, which are trained, is suggested to the region using back-propagation algorithm, The network parameter that network and the fast area convolutional neural networks are suggested in the region is adjusted in training process, makes loss function It minimizes, wherein the loss function includes target classification loss, angle Classification Loss and returns loss.

In alternatively possible realization method, the acceleration region convolutional neural networks model uses ZF frames, the area Suggest that network and the fast area convolutional neural networks share 5 convolutional layers in domain.

Negative sample difficulty example is added in alternatively possible realization method, in the training of the fast area convolutional network to excavate Method.

The second aspect of the application provides a kind of object detecting device, and described device includes：

First acquisition unit, for obtaining training sample set, the training sample set is labeled with target location including multiple With the target image of target angle type；

Training unit for being trained using the training sample set pair acceleration region convolutional neural networks model, is obtained To trained acceleration region convolutional neural networks model, the acceleration region convolutional neural networks model includes region and suggests net Network and fast area convolutional neural networks, the region suggest that network and the fast area convolutional neural networks share convolution Layer, the convolutional layer extract the characteristic pattern that the training sample concentrates each target image, and the region suggests network according to institute The target angle type of candidate region and the candidate region that characteristic pattern is obtained in each target image is stated, it is described fast Fast region convolutional neural networks are screened and adjusted to the candidate region according to the characteristic pattern, obtain each target The target area of image and the target angle type of the target area；

Second acquisition unit, for obtaining image to be detected；

Detection unit, for using in trained acceleration region convolutional neural networks model to described image to be detected Target detection is carried out, obtains the target area of described image to be detected and the target angle type of the target area.

In alternatively possible realization method, the training unit is specifically used for：

The third aspect of the application provides a kind of computer installation, and the computer installation includes processor, the processing Device is used to realize the object detection method when performing the computer program stored in memory.

The fourth aspect of the application provides a kind of computer readable storage medium, is stored thereon with computer program, described The object detection method is realized when computer program is executed by processor.

The present invention obtains training sample set, and the training sample set is labeled with target location and target angle class including multiple The target image of type；It is trained, is trained using the training sample set pair acceleration region convolutional neural networks model Acceleration region convolutional neural networks model, the acceleration region convolutional neural networks model includes region and suggests network and quick Region convolutional neural networks, the region suggest that network and the fast area convolutional neural networks share convolutional layer, the volume Lamination extracts the characteristic pattern that the training sample concentrates each target image, and the region suggests that network is obtained according to the characteristic pattern Take the target angle type of the candidate region and the candidate region in each target image, the fast area convolution Neutral net is screened and adjusted to the candidate region according to the characteristic pattern, obtains the target of each target image Region and the target angle type of the target area；Obtain image to be detected；Utilize the trained acceleration region convolution Target detection is carried out to described image to be detected in neural network model, obtains the target area and institute of described image to be detected State the target angle type of target area.

The existing target detection based on convolutional neural networks using selective search algorithm generate candidate region, take compared with It is more, and extracted region and target detection are separated.Present invention introduce region in acceleration region convolutional neural networks model is built Network is discussed, candidate region is extracted using depth convolutional neural networks.After network training, by sharing convolution network parameter Method, the characteristic pattern that image is obtained by convolutional layer can be applied to extracted region and target detection simultaneously, that is, share The result of calculation of convolutional network so as to the speed of significantly lifting region extraction, accelerates the speed of entire testing process, realizes end To the detection scheme at end.Also, the present invention considers the problem of shooting angle difference causes recall rate to reduce, using being labeled with mesh The target image of mark angular type is trained acceleration region convolutional neural networks model, improves the recall rate of target.Cause This, the present invention can realize the target detection of quick high detection rate.

Description of the drawings

Fig. 1 is the flow chart for the object detection method that the embodiment of the present invention one provides.

Fig. 2 is the schematic diagram that network is suggested in region.

Fig. 3 is the structure chart of object detecting device provided by Embodiment 2 of the present invention.

Fig. 4 is the schematic diagram for the computer installation that the embodiment of the present invention three provides.

Specific embodiment

It is to better understand the objects, features and advantages of the present invention, below in conjunction with the accompanying drawings and specific real Applying example, the present invention will be described in detail.It should be noted that in the case where there is no conflict, embodiments herein and embodiment In feature can be mutually combined.

Elaborate many details in the following description to facilitate a thorough understanding of the present invention, described embodiment only Only it is part of the embodiment of the present invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill Personnel's all other embodiments obtained without making creative work, belong to the scope of protection of the invention.

Unless otherwise defined, all of technologies and scientific terms used here by the article is with belonging to technical field of the invention The normally understood meaning of technical staff is identical.Term used in the description of the invention herein is intended merely to description tool The purpose of the embodiment of body, it is not intended that in the limitation present invention.

Preferably, object detection method of the invention is applied in one or more computer installation.The computer Device be it is a kind of can be according to the instruction for being previously set or storing, the automatic equipment for carrying out numerical computations and/or information processing, Hardware includes but not limited to microprocessor, application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), programmable gate array (Field-Programmable Gate Array, FPGA), digital processing unit (Digital Signal Processor, DSP), embedded device etc..

The computer installation can be that the calculating such as desktop PC, notebook, palm PC and cloud server are set It is standby.The computer installation can with user by modes such as keyboard, mouse, remote controler, touch tablet or voice-operated devices into pedestrian Machine interacts.

Embodiment one

Fig. 1 is the flow chart for the object detection method that the embodiment of the present invention one provides.The object detection method is applied to Computer installation.The object detection method can detect the position of goal-selling in image (such as vehicle, ship), and can To detect the angular type of goal-selling in image (such as front, side, back side).

As shown in Figure 1, the object detection method specifically includes following steps：

101：Obtain training sample set.

The training sample set includes multiple target images for being labeled with target location and target angle type.The target Image is the image for including goal-selling (such as ship, vehicle etc.).The target image can include one or more default Target.The target location represents the position of goal-selling in the target image.The target angle type represents goal-selling Shooting angle (such as front, the back side, side).

In one embodiment, the training sample set includes about 10000 target images.The target location can be with It is labeled as [x, y, w, h], x, y represents the top left co-ordinate of target area, and w represents the width of target area, and h represents target area Height.The target angle type includes positive angular type, flank angle type and back angle type.It is for example, described When object detection method is used to be detected ship, if target image is the direct picture of ship, the target angle marked Type is positive angular type；If target image is the side image of ship, the target angle type marked is flank angle Type；If target image is the back side image of ship, the target angle type marked is back angle type.

102：Use training sample set pair acceleration region convolutional neural networks model (the Faster Region-based Convolution Neural Network, Faster R-CNN) it is trained, obtain trained acceleration region convolutional Neural Network model.

The acceleration region convolutional neural networks model include region suggest network (Region Proposal Network, ) and fast area convolutional neural networks (Fast Region-based Convolution Neural Network, Fast RPN R-CNN).It needs to suggest region network and fast convolution network carry out alternately training.

Network is suggested in the region and the fast area convolutional neural networks have shared convolutional layer, and the convolutional layer is used In the characteristic pattern of extraction image.Suggest that network generates candidate region and the candidate regions of image according to the characteristic pattern in the region The target angle type in domain, and the target angle type of the candidate region of generation and candidate region is inputted into the fast area Convolutional neural networks.The fast area convolutional neural networks are screened and adjusted to the candidate region according to the characteristic pattern It is whole, obtain the target area of image and the target angle type of target area.

Specifically, in training, the convolutional layer extracts the characteristic pattern that the training sample concentrates each target image, institute It states region and suggests that network obtains candidate region and the candidate region in each target image according to the characteristic pattern Target angle type, the fast area convolutional neural networks according to the characteristic pattern to the candidate region carry out screening and Adjustment, obtains the target area of each target image and the target angle type of the target area.

In a preferred embodiment, the acceleration region convolutional neural networks model uses ZF frames, and the region is suggested Network and the fast area convolutional neural networks share 5 convolutional layers.

In one embodiment, the target image that training sample is concentrated can be the image of arbitrary dimension, in input institute Target image is scaled to the image of unified size (such as 1000*600) before stating convolutional layer.In one embodiment, institute The length and width for stating the characteristic pattern of convolutional layer extraction reduce 16 times compared with input picture, and the depth of characteristic pattern is 256.

In one embodiment, being trained using training sample set pair acceleration region convolutional neural networks model can be with Including：

(1) region described in Imagenet model initializations is used to suggest network, using described in training sample set training Suggest network in region.

(2) region in (1) after training is used to suggest that network generation training sample concentrates the candidate regions of each target image The fast area convolutional neural networks are trained in domain using the candidate region.At this point, network and fast area volume are suggested in region Product neutral net shares convolutional layer not yet.

(3) the fast area convolutional neural networks in (2) after training is used to initialize the region and suggest network, use instruction Practicing sample set trains the region to suggest network.

(4) region in (3) after training is used to suggest fast area convolutional neural networks described in netinit, and is kept The convolutional layer is fixed, and the fast area convolutional neural networks are trained using training sample set.At this point, region suggest network and Fast area convolutional neural networks share identical convolutional layer, constitute a unified network.

Fig. 2 is the schematic diagram that network is suggested in region.

After image is by shared convolutional layer, the characteristic pattern of image is obtained.With default size (such as 3*3) on characteristic pattern Sliding window slided according to default step-length (such as step-length be 1), the every position of sliding window corresponds to a central point.When When sliding window slides into a position, to the default scale (such as 3 kinds of scales 128,256,512) of the central point of position application and Default length-width ratio (such as 3 kinds of length-width ratios 1:1、1:2、2:1) anchor frame obtains default quantity (such as 9) candidate region.Pass through Each sliding window is mapped to the feature vector of a low-dimensional by one convolutional layer (convolutional layer is cascaded with shared convolutional layer) In (such as feature vector of 256-d or 512-d).This feature vector is exported to three full articulamentums at the same level, one is mesh Mark classification layer, one is angle classification layer, and one is that border returns layer.The target classification of target classification layer output candidate region obtains Point, it is target (i.e. prospect) or background to be used to indicate candidate region.Candidate region belongs to prospect or background, depending on candidate The registration in region and the target area (region determined by the target location marked) of mark, registration are more than some threshold value Then it is positioned as prospect, registration is then positioned as background less than threshold value.The angle classification score of angle classification layer output candidate region, It is used to indicate the target angle type of candidate region.Border returns the position of the candidate region after layer output fine tuning, for waiting The border of favored area is finely adjusted.

Region suggests that the candidate region that network is chosen is more, if can have been screened according to the target classification score of candidate region The candidate region of dry highest scoring is input to fast area convolutional neural networks, to accelerate the speed of training and detection.

In order to train region suggest network, give each candidate region distribute a label, the label include positive label and Negative label, positive label can distribute to two class candidate regions：(1) with some real goal (Ground Truth, GT) bounding box There is the candidate region that highest IoU (the ratio between Intersection over Union, intersection union) is overlapped；(2) with arbitrary GT sides Boundary's frame has the candidate region of the IoU overlappings more than 0.7.For a GT bounding box, positive label may be distributed to multiple candidate regions Domain.Negative label distribute to be below with the IoU ratios of all GT bounding boxes 0.3 candidate region.Non- just non-negative candidate region There is no any effect to training objective.

Region suggests that the training of network is trained using back-propagation algorithm, and adjustment region suggests network in training process Network parameter, minimize loss function.Suggest the prediction confidence of the candidate region of neural network forecast in loss function indicating area Degree and the difference of true confidence level.In the present embodiment, loss function includes target classification loss, angle Classification Loss and recurrence Lose three parts.

The loss function of image can be defined as：

Wherein, i is the index of candidate region in a training batch (mini-batch).

It is the target classification loss of candidate region.N_clsFor the size of training batch, such as 256.p_iIt is i-th A candidate region is the prediction probability of target.It is GT labels, if candidate region is just (label distributed is positive label, is claimed For positive candidate region),For 1；If candidate region is negative (label distributed is negative label, referred to as negative candidate region),For 0。It may be calculated

It is the angle Classification Loss of candidate region,Meaning be referred to

It is the recurrence loss of candidate region.λ is balance weight, can be taken as 10.N_regFor candidate region Quantity.It may be calculatedt_iA coordinate vector, i.e. t_i=(t_x,t_y,t_w, t_h), represent that 4 of candidate region parameterize coordinates (such as the coordinate in the candidate region upper left corner and width, height).Be with The coordinate vector of the corresponding GT bounding boxes in positive candidate region, i.e.,(such as real goal region upper left The coordinate and width at angle, height).R is the loss function (smooth with robustness_L1), it is defined as：

Above-described embodiment considers the problem of shooting angle difference causes recall rate to reduce, in acceleration region convolutional Neural net Using the loss function for including angle Classification Loss in the training of network model, candidate regions are calculated according to the target angle type of prediction The angle Classification Loss in domain improves the recall rate of target.

The above-mentioned training method for suggesting network for region.The training method of fast area convolutional network is referred to region and builds The training method of network is discussed, details are not described herein again.

In the present embodiment, negative sample difficulty example is added in the training of fast area convolutional network and excavates (Hard Negative Mining, HNM) method.For being wrongly classified as the negative sample of positive sample by fast area convolutional network (i.e. Difficult example), the information record of these negative samples is got off, during next repetitive exercise, these negative samples are inputted again It is concentrated to training sample, and increases the weight of its loss, enhanced its influence to grader, can so ensure ceaselessly pin Classify to the negative sample being more difficult to so that from the easier to the more advanced, the sample distribution covered is also more various for the feature that grader is acquired Property.

103：Obtain image to be detected.

Image to be detected is to include the image of goal-selling (such as ship).Goal-selling is the detection in image to be detected Object.For example, when carrying out ship detection to image to be detected, goal-selling is the ship in image to be detected.

Image to be detected can be the image received from external equipment, such as the ship figure of the camera shooting near harbour Picture receives the ship image from the camera.

Alternatively, image to be detected can be the image of the computer installation shooting, such as computer installation shooting Ship image.

Alternatively, image to be detected can also be the image read from the memory of the computer installation, such as from institute State the ship image read in the memory of computer installation.

104：Image to be detected is detected using trained acceleration region convolutional neural networks model, is obtained to be checked The target area of altimetric image and the target angle type of the target area.

Specifically, the region suggests that the convolutional layer extraction that network and the fast area convolutional neural networks are shared is to be checked The characteristic pattern of altimetric image.Suggest that network obtains candidate region and institute in image to be detected according to the characteristic pattern in the region State the target angle type of candidate region.The fast area convolutional neural networks are according to the characteristic pattern to the candidate region It is screened and is adjusted, obtain the target area of image to be detected and the target angle type of the target area.

The object detection method of embodiment one obtains training sample set, and the training sample set is labeled with target including multiple Position and the target image of target angle type；It is carried out using the training sample set pair acceleration region convolutional neural networks model Training, obtains trained acceleration region convolutional neural networks model, and the acceleration region convolutional neural networks model includes area Network and fast area convolutional neural networks are suggested in domain, and the region suggests that network and the fast area convolutional neural networks are total to Convolutional layer is enjoyed, the convolutional layer extracts the characteristic pattern that the training sample concentrates each target image, and network is suggested in the region The target angle type of the candidate region and the candidate region in each target image is obtained according to the characteristic pattern, The fast area convolutional neural networks are screened and adjusted to the candidate region according to the characteristic pattern, are obtained described each The target area of a target image and the target angle type of the target area；Obtain image to be detected；Utilize the training Target detection is carried out to described image to be detected in good acceleration region convolutional neural networks model, obtains the mapping to be checked The target area of picture and the target angle type of the target area.

The existing target detection based on convolutional neural networks using selective search algorithm generate candidate region, take compared with It is more, and extracted region and target detection are separated.The object detection method of embodiment one is in acceleration region convolutional neural networks Introduce region suggests network in model, and candidate region is extracted using depth convolutional neural networks.After network training, pass through The method of shared convolution network parameter, the characteristic pattern that image is obtained by convolutional layer can be applied to extracted region and target simultaneously Detection, that is, the result of calculation of shared convolutional network, so as to the speed of significantly lifting region extraction, accelerate entire detection stream The speed of journey realizes detection scheme end to end.Also, the object detection method of embodiment one considers shooting angle difference and draws The problem of recall rate reduces is played, using being labeled with the target image of target angle type to acceleration region convolutional neural networks model It is trained, improves the recall rate of target.Therefore, the object detection method of embodiment one can realize quick high detection rate Target detection.

Embodiment two

Fig. 3 is the structure chart of object detecting device provided by Embodiment 2 of the present invention.As shown in figure 3, the target detection Device 10 can include：First acquisition unit 301, training unit 302, second acquisition unit 303, detection unit 304.

First acquisition unit 301, for obtaining training sample set.

Training unit 302, for using the training sample set pair acceleration region convolutional neural networks model (Faster Region-based Convolution Neural Network, Faster R-CNN) it is trained, obtain trained add Fast region convolutional neural networks model.

Fig. 2 is the schematic diagram that network is suggested in region.

The loss function of image can be defined as：

Wherein, i is the index of candidate region in a training batch (mini-batch).

It is the angle Classification Loss of candidate region,Meaning be referred to

Second acquisition unit 303, for obtaining image to be detected.

Detection unit 304, for being carried out using trained acceleration region convolutional neural networks model to image to be detected Detection, obtains the target area of image to be detected and the target angle type of the target area.

Embodiment two obtains training sample set, and the training sample set is labeled with target location and target angle including multiple The target image of type；It is trained, is trained using the training sample set pair acceleration region convolutional neural networks model Good acceleration region convolutional neural networks model, the acceleration region convolutional neural networks model is including region suggestion network and soon Fast region convolutional neural networks, the region suggests that network and the fast area convolutional neural networks share convolutional layer, described Convolutional layer extracts the characteristic pattern that the training sample concentrates each target image, and the region suggests network according to the characteristic pattern Obtain the target angle type of the candidate region and the candidate region in each target image, the fast area volume Product neutral net is screened and adjusted to the candidate region according to the characteristic pattern, obtains the mesh of each target image Mark the target angle type of region and the target area；Obtain image to be detected；It is rolled up using the trained acceleration region Product neural network model in described image to be detected carry out target detection, obtain described image to be detected target area and The target angle type of the target area.

The existing target detection based on convolutional neural networks using selective search algorithm generate candidate region, take compared with It is more, and extracted region and target detection are separated.The introduce region in acceleration region convolutional neural networks model of embodiment two It is recommended that network, candidate region is extracted using depth convolutional neural networks.After network training, joined by shared convolutional network Several methods, the characteristic pattern that image is obtained by convolutional layer can be applied to extracted region and target detection simultaneously, that is, common The result of calculation of convolutional network is enjoyed, so as to the speed of significantly lifting region extraction, accelerates the speed of entire testing process, realizes Detection scheme end to end.Also, embodiment two considers the problem of shooting angle difference causes recall rate to reduce, and uses mark The target image for having target angle type is trained acceleration region convolutional neural networks model, improves the detection of target Rate.Therefore, embodiment two can realize the target detection of quick high detection rate.

Embodiment three

Fig. 4 is the schematic diagram for the computer installation that the embodiment of the present invention three provides.The computer installation 1 includes memory 20th, processor 30 and the computer program 40 that can be run in the memory 20 and on the processor 30, example are stored in Such as object detection program.The processor 30 is realized when performing the computer program 40 in above-mentioned object detection method embodiment The step of, such as step 101~104 shown in FIG. 1.Alternatively, the processor 30 is realized when performing the computer program 40 The function of each module/unit in above device embodiment, such as the unit 301~304 in Fig. 3.

Illustratively, the computer program 40 can be divided into one or more module/units, it is one or Multiple module/units are stored in the memory 20, and are performed by the processor 30, to complete the present invention.Described one A or multiple module/units can be the series of computation machine program instruction section that can complete specific function, which is used for Implementation procedure of the computer program 40 in the computer installation 1 is described.For example, the computer program 40 can be by First acquisition unit 301, training unit 302, second acquisition unit 303, the detection unit 304 being divided into Fig. 3, each unit tool Body function is referring to embodiment two.

The computer installation 1 can be that the calculating such as desktop PC, notebook, palm PC and cloud server are set It is standby.It will be understood by those skilled in the art that the schematic diagram 4 is only the example of computer installation 1, do not form to computer The restriction of device 1 can include either combining some components or different components, example than illustrating more or fewer components Such as described computer installation 1 can also include input-output equipment, network access equipment, bus.

Alleged processor 30 can be central processing unit (Central Processing Unit, CPU), can also be Other general processors, digital signal processor (Digital Signal Processor, DSP), application-specific integrated circuit (Application Specific Integrated Circuit, ASIC), ready-made programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, Discrete hardware components etc..General processor can be microprocessor or the processor 30 can also be any conventional processor Deng the processor 30 is the control centre of the computer installation 1, utilizes various interfaces and connection entire computer dress Put 1 various pieces.

The memory 20 can be used for storing the computer program 40 and/or module/unit, and the processor 30 passes through The computer program and/or module/unit and calling that operation or execution are stored in the memory 20 are stored in memory Data in 20 realize the various functions of the computer installation 1.The memory 20 can mainly include storing program area and deposit Store up data field, wherein, storing program area can storage program area, the application program needed at least one function (for example broadcast by sound Playing function, image player function etc.) etc.；Storage data field can be stored uses created data (ratio according to computer installation 1 Such as voice data, phone directory) etc..In addition, memory 20 can include high-speed random access memory, can also include non-easy The property lost memory, such as hard disk, memory, plug-in type hard disk, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) block, flash card (Flash Card), at least one disk memory, flush memory device or other Volatile solid-state part.

If the integrated module/unit of the computer installation 1 is realized in the form of SFU software functional unit and as independently Production marketing or in use, can be stored in a computer read/write memory medium.Based on such understanding, the present invention It realizes all or part of flow in above-described embodiment method, relevant hardware can also be instructed by computer program come complete Into the computer program can be stored in a computer readable storage medium, which is being executed by processor When, it can be achieved that the step of above-mentioned each embodiment of the method.Wherein, the computer program includes computer program code, described Computer program code can be source code form, object identification code form, executable file or some intermediate forms etc..The meter Calculation machine readable medium can include：Can carry the computer program code any entity or device, recording medium, USB flash disk, Mobile hard disk, magnetic disc, CD, computer storage, read-only memory (ROM, Read-Only Memory), random access memory Device (RAM, Random Access Memory), electric carrier signal, telecommunication signal and software distribution medium etc..Need what is illustrated It is that the content that the computer-readable medium includes can be fitted according to legislation in jurisdiction and the requirement of patent practice When increase and decrease, such as in some jurisdictions, according to legislation and patent practice, computer-readable medium, which does not include electric carrier wave, to be believed Number and telecommunication signal.

In several embodiments provided by the present invention, it should be understood that disclosed computer installation and method, it can be with It realizes by another way.For example, computer installation embodiment described above is only schematical, for example, described The division of unit is only a kind of division of logic function, can there is other dividing mode in actual implementation.

In addition, each functional unit in each embodiment of the present invention can be integrated in same treatment unit, it can also That unit is individually physically present, can also two or more units be integrated in same unit.Above-mentioned integrated list The form that hardware had both may be employed in member is realized, can also be realized in the form of hardware adds software function module.

It is obvious to a person skilled in the art that the invention is not restricted to the details of above-mentioned exemplary embodiment, Er Qie In the case of without departing substantially from spirit or essential attributes of the invention, the present invention can be realized in other specific forms.Therefore, no matter From the point of view of which point, the present embodiments are to be considered as illustrative and not restrictive, and the scope of the present invention is by appended power Profit requirement rather than above description limit, it is intended that all by what is fallen within the meaning and scope of the equivalent requirements of the claims Variation includes within the present invention.Any reference numeral in claim should not be considered as to the involved claim of limitation.This Outside, it is clear that one word of " comprising " is not excluded for other units or step, and odd number is not excluded for plural number.It is stated in computer installation claim Multiple units or computer installation can also be realized by same unit or computer installation by software or hardware.The One, the second grade words are used to indicate names, and are not represented any particular order.

Finally it should be noted that the above embodiments are merely illustrative of the technical solutions of the present invention and it is unrestricted, although reference The present invention is described in detail in preferred embodiment, it will be understood by those of ordinary skill in the art that, it can be to the present invention's Technical solution is modified or equivalent substitution, without departing from the spirit and scope of technical solution of the present invention.

Claims

1. a kind of object detection method, which is characterized in that the described method includes：

Training sample set is obtained, the training sample set includes multiple target figures for being labeled with target location and target angle type Picture；

It is trained using the training sample set pair acceleration region convolutional neural networks model, obtains trained acceleration region Convolutional neural networks model, the acceleration region convolutional neural networks model include region and suggest network and fast area convolution god Through network, network is suggested in the region and the fast area convolutional neural networks share convolutional layer, and the convolutional layer extracts institute The characteristic pattern that training sample concentrates each target image is stated, it is described each that the region suggests that network is obtained according to the characteristic pattern The target angle type of candidate region and the candidate region in target image, the fast area convolutional neural networks root The candidate region is screened and adjusted according to the characteristic pattern, obtains the target area of each target image and described The target angle type of target area；

Obtain image to be detected；

Target detection is carried out to described image to be detected using in the trained acceleration region convolutional neural networks model, Obtain the target area of described image to be detected and the target angle type of the target area.

2. the method as described in claim 1, which is characterized in that described to use training sample set pair acceleration region convolution god Through network model be trained including：

(1) region described in Imagenet model initializations is used to suggest network, the region is trained using the training sample set It is recommended that network；

(2) region in (1) after training is used to suggest that network generates the candidate region of each target image, utilizes the time Favored area trains the fast area convolutional neural networks；

(3) the fast area convolutional neural networks in (2) after training is used to initialize the region and suggest network, use the instruction Practicing sample set trains the region to suggest network；

(4) region in (3) after training is used to suggest fast area convolutional neural networks described in netinit, and described in holding Convolutional layer is fixed, and the fast area convolutional neural networks are trained using the training sample set.

3. the method as described in claim 1, which is characterized in that described to use training sample set pair acceleration region convolution god Through network model be trained including：

Network and the fast area convolutional neural networks, which are trained, is suggested to the region using back-propagation algorithm, training The network parameter that network and the fast area convolutional neural networks are suggested in the region is adjusted in the process, makes loss function minimum Change, wherein the loss function includes target classification loss, angle Classification Loss and returns loss.

4. method as claimed any one in claims 1 to 3, which is characterized in that the acceleration region convolutional neural networks mould Type uses ZF frames, and the region suggests that network and the fast area convolutional neural networks share 5 convolutional layers.

5. method as claimed any one in claims 1 to 3, which is characterized in that the training of the fast area convolutional network Middle addition negative sample difficulty example method for digging.

6. a kind of object detecting device, which is characterized in that described device includes：

First acquisition unit, for obtaining training sample set, the training sample set is labeled with target location and mesh including multiple Mark the target image of angular type；

Training unit for being trained using the training sample set pair acceleration region convolutional neural networks model, is instructed The acceleration region convolutional neural networks model perfected, the acceleration region convolutional neural networks model include region suggest network and Fast area convolutional neural networks, the region suggest that network and the fast area convolutional neural networks share convolutional layer, institute It states convolutional layer and extracts the characteristic pattern that the training sample concentrates each target image, the region suggests network according to the feature Figure obtains the target angle type of the candidate region and the candidate region in each target image, the fast area Convolutional neural networks are screened and adjusted to the candidate region according to the characteristic pattern, obtain each target image Target area and the target angle type of the target area；

Second acquisition unit, for obtaining image to be detected；

Detection unit, for being carried out using in trained acceleration region convolutional neural networks model to described image to be detected Target detection obtains the target area of described image to be detected and the target angle type of the target area.

7. device as claimed in claim 6, which is characterized in that the training unit is specifically used for：

8. device as claimed in claim 6, which is characterized in that the training unit is specifically used for：

9. a kind of computer installation, it is characterised in that：The computer installation includes processor, and the processor is deposited for performing The object detection method as any one of claim 1-5 is realized during the computer program stored in reservoir.

10. a kind of computer readable storage medium, is stored thereon with computer program, it is characterised in that：The computer program The object detection method as any one of claim 1-5 is realized when being executed by processor.