CN109740598A

CN109740598A - Object localization method and device under structuring scene

Info

Publication number: CN109740598A
Application number: CN201811640359.XA
Authority: CN
Inventors: 戴鹏; 王胜春; 杜馨瑜; 顾子晨; 方玥
Original assignee: China Academy of Railway Sciences Corp Ltd CARS; Infrastructure Inspection Institute of CARS; Beijing IMAP Technology Co Ltd
Current assignee: China Academy of Railway Sciences Corp Ltd CARS; Infrastructure Inspection Institute of CARS; Beijing IMAP Technology Co Ltd
Priority date: 2018-12-29
Filing date: 2018-12-29
Publication date: 2019-05-10

Abstract

The invention discloses the object localization methods and device under a kind of structuring scene, this method comprises: obtaining image to be detected；The corresponding Feature Mapping figure of described image to be detected is generated according to the feature convolution nuclear parameter in the identification network of training in advance；Structuring detection zone to be verified is determined from the Feature Mapping figure according to the RPN network convolution nuclear parameter in the identification network；The position sensing shot chart that convolution nuclear parameter generates positioning object and object of reference described in the structuring detection zone is generated according to the shot chart in the identification network；The structuring detection zone to be verified is verified according to the position sensing shot chart；If determining location information of the positioning object in the structuring detection zone by verifying.The deep learning model that the present invention solves the prior art is difficult to realize the technical issues of high speed positioning of target.

Description

Object localization method and device under structuring scene

Technical field

The present invention relates to deep learning field, in particular under a kind of structuring scene object localization method and Device.

Background technique

In recent years, with the fast development of railway cause, China railways total kilometrage is up to 12.4 ten thousand kilometers.Rail fastening is To connect the track infrastructure component of rail and sleeper on track, effect is that rail is fixed on sleeper, keeps rail Away from and prevent the lateral movement of rail.Rail fastening is abnormal, will be so that fastener does not have fixed function to rail, to train Operational safety produce serious influence.Therefore, the service state of rail fastener is to guarantee safety of railway operation to Guan Chong It wants, needs periodically to patrol to it, find the abnormality of fastener in time.

In recent years, the target detection technique based on deep learning obtains important breakthrough, the target detection based on deep learning Generally it is divided into two groups, the R-CNN series generated based on candidate region and (nominating without region) based on homing method YOLO, SSD series, these detection algorithms are greatly improved the accuracy rate of fastener detection.But existing deep learning model is big Spininess designs the multiclass object detection in natural scene, it is difficult to meet the ultrahigh speed testing requirements of railway rail clip.In order to full The positioning of the real-time detection demand of the high speed comprehensive detection train of sufficient speed per hour 350km/h, contacting piece proposes high requirement, Meet the fastener positioning under 350km/h speed, then locating speed is required to reach 49 frames/second, i.e. at least positioning of 20ms/ frame Speed, existing deep learning model are difficult to meet so high locating speed requirement, also be unable to satisfy under high speed detection Positioning accuracy.

Summary of the invention

The main purpose of the present invention is to provide the object localization methods and device under a kind of structuring scene, existing to solve The technical issues of high speed positioning for thering is the deep learning model of technology to be difficult to realize target.

To achieve the goals above, according to an aspect of the invention, there is provided the target under a kind of structuring scene is fixed Position method, this method comprises:

Obtain image to be detected；

The corresponding feature of described image to be detected is generated according to the feature convolution nuclear parameter in the identification network of training in advance Mapping graph；

It is determined from the Feature Mapping figure according to the RPN network convolution nuclear parameter in the identification network to be verified Structuring detection zone, wherein the structuring detection zone includes: at least one positioning object and object of reference；

Convolution nuclear parameter is generated according to the shot chart in the identification network to generate described in the structuring detection zone Position the position sensing shot chart of object and object of reference；

The structuring detection zone to be verified is verified according to the position sensing shot chart；

If determining location information of the positioning object in the structuring detection zone by verifying.

Further, this method further includes:

The mark image pattern for training is obtained, the mark image pattern includes the structuring detection zone marked out Domain, the structuring detection zone marked out include: the positioning object and object of reference that at least one is marked out；

It is raw that feature convolution nuclear parameter, RPN network convolution nuclear parameter and shot chart are trained by the mark image pattern At convolution nuclear parameter；

It generates convolution nuclear parameter according to the feature convolution nuclear parameter, RPN network convolution nuclear parameter and shot chart and generates and know The training pattern parameter of other network.

To achieve the goals above, according to another aspect of the present invention, the target provided under a kind of structuring scene is fixed Position device, the device include:

Image to be detected acquiring unit, for obtaining image to be detected；

Feature Mapping figure generation unit generates institute for the feature convolution nuclear parameter in the identification network according to training in advance State the corresponding Feature Mapping figure of image to be detected；

Structuring detection zone determination unit, for according to the RPN network convolution nuclear parameter in the identification network from institute It states and determines structuring detection zone to be verified in Feature Mapping figure, wherein the structuring detection zone includes: at least one A positioning object and object of reference；

Position sensing shot chart generation unit, it is raw for generating convolution nuclear parameter according to the shot chart in the identification network At the position sensing shot chart for positioning object and object of reference described in the structuring detection zone；

Structuring detection zone authentication unit, for according to the position sensing shot chart to the structuring to be verified Detection zone is verified；

Object location information determination unit, for determining the positioning object in the structuring detection zone Location information.

Further, the device further include:

Image pattern acquiring unit is marked, for obtaining the mark image pattern for being used for training, the mark image pattern Including the structuring detection zone marked out, the structuring detection zone marked out includes: at least one determining of marking out Position object and object of reference；

Convolution kernel parameter training unit, for training feature convolution nuclear parameter, RPN net by the mark image pattern Network convolution nuclear parameter and shot chart generate convolution nuclear parameter；

Training pattern parameter generating unit is used for according to the feature convolution nuclear parameter, RPN network convolution nuclear parameter and obtains Component generates the training pattern parameter that convolution nuclear parameter generates identification network.

To achieve the goals above, according to another aspect of the present invention, a kind of computer equipment, including storage are additionally provided Device, processor and storage on a memory and the computer program that can run on a processor, the processor execution meter The step in the object localization method under above structure scene is realized when calculation machine program.

To achieve the goals above, according to another aspect of the present invention, a kind of computer readable storage medium is additionally provided, The computer-readable recording medium storage has computer program, real when the computer program executes in the computer processor The step in object localization method under existing above structure scene.

The invention has the benefit that the invention proposes a kind of full convolution depth nerve nets of the structured region of optimization Network makes full use of the space structure information of track, converts the orientation problem of fastener Small object to the positioning of structured region Problem is greatly improved the locating speed of fastener.

Detailed description of the invention

In order to more clearly explain the embodiment of the invention or the technical proposal in the existing technology, to embodiment or will show below There is attached drawing needed in technical description to be briefly described, it should be apparent that, the accompanying drawings in the following description is the present invention Some embodiments for those of ordinary skill in the art without creative efforts, can also basis These attached drawings obtain other attached drawings.In the accompanying drawings:

Fig. 1 is the flow chart of the object localization method under structuring scene of the embodiment of the present invention；

Fig. 2 is the method flow diagram of training identification network of the embodiment of the present invention；

Fig. 3 is the method flow diagram that the embodiment of the present invention verifies structuring detection zone；

Fig. 4 is the first structure block diagram of the target locating set under structuring scene of the embodiment of the present invention；

Fig. 5 is the second structural block diagram of the target locating set under structuring scene of the embodiment of the present invention；

Fig. 6 is the composite structural diagram that structuring detection zone of the embodiment of the present invention carries out authentication unit；

Fig. 7 is the basic flow chart of the target positioning based on deep learning；

Fig. 8 is track scene structure prior information figure of the embodiment of the present invention；

Fig. 9 is Small object fastener mark schematic diagram；

Figure 10 is the big target structural area marking schematic diagram of the embodiment of the present invention；

Figure 11 is non-fragment orbit positive sample schematic diagram of the embodiment of the present invention；

Figure 12 is non-fragment orbit negative sample schematic diagram of the embodiment of the present invention；

Figure 13 is SR-FCN schematic network structure of the embodiment of the present invention；

Figure 14 is raw anchor create-rule schematic diagram；

Figure 15 is the anchor point create-rule schematic diagram of the embodiment of the present invention.

Specific embodiment

In order to enable those skilled in the art to better understand the solution of the present invention, below in conjunction in the embodiment of the present invention Attached drawing, technical scheme in the embodiment of the invention is clearly and completely described, it is clear that described embodiment is only The embodiment of a part of the invention, instead of all the embodiments.Based on the embodiments of the present invention, ordinary skill people The model that the present invention protects all should belong in member's every other embodiment obtained without making creative work It encloses.

It should be understood by those skilled in the art that, the embodiment of the present invention can provide as method, system or computer program Product.Therefore, complete hardware embodiment, complete software embodiment or reality combining software and hardware aspects can be used in the present invention Apply the form of example.Moreover, it wherein includes the computer of computer usable program code that the present invention, which can be used in one or more, The computer program implemented in usable storage medium (including but not limited to magnetic disk storage, CD-ROM, optical memory etc.) produces The form of product.

It should be noted that description and claims of this specification and term " first " in above-mentioned attached drawing, " Two " etc. be to be used to distinguish similar objects, without being used to describe a particular order or precedence order.It should be understood that using in this way Data be interchangeable under appropriate circumstances, so as to the embodiment of the present invention described herein.In addition, term " includes " and " tool Have " and their any deformation, it is intended that cover it is non-exclusive include, for example, containing a series of steps or units Process, method, system, product or equipment those of are not necessarily limited to be clearly listed step or unit, but may include without clear Other step or units listing to Chu or intrinsic for these process, methods, product or equipment.

It should be noted that in the absence of conflict, the feature in embodiment and embodiment in the present invention can phase Mutually combination.The present invention will be described in detail below with reference to the accompanying drawings and embodiments.

The optional application scenarios of of the invention one are to carry out detection and localization to the fastener on rail, it should be noted that this hair Bright application is not limited to the scene.

The invention proposes a kind of full convolution deep neural network (Structured of the structured region of optimization Regions-Fully Convolutional Networks, SR-FCN), the space structureization of track can be made full use of to believe Breath converts the test problems of fastener Small object to the orientation problem of structured region, and passes through optimization region referral networks The anchor point (anchor) of (Region Proposal Network, RPN) traverses number, is greatly improved the positioning speed of fastener Degree, and mistake is positioned caused by avoiding because of local fastener missing or background interference, improve the robustness of detection.

As shown in figure 9, the fastener on track is the discrete Small object on orbital image, the depth model of the prior art is being schemed When carrying out fastener positioning as in, the method for being often all based on window sliding search generates detection to each fastener in image and waits Favored area is this directly very low using slip window sampling progress exhaustive search efficiency.

The present invention considers that the facility locations such as rail, fastener, track plates are relatively fixed in orbital image, and position distribution can To constitute the distinctive structured features of track scene.The priori knowledge of 7 fixed structures has been included at least in railroad track image: It (1) only include a rail in every frame orbital image；(2) rail is always vertical with the x-axis of image, and two boundaries of rail It is parallel；(3) the spatial sampling distance of each image short transverse is 2m and error is less than 2mm；(4) width of rail is solid Fixed pixel value；(5) fastener area is always symmetrical in the two sides on rail boundary, and the size of fastener area is fixed 's；(6) spacing of adjacent fastener track or sleeper along the vertical direction is relatively fixed；(7) include 6 complete buttons in every width figure Part, and be distributed in sphere of movements for the elephants type.Fig. 8 is track scene structure prior information figure of the embodiment of the present invention, as shown in figure 8, every The width of rail is about 60 pixels in frame orbital image, and the width of fastener area is about 80 pixels, and fastener lateral separation is about 55- 65 pixels, longitudinal gap are about 275-315 pixel.

The present invention is using known prior information in railroad track image, using by fastener Small objects multiple in piece image Detection be converted into a monolith with fixed structure big target area detect, convert knot for the test problems of fastener Small object The orientation problem in structure region can accelerate the convergence speed of network, the generation number of candidate region be reduced, to mention significantly The speed of high detection.Meanwhile the present invention makes full use of known prior information in scene, the sample being fused in depth network This construction, candidate region generate, and each process such as net structure and loss function constraint greatly reduces fastener candidate region Range, improve detection efficiency and ensure that detection accuracy.

Fig. 7 is the basic flow chart of the target positioning based on deep learning, as shown in fig. 7, the target based on deep learning Position fixing process is divided into two stages, i.e. " off-line training " and " on-line checking ".In " off-line training " stage, first from a large amount of rail Sample automatic marking is carried out using the method for template matching in road image, big data sample set of the building for study inputs deep Off-line training is carried out in degree network and debugging obtains network model parameter；The model parameter obtained using training.In " online inspection Survey " stage, depth network is initialized with the obtained model parameter of training, assigns the ability of network objectives positioning, and then by single width The online real-time positioning to target is realized in the depth network of orbital image input parameter initialization to be detected.The present invention is implemented Example will be described from " off-line training " and " on-line checking " two sides.

Fig. 1 is the flow chart of the object localization method under structuring scene of the embodiment of the present invention, is from " on-line checking " side Discuss the present invention.As shown in Figure 1, the object localization method under the structuring scene of the embodiment of the present invention includes that step S101 is extremely walked Rapid S106.

Step S101 obtains image to be detected.Determine in embodiments of the present invention according to identification network trained in advance When positioning object in bit image, need first to obtain image to be detected.It is to be detected obtaining in embodiments herein Image after, need first to carry out the pretreatment operations such as corresponding scale modulation, gray scale normalization to image, and then will be after pretreatment Image be sent to identification network in.

Step S102 generates described image to be detected pair according to the feature convolution nuclear parameter in the identification network of training in advance The Feature Mapping figure answered.In embodiments of the present invention, identification network trained in advance is at the beginning of trained training pattern parameter Beginningization deep learning network obtains.Such as the full convolution deep neural network of the structured region of the optimization proposed by the present invention of Figure 13 (SR-FCN) shown in structural schematic diagram, in embodiments of the present invention, after being sent to image in identification network, network is identified The corresponding Feature Mapping figure of image to be detected is generated according to wherein trained feature convolution nuclear parameter.In the embodiment of the present invention In, the trained feature convolution nuclear parameter can correspond to multiple groups convolutional network.

In embodiments of the present invention, due to using the big target that small target deteection is converted into fixed structuring Detection, therefore the structured message that target has fixation is detected, and detection type is single.It network over-fitting and improves in order to prevent Speed is detected, the network number of plies is unsuitable too deep.Therefore in embodiments of the present invention, the corresponding convolutional network of feature convolution nuclear parameter can To use VGG16 or ResNet-18 network structure.

Step S103 is determined from the Feature Mapping figure according to the RPN network convolution nuclear parameter in the identification network Structuring detection zone to be verified out, wherein the structuring detection zone includes: at least one positioning object and reference Object.Such as the structural schematic diagram of the full convolution deep neural network (SR-FCN) of the structured region of the optimization proposed by the present invention of Figure 13 It is shown, in embodiments of the present invention, after obtaining Feature Mapping figure, using trained in identification network on Feature Mapping figure RPN network convolution nuclear parameter generates structuring detection zone to be verified.RPN network, that is, region candidate network is waited for generating Favored area obtains corresponding detection zone (ROI) interested.In embodiments of the present invention, RPN network convolution nuclear parameter can be right Answer multiple groups convolutional network.

In embodiments of the present invention, small target deteection is converted into the detection of the big target of structuring due to using, Therefore RPN network convolution nuclear parameter is used to generate candidate region for the big target of structuring, i.e. generation structuring detection zone. As shown in Figures 9 and 10, the existing detection for fastener Small object needs to generate candidate regions to each fastener target in image Domain, the present invention only need to generate the big target of structuring by the way that small target deteection to be converted into the big target of structuring, RPN network Candidate region, the generation number of candidate region is reduced, to greatly improve the speed of detection.

It in an embodiment of the present invention, can be by fastener in for the scene to the fastener progress detection and localization on rail Small object is converted into the big target of structuring as shown in Figure 10.The big target of structuring as shown in Figure 10 includes 5 × 3 small Region, i.e. block of cells R₁₁、R₁₂、R₁₃、R₂₁、R₂₂、R₂₃、R₃₁、R₃₂、R₃₃、R₄₁、R₄₂、R₄₃、R₅₁、R₅₂、R₅₃, wherein R₁₁、R₁₃、R₃₁、 R₃₃、R₅₁、R₅₃The targeted fastener region positioned for 6, R₂₁、R₂₃、R₄₁、R₄₃For 4 object of reference railway roadbed regions, R₁₂、R₂₂、R₃₂、 R₄₂、R₅₂For 5 object of reference steel rail areas.Fastener area, steel rail area, railway roadbed region are arranged in the form of following matrix:

It include 3 contacting pieces in the big target of structuring as shown in Figure 10, therefore will be with 3 contacting piece and corresponding reference steel Totally 15 zonules constitute a structured region for rail and railway roadbed.In embodiments of the present invention, the setting of structured region with to The size of the image of detection is related, and the orbital image for detecting vehicle acquisition in the prior art is often that every frame image is 2 meters corresponding Spatial sampling as shown in figure 8, being often only capable of in 2 meters of spatial sampling comprising 3 pairs of complete fasteners, therefore is directed to 3 contacting piece structures At 5 × 3 structured region.

Different structured regions can also be constructed according to different picture size in alternative embodiment of the invention.When Every frame image is only capable of being referred to the structured area of Figure 10 when including 1 pair of complete fastener in the orbital image of detection vehicle acquisition The composed structure in domain constitutes 1 × 3 structured region for 1 contacting piece.Similarly, when in the orbital image of detection vehicle acquisition Every frame image is only capable of when including 2 pairs of complete fasteners, is referred to the composed structure of the structured region of Figure 10, constitutes 3 × 3 Structured region.With the development of image technique, detecting every frame image in the orbital image of vehicle acquisition may include more detaining Part, but the mode for constituting structured region can be entirely by reference to the composed structure of the structured region of Figure 10.Therefore, the present invention couple The specific structure of structured region is without limitation.

Step S104 generates convolution nuclear parameter according to the shot chart in the identification network and generates the structuring detection zone The position sensing shot chart of object and object of reference is positioned described in domain.Such as the structured area of the optimization proposed by the present invention of Figure 13 Shown in the structural schematic diagram of the full convolution deep neural network (SR-FCN) in domain, in embodiments of the present invention, by training After RPN network convolution nuclear parameter generates structuring detection zone to be verified, convolution kernel ginseng is generated using the shot chart trained The corresponding position sensing shot chart of the structuring detection zone can be generated in number, includes structuring on the position sensing shot chart The position sensing score of each zonule in detection zone.In embodiments of the present invention, shot chart generation convolution nuclear parameter can To correspond to multiple groups convolutional network.

Similar with existing R-FCN network, SR-FCN network proposed by the present invention generates convolution nuclear parameter using shot chart The position sensing shot chart of generation includes first position sensitive score figure for classifying and for returning adjustment sub-district The second position sensitive score figure of domain position.But, the present invention different from the zonule that target is divided into the sizes such as 3 × 3 by R-FCN Structuring detection zone be divided into multiple structures, zonule of different sizes.It is as shown in Figure 10 by 3 contacting pieces and phase Totally 15 zonules are constituted in the embodiment of a structured region reference rail and railway roadbed answered, and first for classifying Position sensing shot chart is 5 × 3 × (C+1) dimension, and wherein C is the class number for detecting target, i.e., by big target in training process Structured region is labeled as several classes, and C can be equal to 1 in an embodiment of the present invention, i.e. the class number of detection target is 1.With It is (every in the position sensing score mapping that the second position sensitive score figure for returning adjustment sub-window position is one 5 × 3 × 4 dimension The position of sub-regions can be denoted as a four-tuple (x, y, w, h), and 15 sub-regions have 5 × 3 × 4 position score mappings altogether).

It is similar with R-FCN, the last one convolutional layer of SR-FCN network proposed by the present invention in the identification network of pre-training There is also 3 branches on the characteristic pattern of acquisition, the 1st branch is exactly in Feature Mapping figure above with region candidate network (RPN) candidate region is generated, the big target of structuring that the corresponding detection zone (ROI) interested of acquisition, i.e. RPN network generate Candidate region；2nd branch is exactly that a first position sensitive score for classifying is obtained on this feature mapping graph Figure；3rd branch is exactly that one is obtained on this feature mapping graph for returning the second position sensitivity of adjustment sub-window position Shot chart.

Step S105 verifies the structuring detection zone to be verified according to the position sensing shot chart. Such as the structural schematic diagram institute of the full convolution deep neural network (SR-FCN) of the structured region of the optimization proposed by the present invention of Figure 13 Show, in embodiments of the present invention, after obtaining position sensing shot chart, can in the pond ROI method according to prior art and Ballot classifying rules distinguishes execution position sensitivity on first position sensitive score figure and second position sensitive score figure ROI pondization operates (Position-Sensitive Rol Pooling), and raw to RPN network through region ballot and local regression At structuring candidate region verified, i.e., verifying RPN network generate structuring candidate region whether be correct structure Change detection zone.

Step S106, if determining position of the positioning object in the structuring detection zone by verifying Information.In an embodiment of the present invention, when the structuring candidate region that verifying RPN network generates is correct structuring detection zone Behind domain, due to each zonule in structuring detection zone position be it is fixed, i.e., positioning object in structuring detection zone In position be fixed, therefore the location information of positioning object can be directly determined out, that is, completed to positioning object Positioning.

In an alternative embodiment of this hair, when the structuring candidate region that verifying RPN network generates is correct structure After changing detection zone, the location information of each zonule in structuring detection zone can be directly determined out, that is, is completed to structuring The positioning of each zonule in detection zone.Embodiment as shown in Figure 10 may be implemented directly in structuring detection zone Railway roadbed region, fastener area, steel rail area are positioned.Although carried out for contacting piece region in an embodiment of the present invention Positioning, but also may be implemented to position steel rail area using method of the invention, therefore use method pair of the invention Rail is positioned, and also can be achieved on realization to the detection of rail crack.

The detection of fastener Small objects multiple in piece image is converted to by the embodiment of the present invention it can be seen from above description There is one monolith the big target area of fixed structure to detect, and can accelerate the convergence speed of network, reduce the life of candidate region At number, to improve the speed of detection.The embodiment of the present invention is directed to the big object detection area with similar structure, detects mesh Mark shape is similar, and only there are certain variations on spatial position, trains over-fitting in order to prevent, and the present invention uses ResNet- 18 network structure as convolutional layer.The present invention constructs the corresponding position sensing shot chart of each facility component of track scene, will The structured message of most fast at present R-FCN depth network and track scene organically combines, and proposes that structured region depth is rolled up entirely Product network (SR-FCN), can effectively solve the speed bottle-neck in high speed real-time detection task, and improve the precision of target detection with And anti-interference ability.

Fig. 2 is the method flow diagram of training identification network of the embodiment of the present invention, to discuss the present invention from " off-line training " side. As shown in Fig. 2, the method for the training identification network of the embodiment of the present invention includes step S201 to step S203.

Step S201 obtains the mark image pattern for training, and the mark image pattern includes the structure marked out Change detection zone, the structuring detection zone marked out includes: the positioning object and object of reference that at least one is marked out.

The basic functional principle of deep learning can be summarized as: g=Af

Here, f indicates input picture；G indicates testing result；A is a transformation matrix, pair between characterization input and output It should be related to.

Then for network training, essence is the output result marked based on a large amount of input data f and in advance That is sample g estimates transformation matrix A between the two by the method for iterative approach, when meeting the number of iterations condition or net is coughed up Error reaches the threshold value of prediction or less, then it is assumed that has acquired the transformation matrix for being similar to AAlso referred to as obtained by training Training pattern parameter.

Obtaining network model parameterAfterwards, the process of target detection can be denoted as:

Image f i.e. to be detected for one₀, using obtained model parameter and its transform operation is trained, can be obtained The result g of target detection₀。

For network training, the construction of training sample is most important.In embodiments of the present invention, in order to use depth The fastener position of learning network detection rail, it is necessary first to a large amount of artificial mark be carried out to the target of detection on orbital image Work generates the mark image pattern for training.Fig. 9 is the Small object fastener mark schematic diagram of the prior art, in existing skill Directly 6 fasteners detection target in orbital image is labeled as training sample in art, using deep learning network into Row model training.In actually detected environment, the form of fastener is vulnerable to Zha Dao covering, light environment variation, fastener abnormal state etc. The influence of a variety of uncertain factors, the diversity of contacting piece sample have very high requirement.In addition, the fastener sample of mark is as allusion quotation The Small object sample of type, candidate region generate more, and detection is more time-consuming, and since the multiple pondization of convolutional layer operates So that network is insensitive to the detection of Small object.

Figure 10 is the big target structural area marking schematic diagram of the embodiment of the present invention, as shown in Figure 10, the embodiment of the present invention Multiple subregions such as fastener, railway roadbed, rail are constituted into whole big target structural detection zone.This improvement first will be multiple Small target deteection task is converted into single big target detection, improves detection speed；Secondly, can be filled to the detection of the big target Divide using relatively fixed position and shape constraining relationship between each sub-regions, the anti-interference ability of detection can be effectively improved. As described in embodiment in step s 103, the present invention can by big target structural region division at 5 × 3 sub-regions, and The size and spacing of all subregion are initialized according to the structuring priori of track scene.

For network training, the construction of positive sample and negative sample is equally most important.Figure 11 is that the present invention one is optional The non-fragment orbit positive sample schematic diagram of embodiment, Figure 12 are the non-fragment orbit negative sample schematic diagrames of an alternate embodiment of the present invention. As is illustrated by figs. 11 and 12, positive sample and the difference of negative sample are larger, and positive sample can completely shows big target structural area Each zonule in domain, and negative sample may only show the cell portion domain in big target structural region.In the present invention Alternative embodiment in, can first define a standard positive sample, will with the area coincidence degree of standard positive sample be lower than 50% Sample be defined as negative sample.

In an embodiment of the present invention, in order to improve the efficiency that sample marks, certainly using the sample based on template matching Dynamic mask method, detailed process are as follows: (1) being added to first by mark generation positive and negative samples template by hand corresponding positive and negative In template library.(2) then, to existing each frame orbital image, child window is extracted from orbital image using slip window sampling, Extract each formwork calculation similarity between the two in the HoG feature and positive and negative several template libraries of child window, according to similarity from Height is voted to low K template of selection using K-NN (K-Nearest Neighbor) classifier child windows generic, The higher child window of score is positive sample by automatic marking in positive template library, and the higher child window quilt of score in negative template library Automatic marking is negative sample.(3) manual review is carried out to the result of automatic marking, removes the generation sample of mistake, complete sample Cleaning.

In an embodiment of the present invention, in order to increase the diversity of sample, standard target template can be defined, and define Region with the standard form area coincidence degree (i.e. pixel is handed over and compared, IOU) of standard target greater than 80% can be used as positive sample, and Registration is used as negative sample lower than 50%.

Step S201, by the mark image pattern train feature convolution nuclear parameter, RPN network convolution nuclear parameter and Shot chart generates convolution nuclear parameter.In embodiments of the present invention, it is identified according to the mark image pattern in positive and negative samples library Network training trains the training pattern parameter of identification networkIn embodiments of the present invention, the training process and parameter of network Adjustment process is similar with R-FCN, and trained model parameter is needed to consist of three parts, i.e., for generating the feature of Feature Mapping figure Convolution nuclear parameter, the RPN network convolution nuclear parameter for generating candidate region and for generate position sensing shot chart Component generates convolution nuclear parameter.This three parts parameter collective effect constitutes training pattern parameterTrained essence is exactly root The model parameter of each section in network is acquired according to the sample data of label, until the training pattern parameter trainedApproach A.

Step S203 generates convolution kernel ginseng according to the feature convolution nuclear parameter, RPN network convolution nuclear parameter and shot chart Number generates the training pattern parameter of identification network.In embodiments of the present invention, it can choose model in the training process of network Adjusting and optimizing strategy such as stochastic gradient descent method (SGD) is raw to feature convolution nuclear parameter, RPN network convolution nuclear parameter and shot chart Continuous iterative approach is carried out at these three parameters of convolution nuclear parameter, until the number of iterations reaches the instruction of preset number or network Practice error and is less than preset threshold.

In an embodiment of the present invention, the threshold value of the training error of network can by structure regularization energy damage threshold Lai It determines.Structure regularization energy damage threshold are as follows:

Wherein,For target classification costing bio disturbance parameter, L_reg(t,t^*) it is that position returns costing bio disturbance parameter, L_sr (h,h^*) it is that structure keeps costing bio disturbance parameter, λ₁And λ₂For weighting coefficient,t^*、h^*Sample mark respectively for training Label, c^*> 0 indicates the non-background of target of detection, s, t_x,y,w,h,Indicate the input data of training.In functionFor energy loss, the threshold value of the training error of network can carry out really according to energy loss It is fixed.The present invention according to detection target position distribution feature (fastener is in such as orbital image " field " font be distributed), to depth net The loss function of network carries out spatial distribution regularization, further ensures the precision and fault-tolerant ability of fastener target detection.

Fig. 3 is the method flow diagram that the embodiment of the present invention verifies structuring detection zone, as shown in figure 3, this hair The method that bright embodiment verifies structuring detection zone includes step S301 and step S302.

Step S301 carries out pondization operation to the position sensing shot chart；

Step S302 carries out region by the result operated to pondization and votes to the structuring detection zone to be verified It is verified in domain.

It in embodiments of the present invention, can be in the pond ROI method according to prior art after obtaining position sensing shot chart And ballot classifying rules, it is sensitive that execution position is distinguished on first position sensitive score figure and second position sensitive score figure ROI pondization operate (Position-Sensitive Rol Pooling), and through region ballot and local regression to RPN network The structuring candidate region of generation is verified, i.e., whether the structuring candidate region that verifying RPN network generates is correctly to tie Structure detection zone.

In an embodiment of the present invention, joined in above-mentioned steps S103 according to the RPN network convolution kernel in the identification network It is several to be determined in structuring detection zone to be verified from the Feature Mapping figure, anchor point of the present invention to existing RPN network Create-rule is improved.Figure 14 is RPN network raw anchor create-rule schematic diagram, as shown in figure 14, original anchor point Create-rule generates the candidate region of 9 different scales and different length-width ratios on each pixel of characteristic pattern.Figure 15 is The anchor point create-rule schematic diagram of the embodiment of the present invention, the present invention are directed to structuring detection zone scale and depth-width ratio relatively Fixed characteristic improves anchor point create-rule, only generates that three depth-width ratios are fixed and scale is light on each pixel The miniature candidate region put, the substantially less number for generating candidate region, further improves the speed of target detection.This hair The bright fixed structure for track scene generates network (RPN) to candidate region and is improved, by the traversal for constraining anchor point Range, and candidate window dimensional variation ratio is limited, the number for generating candidate region is reduced, the speed of target detection is further speeded up Degree.

The present invention is also by testing come to the full convolutional neural networks structure of structured region proposed by the present invention (SR-FCN) Effect verified.

Experimental situation and system configuration are as follows:

(1) hardware configuration: in Intel Xeon@2.40GHz × 28+NVIDIA Geforce Titan X × 4+256GB It deposits

(2) operating system: Ubuntu 16.04LTS

(3) deep learning frame: CUDA 8.0+Anaconda Python 2.7+Caffe

Experimental data is used for depth from the rail polling image of detection Che You tiny fragments of stone, coal, etc. route and the acquisition of high-speed rail route It is as shown in the table to practise network training, test and the verifying of SR-FCN detection effect, sample image composed structure.

When training identifies network, it is contemplated that mainly use relu in network as activation primitive, therefore use Kaiming initial method initializes network.It is initial to learn using stochastic gradient descent (SGD) model optimization method Rate (learning rate) is set as 0.01, and momentum parameter (momentum) is set as 0.9, and weight decaying (weight decay) is set It is 0.0005.Whenever picture Validation data set target function value compared to preceding an iteration do not decline when, Learning rate is reduced 10 times, i.e. lr/=10；Batch processing size (batch size) is set as 256, training set the number of iterations (epochs) 100 are set as.

Result by the experimental verification fastener target positioning proposed by the present invention based on SR-FCN is preferable, for different rails The different fastener types of road scene, for the confidence score of fastener area all close to 1, detection accuracy is high, and scene is adaptable.

In an experiment, also 6000 width images in upper table are tested using different deep learning networks, to result The detection success rate and detection speed of Statistical Comparison each method are carried out, as shown in the table.

By upper table experimental data as it can be seen that SR-FCN network proposed by the present invention is not only adapted to the target under several scenes Detection and localization, and very high detection speed can still be maintained, it can satisfy the real-time detection under 350lm/h speed.

It should be noted that step shown in the flowchart of the accompanying drawings can be in such as a group of computer-executable instructions It is executed in computer system, although also, logical order is shown in flow charts, and it in some cases, can be with not The sequence being same as herein executes shown or described step.

Based on the same inventive concept, the embodiment of the invention also provides the target locating set under a kind of structuring scene, It can be used to implement the object localization method under structuring scene described in above-described embodiment, as described in the following examples. Since the principle that the target locating set under structuring scene solves the problems, such as is similar to the object localization method under structuring scene, Therefore the embodiment of the target locating set under structuring scene may refer to the reality of the object localization method under structuring scene Example is applied, overlaps will not be repeated.Used below, the software of predetermined function may be implemented in term " unit " or " module " And/or the combination of hardware.Although device described in following embodiment is preferably realized with software, hardware or soft The realization of the combination of part and hardware is also that may and be contemplated.

Fig. 4 is the first structure block diagram of the target locating set under structuring scene of the embodiment of the present invention, as shown in figure 4, Target locating set under structuring scene of the embodiment of the present invention includes: image to be detected acquiring unit 1, the generation of Feature Mapping figure Unit 2, structuring detection zone determination unit 3, position sensing shot chart generation unit 4, structuring detection zone authentication unit 5 With object location information determination unit 6.

Image to be detected acquiring unit 1, for obtaining image to be detected；

Feature Mapping figure generation unit 2 is generated for the feature convolution nuclear parameter in the identification network according to training in advance The corresponding Feature Mapping figure of described image to be detected；

Structuring detection zone determination unit 3, for according to the RPN network convolution nuclear parameter in the identification network from institute It states and determines structuring detection zone to be verified in Feature Mapping figure, wherein the structuring detection zone includes: at least one A positioning object and object of reference；

Position sensing shot chart generation unit 4, for generating convolution nuclear parameter according to the shot chart in the identification network Generate the position sensing shot chart that object and object of reference are positioned described in the structuring detection zone；

Structuring detection zone authentication unit 5, for according to the position sensing shot chart to the structure to be verified Change detection zone to be verified；

Object location information determination unit 6, for determining the positioning object in the structuring detection zone In location information.

In an embodiment of the present invention, the positioning object includes: fastener area.The object of reference includes: rail area At least one of domain and railway roadbed subregion.

It in an embodiment of the present invention, can be by fastener in for the scene to the fastener progress detection and localization on rail Small object is converted into the big target of structuring as shown in Figure 10.The big target of structuring as shown in Figure 10 includes 5 × 3 small Region, i.e. block of cells R₁₁、R₁₂、R₁₃、R₂₁、R₂₂、R₂₃、R₃₁、R₃₂、R₃₃、R₄₁、R₄₂、R₄₃、R₅₁、R₅₂、R₅₃, wherein R₁₁、R₁₃、R_31、 R₃₃、R_51、R₅₃For the targeted fastener region of positioning, R₂₁、R₂₃、R₄₁、R₄₃For object of reference railway roadbed region, R₁₂、R₂₂、R₃₂、R₄₂、R₅₂For Object of reference steel rail area.Fastener area, steel rail area, railway roadbed region are arranged in the form of following matrix:

In embodiments of the present invention, position sensing shot chart include the first position sensitive score figure for classifying with And the second position sensitive score figure for returning adjustment sub-window position.But target is divided into the sizes such as 3 × 3 with R-FCN Zonule it is different, structuring detection zone of the invention is divided into multiple structures, zonule of different sizes.Such as Figure 10 Shown in by 3 contacting pieces and corresponding referring to rail and railway roadbed, totally 15 zonules constitute the embodiment of a structured region In, the first position sensitive score figure for classifying is 5 × 3 × (C+1) dimension, and wherein C is the class number for detecting target, I.e. in training process by big target structural zone marker be several classes, in an embodiment of the present invention C can be equal to 1, that is, detect The class number of target is 1.Second position sensitive score figure for returning adjustment sub-window position is one 5 × 3 × 4 dimension (position of each subregion can be denoted as a four-tuple (x, y, w, h) to the mapping of position sensing score, and 15 sub-regions have 5 × 3 altogether × 4 position score mappings).

In an embodiment of the present invention, structuring detection zone determination unit 3 is according to the RPN network in the identification network Convolution nuclear parameter determines structuring detection zone to be verified from the Feature Mapping figure, and the present invention is to existing RPN network Anchor point create-rule improved.Figure 14 is RPN network raw anchor create-rule schematic diagram, as shown in figure 14, original Anchor point create-rule the candidate regions of 9 different scales and different length-width ratios is generated on each pixel of characteristic pattern. Figure 15 is the anchor point create-rule schematic diagram of the embodiment of the present invention, and the present invention is directed to structuring detection zone scale and depth-width ratio phase To more fixed characteristic, anchor point create-rule is improved, only generated on each pixel three depth-width ratios fix and The candidate region that scale slightly scales, the substantially less number for generating candidate region, further improves the speed of target detection Degree.The present invention generates network (RPN) to candidate region for the fixed structure of track scene and is improved, by constraining anchor point Traversal range, and limit candidate window dimensional variation ratio, reduce the number for generating candidate region, further speed up target inspection The speed of survey.

Fig. 5 is the second structural block diagram of the target locating set under structuring scene of the embodiment of the present invention, as shown in figure 5, Target locating set under the structuring scene of the embodiment of the present invention further include: mark image pattern acquiring unit 7, convolution kernel ginseng Number training unit 8 and training pattern parameter generating unit 9.

Image pattern acquiring unit 7 is marked, for obtaining the mark image pattern for being used for training, the mark image pattern Including the structuring detection zone marked out, the structuring detection zone marked out includes: at least one determining of marking out Position object and object of reference；

Convolution kernel parameter training unit 8, for training feature convolution nuclear parameter, RPN by the mark image pattern Network convolution nuclear parameter and shot chart generate convolution nuclear parameter；

Training pattern parameter generating unit 9, for according to the feature convolution nuclear parameter, RPN network convolution nuclear parameter and Shot chart generates the training pattern parameter that convolution nuclear parameter generates identification network.

In embodiments of the present invention, training pattern parameter generating unit 9 can be also used for being adjusted according to the model of selection excellent Change strategy such as stochastic gradient descent method (SGD) and convolution is generated to feature convolution nuclear parameter, RPN network convolution nuclear parameter and shot chart These three parameters of nuclear parameter carry out continuous iterative approach, until the number of iterations reaches the training error of preset number or network Less than preset threshold.

Fig. 6 is the composite structural diagram that structuring detection zone of the embodiment of the present invention carries out authentication unit, as shown in fig. 6, this It includes: pond module 501 and region vote module 502 that inventive embodiments structuring detection zone, which carries out authentication unit 5,.

Pond module 501, for carrying out pondization operation to the position sensing shot chart；

Region vote module 502 carries out region for the result by operating to pondization and votes to described to be verified Structuring detection zone is verified.

To achieve the goals above, according to the another aspect of the application, a kind of computer equipment, including storage are additionally provided Device, processor and storage on a memory and the computer program that can run on a processor, the processor execution meter The step in the object localization method under above structure scene is realized when calculation machine program.

Processor can be central processing unit (Central Processing Unit, CPU).Processor can also be it His general processor, digital signal processor (Digital Signal Processor, DSP), specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field- Programmable Gate Array, FPGA) either other programmable logic device, discrete gate or transistor logic, The combination of the chips such as discrete hardware components or above-mentioned all kinds of chips.

Memory as a kind of non-transient computer readable storage medium, can be used for storing non-transient software program, it is non-temporarily State computer executable program and unit, such as corresponding program unit in above method embodiment of the present invention.Processor passes through Non-transient software program, instruction and module stored in memory are run, thereby executing the various function application of processor And work data processing, that is, realize the method in above method embodiment.

Memory may include storing program area and storage data area, wherein storing program area can storage program area, extremely Application program required for a few function；It storage data area can the data etc. that are created of storage processor.In addition, memory can It can also include non-transient memory, for example, at least disk memory, a flash memory to include high-speed random access memory Device or other non-transient solid-state memories.In some embodiments, it includes remotely setting relative to processor that memory is optional The memory set, these remote memories can pass through network connection to processor.The example of above-mentioned network includes but is not limited to Internet, intranet, local area network, mobile radio communication and combinations thereof.

One or more of unit storages in the memory, when being executed by the processor, execute above-mentioned Method in embodiment.

Above-mentioned computer equipment detail can correspond to refering to associated description corresponding in above-described embodiment and effect into Row understands that details are not described herein again.

To achieve the goals above, according to the another aspect of the application, a kind of computer readable storage medium is additionally provided, The computer-readable recording medium storage has computer program, real when the computer program executes in the computer processor The step in object localization method under existing above structure scene.It will be understood by those skilled in the art that realizing above-mentioned implementation All or part of the process in example method, is that relevant hardware can be instructed to complete by computer program, the journey Sequence can be stored in a computer-readable storage medium, and the program is when being executed, it may include such as the embodiment of above-mentioned each method Process.Wherein, the storage medium can for magnetic disk, CD, read-only memory (Read-Only Memory, ROM), with Machine storage memory (RandomAccessMemory, RAM), flash memory (Flash Memory), hard disk (Hard Disk Drive, abbreviation: HDD) or solid state hard disk (Solid-State Drive, SSD) etc.；The storage medium can also include above-mentioned The combination of the memory of type.Obviously, those skilled in the art should be understood that each module of the above invention or each step It can be realized with general computing device, they can be concentrated on a single computing device, or be distributed in multiple calculating On network composed by device, optionally, they can be realized with the program code that computing device can perform, it is thus possible to It is stored in storage device and is performed by computing device, or they are fabricated to each integrated circuit modules, Or single integrated circuit module is maked multiple modules or steps in them to realize.In this way, the present invention is not limited to Any specific hardware and software combines.

The foregoing is only a preferred embodiment of the present invention, is not intended to restrict the invention, for the skill of this field For art personnel, the invention may be variously modified and varied.All within the spirits and principles of the present invention, made any to repair Change, equivalent replacement, improvement etc., should all be included in the protection scope of the present invention.

Claims

1. the object localization method under a kind of structuring scene characterized by comprising

Obtain image to be detected；

The corresponding Feature Mapping of described image to be detected is generated according to the feature convolution nuclear parameter in the identification network of training in advance Figure；

Structure to be verified is determined from the Feature Mapping figure according to the RPN network convolution nuclear parameter in the identification network Change detection zone, wherein the structuring detection zone includes: at least one positioning object and object of reference；

Convolution nuclear parameter, which is generated, according to the shot chart in the identification network generates positioning described in the structuring detection zone The position sensing shot chart of object and object of reference；

2. the object localization method under structuring scene according to claim 1, which is characterized in that further include:

The mark image pattern for training is obtained, the mark image pattern includes the structuring detection zone marked out, institute Stating the structuring detection zone marked out includes: the positioning object and object of reference that at least one is marked out；

Feature convolution nuclear parameter, RPN network convolution nuclear parameter and shot chart, which are trained, by the mark image pattern generates volume Product nuclear parameter；

Convolution nuclear parameter, which is generated, according to the feature convolution nuclear parameter, RPN network convolution nuclear parameter and shot chart generates identification net The training pattern parameter of network.

3. the object localization method under structuring scene according to claim 1, which is characterized in that the positioning object It include: fastener area；

The object of reference includes: at least one of steel rail area and railway roadbed region.

4. the object localization method under structuring scene according to claim 3, which is characterized in that the fastener area Quantity is 6, and the quantity of the steel rail area is 5, and the quantity in the railway roadbed region is 4, the fastener area, rail area Domain, railway roadbed region are arranged in the form of following matrix:

Wherein, R₁₁、R₁₃、R₃₁、R₃₃、R₅₁、R₅₃For fastener area, R₂₁、R₂₃、R₄₁、R₄₃For railway roadbed region, R₁₂、R₂₂、R₃₂、R₄₂、 R₅₂For steel rail area.

5. the object localization method under structuring scene according to claim 4, which is characterized in that the position sensing obtains Component, comprising: the first position sensitive score figure for classifying and the second for returning adjustment sub-window position Set sensitive score figure, wherein the first position sensitive score figure is 5 × 3 × (C+1) dimension, and C is the classification number for detecting target Mesh, the second position sensitive score figure are 5 × 3 × 4 dimensions.

6. the object localization method under structuring scene according to claim 1, which is characterized in that described according to institute's rheme Sensitive score figure is set to verify the structuring detection zone to be verified, comprising:

Pondization operation is carried out to the position sensing shot chart；

Region is carried out by the result for operating pondization to vote to verify the structuring detection zone to be verified.

7. the object localization method under structuring scene according to claim 2, which is characterized in that described according to the spy It levies convolution nuclear parameter, RPN network convolution nuclear parameter and shot chart and generates the training pattern ginseng that convolution nuclear parameter generates identification network Number, comprising:

The feature convolution nuclear parameter, RPN network convolution nuclear parameter and shot chart are generated according to preset model adjustable strategies Convolution nuclear parameter, which is iterated, approaches, until the network training error of the identification network generated is less than preset threshold.

8. the object localization method under structuring scene according to claim 7, which is characterized in that the network training misses The threshold value of difference is determined by structure regularization energy damage threshold；

The structure regularization energy damage threshold are as follows:

Wherein,For target classification costing bio disturbance parameter, L_reg(t,t^*) it is that position returns costing bio disturbance parameter, L_sr(h, h^*) it is that structure keeps costing bio disturbance parameter, λ₁And λ₂For weighting coefficient,t^*、h^*Sample label respectively for training, c^* > 0 indicates the non-background of target of detection, s, t_x,y,w,h,Indicate the input data of training.

9. the object localization method under structuring scene according to claim 1, which is characterized in that the feature convolution kernel The corresponding convolutional network of parameter uses VGG16 or ResNet-18 network structure.

10. the object localization method under structuring scene according to claim 1, which is characterized in that the RPN network volume The anchor point create-rule of the corresponding RPN convolutional network of product nuclear parameter is that each pixel only generates the fixed candidate of multiple depth-width ratios Region.

11. the target locating set under a kind of structuring scene characterized by comprising

Image to be detected acquiring unit, for obtaining image to be detected；

Feature Mapping figure generation unit, for the feature convolution nuclear parameter in the identification network according to training in advance generate it is described to The corresponding Feature Mapping figure of detection image；

Structuring detection zone determination unit, for according to the RPN network convolution nuclear parameter in the identification network from the spy Structuring detection zone to be verified is determined in sign mapping graph, wherein the structuring detection zone includes: that at least one is fixed Position object and object of reference；

Position sensing shot chart generation unit generates institute for generating convolution nuclear parameter according to the shot chart in the identification network State the position sensing shot chart that object and object of reference are positioned described in structuring detection zone；

Structuring detection zone authentication unit, for being detected according to the position sensing shot chart to the structuring to be verified It is verified in region；

Object location information determination unit, for determining position of the positioning object in the structuring detection zone Confidence breath.

12. the target locating set under structuring scene according to claim 11, which is characterized in that further include:

Image pattern acquiring unit is marked, for obtaining the mark image pattern for being used for training, the mark image pattern includes The structuring detection zone marked out, the structuring detection zone marked out include: the positioning mesh that at least one is marked out Mark object and object of reference；

Convolution kernel parameter training unit, for training feature convolution nuclear parameter by the mark image pattern, RPN network is rolled up Product nuclear parameter and shot chart generate convolution nuclear parameter；

Training pattern parameter generating unit, for according to the feature convolution nuclear parameter, RPN network convolution nuclear parameter and shot chart Generate the training pattern parameter that convolution nuclear parameter generates identification network.

13. the target locating set under structuring scene according to claim 11, which is characterized in that the positioning target Object includes: fastener area；

14. the target locating set under structuring scene according to claim 13, which is characterized in that the fastener area Quantity be 6, the quantity of the steel rail area is 5, and the quantity in the railway roadbed region is 4, the fastener area, rail Region, railway roadbed region are arranged in the form of following matrix:

15. the target locating set under structuring scene according to claim 14, which is characterized in that the position sensing Shot chart, comprising: the first position sensitive score figure for classifying and second for returning adjustment sub-window position Position sensing shot chart, wherein the first position sensitive score figure is 5 × 3 × (C+1) dimension, and C is the classification number for detecting target Mesh, the second position sensitive score figure are 5 × 3 × 4 dimensions.

16. the target locating set under structuring scene according to claim 11, which is characterized in that the structuring inspection Survey area validation unit, comprising:

Pond module, for carrying out pondization operation to the position sensing shot chart；

Region vote module carries out region ballot for the result by operating to pondization to examine to the structuring to be verified Region is surveyed to be verified.

17. the target locating set under structuring scene according to claim 12, which is characterized in that the training pattern Parameter generating unit is also used to join the feature convolution nuclear parameter, RPN network convolution kernel according to preset model adjustable strategies Several and shot chart generation convolution nuclear parameter, which is iterated, approaches, and presets until the network training error of the identification network generated is less than Threshold value.

18. the target locating set under structuring scene according to claim 17, which is characterized in that the network training The threshold value of error is determined by structure regularization energy damage threshold；

The structure regularization energy damage threshold are as follows:

19. the target locating set under structuring scene according to claim 11, which is characterized in that the feature convolution The corresponding convolutional network of nuclear parameter uses VGG16 or ResNet-18 network structure.

20. the target locating set under structuring scene according to claim 11, which is characterized in that the RPN network The anchor point create-rule of the corresponding RPN convolutional network of convolution nuclear parameter is that each pixel only generates the fixed time of multiple depth-width ratios Favored area.

21. a kind of computer equipment including memory, processor and stores the meter that can be run on a memory and on a processor Calculation machine program, which is characterized in that the processor realizes any one of claims 1 to 10 method when executing the computer program In step.

22. a kind of computer readable storage medium, the computer-readable recording medium storage has computer program, and feature exists In realization such as the step in claims 1 to 10 any one method when the computer program executes in the computer processor Suddenly.