CN110188689A

CN110188689A - Virtual driving target collision detection method based on real scene modeling

Info

Publication number: CN110188689A
Application number: CN201910463552.9A
Authority: CN
Inventors: 宋永端; 沈志熙; 李聃; 曾海林
Original assignee: Chongqing University
Current assignee: Chongqing University
Priority date: 2019-05-30
Filing date: 2019-05-30
Publication date: 2019-08-30
Anticipated expiration: 2039-05-30
Also published as: CN110188689B

Abstract

The invention discloses a kind of virtual driving target collision detection methods based on real scene modeling, comprising steps of 1) shooting the virtual driving scene of vehicle front by virtual camera, and obtain coordinate of the collision target on imaging plane by target detection；2) the picture plane vertical height of collision target is obtained according to the plane coordinates of collision target；3) according to as plane vertical height is to the regression equation of practical fore-and-aft distance, the fore-and-aft distance of operation vehicle and collision target is calculated；4) according to as the regression equation between the lateral Safe width of plane and actual range, obtaining as the lateral Safe width of plane, 5) according to the target position detected, judge whether target will collide with virtual driving vehicle.The present invention solves the technical problem that collision detection is carried out in the virtual driving scene modeled based on real scene.

Description

Virtual driving target collision detection method based on real scene modeling

Technical field

The present invention relates to automobile virtual driving technical field, in particular to a kind of virtual driving based on real scene modeling Target collision detection method.

Background technique

In order to improve driver training validity and training effect, need to add pedestrian and vehicle in driving simulation system, Virtual driving vehicle is needed to hide pedestrian and vehicle in drive simulating, and the premise hidden is that driving simulation system is able to detect and touches Target is hit, and target is accurately positioned.

Traditional virtual driving scene is generated by 3 d modeling software, therefore collision targets such as people, vehicle in scene Coordinate position be known, it may be convenient to carry out target collision detection.But it corresponds to and is virtually driven based on what real scene modeled System is sailed, the position coordinates of people, vehicle spray collision target in system are unknown quantity, therefore in the void modeled based on real scene It is a technical problem that target collision detection is carried out in quasi- control loop.

Summary of the invention

In view of this, the object of the present invention is to provide a kind of virtual driving target collision detections based on real scene modeling Method, to solve to carry out the technical problem of target collision detection in the virtual driving scene modeled based on real scene.

The present invention is based on the virtual driving target collision detection methods of real scene modeling, comprising the following steps:

1) virtual driving scene of vehicle front is shot by the way that the virtual camera of virtual driving vehicle drive position is arranged in, and Coordinate [x of the collision target on virtual camera imaging plane is obtained by target detection₁,y₁,x₂,y₂], (x in coordinate₁,y₁) table Show the top left co-ordinate of collision target rectangle frame, (x₂,y₂) indicate collision target rectangle frame bottom right angular coordinate；Described is virtual Driving Scene is made of sphere model and the real scene video being attached on sphere model inner wall；

2) according to the plane coordinates [x of collision target₁,y₁,x₂,y₂] obtain collision target picture plane vertical height h, h= y₂；

3) according to as plane vertical height h is to the regression equation of practical fore-and-aft distance D, operation vehicle and collision is calculated The fore-and-aft distance D of target；

D=2.908 × 10^-7·h³-3.069×10^-4·h²+0.08266·h+0.252 (1)

4) according to as the regression equation between plane lateral Safe width w and actual range D, virtual driving vehicle is obtained As the lateral Safe width w of plane；

W=-0.03093D³+2.916·D²-85.9·D+832 (2)；

5) collision judgment:

First judge whether virtual driving vehicle will occur longitudinal impact with target, if fore-and-aft distance D is less than the safety of the vehicle When braking distance, then longitudinal impact may occur for judgement, when fore-and-aft distance D is greater than the safe stopping distance of the vehicle, then sentence It is disconnected that longitudinal impact will not occur；

Then under the premise of with target longitudinal impact may occur for virtual driving vehicle, continue according to target rectangle frame The abscissa x of the right and left₁And x₂To determine whether side collision will occur, ifAndThen judge that virtual driving vehicle will not occur side collision with target and otherwise judge virtual driving With target side collision can occur for vehicle；Aforementioned middle w_tFor the pixel overall width of picture captured by virtual camera.

Further, it in the step 1), detects collision target and obtains its coordinate on virtual camera imaging plane [x₁,y₁,x₂,y₂] the step of it is as follows:

A: to virtual driving scene image as original image before the vehicle of extraction virtual camera shooting；

B: being used as input picture to be passed to feature extraction network after carrying out scaling to original image, defeated by feature extraction network Characteristic pattern is obtained out；

C: characteristic pattern is passed to RPN network, RPN network will be corresponding with each pixel on characteristic pattern on input picture Point as anchor point, using anchor point as the center of anchor frame, each anchor point position generates several anchor frames over an input image；

D: inputting convolutional layer for the characteristic image with anchor frame information, by convolutional layer output image input Softmax classification Device carries out two classified calculating of target/background, tentatively judges that anchor frame content belongs to target or background, obtains several targets； Characteristic image is inputted into another convolutional layer simultaneously, convolutional layer output image is calculated by frame regression model, is corrected The translational movement and change of scale of anchor frame；

E: frame modification is carried out to all anchor frames according to the translational movement of obtained amendment anchor frame and change of scale；

F: according to the target classification score in step d in Softmax classifier calculated result, by the side of score from high to low Formula sequence, by anchor before the anchor frame in the high target of score comes, by anchor after the anchor frame in the low target of score comes, so All anchor frames are arranged successively, and extract several forward anchor frames；

G: the part for exceeding image boundary in extracted each anchor frame is removed；

H: non-maxima suppression processing is carried out to extracted anchor frame, removes the repetition anchor frame of each target；

I: it sorts remaining anchor frame, mentions from high to low again according to target classification score in Softmax classifier calculated result Forward several anchor frames are taken to export as candidate target, candidate target=[x1, y1, x2, y2], (x1, y1) is the anchor frame upper left corner Coordinate, (x2, y2) are anchor frame bottom right angular coordinate；

J: the grid for being 7*7 by the corresponding characteristic pattern region division of each candidate target, and to each grid Max pooling processing is carried out, Output Size is the candidate target characteristic pattern of 7*7；

K: pedestrian, vehicle are particularly belonged to still by full articulamentum and Softmax classifier calculated candidate target characteristic pattern Background exports all kinds of probability vectors；Return the positional shift for obtaining each candidate target characteristic pattern using frame again simultaneously Amount corrects target anchor frame again, obtains the final coordinate [x of collision target₁,y₁,x₂,y₂]。

Further, the feature extraction network in the step b is SRN network, and expression formula is as follows:

y_n=y₀+y₁+y₂+...+y_n-1+f_n(y_n-1) (3)

F in formula_n() indicates with certain BN layers tactic, Relu activation primitive and convolutional layer, convolutional layer, that is, residual error Unit, y₀For the input of network, y_nFor the output of network；

The SRN network structure includes 1 initial convolutional layer C1 and 3 sparse residual unit set C2, C3 and C4, In each sparse residual unit set stacked by m sparse residual units；The sparse residual unit is by multiple with certain BN layers tactic, Relu activation primitive and convolutional layer composition, number are determined by length factor l；The length factor l can For controlling initial characteristics y₀Influence to network output, each sparse residual unit have identical length factor.

Further, in the step c, using anchor point as the center of anchor frame, frame side length is defined at each anchor point is respectively 128,256,512 and frame length-width ratio be respectively 1:2,1:1,2:1 amount to 9 anchor frames.

Further, in the step h non-maxima suppression processing the following steps are included:

1. selecting multiple anchor frames belonging to a certain target；

2. according to Softmax classifier calculated in step d as a result, being first less than the probability that content belongs to target in anchor frame 0.5 anchor frame removes, then by content in anchor frame belong to the anchor frame of the maximum probability of target and remaining anchor frame carry out one by one it is be overlapped Degree judgement illustrates that two anchor frames belong to same target, retains maximum probability anchor frame, it is smaller to remove probability if threshold value is higher than setting value That anchor frame；Otherwise, illustrate that two anchor frame category different targets, two anchor frames retain simultaneously；

3. select multiple anchor frames belonging to next target, continue step 1.-treatment process 2., until completing whole mesh The screening of anchor frame belonging to marking.

Beneficial effects of the present invention:

1, the present invention is based on the virtual driving target collision detection method of real scene modeling, collision target is being detected Afterwards, the conversion formula between coordinate by target in the pixel coordinate in scene image and in three dimensions, obtains The picture plane of fore-and-aft distance D and vehicle developed width when distance is D between virtual driving vehicle and collision target is lateral Safe width w, to solve the technology for obtaining collision target coordinate in the virtual driving scene modeled based on real scene Problem；And it is further realized in virtual driving scene by D and w, is between virtual driving vehicle and collision target The no judgement that side collision and longitudinal impact can occur can improve the virtual driving training validity based on real scene modeling.

2, it the present invention is based on the virtual driving target collision detection method of real scene modeling, is used in target detection The object detection method based on sparse residual error neural network, weaken initial feature input to the shadow of overall network model It rings, while also enriching the output feature of network, improve precision of prediction height.

Detailed description of the invention

Fig. 1 is virtual driving scene sphere model figure；

Fig. 2 is residual unit structure chart, and BN is input layer in figure, and RELU is activation primitive, and conv is convolutional layer；

Fig. 3 is locally-attached SRN network, and length factor l=2 in figure indicates that the number of sparse residual unit is 2；

Fig. 4 is target detection flow chart；

Fig. 5 is the overall structure figure of SRN network；

Fig. 6 is object detection results figure；

Fig. 7 is spherical projection model；

Fig. 8 is the perspective projection model of camera；

Fig. 9 is laterally security range schematic diagram, when there is collision target in w, then it represents that side collision can occur；

Figure 10 is the mapping curve figure of distance and pixels tall；

Figure 11 is the mapping curve figure of pixel wide and distance.

Specific embodiment

The invention will be further described with reference to the accompanying drawings and examples.

The virtual driving target collision detection method that the present embodiment is modeled based on real scene, comprising the following steps:

D=2.908 × 10^-7·h³-3.069×10^-4·h²+0.08266·h+0.252 (1)

W=-0.03093D³+2.916·D²-85.9·D+832 (2)；

5) collision judgment:

In the present embodiment, in the step 1), detects collision target and obtain its seat on virtual camera imaging plane Mark [x₁,y₁,x₂,y₂] the step of it is as follows:

A: to virtual driving scene image as original image before the vehicle of extraction virtual camera shooting；It is mentioned in the present embodiment The original image resolution taken is 2706*1738.

B: being used as input picture to be passed to feature extraction network after carrying out scaling to original image, defeated by feature extraction network Characteristic pattern is obtained out.The feature extraction network is SRN network, and SRN network expression is as follows:

y_n=y₀+y₁+y₂+...+y_n-1+f_n(y_n-1) (3)

F in formula_n() indicates with certain BN layers tactic, Relu activation primitive and convolutional layer, convolutional layer, that is, residual error Unit, y₀For the input of network, y_nFor the output of network.

SRN network structure described in the present embodiment includes 1 initial convolutional layer C1 and 3 sparse residual unit set C2, C3 and C4, wherein there is m sparse residual units in each sparse residual unit set.The sparse residual unit is by multiple It is formed with certain BN layers tactic, Relu activation primitive and convolutional layer (i.e. residual unit), number is determined by length factor l It is fixed；The length factor l can be used to control initial characteristics y₀Influence to network output, each sparse residual unit has identical Length factor.In the present embodiment C2, C3, C4 set in sparse residual unit number m be all 2, sparse residual unit length because Sub- l is 2, and network-wide factor k is 1, and each sparse residual unit convolution kernel size is 3*3, and convolution kernel weight uses MSRA method is initialized.In the present embodiment, original image is by scaling at 3 channel RGB, the input figure of size 800*600 Piece.It inputs picture C1 incoming first and carries out convolution, C1 includes the convolution kernel that 96 scales are 7*7, and convolution step-length stride is set It is set to 2.Meanwhile in order to obtain edge feature information, setting pixel filling pad is 3.After C1 carries out convolution, size is obtained C2 is input to for the characteristic pattern of 400*300*96.It is 96, pad 1 that each convolutional layer convolution kernel number is arranged in C2, in set First convolutional layer setting stride is 2, remaining each layer stride is 1.After C2 gathers convolution, export 200*150*96's Characteristic pattern.A convolutional layer includes 128 convolution kernels in C3, and pad and stride setting are identical as C2.Pass through C3, characteristic pattern ruler It is very little to become 100*75*128.A convolutional layer includes 256 convolution kernels in C4, and pad and stride setting are identical as C2, C3.Through Cross C4, the convolution characteristic pattern of last SRN network output 50*38*256.

C: characteristic pattern is passed to RPN network, RPN network will be corresponding with each pixel on characteristic pattern on input picture Point as anchor point, using anchor point as the center of anchor frame, each anchor point position generates several anchor frames over an input image.The present embodiment Defined at each anchor point frame side length be respectively 128,256,512 and frame length-width ratio be respectively 1:2,1:1,2:1 Amount to 9 anchor frames, it is 50*38*9=17100 that the present embodiment, which has the anchor frame number generated altogether,.

D: inputting convolutional layer for the characteristic image with anchor frame information, by convolutional layer output image input Softmax classification Device carries out two classified calculating of target/background, tentatively judges that anchor frame content belongs to target (pedestrian or vehicle) or background, obtains Several targets；Specifically: characteristic pattern is sent into a convolution kernel having a size of 1*1, the convolutional layer that convolution nuclear volume is 18, convolution The each point of nuclear volume character pair figure has 9 anchor frames, at the same each anchor frame content be likely to be two kinds of target or background as a result, Therefore Output Size is 50*38*18.Output image is sent into Softmax classifier, specially softmax classifier carries out two Classification, judges that anchor frame content is target or background.

Characteristic image is inputted into another convolutional layer simultaneously, convolutional layer output image is calculated by frame regression model, The translational movement and change of scale of amendment anchor frame are obtained, specifically: characteristic pattern is sent into a convolution kernel having a size of 1*1, convolution kernel The convolutional layer that quantity is 36, each point of convolution nuclear volume character pair figure has 9 anchor frames, while each anchor frame passes through anchor point Coordinate (A_x,A_y) and anchor frame length and width A_h、A_w4 data indicate in total, therefore Output Size is 50*39*36；Export image It is calculated by frame regression model, obtains the translational movement and change of scale of amendment anchor frame.

E: frame modification is carried out to all anchor frames according to the translational movement of obtained amendment anchor frame and change of scale.

F: according to the target classification score in step d in Softmax classifier calculated result, by the side of score from high to low Formula sequence, by anchor before the anchor frame in the high target of score comes, by anchor after the anchor frame in the low target of score comes, so All anchor frames are arranged successively, and extract 6000 forward anchor frames.

H: non-maxima suppression processing is carried out to extracted 6000 anchor frames, removal repeats anchor frame, the specific steps are as follows:

1. selecting multiple anchor frames belonging to a certain target；

I: it sorts remaining anchor frame, mentions from high to low again according to target classification score in Softmax classifier calculated result Preceding 300 anchor frames are taken to export as candidate target, candidate target=[x1, y1, x2, y2], (x1, y1) is anchor frame upper left corner seat Mark, (x2, y2) are anchor frame bottom right angular coordinate.

D=2.908 × 10 are obtained in the present embodiment^-7·h³-3.069×10^-4·h²+ 0.08266h+0.252 and w=- 0.03093·D³+2.916·D²The method of two equations of linear regression of -85.9D+832 is as follows:

Because the true environment of shooting is added to sphere mould at panoramic video by the virtual driving scene built in the present embodiment In type, and common recording shooting action is lost on three-dimensional real world information MAP to two dimensional image plane The depth information of object in environment, to cause can not obtain the position of the object in the scene.In order to realize virtual driving system The collision judgment function of system, needs to find out the mapping relations of true environment coordinate system Yu scenic picture coordinate system.

Since panoramic video to be attached on the inner surface of sphere model, essence is a kind of real world scenery to sphere mould The spherical projection on type surface, i.e., orthogonal projection of the 3D point to spherical surface in realistic space, and in the point and space of spherical surface image 3D point is one-to-one.Different from common perspective projection model, spherical projection model is as shown in fig. 7, ball S origin is in figure O, radius r, spherical surface are the panoramic picture of attachment, and coordinate system OXYZ is the right angle seat for defining target object in spherical panorama picture Mark system, the optical center of spherical camera is also used as with time point O.Connection space 3D point P_c=(X_c,Y_c,Z_c) with optical center O, hand over Spherical Surface S in Point p=(x, y, z).For ease of calculation, for the point of panoramic picture on spherical surface, using spherical coordinateMode carry out table Show, the relational expression with rectangular coordinate system are as follows:

Because of centre of sphere O, coordinate p on panorama, the 3D point P in realistic space_cCollinearly, p and P can be obtained_cRelationship:

WhereinIt enablesHave:

P=λ P_c (6)

In actual photographed, there is the relationships of rotation and translation for the reference coordinate of camera coordinates and real world, if R For spin matrix, T is translation vector, then has:

P_c=RP_w+T (7)

Mapping relations between spherical panorama picture coordinate and real world reference coordinate as a result, are as follows:

P=λ [R | T] P_w (8)

Wherein, spin matrix R and translation vector T is and shoots camera attribute decision used.

It is used to carry out the image apart from modeling in the present embodiment, is by being placed in sphere model optical center position in system Virtual camera shoots the scenic picture flat image obtained of spherome surface, therefore there is also have spherical panorama picture seat Mapping relations between mark and flat image coordinate, this mapping relations can be illustrated by the perspective projection model of camera, As shown in figure 8, setting camera coordinates system as (X_P,Y_P,Z_P), photo coordinate system is (w, h), and pixel coordinate system is (u, v), according to Similar triangle theory can obtain:

Light forms an analog quantity on imaging plane, by the way that the analog quantity is sampled and quantified, can obtain picture Coordinate value on pixel planes.There are scalings and translation transformation between pixel planes and imaging plane.If on imaging plane As after being mapped to pixel planes, respective coordinates are exaggerated α times on u axis, β times are exaggerated on v axis, origin translation c_w, c_h, i.e., Obtain following equation:

Arrangement formula (9) and (10), the mapping relations that camera coordinates system and pixel coordinate system can be obtained are shown below:

It can be obtained by formula (11), target range camera is remoter, Z_PBigger, then u, v are smaller, as position imaging plane lower edge It is remoter.Therefore, the distance of target range shooting point in space can be judged by the vertical pixel height of target in plane picture.

R, T, α, β, c in above-mentioned model_wAnd c_hIt is to determine that acquisition process is more multiple by the outer ginseng and internal reference attribute of camera It is miscellaneous, while will also result in accurate coordinate mapping relations there is also factors such as optical path distortions and being difficult to seek.The present embodiment is using inverse To thinking, the distance value in true coordinate and the pixel value in corresponding flat image coordinate are first counted by calibration experiment, then Using the method for data regression, the mapping relations between two kinds of data are found out, establish scene distance model.

Because judging that vehicle collision needs to obtain the distance model in both direction: one be along track direction it is longitudinal away from From model, one is perpendicular to the laterally security range model of track direction.What fore-and-aft distance model D=f (h) was mainly obtained is In scenic picture the vertical pixel height of target object and the object at a distance from the true environment medium shot point between pass System.

The width of roadway that laterally security Range Representation virtual driving vehicle needs to occupy in front sections of road, i.e. vehicle Clear could normally travel within this range.The range is determined by vehicle width and reserved safety allowance.As shown in figure 9, L in figure_DFor real road width, L_CFor actual vehicle width, L_SFor safety allowance, therefore lateral practical safe range L_R=L_C+ 2·L_S, actual range of the D between tested point and camera, w L_RWhen distance is D, lateral pixel corresponding in scenic picture is wide Degree.As shown in Figure 9, pixel wide w is determined at a distance from camera by target developed width and target in picture, therefore is set lateral Safe range model is w=G (L_R,D).Due to vehicle width L_CWith safety allowance L_SFor fixed value, L_RFor constant, thus it is lateral Safe range model can be converted into w=g (D).

First analyze D=f (h).If dependent variable is D, if there is relational expression:

D=b₀+b₁h₁+b₂h₂+…+b_mh_m+ε (12)

Wherein, ε is the stochastic variable of zero-mean, h₁,h₂,…,h_mFor controllable variable, b₀,b₁,…,b_mFor unknown ginseng Number, then deserving to be called formula (12) is multiple linear regression model (m > 1).It is assumed that h₁,h₂,…,h_m, D, carry out n times random experiment, obtain Observed value is h_i1,h_i2,…,h_im,D_i, i=1,2 ..., n then have D_i=b₀+b₁h_i1+b₂h_i2+…+b_nh_in+ε_i, often assume that ε₁, ε₂,…,ε_nObey independent same distribution N (0, σ²).Claim

For m member equation of linear regression,Referred to as sample point (h₁,h₂,…,h_m) at regressand value, wherein It is b respectively₀,b₁,b₂,…,b_mEstimation.

Finally, test to the regression equation being built such that, i.e. (h₁,h₂,…,h_m) whether have an impact to D, and still Linear, this just needs null hypothesis H₀:b₁=b₂=...=b_m=0.For this purpose, the amount of taking statistics:

Wherein, SS_RFor regression sum of square, SS_ReFor residual sum of squares (RSS).It is demonstrated in mathematical statistics, if H₀It sets up, then F It obeys the F that freedom degree is (m, n-m-1) and is distributed F (m, n-m-1).Given significance η, by p (F > F_η)=η, tabling look-up can obtain F_η.As F > F_ηWhen, refuse H₀, i.e., at significance η, h₁,h₂,…,h_mSignificant to the linear effect of D, regression equation is intentional Justice；Otherwise regression equation is meaningless.

Observation data needed for regression modeling are from practical bracketing, all parameters of calibration experiment in the present embodiment All domain car is thought as standard according to the driver of height 1.7m and the 9th generation of Dongfeng Honda.Through measuring, driver is in operator seat tune After whole good steering position, vertical height is 1.2m, vehicle width 1.8m to eyes from the ground.According to virtual driving system collision judgment Actual requirement carries out calibration measurement for the distance between traveling front 0m to 50m.It measures from 2m, is measured every 1m Once, it until at 50m, measures 49 times altogether.Simultaneously, it is contemplated that the safety allowance of vehicle width refers to automobile in experiment Width increases 20cm, every time in the straight-bar of the transversely placed and fixed length 2m in measurement point position, to record transverse width.Record Resulting observation data are as shown in table 1.

1 calibration experiment data record of table

Meanwhile for the ease of analysis, the mapping curve figure observed between data in observation data is depicted, such as Figure 10 and 11 It is shown.

It can be obtained by Figure 10 and 11, be in smooth non-linear relation between D and h, D and w, it can be using nonlinear regression to the mould Type is fitted.

By analyzing the mapping relations of D=f (h) above, can be seen that by nominal data curve graph, D and h are in three between the two Secondary curved line relation, therefore choose following regression model and carry out nonlinear regression modeling:

D=b₀+b₁h+b₂h²+b₃h³+ε (15)

Wherein, h is as plane vertical pixel height, and D is actual range, and ε is the stochastic variable of zero-mean, (b₀,b₁,b₂) The unknown parameter acquired is needed for regression modeling.

For the ease of carrying out modeling analysis, if h₁=h, h₂=h², h₃=h³.By h₁、h₂、h₃Substituting into equation (15) can be by it It is shown to be converted into two variable linear regression such as (16):

D=b₀+b₁h+b₂h₂+b₃h₃+ε (16)

Its corresponding binary linear regression equation are as follows:

Wherein,It is parameter (b₀,b₁,b₂,b₃) estimated value,For point (h₁,h₂,h₃) at regressand value.

Using least square method to parameter (b₀,b₁,b₂,b₃) estimated, it can acquire as plane vertical height h is to reality The regression equation of fore-and-aft distance D are as follows:

D=2.908 × 10^-7·h³-3.069×10^-4·h²+0.08266·h+0.252 (18)

Finally, passing through null hypothesis H₀:b₁=b₂=b₃Whether=0 examine (18) significant, easily calculates:

Total sum of squares

Residual sum of squares (RSS)

Regression sum of square

SS_R=SS_T-SS_Re=6207

By formula (14) amount of taking statistics F ≈ 75.145, significance η=0.01 is given, by p (F > F_η)=0.01, looks into F Known to (m, n-m-1)=F (3,45) distribution table: F_η(3,45)=4.31.Due to: F=75.145 > 4.31, therefore gained recurrence side Formula (18) is significant.

Similarly, for substituting into its data and being calculated as the relationship between plane lateral Safe width w and actual range D, It obtains:

W=-0.03093D³+2.916·D²-85.9·D+832 (19)

By can be calculated 177.6 > 4.31 of F ≈, so, regression equation (19) is significant.

The fore-and-aft distance D used in the present embodiment is demonstrated by above-mentioned analysis and as the lateral Safe width w of plane It can be used to do target collision detection in the virtual driving scene modeled based on real scene.

Finally, it is stated that the above examples are only used to illustrate the technical scheme of the present invention and are not limiting, although referring to compared with Good embodiment describes the invention in detail, those skilled in the art should understand that, it can be to skill of the invention Art scheme is modified or replaced equivalently, and without departing from the objective and range of technical solution of the present invention, should all be covered at this In the scope of the claims of invention.

Claims

1. the virtual driving target collision detection method based on real scene modeling, which comprises the following steps:

1) virtual driving scene of vehicle front is shot by the way that the virtual camera of virtual driving vehicle drive position is arranged in, and passed through Target detection obtains coordinate [x of the collision target on virtual camera imaging plane₁,y₁,x₂,y₂], (x in coordinate₁,y₁) indicate to touch Hit the top left co-ordinate of target rectangle frame, (x₂,y₂) indicate collision target rectangle frame bottom right angular coordinate；The virtual driving Scene is made of sphere model and the real scene video being attached on sphere model inner wall；

2) according to the plane coordinates [x of collision target₁,y₁,x₂,y₂] obtain collision target picture plane vertical height h, h=y₂；

3) according to as plane vertical height h is to the regression equation of practical fore-and-aft distance D, operation vehicle and collision target is calculated Fore-and-aft distance D；

D=2.908 × 10^-7·h³-3.069×10^-4·h²+0.08266·h+0.252 (1)

4) according to as the regression equation between plane lateral Safe width w and actual range D, the picture for obtaining virtual driving vehicle is flat Surface side is to Safe width w；

W=-0.03093D³+2.916·D²-85.9·D+832 (2)；

5) collision judgment:

First judge whether virtual driving vehicle will occur longitudinal impact with target, if fore-and-aft distance D is less than the safety arrestment of the vehicle Apart from when, then judgement longitudinal impact may occur, when fore-and-aft distance D be greater than the vehicle safe stopping distance when, then judge not Longitudinal impact can occur

Then under the premise of with target longitudinal impact may occur for virtual driving vehicle, continue according to target rectangle frame or so The abscissa x on both sides₁And x₂To determine whether side collision will occur, ifAndThen judge that virtual driving vehicle will not occur side collision with target and otherwise judge virtual driving With target side collision can occur for vehicle；Aforementioned middle w_tFor the pixel overall width of picture captured by virtual camera.

2. the virtual driving target collision detection method according to claim 1 based on real scene modeling, feature It is: in the step 1), detects collision target and obtain its coordinate [x on virtual camera imaging plane₁,y₁,x₂,y₂] The step of it is as follows:

B: it is used as input picture to be passed to feature extraction network after carrying out scaling to original image, is exported by feature extraction network To characteristic pattern；

C: being passed to RPN network for characteristic pattern, and RPN network is by point corresponding with each pixel on characteristic pattern on input picture As anchor point, using anchor point as the center of anchor frame, each anchor point position generates several anchor frames over an input image；

D: will with anchor frame information characteristic image input convolutional layer, by convolutional layer output image input Softmax classifier into Two classified calculatings of row target/background tentatively judge that anchor frame content belongs to target or background, obtain several targets；Simultaneously Characteristic image is inputted into another convolutional layer, convolutional layer output image is calculated by frame regression model, obtains amendment anchor frame Translational movement and change of scale；

F: it according to the target classification score in step d in Softmax classifier calculated result, is arranged in the way of from high to low by score Sequence, by anchor before the anchor frame in the high target of score comes, by anchor after the anchor frame in the low target of score comes, so by institute Some anchor frames are arranged successively, and extract several forward anchor frames；

I: sorting remaining anchor frame from high to low again according to target classification score in Softmax classifier calculated result, and extraction is leaned on Preceding several anchor frames are exported as candidate target, candidate target=[x1, y1, x2, y2], and (x1, y1) is anchor frame top left co-ordinate, (x2, y2) is anchor frame bottom right angular coordinate；

J: the grid for being 7*7 by the corresponding characteristic pattern region division of each candidate target, and each grid is carried out Max pooling processing, Output Size are the candidate target characteristic pattern of 7*7；

K: particularly belonging to pedestrian, vehicle or background by full articulamentum and Softmax classifier calculated candidate target characteristic pattern, Export all kinds of probability vectors；Return the position offset for obtaining each candidate target characteristic pattern using frame again simultaneously, then Secondary amendment target anchor frame obtains the final coordinate [x of collision target₁,y₁,x₂,y₂]。

3. the virtual driving target collision detection method according to claim 2 based on real scene modeling, feature Be: the feature extraction network in the step b is SRN network, and expression formula is as follows:

y_n=y₀+y₁+y₂+...+y_n-1+f_n(y_n-1) (3)

F in formula_n() indicates with certain BN layers tactic, Relu activation primitive and convolutional layer, convolutional layer, that is, residual unit, y₀For the input of network, y_nFor the output of network；

The SRN network structure includes 1 initial convolutional layer C1 and 3 sparse residual unit set C2, C3 and C4, wherein often A sparse residual unit set is stacked by m sparse residual units；The sparse residual unit is by multiple with certain sequence BN layer, Relu activation primitive and the convolutional layer of arrangement form, and number is determined by length factor l；The length factor l can be used to Control initial characteristics y₀Influence to network output, each sparse residual unit have identical length factor.

4. the virtual driving target collision detection method according to claim 2 based on real scene modeling, feature Be: in the step c, using anchor point as the center of anchor frame, it is respectively 128,256,512 that frame side length is defined at each anchor point And frame length-width ratio is respectively total 9 anchor frames of 1:2,1:1,2:1.

5. the virtual driving target collision detection method according to claim 2 based on real scene modeling, feature Be: the processing of non-maxima suppression in the step h the following steps are included:

1. selecting multiple anchor frames belonging to a certain target；

2. according to Softmax classifier calculated in step d as a result, content in anchor frame is first belonged to the probability of target less than 0.5 Anchor frame removes, then content in anchor frame is belonged to the anchor frame of the maximum probability of target and remaining anchor frame carries out degree of overlapping one by one and sentences It is disconnected, if threshold value is higher than setting value, illustrates that two anchor frames belong to same target, retain maximum probability anchor frame, remove probability it is lesser that A anchor frame；Otherwise, illustrate that two anchor frame category different targets, two anchor frames retain simultaneously；

3. select multiple anchor frames belonging to next target, continue step 1.-treatment process 2., until completing target complete institute Belong to the screening of anchor frame.