CN114757977A

CN114757977A - Moving object track extraction method fusing improved optical flow and target detection network

Info

Publication number: CN114757977A
Application number: CN202210476328.5A
Authority: CN
Inventors: 魏铖磊; 张欢庆; 孔周维
Original assignee: Chongqing Changan Automobile Co Ltd
Current assignee: Chongqing Changan Automobile Co Ltd
Priority date: 2022-04-29
Filing date: 2022-04-29
Publication date: 2022-07-15

Abstract

The invention discloses a moving object track extraction method fusing an improved optical flow and a target detection network, and the method comprises the following steps of S1: acquiring videos of road sections or intersections through camera equipment; s2: detecting moving objects by using a deep learning detection network YOLOv3 according to the video acquired in S1, and recording the position and size information of each moving object; s3: calculating optical flow information of the moving object by the improved optical flow method based on the information obtained at S2; s4: and (4) acquiring the motion trail of the object by depending on the optical flow information acquired by the S3 and the detection frame, and stopping the algorithm after the video is finished. According to the method, the image is subjected to equal-scale scaling by adopting an improved optical flow algorithm, layers with different scales are obtained, and optical flow calculation is performed on each layer in a recursive manner, so that the whole method can cope with moving objects with high moving speed, the track extraction of the moving objects is realized, and the accuracy and the stability of object matching are ensured.

Description

Moving object track extraction method fusing improved optical flow and target detection network

Technical Field

The invention relates to the technical field of automobile intelligent driving systems, in particular to a moving object track extraction method fusing an improved optical flow and a target detection network.

Background

Currently, in the field of intelligent driving, extracting and predicting the motion trend of moving objects around a driving vehicle has become one of the key technologies of the intelligent driving system of an automobile evolving to a higher level. The technical aim is to acquire information such as the driving direction, speed, track and trend of moving objects around the automobile at the current moment by means of sensors arranged on the periphery of the automobile body, and then predict the motion state and track of the moving objects around the automobile at the future moment by a certain technology or algorithm, so that the automobile is helped to make advance decision and avoid dangerous scenes in advance. Therefore, the safety and comfort of the driver and the passenger in the driving process are improved, and the safety and the comfort have great application potential and value.

With the development of visual sensors and the improvement of computing power of mobile computing units, acquiring the motion state of an object based on a visual image is becoming the mainstream method for acquiring the motion state of a moving object, wherein the optical flow based object tracking and trajectory acquisition method is gaining more applications in the beginning of the development of the field. The optical flow method is to calculate the motion information of an object between adjacent frames according to the corresponding relation between the previous frame and the current frame by using the change of pixels in an image sequence on a time domain and the correlation between the adjacent frames, and generally has good capability of resisting the change of illumination intensity. However, when the moving speed of the object is high, the pixel point group in the calculation area of the optical flow method in the previous frame may completely exceed the calculation area in the next frame, so that the conventional optical flow method is difficult to achieve the desired effect, for example, patent 102999759a adopts the conventional optical flow method, and is only suitable for the scene where the vehicle is running at a low speed.

In order to acquire the motion track of the moving vehicle, after the optical flow method calculates and acquires the movement information of the current frame i of the moving object, the optical flow method may rely on some simple features, such as Harris (Harris) features, to match and calculate the position of the relevant features of the object of the frame i +1 of the next frame, for example, patent CN103871079A learns the Haar Like features of the vehicle using an Adaboost machine learning algorithm, and acquires the track of the moving vehicle in combination with the optical flow, but for vehicles and pedestrians in a complex urban scene, the Haar Like features belong to simple features designed manually, and may not describe the characteristics of the complex features, so that the method may face the situations of failed track extraction and lost tracking.

With the rise of deep learning wave in recent years, a large number of algorithms are emerged to learn the visual information of an actual object through the strong feature expression capability of a convolutional neural network; and then, post-processing is carried out on the features learned by the convolutional network, and finally high-level visual tasks such as object detection and tracking are realized. Although the scheme of extracting object features by means of the convolutional neural network has high accuracy, the scheme depends on a mobile device with high calculation power, and when the calculation power of the mobile device is insufficient, the deep learning scheme often gives way to the operation speed in order to ensure the accuracy.

Disclosure of Invention

In view of the above-mentioned deficiencies in the prior art, the present invention provides a moving object trajectory extraction method that integrates an improved optical flow and a target detection network, so as to solve the problems of failure in trajectory extraction, unstable object matching, and the like when the moving object moves at a fast speed in a complex urban traffic scene (e.g., a traffic intersection) in the prior art.

In order to solve the technical problem, the invention adopts the following technical scheme:

the method for extracting the track of the moving object fusing the improved optical flow and the target detection network comprises the following steps:

s1: acquiring videos of road sections or intersections through camera equipment;

s2: detecting moving objects by using a deep learning detection network YOLOv3 according to the video acquired in S1, and recording the position and size information of each moving object;

s3: calculating optical flow information of the moving object by the improved optical flow method based on the information obtained at S2;

s4: and (4) acquiring the motion trail of the object by depending on the optical flow information acquired by the S3 and the detection frame, and stopping the algorithm after the video is finished.

Compared with the prior art, the invention has the following beneficial effects:

aiming at the problems of failure in track extraction, instability in object matching and the like when a moving object moves at a high speed in a complex urban traffic scene (such as a traffic intersection) by an intelligent driving system, the method improves the optical flow algorithm, obtains layers with different scales by scaling an image in equal proportion, recursively calculates optical flows of all the layers, and realizes target acquisition, object feature matching and frame skipping detection by using a YOLOv3 target detection convolutional neural network, so that the whole algorithm can cope with the moving object with a high moving speed, the moving track of the object with a high moving speed can be acquired, and the accuracy and the stability of object matching are realized.

Drawings

FIG. 1 is a flow chart of a moving object trajectory extraction method that integrates an improved optical flow and a target detection network in accordance with the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the present invention will be further described in detail with reference to the accompanying drawings, the described embodiments should not be construed as limiting the present invention, and all other embodiments obtained by a person of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

The invention provides a moving object track extraction method fusing an improved optical flow and a target detection network, which comprises the following steps:

s2: and detecting moving objects by using the deep learning detection network YOLOv3 according to the video acquired in the S1, and recording the position and size information of each moving object. Based on the current frame, detecting and obtaining the position area set of the interested object under the current frame

Wherein: f represents the current frame number, n represents the number of the moving object under the current frame f, and the set

Each element comprises the coordinates of the upper left corner and the length and the width [ X ] of a moving object circumscribed rectangle with the number of n under the current frame f in a pixel coordinate system _f,Y_f,L_f,W_f]。

S3: from the information obtained at S2, optical flow information of the moving object is obtained by modified optical flow calculation.

At S3, optical flow information is acquired by the following algorithm:

step 1: the method comprises the steps of scaling an image to be [1/2,1/4,1/8] times of an original image to obtain layers of L1, L2 and L3, sequentially calculating optical flows of an object from the layer of L3, and taking the calculation result of the optical flows of the upper layer as the initial state of the optical flow calculation of the lower layer, so that the accurate optical flow of the object is calculated step by step from coarse to fine, and the calculation accuracy of the optical flow of a fast moving object is improved. Wherein, the process of calculating the optical flow is as follows:

suppose that the coordinate u of the pixel point u in the current frame in the image is [ u ═ u_x,u_y]In the next frame, the position of the pixel is v ═ u_x+d_x,u_y+d_y]The two frame images are denoted by I (x, y) and J (x, y), respectively. The optical flow method assumes that pixels in the field W have the same motion rule, and establishes an error function for optimizing the optical flow size as follows:

wherein w_xAnd w_yThe horizontal distance and the vertical distance between the adjacent pixel of the pixel u and the pixel u are generally 2, 3, 4, 5, 6, and 7. Rewriting the optical flow loss function, assuming that it is currently the L-th layer, then there is the following equation:

wherein, g^LTo guess the optical flow, the initial value of the optical flow, d, for the L-th iteration of the pyramid is represented ^LRepresenting the optical flow error of the L-th iteration of the pyramid as the residual optical flow; wherein the residual light flow d^LThe calculation process of (c) is as follows:

first, calculate ∈^LTo d^LThe derivative of (a) yields the following formula:

at d^L＝[0,0]The first order taylor expansion of (a) is:

wherein the content of the first and second substances,

indicating picture J^LThe x, y coordinates are derived. Defining:

where Δ I is found by image gradient calculation, i.e.:

all the above formulas are substituted into Taylor expansion, and two sides are transposed at the same time to obtain:

order to

Then there are:

when taking the minimum value, the loss function obtains the optimum value when the derivative is 0, i.e. d^L＝G^-1b_kFor pyramid optical flow, let n be calculated iteratively^k＝G^-1b_kThe iterative formula is

When n is^kAnd after the threshold value is smaller than the threshold value, the iteration is ended.

Step 2: acquiring optical flow information of a moving object: based on the moving object position area set acquired by the YOLOv3 algorithm S2, the detection frames output by the 32-fold down-sampling detection head, the detection frames output by the 16-fold down-sampling detection head, and the detection frames output by the 8-fold down-sampling detection head are sequentially set as the tracking start points of the pyramid top-level, second top-level, and third-level optical flows according to the down-sampling multiple performed by the YOLOv3 detection head from which the object detection frame is derived. Wherein, the optical flow calculated by the bottom pyramid optical flow tracking point is weighted with the optical flow calculation result of the upper layer, and the weight of the optical flow calculation result of the upper layer is 1/e ². Thus, a central point set of the detection frame area under the current frame is obtained

Wherein

Which represents the coordinates of the center point contained in the moving object with number i under the current frame f.

S4: and (5) acquiring an object motion track by means of the optical flow information acquired in the S3 and the detection frame, and stopping the algorithm after the video is finished.

In S4, the object motion trajectory is acquired by:

(1) drawing a motion trajectory of each detected object for the next frame f +1 based on the moving object position information and the optical flow information obtained at S2 and S3;

(2) continuously sending image frames into a YOLOv3 detection network, and performing Kalman filtering by using the coordinates of the newly detected object and the object position calculated by the optical flow; and taking the pixel points in the filtered object detection frame as the starting points of the next optical flow prediction, and continuously calculating and drawing the object position and the optical flow of the next frame.

As described above, the reminding system of the present invention is not limited to the configuration, and other systems capable of implementing the embodiments of the present invention may fall within the protection scope of the present invention.

Finally, it should be noted that the above embodiments are only used for illustrating the technical solutions of the present invention and not for limiting the technical solutions, and those skilled in the art should understand that modifications or equivalent substitutions can be made on the technical solutions of the present invention without departing from the spirit and scope of the technical solutions, and all that should be covered by the claims of the present invention.

Claims

1. The method for extracting the track of the moving object fusing the improved optical flow and the target detection network is characterized by comprising the following steps of:

2. The method for extracting moving object trajectory fusing improved optical flow and target detection network as claimed in claim 1, wherein in S2, based on the current frame, a set of interested object location areas under the current frame is detected

Each element comprises the coordinates of the upper left corner and the length and the width [ X ] of a moving object circumscribed rectangle with the number of n under the current frame f in a pixel coordinate system_f,Y_f,L_f,W_f]。

3. The moving object trajectory extraction method fusing an improved optical flow and a target detection network according to claim 1, wherein in S3, optical flow information is acquired by the following algorithm:

step 1: the method comprises the steps of scaling an image to be [1/2,1/4,1/8] times of an original image to obtain L1, L2 and L3 image layers, sequentially calculating optical flows of an object from the L3 image layer to the L1 image layer, and taking the calculation result of the optical flow of the upper layer as an initial state of optical flow calculation of the lower layer, wherein the process of calculating the optical flows is as follows:

suppose the coordinate u of the pixel point u in the image in the current frame is [ u ═ u { [ u ]_x,u_y]In the next frame, the position of the pixel is v ═ u_x+d_x,u_y+d_y]Two frame images are respectively represented by I (x, y) and J (x, y); the optical flow method assumes that pixels in the field W have the same motion law, and establishes an error function for optimizing the size of the optical flow as follows:

wherein w_xAnd w_yRepresenting the transverse distance and the longitudinal distance between the adjacent pixel of the pixel u and the pixel u; rewriting the optical flow loss function, assuming that it is currently the L-th layer, then there is the following equation:

Wherein, g^LTo guess the optical flow, the initial value of the optical flow for the L-th iteration of the pyramid is represented, d^LRepresenting the optical flow error of the L-th iteration of the pyramid as the residual optical flow; wherein the residual light flow d^LThe calculation process of (c) is as follows:

first, calculate ∈^LTo d^LThe derivative of (a) yields the following formula:

at d^L＝[0,0]The first order taylor expansion of (a) is:

wherein the content of the first and second substances,

indicating picture J^LDerivation of x, y coordinates; defining:

where Δ I is found by image gradient calculation, i.e.:

order to

Then there are:

when taking the minimum value, the loss function is optimizedValue when the derivative is 0, i.e. d^L＝G^-1b_kFor pyramid optical flow, let n be calculated iteratively^k＝G^-1b_kThe iterative formula is

When n is^kWhen the value is smaller than the threshold value, the iteration is ended;

step 2: acquiring optical flow information of a moving object: based on the moving object position area set acquired by the YOLOv3 algorithm S2, sequentially taking a detection frame output by a 32-time down-sampling detection head, a detection frame output by a 16-time down-sampling detection head and a detection frame output by an 8-time down-sampling detection head as tracking starting points of the top layer, the second layer and the third layer of the optical flow of the pyramid according to the down-sampling times of the YOLOv3 detection head from which the object detection frame comes; obtaining the central point set of the detection frame area under the current frame

Wherein

Which represents the coordinates of the center point included in the moving object with number i under the current frame f.

4. The method as claimed in claim 3, wherein the optical flow calculated by the pyramid optical flow tracing point at the bottom layer is weighted with the optical flow calculation result at the upper layer, and the weight of the optical flow calculation result at the upper layer is 1/e²。

5. The moving object trajectory extraction method fusing an improved optical flow and a target detection network according to claim 1, wherein in S4, the object motion trajectory is acquired by: