CN114663469A

CN114663469A - Target object tracking method and device, electronic equipment and readable storage medium

Info

Publication number: CN114663469A
Application number: CN202011538842.4A
Authority: CN
Inventors: 章恒
Original assignee: Fengtu Technology Shenzhen Co Ltd
Current assignee: Fengtu Technology Shenzhen Co Ltd
Priority date: 2020-12-23
Filing date: 2020-12-23
Publication date: 2022-06-24

Abstract

The application provides a target object tracking method, a target object tracking device, an electronic device and a computer readable storage medium. According to the method and the device, the images to be detected, which are acquired by the floating vehicle and comprise the road, are acquired, the optical flow matching processing is carried out on the images to be detected after the first target object is detected, the relative movement speed of the first target object and the floating vehicle is determined, and then the first passing position of the first target object at the first future time point can be predicted according to the relative movement speed and the images to be detected. Therefore, the target tracking method provided by the application can predict the first passing position of the first target object, and can calculate the accurate time point when the first passing position on the motion track is reached, namely the first future time point, by combining the first passing position and the relative motion speed, so that the accuracy of tracking the target object is improved.

Description

Target object tracking method and device, electronic equipment and readable storage medium

Technical Field

The present application relates to the field of trajectory prediction technologies, and in particular, to a target tracking method and apparatus, an electronic device, and a computer-readable storage medium.

Background

The enclosure is damaged intentionally, pedestrians and non-motor vehicles are not required to break into the enclosure when the enclosure is in the inspection station, passengers get on the inspection station at high speed and get off the inspection station to walk, and the like, so that according to the conventional supervision means and the shortage of patrol police power, the traffic management department can find, manage and rescue the violations in real time. Therefore, a method capable of predicting the motion trajectory of an illegal object such as a pedestrian or a non-motor vehicle to predict the real-time position is urgently needed.

The existing prediction technology generally determines the probability of going to each candidate track point under the current motion track according to the historical motion track and the current motion track of the illegal targets such as pedestrians, non-motor vehicles and the like so as to predict the positions of the illegal targets such as pedestrians, non-motor vehicles and the like.

However, in the prior art, only the possible arrival positions of the illegal targets such as pedestrians and non-motor vehicles can be predicted, and the time points of arrival of the illegal targets such as pedestrians and non-motor vehicles at specific places cannot be predicted accurately, so that the accuracy in tracking the targets is not high.

Disclosure of Invention

The application provides a target object tracking method, a target object tracking device, electronic equipment and a computer readable storage medium, and aims to solve the problems that the time point when a target such as a pedestrian, a non-motor vehicle and the like reaches a specific place in a current motion track cannot be accurately predicted, and the target tracking precision is low.

In a first aspect, the present application provides a package separation method, the method comprising:

acquiring an image to be detected containing a road, wherein the image to be detected is acquired through a preset floating vehicle;

detecting a first target object in the image to be detected;

performing optical flow matching processing on the image to be detected, and determining the relative movement speed of the first target object and the floating vehicle;

and predicting a first passing position of the first target object at a first future time point according to the relative movement speed and the image to be detected.

In a second aspect, the present application provides a package separation device comprising:

the system comprises an acquisition unit, a display unit and a control unit, wherein the acquisition unit is used for acquiring an image to be detected containing a road, and the image to be detected is acquired through a preset floating vehicle;

the detection unit is used for detecting a first target object in the image to be detected;

the determining unit is used for carrying out optical flow matching processing on the image to be detected and determining the relative movement speed of the first target object and the floating vehicle;

and the prediction unit is used for predicting a first passing position of the first target object at a first future time point according to the relative movement speed and the image to be detected.

In a third aspect, the present application further provides an electronic device, where the electronic device includes a processor and a memory, where the memory stores a computer program, and the processor executes the steps in any one of the target object tracking methods provided in the present application when calling the computer program in the memory.

In a fourth aspect, the present application further provides a computer-readable storage medium having a computer program stored thereon, where the computer program is loaded by a processor to execute the steps of the target object tracking method.

According to the method and the device, the images to be detected, which are acquired by the floating vehicle and comprise the road, are acquired, the optical flow matching processing is carried out on the images to be detected after the first target object is detected, the relative movement speed of the first target object and the floating vehicle is determined, and then the first passing position of the first target object at the first future time point can be predicted according to the relative movement speed and the images to be detected. Therefore, the target tracking method provided by the application can predict the first passing position of the first target object, and can determine the accurate time point when the first target object reaches the first passing position, namely the first future time point, by combining the first passing position and the relative speed, so that the accuracy of tracking the target object is improved.

Drawings

In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings needed to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the following description are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings based on these drawings without creative efforts.

FIG. 1 is a schematic view of a target object tracking system according to an embodiment of the present disclosure;

FIG. 2 is a schematic flow chart of a target object tracking method according to an embodiment of the present disclosure;

FIG. 3 is a schematic illustration of a road condition at which a first target object is currently located as provided in an embodiment of the present application;

FIG. 4 is a schematic diagram of an image to be detected provided in an embodiment of the present application;

FIG. 5 is a schematic view of a scenario of a traversable location provided in an embodiment of the present application;

FIG. 6 is a schematic flowchart of an embodiment of step S40 provided in the embodiments of the present application

FIG. 7 is a schematic structural diagram of an embodiment of a target tracking method provided in the embodiments of the present application;

fig. 8 is a schematic structural diagram of an embodiment of an electronic device provided in an embodiment of the present application.

Detailed Description

The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the drawings in the embodiments of the present application, and it is obvious that the described embodiments are only a part of the embodiments of the present application, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present application.

In the description of the embodiments of the present application, it should be understood that the terms "first", "second", and the like are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implying any number of technical features indicated. Thus, features defined as "first", "second", may explicitly or implicitly include one or more of the described features. In the description of the embodiments of the present application, "a plurality" means two or more unless specifically defined otherwise.

The following description is presented to enable any person skilled in the art to make and use the application. In the following description, details are set forth for the purpose of explanation. It will be apparent to one of ordinary skill in the art that the present application may be practiced without these specific details. In other instances, well-known processes have not been described in detail so as not to obscure the description of the embodiments of the present application with unnecessary detail. Thus, the present application is not intended to be limited to the embodiments shown, but is to be accorded the widest scope consistent with the principles and features disclosed in the embodiments herein.

The embodiment of the application provides a target object tracking method, a target object tracking device, electronic equipment and a computer-readable storage medium. The target object tracking may be integrated in an electronic device, and the electronic device may be a server or a terminal.

First, before describing the embodiments of the present application, the related contents of the embodiments of the present application with respect to the application context will be described.

In the field of target tracking, an important application scenario is that illegal target objects on a road are tracked and reported to a real-time position, and trackers including police and traffic police can be deployed in advance according to the real-time position to take measures for the illegal target objects.

However, the current tracking method only predicts the final position by combining the motion information of the target object acquired by the image acquisition system and the historical motion track of the target object, and cannot obtain the accurate time point when the target object reaches the final position. Therefore, when the current tracking method is used for tracking an illegal target object, it cannot be determined how much time is needed to be deployed in advance to intercept the target object, and the target object can arrive at the final position after being deployed for more than ten hours, which wastes manpower and material resources.

Based on the above defects in the related art, the embodiments of the present application provide a target object tracking method, which overcomes the defects in the related art at least to some extent.

An execution main body of the target object tracking and separating method according to the embodiment of the present application may be the target object tracking apparatus provided in the embodiment of the present application, or different types of electronic devices such as a server device, a physical host, or a User Equipment (UE) integrated with the target object tracking apparatus, where the target object tracking apparatus may be implemented in a hardware or software manner, and the UE may specifically be a terminal device such as a smart phone, a tablet computer, a notebook computer, a palm computer, a desktop computer, or a Personal Digital Assistant (PDA).

The electronic device can adopt a working mode of independent operation or a working mode of a device cluster, and by applying the target object tracking method provided by the embodiment of the application, the prediction error caused when the target object changes the movement speed can be reduced, the accurate time point when the first target object reaches the first passing position can be determined, and the accuracy of tracking the target object is improved.

Referring to fig. 1, fig. 1 is a schematic view of a scene of a target object tracking system according to an embodiment of the present disclosure. The target object tracking system may include an electronic device 100, and a target object tracking apparatus is integrated in the electronic device 100. For example, the electronic device may acquire an image to be detected including a road acquired by a floating vehicle, perform optical flow matching processing on the image to be detected after detecting a first target object, determine a relative movement speed of the first target object and the floating vehicle, and predict a first passing position of the first target object at a first future time point according to the relative movement speed and the image to be detected.

In addition, as shown in fig. 1, the target object tracking system may further include a memory 200 for storing data, such as image data and video data.

It should be noted that the scene schematic diagram of the target object tracking system shown in fig. 1 is only an example, and the target object tracking system and the scene described in the embodiment of the present application are for more clearly illustrating the technical solution of the embodiment of the present application, and do not form a limitation on the technical solution provided in the embodiment of the present application, and as a person having ordinary skill in the art knows that along with the evolution of the target object tracking system and the appearance of a new service scene, the technical solution provided in the embodiment of the present invention is also applicable to similar technical problems.

In the following, an electronic device is used as an executing subject, and the executing subject is omitted in the following method embodiments for simplicity and convenience of description.

Referring to fig. 2, fig. 2 is a schematic flowchart of a target object tracking method according to an embodiment of the present disclosure. It should be noted that, although a logical order is shown in the flow chart, in some cases, the steps shown or described may be performed in an order different than that shown or described herein. The target object tracking method includes steps S10 to S40, in which:

s10, acquiring an image to be detected containing a road, wherein the image to be detected is acquired through a preset floating vehicle.

In the embodiment of the application, the floating car method is adopted to obtain the image to be detected. The floating vehicle method generally refers to mounting a camera on a moving motor vehicle, and when the motor vehicle runs on a road, regularly acquiring images including the road through the camera, wherein the acquired images are to-be-detected images.

Wherein the road may be any type of road. In particular, roads may include, but are not limited to, highways, expressways, and urban roads.

The image to be detected may only include a specific lane (such as an emergency lane of a highway) of all lanes of the road, or may include all lanes of the road. For example, the image to be detected may only include an emergency lane in the highway, or may also include a normal driving lane and an emergency lane in the highway.

In some embodiments, only objects in a specific lane in the road need to be tracked, and in this case, the image to be detected may only contain the specific lane. For example, to track a vehicle in an emergency lane on a highway, the image to be detected may now contain only the emergency lane.

In some embodiments, the object to be tracked may appear in any lane, in which case the image to be detected contains all lanes. For example, a pedestrian temporarily getting off the vehicle on the expressway may move in an emergency lane or may enter a normal driving lane, and the image to be detected needs to include the emergency lane and the normal driving lane.

Specifically, in practical application, the electronic device to which the target object tracking method provided by the embodiment of the present application is applied may directly include, on hardware, a camera (the camera is mainly used for collecting an image to be detected) arranged on a floating vehicle, and locally store an image obtained by shooting with the camera, and may directly read the image inside the electronic device; or the electronic equipment can also establish network connection with the camera and acquire the image obtained by the camera on line from the camera according to the network connection; alternatively, the electronic device may also read the image captured by the camera from a related storage medium storing the image captured by the camera, and the specific acquisition mode is not limited herein.

The camera can shoot images according to a preset shooting mode, for example, the shooting height, the shooting direction or the shooting distance can be set, the specific shooting mode can be adjusted according to the camera, and the camera is not limited specifically. The multi-frame images shot by the camera can form a video through a time line.

And S20, detecting the first target object in the image to be detected.

The first target object refers to an object in the road, and may specifically refer to an object that has an illegal action or is desired to be tracked. The first target object may be of various types, and may be specifically set according to an actual application scenario, for example, the first target object may be a normal pedestrian, an operator, a motor vehicle, a non-motor vehicle, or a work vehicle.

In some embodiments, the first target object may be an illegal moving target on a highway, such as a pedestrian, a non-motor vehicle, or the like. For example, in the case of detecting pedestrians and non-motor vehicles illegally intruding into a highway, the first target object refers to the pedestrians and non-motor vehicles appearing in the image to be detected.

In some embodiments, the first target object may also be a motor vehicle to be tracked on a highway. For example, when a speed measuring device on a road measures that a motor vehicle with the license plate number of Yue A123456 runs at an overspeed, in order to track and position the real-time position of the vehicle for traffic police interception, the motor vehicle with the license plate number of Yue A123456 is set as a first target object.

In some embodiments, a trained object detection network may be used to detect a first target object in an image to be detected.

The object detection network (if not specifically indicated, the object detection network refers to a trained object detection network) may include a feature extraction layer and a detection layer.

And the characteristic extraction layer is used for outputting the image characteristics of the image to be detected according to the image to be detected. The feature extraction layer takes the image to be detected as input, and performs one or more operations including but not limited to convolution, pooling and the like on the image to be detected so as to extract the image to be detected and obtain the image features of the image to be detected.

And the detection layer is used for predicting the image type of the image to be detected according to the image characteristics of the image to be detected, and detecting the area of the first target object in the image to be detected if the image type predicted by the image to be detected is a preset type (wherein the preset type is used for indicating that the image to be detected contains the first target object). The detection layer takes the image characteristics of the image to be detected as input, and carries out detection processing according to the image characteristics of the image to be detected to obtain a first target object.

For example, taking the first target object as a pedestrian who illegally enters the highway as an example, the specific process of "detecting the first target object in the image to be detected" may include:

(1) inputting the image to be detected into the trained object detection network, enabling the image to be detected to pass through a feature extraction layer in the object detection network, performing one or more operations including but not limited to convolution, pooling and the like on the image to be detected, and acquiring the image features of the image to be detected.

(2) And enabling the image characteristics of the image to be detected to pass through a detection layer in a trained object detection network, performing prediction processing on the image to be detected according to the image characteristics of the image to be detected, predicting the image category of the image to be detected, and detecting the pedestrian in the image to be detected if the image category predicted by the image to be detected is a preset category, namely the pedestrian.

In some embodiments, the trained object detection network may be obtained by:

and A1, collecting a sample set in advance.

The sample set comprises a large number of sample images, and the sample pictures for training comprise images with sample objects and images without sample objects. For example, in order that the object detection network can detect a pedestrian, the pedestrian is taken as a sample object, and an image of a large number of and no pedestrians is acquired as a sample image.

In some embodiments, data enhancement may be performed on the sample set. Specifically, sample images collected in different devices, different time periods and different scenes can be labeled (whether the images contain sample objects and detection frames of the sample objects are labeled), and the data of the sample set is enhanced by using methods such as mosaic, rotation, cutting and splicing, random erasing, color dithering and the like.

And A2, constructing a preliminary object detection network.

The preset object detection network and the trained object detection network have similar network structures and functions, and reference may be made to the above description specifically, and details are not described herein for simplifying the description. Namely, the preset object detection network may also include a feature extraction layer and a detection layer.

For example, an open source network (such as YOLOv4 network) with model parameters as default values (available for detection tasks) may be used as the preset object detection network. The open source detection task network generally comprises a feature extraction layer and a detection layer, and the feature extraction layer and the detection layer of the open source detection task network can be adopted and respectively used as the feature extraction layer and the detection layer of the preset object detection network.

And A3, training the preset object detection network by using the sample set until the preset object detection network is converged to obtain the trained object detection network.

The training process of the object detection network is similar to the training process of the existing detection network, and for parts which are not described in detail, the existing training mode of the network model can be referred.

S30, carrying out optical flow matching processing on the image to be detected, and determining the relative movement speed of the first target object and the floating vehicle.

Wherein, the optical flow refers to the instantaneous speed of the pixel motion of a space motion object on an image. Optical flow matching is a method for acquiring the motion state of an object, and the principle is as follows:

and obtaining an optical flow vector, namely a motion vector, of the pixel point at the position by using the gray value change of the pixel point at the single position in the image, wherein the motion vector comprises the motion speed and the motion direction of the pixel point at the position. And then obtaining motion vectors of pixel points at all positions by the same method, obtaining all pixel points of an object in the image, and obtaining the motion state of the object by the motion vectors of all the pixel points. Therefore, the optical flow matching can obtain not only the moving speed of the object but also the moving direction of the object.

Specifically, a grayscale image of an image to be detected is first acquired.

After an image to be detected containing a first target object is obtained (wherein the image to be detected contains N continuous images in a time sequence), gray processing is respectively carried out on the N images to be detected, and N gray images corresponding to the N images to be detected are obtained.

And then, acquiring the gray value of each pixel point in the N gray images according to the N gray images. I (x)_i,y_i,t_i) The pixel of the y-th row and x-th column in the ith gray scale image is shown at the time t_iWherein i is more than or equal to 1 and less than or equal to N.

Then, according to the ith gray level image and the (i +1) th gray level image, a pixel point (x) is obtained_i,y_i) The corresponding optical flow constraint equation.

Assuming that the brightness among N images to be detected is constant, the pixel point (x) in the ith image in the image to be detected_i,y_i) And pixel point (x) in (i +1) th image in image to be detected_i+1,y_i+1) The gray scale values are the same, wherein i is more than or equal to 1 and less than or equal to N. From the characteristic that the (i +1) th acquisition time point lags behind the ith image, the following formula (1) can be derived:

I(xi,yi,ti)＝I(xi,yi,ti)+I_xiu_idt+I_yiv_idt+I_tidt equation (1)

Wherein, the right side of the equation is the pixel point (x) in the (i +1) th gray image_i+1,y_i+1And) is subjected to an approximation equation developed by a taylor series. I is_xiIs I (x)_i,y_i,t_i) Partial derivatives of x, I_yiIs I (x)_i,y_i,t_i) Partial derivatives of y, I_tiIs I (x)_i,y_i,t_i) For t_iThe partial derivatives of (a) are,

is a pixel point (x)_i,y_i) At a pixel point (x)_i,y_i) Pixel point (x)_i+1,y_i) Pixel point (x)_i+2,y_i) The instantaneous speed in the direction of alignment of (a),

is a pixel point (x)_i,y_i) At a pixel point (x)_i,y_i) Pixel point (x)_i,y_i+1), pixel point (x)_i,y_i+2) instantaneous speed in the direction of alignment.

For example, the 1 st gray image contains the A pixel (1, 1) in the first row and the first columnAnd the B pixel (1, 2) in the second row and the second column of the first row is positioned at the C pixel (1, 3) in the third row and the third column of the first row. At this time u₁Representing the instantaneous speed of the A pixel point in the ABC direction.

By changing the above formula (1), a pixel point (x) for representing can be obtained_i,y_i) The optical flow constraint equation of (2) below:

I_xiu_i+I_yiv_i+I_ti0 formula (2)

And then, obtaining the relative motion speed of each pixel point through an optical flow constraint equation of each pixel point in the image to be detected.

According to pixel point (x)_i,y_i) The optical flow constraint equation of (c) can determine the pixel point (x)_i,y_i) Has a motion vector of

Similarly, the motion vector of each pixel point on the whole image to be detected, namely the relative motion speed and the motion direction of each pixel point, can be obtained by the same method.

There are many kinds of optical flow matching algorithms. In some embodiments, a Lucas-Kanade optical flow matching method (hereinafter referred to as L-K method) may be employed to obtain the relative movement speed and movement direction of the first target object.

The L-K method can be combined with optical flow vector information of a plurality of adjacent pixel points, and ambiguity in an optical flow equation can be eliminated. Compared with a point-by-point calculation method, the L-K method is insensitive to image noise and is more stable in actual representation, so that in an actual application scene, the L-K method is usually adopted to perform optical flow matching processing on an image to be detected to obtain the relative movement speed and the movement direction of the first target object.

S40, predicting a first passing position of the first target object at a first future time point according to the relative movement speed and the image to be detected.

The first future time point is a time point when the second target object appears at the first passing position.

Specifically, in some embodiments, road recognition detection may be further performed according to the image to be detected, so as to detect a road condition (e.g., several roads and the direction of each road) at the current position of the first target object. And determines the travel track of the first target object based on the road condition at the position where the first target object is currently located and the movement direction of the first target object determined in step S30. For example, as shown in fig. 3, fig. 3 is a schematic diagram of a road condition of a current position of a first target object provided in the embodiment of the present application. The current orientation of the first target object is road 1, and the road may be taken as the travel track of the first target object.

In some embodiments, the GPS position of the floating vehicle at the time of acquiring the image to be detected may also be acquired, and road conditions within a preset range (e.g., within 20 meters) of the GPS position, such as whether a bifurcation road exists or not, may be detected in a preset map. And determines the travel track of the first target object based on the road condition at the position where the first target object is currently located and the movement direction of the first target object determined in step S30.

In order to achieve more accurate tracking, after determining the travel track of the first target object, it can be further predicted at which specific position and at which time point the first target object will appear. Specifically, the method comprises the following steps:

in some embodiments, the first future time point may be used as a known condition to predict a first passing position where the first target object arrives at the first future time point, which will be described in detail in steps S41-S44 below and will not be described herein.

In some embodiments, cameras are provided in the roadway at fixed distances. The image of the first target object is thus captured by the camera arranged on the road. Since the travel locus of the first target object is determined, a position on the travel locus of the first target object at which the camera is previously set may be set as the first passing position. At this time, a first future point in time at which the first target object reaches the camera (i.e., the first passing position) may be predicted using the first passing position as a first known condition. Specifically, the prediction step may be as follows:

(1) and acquiring the movement speed of the floating car, and acquiring the movement speed of the first target object according to the relative movement speed and the movement speed of the floating car.

(2) And when acquiring the image to be detected, acquiring the position of the first target object and the acquisition time point of the image to be detected.

(3) And determining the movement distance of the first target object according to the position of the first target object and the first passing position.

(4) And determining the movement time length of the first target object according to the movement speed of the first target object and the movement distance of the first target object.

(5) And determining a first future time point according to the acquisition time point of the image to be detected and the motion time of the first target object. At this time, the first future time point is a time point obtained by adding the motion duration of the first target object to the acquisition time point of the image to be detected. For example, the acquisition time point of the image to be detected is 6:30am, the movement duration of the first target object is 2 hours, and the first future time point is 8:30 am.

Further, in order to predict and track the first target object more accurately, a camera may be provided at a road bifurcation.

According to the method and the device, the image to be detected including the road and acquired by the floating vehicle is obtained, after the first target object is detected, the optical flow matching processing is carried out on the image to be detected, the relative movement speed of the first target object and the floating vehicle is determined, and then the first passing position of the first target object at the first future time point can be predicted according to the relative movement speed and the image to be detected. Therefore, the target tracking method provided by the embodiment of the application can predict the first passing position of the first target object, and can calculate the accurate time point when the first passing position on the motion track is reached, namely the first future time point, by combining the first passing position and the relative motion speed, so that the precision in tracking the target object is improved.

However, in an actual scene, due to the performance influence of the image acquisition device installed on the floating car, or because the installation position is far away from the shooting position, the floating car is difficult to acquire a high-definition image to be detected, and the first target object is easy to be misjudged. In order to improve the detection accuracy of the first target object, in some embodiments, the step S20 may specifically include the following steps B1 to B4:

and B1, performing feature extraction on the image to be detected to obtain the image features of the image to be detected.

In some embodiments, steps B1-B3 may be implemented with a deep-learned attribute detection network.

The attribute detection network may include: the device comprises a feature extraction module, a segmentation module and a detection module, wherein:

and the characteristic extraction module is used for outputting the image characteristics of the image to be detected according to the image to be detected. The feature extraction layer takes the image to be detected as input, and performs one or more operations including but not limited to convolution, pooling and the like on the image to be detected so as to extract the image to be detected and obtain the image features of the image to be detected.

And the segmentation module is used for predicting the image type of the image to be detected according to the image characteristics of the image to be detected, and segmenting the area of the object in the image to be detected if the image type predicted by the image to be detected is a preset type (wherein the preset type is used for indicating that the image to be detected contains the object). The segmentation module takes the image characteristics of the image to be detected as input, and performs segmentation processing according to the image characteristics of the image to be detected to obtain an object in the image to be detected.

The detection module is used for further extracting features according to the area where the object is located to obtain the image features of the object; and predicting according to the image characteristics of the object, and determining whether the region of the object has the characteristic vector of the first attribute characteristic and the characteristic vector of the second attribute characteristic. Namely, a detection module of the attribute detection network can be called to perform feature discrimination on the attribute to be identified, so as to obtain the image feature of the attribute to be identified.

At this time, a feature extraction module of the attribute detection network can be called to perform feature extraction on the image to be detected, so as to obtain the image features of the image to be detected.

In some embodiments, the attribute detection network may be obtained by training a pre-constructed detection network through a large amount of sample data sets (sample data marks information about whether an object is included, a bounding box or a detection box of the included object, and first attribute features and second attribute features of the included object, and the like). The specific training process is similar to the training process of the existing segmentation and detection task network, and for the part of the training process which is not detailed, the existing network training mode can be referred.

Similar to the mesh function and structure of the attribute detection network, the pre-constructed detection network may also include: the device comprises a feature extraction module, a segmentation module and a detection module.

And B2, performing segmentation processing according to the image characteristics to obtain the object in the image to be detected.

The object may specifically refer to a pedestrian, a motor vehicle, a non-motor vehicle, a sign or a guard rail, etc. in the image to be detected.

Correspondingly, at this time, a segmentation module of the attribute detection network can be called to perform segmentation processing according to the image characteristics of the image to be detected, so as to obtain the object in the image to be detected.

And B3, detecting whether the object has the first attribute characteristic and the second attribute characteristic.

Wherein the first attribute feature and the second attribute feature are two different locations included in the first target object. In particular, the first attribute feature and the second attribute feature are different for different first target objects. For example, where the first target object is a pedestrian, the first attribute feature and the second attribute feature may be different parts of the pedestrian, such as shoes and hair. As another example, when the first target object is a motor vehicle, the first attribute feature and the second attribute feature may be different portions of the motor vehicle, such as doors and tires.

Correspondingly, at this time, a detection module of the attribute detection network can be called to detect whether the pixel points of the first attribute feature and the pixel points of the second attribute feature exist in the area where the object is located, so as to determine whether the object has the first attribute feature and the second attribute feature. If the pixel point of the first attribute characteristic exists in the area where the object is located, determining that the first attribute characteristic exists in the object; and if the pixel point of the first attribute characteristic does not exist in the area of the object, determining that the first attribute characteristic does not exist in the object. If the pixel point of the second attribute characteristic exists in the area of the object, determining that the second attribute characteristic exists in the object; and if the pixel point of the second attribute characteristic does not exist in the area of the object, determining that the second attribute characteristic does not exist in the object.

And B4, when the object is detected to have the first attribute feature and the second attribute feature, the object is taken as the first target object.

And when the object is detected to have the first attribute feature and the second attribute feature, the object is taken as a first target object. And when the object is detected to have only the first attribute characteristic, or only the second attribute characteristic, or both the first attribute characteristic and the second attribute characteristic, the first target object is proved to be absent in the image to be detected, the object is abandoned, and the next image to be detected is continuously detected.

Taking the detection of a pedestrian violating the expressway as an example (i.e., the first target object is a pedestrian), in this case, the first attribute feature and the second attribute feature to be detected may be hair and shoes, respectively, and if the object detected in the step B2 has the first attribute feature (i.e., hair) and the second attribute feature (i.e., shoes), it is determined that the object detected in the step B2 is the first target object (i.e., a pedestrian).

By detecting whether the object in the image to be detected has the inherent attribute characteristics of the first target object according to the inherent attribute characteristics of the first target object, and determining the object having the inherent attribute characteristics of the first target object as the first target object, the detection accuracy of the first target object can be improved to a certain extent.

However, when the first target object is detected by the target detection method described in steps B1 to B4, the first attribute feature and the second attribute feature are likely to be detected incorrectly, for example, an object feature other than shoes is detected as shoes.

In order to further improve the detection accuracy of the first target object, in some scenarios, the position relationship between the first attribute feature and the second attribute feature may be further combined when the first target object is detected by using the above steps B1 to B4. In this case, the step B4 may specifically include the following steps B41-B42, in which:

b41, when the object is detected to have the first attribute feature and the second attribute feature, detecting the target position relation between the first attribute feature and the second attribute feature.

The target position relation refers to the spatial position relation of the first attribute feature and the second attribute feature. The spatial relationship may have a variety of manifestations, for example, the spatial relationship may include: first attribute feature above second attribute feature, first attribute feature below second attribute feature and first attribute feature to the left of second attribute feature, and so on.

In some embodiments, the target position relationship may be determined by comparing coordinate position relationships between all pixel points in the first attribute feature and all pixel points in the second attribute feature.

Referring to fig. 4, fig. 4 is a schematic diagram of an image to be detected provided in an embodiment of the present application, and fig. 4 illustrates how to determine a target position relationship according to a coordinate position relationship between pixel points.

In fig. 4, the image to be detected includes a pedestrian, and the leftmost lower corner of the image to be detected is taken as the origin of coordinates, i.e., (0, 0). The Y axis points upward from the origin of coordinates in the width direction of the image to be detected on the plane of the image to be detected. The X-axis points from the origin of coordinates to the right in the long direction of the image to be detected on the plane of the image to be detected. And establishing a coordinate system by using the coordinate origin, the X axis and the Y axis.

If the first attribute feature is hair, the second attribute feature is trunk, the pixel point a with the minimum coordinate on the Y axis in the pixel points of the first attribute feature (namely, hair) is (5,10), and the pixel point B with the maximum coordinate on the Y axis in the pixel points of the second attribute feature (namely, trunk) is (5,8), then it is described that the coordinates of all the pixel points of the first attribute feature (namely, hair) on the Y axis are all larger than the coordinates of the pixel point B on the Y axis, namely, all the pixel points of the first attribute feature (namely, hair) are above all the pixel points of the second attribute feature (namely, trunk), therefore, the target position relationship between the first attribute feature (namely, hair) and the second attribute feature (namely, trunk) is: the first attribute feature (i.e., hair) is above the second attribute feature (i.e., torso).

And B42, when the target position relation is detected to accord with a preset reference position relation, taking the object as the first target object.

The preset reference position relationship may also be different for different first attribute features and second attribute features. Specifically, the setting may be performed according to a position relationship of the inherent attribute feature of the first target object in an actual application scene, and is not limited specifically.

For example, when the first attribute feature is hair and the second attribute feature is torso, the preset reference position relationship is that the first attribute feature is above the second attribute feature. If the first attribute feature is a face and the second attribute feature is a trunk, the preset reference position relationship is that the first attribute feature is above the second attribute feature.

In contrast, when the first attribute feature is a shoe and the second attribute feature is a torso, the preset reference position relationship is that the first attribute feature is below the second attribute feature.

By detecting whether the target position relation between the first attribute feature and the second attribute feature is the same as the preset reference position or not, the object is finally determined to be the first target object only when the target position relation is the same as the preset reference position, the condition that the first attribute feature and the second attribute feature are judged by mistake can be avoided, and the detection accuracy of the first target object is improved.

In a practical application scenario, the tracked first target object may be an illegal target (e.g., an illegal pedestrian). Therefore, after the position of the illegal target is predicted, the target object tracking method provided by the embodiment of the application can output alarm information so as to intercept the illegal target. Further, in order to improve the accuracy of the warning information, the target object tracking method provided by the application can firstly confirm whether the illegal target reaches the predicted position, and if the illegal target reaches the predicted position, warning is performed.

At this time, the step S40 may specifically include the following steps C1-C4, wherein:

and C1, calling a preset image acquisition device to acquire the warp image of the first warp position, wherein the image acquisition device is arranged at the first warp position.

The image acquisition device acquires an image including a first passing position at a first future time point.

Because floating vehicle can not be along with the target object motion, in order to improve tracking efficiency, conveniently do further orbit analysis to the target object, in some embodiments of this application, can set up the camera (be image acquisition device), hereinafter be referred to fixed point camera for short in advance on some positions of road, fixed point camera is used for gathering the image (be through the image promptly).

C2, detecting a second target object in the image.

The second target object refers to an object in the road, and may specifically refer to an object that has an illegal action or is desired to be tracked. The second target object may be of various types, and may be specifically set according to an actual application scenario, for example, the second target object may be a normal pedestrian, an operator, a motor vehicle, a non-motor vehicle, or a work vehicle.

Wherein the second target object is an object of the same category as the first target object. For example, if the first target object is a pedestrian, then the second target object is also a pedestrian; if the first target object is a motor vehicle, then the second target object is also a motor vehicle. The first target object and the second target object are distinguished in that: the first target object is an object detected by aiming at an image to be detected acquired by the floating vehicle, and the second target object is an object detected by aiming at a passing image acquired by an image acquisition device (namely a camera preset on the first passing position) of the stuck point.

Here, the manner of detecting the second target object in step C2 is similar to the manner of detecting the first target object in step S20. In some embodiments, the object detection network trained in step S20 may be used to detect the second target object in the traversed image, and the detailed description is omitted here. Alternatively, the second target object may be detected in the manner described in the above steps B1 to B4 and B41 to B42.

C3, detecting whether the second target object is the same as the first target object.

Wherein the purpose of step C3 is to determine whether the first target object has reached the first traversed location at the first future point in time as predicted.

When the second target object is detected, the accuracy of prediction can be judged only by determining that the image to be detected and the image to be detected contain objects of the same category and determining that the second target object and the first target object are the same object. If the detected second target object is the same as the first target object, it is determined that the first target object has reached the first passing position at the first future time point, and the prediction in step S40 is correct.

Specifically, the similarity between the second target object and the first target object may be determined by some similarity measure algorithm (e.g., cosine distance algorithm) to determine whether the second target object and the first target object are the same. When the similarity between the second target object and the first target object is larger (for example, larger than a preset similarity threshold), the second target object and the first target object are determined to be the same. When the similarity between the second target object and the first target object is small (for example, smaller than or equal to a preset similarity threshold), it is determined that the second target object is not the same as the first target object.

In some embodiments, whether the second target object is the same as the first target object may be detected by a cosine distance. Specifically, a second feature vector of the second target object and a first feature vector of the first target object are determined, respectively. And then, the cosine distance between the second characteristic vector and the first characteristic vector is obtained, and if the cosine distance is smaller than a preset threshold value, the second target object is the same as the first target object. And if the cosine distance is greater than or equal to the preset threshold value, the second target object is different from the first target object.

In order to improve the detection accuracy of whether the second target object is the same as the first target object, further, the similarity between the first attribute feature of the second target object and the first attribute feature of the first target object, and the similarity between the second attribute feature of the second target object and the second attribute feature of the first target object may be compared, respectively.

Taking the example of detecting a pedestrian who breaks into the highway (i.e., the first object is a pedestrian), for example, when the pedestrian (the first object) is detected in step S20, information such as the gender, age, forward/backward movement, color, helmet, glasses, clothing color, clothing style, whether to ride or not is further detected. When the pedestrian (the first target object) is detected in step C2, the information such as the gender, age, forward and backward movement, color, helmet, glasses, color of clothing, style of clothing, whether to ride or not of the pedestrian is also detected. Then, in step C3, the cosine distances of the information of the gender, age, forward and backward movement, hair color, helmet, glasses, clothing color, clothing style, whether to ride, etc., of the pedestrian in step S20 and step C2 are calculated, respectively, and when the cosine distances are smaller, it is proved that the degree of similarity between the pedestrian detected in step S20 and the pedestrian detected in step C2 is higher, and it can be determined that the second target object is the same as the first target object. When the cosine distance is large, it turns out that the degree of similarity between the pedestrian detected in step S20 and the pedestrian detected in step C2 is low, it can be determined that the second target object is not identical to the first target object.

Because the shooting visual angle and the shooting distance of the floating vehicle and the fixed point camera relative to the target object are different, certain difficulty exists if the images collected by the floating vehicle and the fixed point camera are required to be detected whether to contain the same target object. In some embodiments, the first target object and the second target object are detected by using the attribute elements of the target object such as the face, the torso, the head, and the positional relationship between the attribute elements, as described in the above steps B1 to B4 and B41 to B42. And determines whether the first target object and the second target object are identical based on the attribute elements of the first target object, the second target object, and the positional relationship between the attribute elements in step C3. Therefore, the image contrast can be converted into the contrast of the attribute elements of the target object, so that the recognition difficulty is reduced to a certain extent, and the recognition accuracy is improved.

And C4, when the second target object is detected to be the same as the first target object, outputting alarm information.

The warning information is used for providing a warning, for example, providing warning information of an illegal target to a traffic supervision department and the like. The alert information may include information such as a first past location and a first future point in time.

Taking the detection of the pedestrian who violates the break on the expressway as an example, after the first passing position where the pedestrian a arrives at the first future time point is predicted, if the passing image acquired at the first future time point by the image acquisition device arranged at the first passing position contains the pedestrian a, the first future time point and the first passing position are output to the traffic supervision department.

In some scenarios, to improve the tracking accuracy of the first target object, the movement speed of the first target object may be further updated according to the actual time when the first target object reaches the first passing position. And predicting the travel track of the first target object after the first future time point through the updated motion speed. At this time, the step C3 may further include the following steps C5-C9, wherein:

and C5, when a second target object is detected to be the same as the first target object, acquiring a first time difference between the acquisition time point of the image to be detected and the acquisition time point of the image to be detected.

The acquisition time point (referred to as a first acquisition time point for short in the text) of the image to be detected refers to the time point when the floating car acquires the image to be detected, the acquisition time point (referred to as a second acquisition time point for short in the text) of the image to be detected refers to the time point when the preset image acquisition device acquires the image to be detected, and the first time difference refers to the time difference value between the time point when the floating car acquires the image to be detected and the time point when the preset image acquisition device acquires the image to be detected.

For example, if the floating car acquires an image to be detected at 6:30am and the preset image acquisition device acquires a passing image at 8:30am, the time difference is 2 hours.

And C6, acquiring the position of the first target object at the acquisition time point of the image to be detected.

The position is the position of the first target object at the first acquisition time point.

In some embodiments, the approximate location of the floating car at the time of acquiring the image to be detected may be adjusted as the location of the first target object.

In some embodiments, in order to more accurately obtain the location of the first target object, the cosine distance between the image to be detected and the real scene map of the expressway may be determined, and then the location of the first target object may be determined by the cosine distance. The method comprises the following specific steps:

(1) firstly, the real scene map of the expressway can be divided into a plurality of segments of real scene maps, the number of the segments can be adjusted according to the actual situation, and the position in each segment of real scene map is the known position in the real scene map of the expressway.

(2) And obtaining the cosine distance between the feature vector of the image to be detected and the feature vector of each segment live-action map.

(3) And determining the segment live-action map with the minimum cosine distance, and taking the position in the segment live-action map with the minimum cosine distance as the position of the first target object.

C7, determining the position offset of the first target object according to the position and the first passing position.

The position offset refers to a distance between the located position and the first passing position.

Referring to fig. 5, fig. 5 is a schematic view of a scene of a menstruation position provided in an embodiment of the present application. The above concept is explained by taking the case of detecting a pedestrian who breaks an intrusion on the highway in fig. 5, where A, B in fig. 5 are all the position points on the highway, and AB is 10 km.

The floating car firstly collects an image to be detected of a pedestrian at a point A at 6:30am, wherein the point A is a point on an expressway, the point A is the position, and 6:30am is the collection time point of the image to be detected. An image acquisition device is preset at a point B, 8: the image acquisition device preset at the point B at 30am acquires the image of the pedestrian, so the point B is a first longitude position, and 8:30am is a first future time point. Therefore, the distance between A, B points is the position offset, and if the distance between a and B is 10km, the position offset is 10 km.

And C8, predicting the traveling speed of the second target object according to the first time difference and the position offset.

The second target object is the same as the first target object, so that the traveling speed of the second target object refers to the traveling speed of the first target object when the first target object continues to move after reaching the first passing position.

Next, by way of example in step C7, when the pedestrian reaches point B, the first time difference is 2 hours, and the traveling speed of the pedestrian is 5 km/hour, so the traveling speed of the second target object is 5 km/hour.

C9, predicting a second passing position of the second target object at a second future point in time according to the travelling speed.

Wherein the second future point in time is a point in time at which the second target object appears at the second passing position.

The second passing position refers to a position where the first target object continues to move after reaching the first passing position and arrives at a second future time point, because the second target object is the same as the first target object.

Specifically, first, the travel track of the second target object may be determined based on the image of the second target object, the road condition of the current position of the second target object, and the movement direction of the second target object. At this time, the moving direction of the second target object may be determined according to the through-image-referenced optical flow matching algorithm mentioned in step S30 described above. The manner of determining the travel track of the second target object is similar to the manner of determining the travel track of the first target object in step S40, and reference may be specifically made to the description of step S40, which is not repeated herein. The second target object may adopt the same movement direction as the first target object, or a movement direction different from the first target object.

Then, a second passing position of the second target object at a second future point in time is predicted from the travel speed and the travel trajectory of the second target object.

The step C9 may refer to the description of the step S40 specifically for determining the second passing position of the second target object at the second future time point and the manner of determining the first passing position of the first target object at the first future time point in the step S40, which are not repeated herein.

Take the detection of an offending pedestrian at high speed in FIG. 5 as an example. Next, an example of step C8 is described, where a, B, and C are collinear in fig. 5, C is also a position point on the highway, and BC is 10 km.

If one wants to predict the pedestrian's position at 10:30am, the second future point in time is 10:30am, the time difference between the first future point in time and the second future point in time is 2 hours, and it is determined that the travel speed is 5km/h, and thus the distance traveled is 10km, i.e. 8:30am to 10: the pedestrian moved 10km from point B between 30am in the same direction of motion as at point a. So point C reached at 10:30am is the second pass position.

Continuing with the example of detecting an illegal pedestrian at high speed in fig. 5, BD is 10 km. And the pedestrian is positioned at the point B at the moment of 8:30am, and the moving direction of the pedestrian is detected to be towards the point D through the image to be detected.

If one wants to predict the pedestrian's position at 10:30am, the second future point in time is 10:30am, the time difference between the first future point in time and the second future point in time is 2 hours, it is determined that the travel speed is 5km/h, and thus the travel distance is 10km, i.e. 8:30am to 10: the pedestrian moves 10km from point B toward point D between 30am, reaching point D. The second transit position is therefore point D.

Further, in order to more intuitively reflect the motion trajectory of the target object to improve the tracking efficiency of the target object, after the travel trajectory and the first passing position of the first target object are determined in the above step S40, the travel trajectory and the first passing position of the first target object are displayed on a preset map. After the travel locus and the second passing position of the second target object are determined in the above-described step C9, the travel locus and the second passing position of the second target object are displayed on the preset map.

In some practical scenarios, it is also desirable to be able to predict the position at which the first target object arrives at any point in time. At this time, a first passing position where the first target object arrives at the first future time point may be predicted using the first future time point as a known condition. Referring to fig. 6, fig. 6 is a flowchart illustrating an embodiment of step S40 provided in the present embodiment. The step S40 may specifically include the following steps S41-S44, in which:

and S41, acquiring a first acquisition time point of the image to be detected.

Wherein, the first acquisition time point refers to the time point when the floating car acquires the image to be detected.

In some embodiments, the floating car may be configured to record a time point of acquisition as the first acquisition time point when acquiring the image to be detected.

For example, if the floating car acquires an image to be detected at 6:30am, the first acquisition time point is 6:30 am.

And S42, acquiring the position of the first target object at the first acquisition time point.

The method for determining the position is described in detail in step C6, and the detailed process may refer to step C6, which is not described herein again.

S43, acquiring a second time difference between the first acquisition time point and the first future time point.

The second time difference refers to a time difference between a time point when the floating car collects the image to be detected and a time point when the first target object reaches the first passing position, namely the movement time length of the first target object.

For example, a floating car is in 6:30am acquires an image to be detected, the preset prediction time point is 8:30am, and the second time difference is 2 hours.

And S44, predicting a first passing position of the first target object at a first future time point according to the relative movement speed, the time difference and the position.

In some embodiments, a method of predicting a first traveled position includes:

(1) and acquiring the motion speed of the floating car, and determining the motion speed of the first target object according to the relative motion speed of the first target object and the motion speed of the floating car.

(2) And determining the moving distance of the first target object in the second time difference according to the moving speed of the first target object and the second time difference.

(3) And determining a second passing position according to the first passing position, the distance moved by the first target object and the moving direction of the first target object. The method of acquiring the moving direction of the first target object is already mentioned in step S40 and will not be described here.

In some practical scenes, the image to be detected is from a road video acquired by a floating car, and in order to improve the speed of acquiring the image to be detected, a video key frame which may have a first target in the road video needs to be extracted and used as the image to be detected.

Referring to fig. 6, fig. 6 is a flowchart illustrating an embodiment of step S10 provided in the present embodiment. At this time, the step S10 may specifically include the following steps D1-D4, wherein:

and D1, acquiring the road video collected by the floating car.

The road video refers to a video including a road collected by the floating car, and the meaning of the road is described in step S10 and is not described herein again.

Further, in an actual scene, a plurality of floating vehicles may collect the road video on the road at the same time, so that a situation that different data formats exist in the road video collected by the plurality of floating vehicles may occur. In order to increase the processing speed of the video, the data format of the road video needs to be collected uniformly, and the data format includes an MPEG format, an AVI format, an MOV format and the like.

Specifically, a road video collected by a plurality of floating vehicles can be uniformly accessed by adopting a ministerial standard protocol so as to unify the data format of the road video.

Further, because a plurality of floating vehicles can simultaneously acquire road videos on a road in an actual scene, the number of road videos acquired by the floating vehicles is large, and invalid road videos in the acquired road videos need to be eliminated to improve the accuracy of acquiring the images to be detected. The invalid road video comprises road video shot by a floating car when the lens is blocked, or low-definition road video shot by the floating car when the light is poor, and the like.

Specifically, algorithms such as histogram analysis and frequency domain analysis may be used to exclude invalid road videos.

D2, segmenting the road video according to the preset intercepting duration to obtain a plurality of segmented videos.

The preset capturing duration may be set according to an actual situation, and is not particularly limited, but the preset capturing duration should not be too large, so as to avoid that the new element disappears after appearing in the picture and the detection of the new element is missed, which results in that the new element cannot be detected in step D3. For example, the preset truncation duration may be 1 second.

For example, if the road video a is 1 minute and the preset capture duration is 3 seconds, the road video a is segmented, that is, the road video is captured from the 0 th second of the road video in the 3 rd, 6 th, 9 th, … … th, and 60 th seconds, and 20 segmented videos can be obtained.

One of the purposes of segmenting the road video is to reduce the video occupation cache and improve the operation efficiency of obtaining the image to be detected.

Specifically, after obtaining a plurality of segmented videos, it may be determined whether an image to be detected exists in each segmented video, and if the image to be detected does not exist in the determined segmented video, the determined segmented video is deleted from the cache. Therefore, the judged segmented video does not occupy the cache, the utilization rate of the cache can be effectively improved by segmenting the road video, and the operation efficiency of obtaining the image to be detected is improved.

And D3, acquiring similarity between a first image and a second image of the ith video in the plurality of segmented videos, wherein the first image and the second image are respectively an image with the most advanced acquisition time point and an image with the most advanced acquisition time point in the ith video.

Wherein, the similarity of the first image and the second image can be obtained by determining the cosine distance of the first image and the second image.

Specifically, first, a third feature vector of the first image and a fourth feature vector of the second image are respectively extracted, and cosine distances of the third feature vector and the fourth feature vector are determined.

And determining the similarity of the first image and the second image according to the cosine distance. The first image and the second image are explained below by way of example: assume that the frame rate of the captured road video is 60 frames/second, i.e., 1 second video is composed of 60 pictures. If the road video with the duration of 60 seconds is intercepted by the preset intercepting duration of 1 second, 60 segmented videos { w1, w2, … …, w60} with the shooting time points with the duration of 1 second lagging in sequence are obtained, the shooting time point of w1 is the forefront, and the shooting time point of w2 is the last. For wi, 1 ≦ i ≦ 60, consisting of 60 images { p1, p2, … …, p60} whose acquisition time points lag one after the other, p1 being the earliest and p60 being the latest. Where the first image is p1 and the second image is p 60.

And D4, when the similarity is smaller than a preset threshold value, taking the first image or the second image as the image to be detected.

Specifically, when the similarity is smaller than the preset threshold, it represents that a new object appears in the first image or the second image, and at this time, the first image and the second image may be directly used as the images to be detected, respectively. And when the similarity is smaller than a preset threshold value, the fact that no new object appears in the first image or the second image is represented, the ith video is abandoned, and further data processing is reduced.

In some embodiments, when the similarity is smaller than the preset threshold, that is, when a new object appears in the first image or the second image, the first image and the second image may be further detected, and an image in which a new object appears in the first image or the second image is taken as an image to be detected.

For example, when the similarity is smaller than the preset threshold, it may be that a pedestrian exists in the first image and no pedestrian exists in the second image, or it may be that a pedestrian exists in the second image and no pedestrian exists in the first image, and it is necessary to take an image in which a pedestrian appears in the first image or the second image as the image to be detected.

By segmenting the road video, a plurality of segmented videos are obtained, the first image and the last image of each segmented video are only needed to be compared, whether a new element appears is determined, when the new element appears, the first image or the second image is used as an image to be detected, the problem that each image in one segment of the road video needs to be used as an image to be detected to detect can be avoided, data processing amount is reduced, the first passing position prediction speed of a first target object at a first future time point is improved, and then the target tracking efficiency is improved.

For example, if the road video is 1 minute and 60 images are included per second, the detection step in step S20 is performed for each image (1 × 60 — 3600 images, wherein the frame rate is 60 frames/second) in 1 minute, which results in a huge data processing amount. Therefore, in the embodiment of the application, the road video is segmented to obtain a plurality of segmented videos, each segment of video is judged to detect whether a new element appears, when a certain frame of image has the new element, the image with the new element appears is used as the image to be detected, so that the problem that a large number of images in the road video acquired by the floating image need to be detected is solved, the data processing amount is reduced, the prediction speed of the first passing position of the first target object at the first future time point is improved, and the target tracking efficiency is further improved.

In order to better implement the target object tracking method in the embodiment of the present application, based on the target object tracking method, an embodiment of the present application further provides a target object tracking device, as shown in fig. 7, which is a schematic structural diagram of an embodiment of the target object tracking device in the embodiment of the present application, and the target object tracking device 700 includes:

the system comprises an acquisition unit 701, a display unit and a control unit, wherein the acquisition unit is used for acquiring an image to be detected comprising a road, and the image to be detected is acquired by a preset floating vehicle;

a detection unit 702, configured to detect a first target object in the image to be detected;

a determining unit 703, configured to perform optical flow matching processing on the image to be detected, and determine a relative movement speed of the first target object and the floating vehicle;

a predicting unit 704, configured to predict a first passing position of the first target object at a first future time point according to the relative motion speed and the image to be detected.

In some embodiments of the present application, the detecting unit 702 is specifically configured to:

extracting the features of the image to be detected to obtain the image features of the image to be detected;

performing segmentation processing according to the image characteristics to obtain an object in the image to be detected;

detecting whether the object has a first attribute feature and a second attribute feature;

and when the object is detected to have the first attribute feature and the second attribute feature, taking the object as the first target object.

when the object is detected to have a first attribute feature and a second attribute feature, detecting a target position relationship between the first attribute feature and the second attribute feature;

and when the target position relation is detected to accord with a preset reference position relation, taking the object as the first target object.

In some embodiments of the present application, the target object tracking device 700 comprises an alarm unit (not shown in the figures), which is specifically configured to:

calling a preset image acquisition device to acquire a warp image of the first warp position, wherein the image acquisition device is arranged at the first warp position;

detecting a second target object in the traversed image;

detecting whether the second target object is the same as the first target object;

and when the second target object is detected to be the same as the first target object, outputting alarm information.

In some embodiments of the application, after the step of detecting whether the second target object is the same as the first target object, the detecting unit 702 is specifically configured to:

when a second target object is detected to be the same as the first target object, acquiring a first time difference between the acquisition time point of the image to be detected and the acquisition time point of the image to be detected;

and acquiring the position of the first target object at the acquisition time point of the image to be detected.

Determining a position offset of the first target object according to the position and the first passing position;

predicting the traveling speed of the second target object according to the first time difference and the position offset;

predicting a second traversed position of the second target object at a second future point in time from the travel speed.

In some embodiments of the present application, the prediction unit 704 is specifically configured to:

acquiring a first acquisition time point of the image to be detected;

acquiring the position of the first target object at the first acquisition time point;

obtaining a second time difference between the first acquisition time point and the first future time point;

and predicting a first passing position of the first target object at a first future time point according to the relative movement speed, the time difference and the position.

In some embodiments of the present application, the obtaining unit 701 is specifically configured to:

acquiring a road video acquired by the floating car;

segmenting the road video according to a preset intercepting time length to obtain a plurality of segmented videos;

acquiring similarity between a first image and a second image of an ith video in the plurality of segmented videos, wherein the first image and the second image are respectively an image with the most advanced acquisition time point and an image with the most advanced acquisition time point in the ith video;

and when the similarity is smaller than a preset threshold value, taking the first image or the second image as the image to be detected.

In a specific implementation, the above units may be implemented as independent entities, or may be combined arbitrarily to be implemented as the same or several entities, and the specific implementation of the above units may refer to the foregoing method embodiments, which are not described herein again.

Since the target object tracking apparatus can perform the steps of the target object tracking method in any embodiment of the present application corresponding to fig. 1 to 6, the beneficial effects that can be achieved by the target object tracking method in any embodiment of the present application corresponding to fig. 1 to 6 can be achieved, for which, the foregoing description is omitted for brevity.

In addition, in order to better implement the target object tracking method in the embodiment of the present application, based on the target object tracking method, the embodiment of the present application further provides an electronic device, referring to fig. 8, where fig. 8 shows a schematic structural diagram of the electronic device in the embodiment of the present application, specifically, the electronic device provided in the embodiment of the present application includes a processor 801, and when the processor 801 is used to execute a computer program stored in a memory 802, the steps of the target object tracking method in any embodiment corresponding to fig. 1 to 6 are implemented; alternatively, the processor 801 is configured to implement the functions of the units in the corresponding embodiment of fig. 7 when executing the computer program stored in the memory 802.

Illustratively, a computer program may be partitioned into one or more modules/units, which are stored in the memory 802 and executed by the processor 801 to implement the embodiments of the present application. One or more modules/units may be a series of computer program instruction segments capable of performing certain functions, the instruction segments being used to describe the execution of a computer program in a computer device.

The electronic device may include, but is not limited to, a processor 801, a memory 802. Those skilled in the art will appreciate that the illustration is merely an example of an electronic device and does not constitute a limitation of an electronic device, and may include more or less components than those illustrated, or combine some components, or different components, for example, an electronic device may further include an input output device, a network access device, a bus, etc., and the processor 801, the memory 802, the input output device, the network access device, etc., are connected via the bus.

The Processor 801 may be a Central Processing Unit (CPU), other general purpose Processor, a Digital Signal Processor (DSP), an Application Specific Integrated Circuit (ASIC), an off-the-shelf Programmable Gate Array (FPGA) or other Programmable logic device, discrete Gate or transistor logic, discrete hardware components, etc. The general purpose processor may be a microprocessor or the processor may be any conventional processor or the like, the processor being the control center for the electronic device and the various interfaces and lines connecting the various parts of the overall electronic device.

The memory 802 may be used to store computer programs and/or modules, and the processor 801 may implement various functions of the computer device by running or executing the computer programs and/or modules stored in the memory 802 and invoking data stored in the memory 802. The memory 802 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required by at least one function (such as a sound playing function, an image playing function, etc.), and the like; the storage data area may store data (such as audio data, video data, etc.) created according to the use of the electronic device, etc. In addition, the memory may include high speed random access memory, and may also include non-volatile memory, such as a hard disk, a memory, a plug-in hard disk, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), at least one magnetic disk storage device, a Flash memory device, or other volatile solid state storage device.

It can be clearly understood by those skilled in the art that, for convenience and brevity of description, the specific working processes of the target object tracking apparatus, the electronic device and the corresponding units thereof described above may refer to the descriptions of the target object tracking method in any embodiment corresponding to fig. 1 to 6, and are not described herein again in detail.

It will be understood by those skilled in the art that all or part of the steps of the methods of the above embodiments may be performed by instructions or by associated hardware controlled by the instructions, which may be stored in a computer readable storage medium and loaded and executed by a processor.

To this end, an embodiment of the present application provides a computer-readable storage medium, in which a plurality of instructions are stored, and the instructions can be loaded by a processor to execute steps in the target object tracking method in any embodiment corresponding to fig. 1 to 6 in the present application, and specific operations may refer to descriptions of the target object tracking method in any embodiment corresponding to fig. 1 to 6, which are not described herein again.

Wherein the computer-readable storage medium may include: read Only Memory (ROM), Random Access Memory (RAM), magnetic or optical disks, and the like.

Since the instructions stored in the computer-readable storage medium can execute the steps in the target object tracking method in any embodiment corresponding to fig. 1 to 6 in the present application, the beneficial effects that can be achieved by the target object tracking method in any embodiment corresponding to fig. 1 to 6 in the present application can be achieved, for details, see the foregoing description, and are not repeated herein.

The foregoing detailed description is directed to a target object tracking method, an apparatus, an electronic device, and a computer-readable storage medium, which are provided by the embodiments of the present application, and specific examples are applied in the present application to explain the principles and embodiments of the present application, and the descriptions of the foregoing embodiments are only used to help understand the method and the core ideas of the present application; meanwhile, for those skilled in the art, according to the idea of the present application, there may be variations in the specific embodiments and the application scope, and in summary, the content of the present specification should not be construed as a limitation to the present application.

Claims

1. A target object tracking method, comprising:

acquiring an image to be detected comprising a road, wherein the image to be detected is acquired through a preset floating vehicle;

detecting a first target object in the image to be detected;

2. The target object tracking method according to claim 1, wherein said detecting the first target object in the image to be detected includes:

extracting the characteristics of the image to be detected to obtain the image characteristics of the image to be detected;

3. The method for tracking the target object according to claim 2, wherein the regarding the object as the first target object when detecting that the object has the first attribute feature and the second attribute feature comprises:

when detecting that the object has a first attribute feature and a second attribute feature, detecting a target position relationship between the first attribute feature and the second attribute feature;

4. The target object tracking method according to claim 1, wherein predicting a first passing position of the first target object at a first future time point based on the relative motion speed and the image to be detected further comprises:

detecting a second target object in the traversed image;

5. The target object tracking method according to claim 4, wherein the detecting whether the second target object is the same as the first target object further comprises:

6. The target object tracking method according to claim 1, wherein predicting the first passing position of the first target object at the first future time point based on the relative movement speed and the image to be detected comprises:

acquiring a first acquisition time point of the image to be detected;

7. The target object tracking method according to claim 1, wherein the acquiring an image to be detected including a road includes:

acquiring a road video acquired by the floating car;

8. A target object tracking device, comprising:

9. An electronic device comprising a processor and a memory, the memory having a computer program stored therein, the processor executing the target object tracking method according to any one of claims 1 to 7 when calling the computer program in the memory.

10. A computer-readable storage medium, having stored thereon a computer program which is loaded by a processor to perform the steps of the target object tracking method of any one of claims 1 to 7.