Disclosure of Invention
In order to solve the above technical problems or at least partially solve the above technical problems, embodiments of the present invention provide a method and an apparatus for detecting missing targets in target tracking.
In view of this, in a first aspect, an embodiment of the present invention provides a method for detecting missing targets in target tracking, including:
when a tracking target is lost, randomly selecting a plurality of pixel points on a current frame image as search position points of the target;
selecting one search frame from the search frames corresponding to the plurality of search position points as a similar search frame according to the position of the search frame corresponding to the search position point;
enlarging the calculation range of the similar search box to obtain a search area;
and determining the position of the target according to the position of the search frame corresponding to each pixel point in the search area.
Optionally, selecting one search frame from the search frames corresponding to the plurality of search location points as a similar search frame according to the location of the search frame corresponding to the search location point, where the selecting includes:
respectively calculating the similarity between the search frame corresponding to each of the plurality of search position points and the target frame in the previous frame of image;
comparing the calculated similarity;
and selecting the search box corresponding to the maximum similarity in the calculated similarities as a similar search box.
Optionally, enlarging the calculation range of the similar search box to obtain a search area, including:
interpolating the upper part, the lower part, the left part and the right part of the similar search box by adopting a padding technology, and expanding the calculation range of the similar search box to a threshold value;
and taking the calculation range of the similar search box after padding as a search area.
Optionally, determining the position of the target according to the position of each pixel point in the search area includes:
calculating the similarity between a search frame corresponding to each pixel point in the search area and a target frame in the previous frame of image;
comparing the calculated similarity;
and selecting the position of the search box corresponding to the maximum similarity in the calculated similarities as the position of the target.
Optionally, the similarity is obtained by being a euclidean distance, a manhattan distance, a minkowski distance, or by a twin network.
In a second aspect, an embodiment of the present invention provides an apparatus for detecting missing targets in target tracking, including:
the random sampling module is used for randomly selecting a plurality of pixel points on the current frame image as target searching position points;
the rough searching module is used for selecting one searching frame from the searching frames corresponding to the plurality of searching position points as a similar searching frame according to the position of the searching frame corresponding to the searching position point;
the search area determining module is used for enlarging the calculation range of the similar search box to obtain a search area;
and the target position determining module is used for determining the position of the target according to the position of the search frame corresponding to each pixel point in the search area.
Optionally, the rough search module includes:
the first similarity calculation module is used for calculating the similarity between the search frame corresponding to each of the plurality of search position points and the target frame in the previous frame image;
the first comparison module is used for comparing the calculated similarity;
and the first selecting module is used for selecting the search box corresponding to the maximum similarity in the calculated similarities as a similar search box.
Optionally, the search determining module interpolates the similar search box up, down, left, and right by using a padding technology, expands a calculation range of the similar search box to a threshold, and uses the calculation range of the similar search box after padding as a search area.
Optionally, the target position determining module includes:
the second calculation module is used for calculating the similarity between a search frame corresponding to each pixel point in the search area and a target frame in a previous frame of image;
the second comparison module is used for comparing the calculated similarity;
and the second selecting module is used for selecting the position of the search box corresponding to the maximum similarity in the calculated similarities as the position of the target.
In a third aspect, an embodiment of the present invention provides a mobile terminal, including:
a processor, a memory, a communication interface, and a bus;
the processor, the memory and the communication interface complete mutual communication through the bus;
the communication interface is used for information transmission between external devices;
the processor is configured to invoke program instructions in the memory to perform the steps of the method of the first aspect.
In a fourth aspect, an embodiment of the present invention also provides a non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the steps of the method according to the first aspect.
Compared with the prior art, the method for re-detecting the target loss in the target tracking provided by the embodiment of the invention determines the similar search frame by randomly sampling the pixel points of the current frame image, reduces the search range, enlarges the calculation range of the similar search frame to obtain the target search area, and accurately determines the position of the target according to the position of the search frame of each pixel point in the target search area. According to the technical scheme, the time complexity of target re-detection is reduced by narrowing the search range, and the problem of target loss in the target tracking process is efficiently solved, so that the robustness of the tracker to moving objects is better.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
Fig. 1 is a flowchart of a method for detecting missing targets in target tracking according to an embodiment of the present invention, where the method includes: coarse search and fine search;
the rough search includes steps S1-S2 as follows:
s1, after a tracking target is lost, randomly selecting a plurality of pixel points on a current frame image as search position points of the target;
specifically, in this embodiment of the present application, the current frame image is a first frame image collected by a target tracker after the target is lost, and the target in a previous frame image is not lost.
S2, selecting a search frame from the search frames corresponding to the plurality of search position points as a similar search frame according to the position of the search frame corresponding to the search position point;
specifically, in this embodiment of the present application, selecting one search frame from search frames corresponding to a plurality of search position points as a similar search frame according to a position of the search frame corresponding to the search position point includes:
each search position point corresponds to a search box taking the position of the search position point as a center position, and the size of the search box is determined by a target tracking algorithm adopted by target tracking.
Calculating the similarity between the search frame corresponding to each of the plurality of search position points and the target frame in the previous frame of image, wherein the target frame is the search frame with the position of the target as a central point, and specifically, in the embodiment of the application, the similarity can be obtained by calculating an Euclidean distance, a Manhattan distance, a Minkowski distance, a twin network and the like;
comparing the calculated similarity;
and selecting the search frame corresponding to the maximum similarity in the calculated similarities as a similar search frame, wherein the search position points are randomly selected, so that the similarity between the search frame corresponding to the search position point and the target frame in the previous frame image is greater than the similarity between the search frame corresponding to the search position point and the target frame corresponding to the other search position points as long as a certain search position point is positioned near the position where the target is positioned, and the position near the position where the target is positioned, namely the position in the similar search frame, can be positioned.
The time complexity of the rough search is only related to the number of the selected search location points.
The fine search includes steps S3-S4 as follows:
s3, enlarging the calculation range of the similar search box to obtain a search area;
specifically, in the embodiment of the present application, a search area is obtained by padding a similar search box, where padding is a commonly used method in the field of target tracking, and is to interpolate the top, bottom, left, and right of the similar search box on the basis of the similar search box currently being processed, so as to expand the calculation range.
S4, determining the position of the target according to the position of a search frame corresponding to each pixel point in the search area;
specifically, in this embodiment of the present application, determining the position of the target according to the position of each pixel point in the search area includes:
calculating the similarity between a search frame corresponding to each pixel point in the search area and a target frame in a previous frame of image, specifically, in the embodiment of the present application, the similarity may be obtained by calculating an euclidean distance, a manhattan distance, a minkowski distance, or using a twin network, or the like;
comparing the calculated similarity;
and selecting the position of the search box corresponding to the maximum similarity in the calculated similarities as the position of the target, wherein the position of the search box is the position of the center point of the search box.
The complexity of the fine search is proportional to the search area size after padding.
Compared with the prior art, the invention has the following advantages: 1. the problem of target loss caused by reasons such as too fast movement in target tracking is efficiently processed, so that the robustness of a tracker to moving objects is better. 2. The time complexity is low, the time required by rough search is only in direct proportion to the selection of random points, and the square-level time complexity of the exhaustive search is reduced to a linear level. 3. The fine search greatly ensures the accuracy of the search, thereby improving the tracking effect.
Based on the same inventive concept as the target loss rechecking method in the target tracking, an embodiment of the present invention further provides a device for target loss rechecking in the target tracking, as shown in fig. 2, where the device for target loss rechecking in the target tracking includes:
the random sampling module is used for randomly selecting a plurality of pixel points on the current frame image as target searching position points;
the rough searching module is used for selecting one searching frame from the searching frames corresponding to the plurality of searching position points as a similar searching frame according to the position of the searching frame corresponding to the searching position point;
the search area determining module is used for enlarging the calculation range of the similar search box to obtain a search area;
and the target position determining module is used for determining the position of the target according to the position of the search frame corresponding to each pixel point in the search area.
The coarse search module may include:
the first similarity calculation module is used for calculating the similarity between the search frame corresponding to each of the plurality of search position points and the target frame in the previous frame image;
the first comparison module is used for comparing the calculated similarity;
and the first selecting module is used for selecting the search box corresponding to the maximum similarity in the calculated similarities as a similar search box.
The search determining module may interpolate the upper, lower, left, and right sides of the similar search box by using a padding technique, expand a calculation range of the similar search box to a threshold, and use the calculation range of the similar search box after padding as a search area.
The target location determination module may include:
the second calculation module is used for calculating the similarity between a search frame corresponding to each pixel point in the search area and a target frame in a previous frame of image;
the second comparison module is used for comparing the calculated similarity;
and the second selecting module is used for selecting the position of the search box corresponding to the maximum similarity in the calculated similarities as the position of the target.
One specific example is:
when a signal that the target is tracked and lost is obtained, the target is not in the current search area of the tracker, and the target tracker can change the search area and redefine the target search area of the tracking algorithm. In order to efficiently obtain the target search area, the re-detection process can be performed in two steps, i.e., a coarse search and a fine search. Firstly, obtaining the approximate position of a target through a first-step rough search; and then, the position is accurately searched to obtain the accurate position of the target, so that the problem of target loss caused by over-quick movement and the like is solved.
1. Coarse search by random sampling
Firstly, randomly generating a plurality of search position points on a current frame picture as possible directions of targets. As shown in fig. 3, white dots in the figure are randomly generated search position points, and black dots are target points;
then, similarity calculation is performed on a target frame (as shown in fig. 4, a frame with a black dot as a center is used as the target frame) generated by the previous frame image and a search frame (as shown in fig. 4, a frame with a white dot as a center is used as the search frame) where each search position point of the current frame is located, and since the search position points are randomly selected, as long as a certain search position point is located near the position where the target is located, the similarity between the search frame corresponding to the search position point and the target frame in the previous frame image is greater than the similarity between the search frame corresponding to other search position points and the target frame, so that the position near the position where the target is located, that is, the position where the search frame shown in fig. 4 is located, which is the result of rough search. The time complexity of the rough search is only related to the number of random points selected.
2. Fine search using padding method
The position obtained by the rough search is only the position near the target, but not necessarily the position of the target, so that the target needs to be continuously and accurately positioned, that is, the search frame obtained by the rough search is paged, then the region after the paging is taken as the target search region, the region after the paging is the region in the dashed line frame shown in fig. 5, the similarity calculation is performed on the search frame where each pixel point in the target search region is located and the target frame in the previous frame, and the position where the pixel point with the largest similarity is located is obtained and taken as the position of the target, thereby realizing the accurate search. The complexity of the fine search is proportional to the size of the area after padding.
It should be noted that, in this document, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other like elements in a process, method, article, or apparatus that comprises the element.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions.
Through the above description of the embodiments, those skilled in the art will clearly understand that the methods described in the embodiments of the present invention can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention or the method according to some parts of the embodiments.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.