CN112489085A

CN112489085A - Target tracking method, target tracking device, electronic device, and storage medium

Info

Publication number: CN112489085A
Application number: CN202011460226.1A
Authority: CN
Inventors: 不公告发明人
Original assignee: Beijing Pengsi Technology Co ltd
Current assignee: Beijing Pengsi Technology Co ltd
Priority date: 2020-12-11
Filing date: 2020-12-11
Publication date: 2021-03-12

Abstract

The present disclosure relates to a target tracking method, a target tracking apparatus, an electronic device, and a storage medium. The target tracking method comprises the steps of determining at least one first local area used when a target to be tracked is tracked; determining an incidence relation between the first local area and the whole target to be tracked; performing local tracking between a plurality of image frames based on the first local area to obtain local tracking information; and obtaining global tracking information of the target to be tracked in different image frames according to the local tracking information and the incidence relation. According to the target tracking method, the local tracking is carried out through the first local area, the target tracking can be realized more stably, and the target tracking success rate is improved.

Description

Target tracking method, target tracking device, electronic device, and storage medium

Technical Field

The present invention relates to the field of computer vision technologies, and in particular, to a target tracking method, a target tracking apparatus, an electronic device, and a storage medium.

Background

The problem of target tracking is an important research direction in the field of computer vision. The target tracking problem can be divided into single target tracking and multi-target tracking according to the number of targets to be tracked. According to the complexity in the tracking process, the method can be divided into short-time target tracking, long-time target tracking and the like. The current research on the target tracking problem has made many advances and breakthroughs, but still has many problems to be solved. For example, in a long-time target tracking process, a phenomenon that a target is lost easily occurs.

Disclosure of Invention

In view of the above, it is desirable to provide a target tracking method, a target tracking apparatus, an electronic device, and a storage medium.

The technical scheme of the disclosure is realized as follows:

in one aspect, the present disclosure provides a target tracking method.

The target tracking method provided by the embodiment of the disclosure comprises the following steps:

determining at least one first local area used when a target to be tracked is tracked;

determining an incidence relation between the first local area and the whole target to be tracked;

performing local tracking between a plurality of image frames based on the first local area to obtain local tracking information;

and obtaining global tracking information of the target to be tracked in different image frames according to the local tracking information and the association relation.

In some embodiments, the determining at least one first local area for use in tracking the target to be tracked includes:

determining at least one candidate local region for target tracking;

and selecting at least one candidate local area with the blocked statistical probability meeting a first condition as the first local area according to the blocked statistical probability of each candidate local area.

In some embodiments, the determining at least one first local area used when tracking the target to be tracked further comprises:

determining the tracking quality of the local tracking information of the target to be tracked based on the first local area of which the occluded statistical probability meets a first condition;

selecting a local area with a tracking quality not lower than a preset quality threshold as the updated first local area.

In some embodiments, the determining at least one candidate local region for target tracking comprises:

determining at least one candidate local area for target tracking according to the characteristic attributes of different local areas of the target to be tracked, wherein the characteristic attributes comprise: the number of features and/or the significance parameter of an individual feature.

In some embodiments, the selecting, as the first local region, at least one candidate local region whose statistical probability of being occluded satisfies a first condition according to the statistical probability of each of the candidate local regions being occluded includes:

selecting, as the first local region, a candidate local region that satisfies the first condition in terms of the statistical probability of the candidate local region being occluded and is not occluded in an initial image frame among the plurality of image frames.

In some embodiments, the number of the first local regions is N, where N is a positive integer equal to or greater than 2 and less than P; p is the number of local areas contained in the target to be tracked;

the performing local tracking between a plurality of image frames based on the first local area to obtain local tracking information includes:

in a plurality of image frames, local tracking is carried out in an m image frame based on the first local area which is not blocked in the m-1 image frame, so as to obtain local tracking information of the target to be tracked in the m image frame, wherein the m image frame is a next image frame of the m-1 image frame, and m is a positive integer greater than 1.

In some embodiments, the method further comprises:

when it is detected that part of the first local area is occluded in one or more image frames, selecting an unoccluded local area as the updated N first local areas according to the occlusion condition.

In some embodiments, the selecting the non-occluded local area as the updated N first local areas includes:

determining a perspective relationship between different objects in each of the image frames; wherein the target at least comprises the target to be tracked;

determining a foreground target of the target to be tracked according to the perspective relation;

determining whether an interfering object which obstructs the alternative local area in the target to be tracked exists in the foreground target;

and selecting the candidate local areas which are not blocked as the updated N first local areas.

In some embodiments, the determining whether there is an interfering object in the foreground target that obstructs the candidate local region in the target to be tracked includes:

determining whether the position of the foreground target is overlapped with the position of the alternative local area in the target to be tracked;

if yes, the fact that the candidate local area is shielded by the interference object in the image frame is determined.

In some embodiments, the determining a perspective relationship between different targets in each of the image frames, and determining a foreground target of the target to be tracked according to the perspective relationship, includes:

determining the external frames of the different targets;

determining a first distance between the lower edge of the outer frame of the target to be tracked and the lower edge of the image frame, and determining a second distance between the lower edge of the outer frame of the target other than the target to be tracked and the lower edge of the image frame;

and if the first distance is greater than the second distance, determining that the target corresponding to the second distance is a foreground target of the target to be tracked.

In some embodiments, the obtaining global tracking information of the target to be tracked in different image frames according to the local tracking information and the association relationship includes:

determining the relative position relation of the first local area in the target to be tracked according to the position information of the first local area and the position information of the outer frame of the target to be tracked;

and converting the local tracking information of the first local area in each image frame into global tracking information of the target to be tracked in each image frame according to the relative position relation.

In another aspect, the present disclosure provides a target tracking apparatus. The target tracking device provided by the embodiment of the disclosure comprises:

the tracking device comprises a first processing unit, a second processing unit and a control unit, wherein the first processing unit is used for determining at least one first local area used for tracking a target to be tracked;

the second processing unit is used for determining the incidence relation between the first local area and the whole target to be tracked;

a third processing unit, configured to perform local tracking between a plurality of image frames based on the first local area, to obtain local tracking information;

and the fourth processing unit is used for obtaining global tracking information of the target to be tracked in different image frames according to the local tracking information and the association relation.

In some embodiments, the first processing unit, configured to determine at least one first local area used when tracking the target to be tracked, includes:

the first processing unit is specifically configured to determine at least one candidate local region for target tracking;

In some embodiments, the first processing unit is configured to determine at least one first local area used when tracking the target to be tracked, and further includes:

the first processing unit is specifically configured to determine, based on the first local area where the occluded statistical probability satisfies a first condition, a tracking quality of local tracking information of the target to be tracked;

In some embodiments, the first processing unit, configured to determine at least one candidate local region for target tracking, includes:

the first processing unit is specifically configured to determine at least one candidate local region for target tracking according to a feature attribute of different local regions of the target to be tracked, where the feature attribute includes: the number of features and/or the significance parameter of an individual feature.

In some embodiments, the first processing unit, configured to select, as the first local region, at least one candidate local region whose statistical probability of being occluded satisfies a first condition according to the statistical probability that each of the candidate local regions is occluded, includes:

the first processing unit is specifically configured to select, as the first local region, a candidate local region, where the statistical probability of being occluded satisfies the first condition and an initial image frame in the plurality of image frames is not occluded, according to the statistical probability of the candidate local region being occluded.

the third processing unit, configured to perform local tracking between multiple image frames based on the first local area, to obtain local tracking information, includes:

the third processing unit is specifically configured to, in a plurality of image frames, perform local tracking in an mth image frame based on the first local area that is not blocked in the mth-1 image frame, to obtain local tracking information of the target to be tracked in the mth image frame, where the mth image frame is a subsequent image frame of the mth-1 image frame, and m is a positive integer greater than 1.

In some embodiments, the image processing apparatus further comprises a fifth processing unit, configured to, when it is detected that part of the first local area is occluded in one or more of the image frames, select, according to an occlusion condition, an unoccluded local area as the updated N first local areas.

In some embodiments, the fifth processing unit, configured to select an unobstructed partial area as the updated N first partial areas, includes:

the fifth processing unit is specifically configured to determine a perspective relationship between different targets in each of the image frames; wherein the target at least comprises the target to be tracked;

In some embodiments, the first processing unit, configured to determine a perspective relationship between different targets in each of the image frames, and determine a foreground target of the target to be tracked according to the perspective relationship, includes:

the first processing unit is specifically configured to determine the circumscribed frames of the different targets;

In some embodiments, the fourth processing unit is configured to obtain global tracking information of the target to be tracked in different image frames according to the local tracking information and the association relationship, and includes:

the fourth processing unit is specifically configured to determine a relative position relationship of the first local area in the target to be tracked according to the position information of the first local area and the position information of the circumscribed frame of the target to be tracked;

In yet another aspect, the present disclosure also provides an electronic device.

The electronic equipment that this disclosed embodiment provided includes: a processor and a memory for storing a computer program capable of running on the processor, wherein the processor is configured to execute the steps of the object tracking method provided by the embodiments of the present disclosure in one aspect when the computer program is run.

In yet another aspect, the present disclosure also provides a computer-readable storage medium.

The computer readable storage medium provided by the embodiments of the present disclosure has a computer program stored thereon, and the computer program, when executed by a processor, implements the steps of the target tracking method provided by the embodiments of the present disclosure in one aspect.

According to the method and the device for tracking the target, the target to be tracked is locally tracked among a plurality of image frames through at least one first local area, and the global tracking information of the target in different image frames is obtained according to the obtained local tracking information and the incidence relation between the first local area and the whole target to be tracked. In this way, local tracking information is obtained by performing local tracking in one or more first local regions, and global tracking information is obtained based on the local tracking information, so that the number of features to be recognized when the target can be tracked can be reduced compared with the case of directly performing global tracking on the whole target. Meanwhile, when global tracking is carried out on the whole target to be tracked, the whole area of the target to be tracked needs to be identified and matched, once partial area is shielded, the situation that the target cannot be identified and then lost due to the fact that the whole area cannot be matched can occur. Compared with a local area in the target to be tracked, the whole target to be tracked has a higher probability of being partially shielded than the local area due to the large area. In contrast, the area to be identified and matched is smaller by performing the local tracking through the first local area. For the same obstacle, the probability that the local area is blocked relative to the global area is also small. Therefore, the global tracking information is obtained by performing local tracking on the first local area, and compared with the method of directly performing global tracking on the whole target to be tracked, the target tracking can be performed more stably, so that the target tracking success rate is improved.

Drawings

FIG. 1 is a flowchart illustrating a target tracking method according to an exemplary embodiment;

FIG. 2 is a schematic diagram illustrating an occlusion relationship determination according to an exemplary embodiment;

FIG. 3 is a schematic diagram of a target to be tracked loss process;

FIG. 4 is a schematic diagram of a possible situation in which an object to be tracked is occluded;

FIG. 5 is a schematic diagram illustrating target tracking learning, according to an exemplary embodiment;

FIG. 6 is an overall flow diagram illustrating a target tracking method according to an exemplary embodiment;

FIG. 7 is a schematic diagram illustrating a target tracking device architecture in accordance with an exemplary embodiment;

fig. 8 is a schematic diagram of an electronic device shown in accordance with an example embodiment.

Detailed Description

The technical solution of the present invention is further described in detail with reference to the drawings and the specific embodiments of the specification. Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the invention, as detailed in the appended claims.

When the target to be tracked is integrally tracked by the target tracking algorithm based on the neural network at present, all pixels and image characteristics of the target to be tracked in a large number of images need to be processed, the calculation amount is overlarge, and high requirements are provided for the performance of a processor.

The present disclosure provides a target tracking method. FIG. 1 is a flow diagram illustrating a target tracking method according to an exemplary embodiment. As shown in fig. 1, the target tracking method includes:

step 10, determining at least one first local area used when a target to be tracked is tracked;

step 11, determining the incidence relation between the first local area and the whole target to be tracked;

step 12, performing local tracking among a plurality of image frames based on the first local area to obtain local tracking information;

and step 13, obtaining global tracking information of the target to be tracked in different image frames according to the local tracking information and the incidence relation.

In the present exemplary embodiment, the target to be tracked may be a human body, an animal, a vehicle, or the like that can move. The specific target to be tracked is not limited to any one kind.

In the present exemplary embodiment, the first local region may be a local region of the whole object to be tracked in the image frame, and includes a recognition feature for performing tracking recognition by the tracking device (for example, when the object to be tracked is a human body, the first local region may be a certain organ tissue of the human body in the image, including a head, an extremity, a body, and the like).

In the present exemplary embodiment, the association relationship between the first local region and the entire target to be tracked includes a positional relationship and a relative local region relationship in the image between the first local region and the entire target to be tracked, and the like. The position relation can include a coordinate relation between the area center coordinate of the first local area and the area center coordinate of the whole target to be tracked; the relative local region relationship may include a relative relationship between the outline and the area of the first local region and the outline and the area of the entire target to be tracked, for example, which region of which position of the entire target to be tracked the first local region is.

In the exemplary embodiment, the plurality of image frames may be from the same video, and the video may be from a road monitoring device and/or a security monitoring device, etc. Of course, the device from the video is only illustrated here, and the specific implementation is not limited to these illustrated road monitoring devices and/or security monitoring devices. This security protection supervisory equipment includes but not limited to: a monitoring camera in a community, a monitoring camera of an office building, a monitoring device of a market and/or a monitoring device of a non-road public place such as a park.

In the exemplary embodiment, the global tracking is that the tracking device identifies and locates according to the overall characteristics of the target to be tracked (i.e., the target to be tracked is taken as a whole). For example, when the target to be tracked is a person, all the characteristics (including the head, the limbs, the body and the like) of the person are seen as a whole to be identified and positioned. The local tracking is that the tracking device identifies and positions the target according to partial characteristics of the target to be tracked. For example, when the target to be tracked is a person, only the head of the person can be identified and positioned, so as to determine the position of the target to be tracked.

In the present exemplary embodiment, the local tracking information may be position information (including coordinate information) of a tracked local feature in an image obtained based on target local feature tracking and/or region information (including a morphological contour, an area, and the like composed of the local feature) of the local feature.

The global tracking information may be position information (including coordinate information) of the whole target to be tracked in the image and/or region information (including morphological contour, area, etc. composed of the whole features) of the whole target, which is obtained based on the target whole feature tracking.

In the present exemplary embodiment, when the target to be tracked has an easily trackable and recognizable feature with a conspicuous feature, at least one first local area (having a conspicuous easily trackable and recognizable feature) may be locally tracked in a plurality of image frames, local tracking information of this first local area is acquired, and then global tracking information of the target to be tracked is acquired based on this local tracking information.

In the exemplary embodiment, when the target to be tracked does not have obvious features and is easy to track and identify, the plurality of different first local areas may be synchronously tracked, the local tracking information of the plurality of first local areas may be obtained, and then the global tracking information of the target to be tracked may be obtained based on the local tracking information of the plurality of first local areas, so as to improve the tracking success rate.

According to the embodiment of the disclosure, the target to be tracked is locally tracked among a plurality of image frames through at least one first local area, and the global tracking information of the target to be tracked in different image frames is obtained according to the obtained local tracking information and the incidence relation between the first local area and the whole target to be tracked. Therefore, local tracking information is obtained through local tracking of one or more first local areas, global tracking information is obtained based on the local tracking information, and compared with the method of directly carrying out global tracking on the whole target to be tracked, the method can reduce the number of features required to be identified when the target can be tracked. Meanwhile, when global tracking is carried out on the whole target to be tracked, the whole area of the target to be tracked needs to be identified and matched, once partial area is shielded, the whole area cannot be matched, the target to be tracked cannot be identified, and then the target is lost. And the whole target is relative to a local area in the target to be tracked, and the probability that the whole target is partially shielded is higher than that of the local area due to the large area of the whole target. In contrast, the area to be identified and matched is smaller by performing the local tracking through the first local area. For the same obstacle, the probability that the local area is blocked relative to the global area is also small. Therefore, the global tracking information is obtained by performing the local tracking on the first local area, and compared with the method of directly performing the global tracking on the whole target to be tracked, the target tracking can be performed more stably, so that the target tracking success rate is improved.

determining at least one candidate local region for target tracking;

In the present exemplary embodiment, the candidate local regions are all local regions in the target global region in the image frame. In the present exemplary embodiment, a plurality of one or more candidate local regions for local tracking may be determined in advance.

In one embodiment, a plurality of local regions with lower occlusion probability may be selected as candidate local regions based on big data statistics.

In the present exemplary embodiment, the first condition may be that the probability that the local region is occluded is less than a preset probability value. Because the probability that each part is shielded is different in the moving process of the target to be tracked, the alternative local area smaller than the preset probability value can be determined as the first local area for local tracking. For example, when the target to be tracked is a human body, the probability that the leg of the human body is blocked is generally higher than the probability that the head of the human body is blocked. The probability that the leg of the human body is occluded as the candidate local region is greater than the probability that the head is occluded as the candidate local region. It is obvious that the first condition is more easily satisfied by the head as a candidate local region.

In one embodiment, if the target to be tracked has been continuously tracked for a period of time. If the time period exceeds the preset time period, one or more local regions which are not shielded or are shielded less frequently can be selected as the candidate local regions according to the shielded statistical condition of the target to be tracked. In this way, a region more suitable as a candidate local region in the current environment can be determined, thereby improving the tracking success rate.

In one embodiment, the size of the target to be tracked is determined according to global tracking information of the target to be tracked in an initial image frame; wherein the initial image frame is a first frame image of the acquired plurality of image frames;

if the size of the target to be tracked is larger than a preset size threshold value, performing local tracking on the plurality of image frames based on a first local area;

if the size of the target to be tracked is not larger than a preset size threshold, performing global tracking on the whole target to be tracked in the plurality of image frames.

In the exemplary embodiment, by comparing the size of the target to be tracked in the initial image frame with the preset size threshold, the overall size of the target to be tracked can be preliminarily obtained, and the possibility or the probability of target loss when the target to be tracked is globally tracked is estimated. When the size of the target to be tracked is larger than the preset size threshold, the overall size of the target to be tracked is larger, the higher the probability of being shielded in the tracking process is, the higher the possibility of being lost is, so that the local tracking is selected to be more appropriate, and the target tracking success rate is improved. When the size of the target to be tracked is not larger than the preset size threshold, the whole size of the target to be tracked is small, when the target to be tracked is subjected to global tracking, the possibility of target loss is small, and the global tracking can be directly selected.

In the present exemplary embodiment, while performing the local tracking, the tracking quality of the first local region satisfying the first condition based on the statistical probability of being occluded may be analyzed simultaneously. The tracking quality may be an evaluation value that evaluates the tracking effect. The evaluation value may be used to indicate a tracking success rate and/or confidence of the target. Meanwhile, the tracking success rate and the confidence coefficient are positively correlated with the tracking quality. Namely, the higher the tracking success rate is, the higher the tracking quality is; the higher the confidence, the higher the tracking quality. The tracking success rate can be represented by the ratio of the number of image frames successfully tracked to the total number of image frames. For example, if the total number of frames of images is 100 frames and the number of frames of images to be tracked to the target is 80, the success rate of tracking the target based on the first local region is 80%.

If the tracking quality is not good, for example, when the feature recognition is not obvious and the recognition is easy to be wrong, the tracking quality is lower than the preset quality threshold, the local area with the tracking quality not lower than the preset quality threshold can be reselected from the candidate local areas as the updated first local area. Tracking then continues for the updated first local area. The preset quality threshold is a preset parameter value which is set in the system and used for evaluating the target tracking quality. For example a preset parameter value of 50. When the tracking quality is good, for example, the selected local area features are obvious and easy to identify, the tracking device can easily capture the target, and the target is not easy to lose. The tracking quality can now be evaluated as 80, above a preset quality threshold 50, without reselecting the first local area. If the tracking quality evaluation is 20 and is smaller than the preset quality threshold 50, then the candidate local area needs to be reselected as the first local area for local tracking.

In the present exemplary embodiment, the candidate local regions may be determined according to the characteristic attributes of the different local regions of the target. The characteristic attributes include: the number of features and/or the significance parameter of an individual feature. The number of features refers to the number of features contained in the local region, with a different number of features making the local region easier to identify. For example, when the target to be tracked is a human body, the local regions may be a head and a waist. When the head is a partial region, the features included in the head include the features of the five sense organs themselves, the features of the five sense organs individually, the head shape feature, and the hair style feature. In contrast, when the waist is a local region, only the size and shape of the waist are included. A significance parameter for a single feature means that there are some significant differences in the individual features, and these differences can be taken as significance parameters. Such as a human body with obvious easily distinguishable scars, color spots, etc. in its facial features. The higher the significance of a feature, the higher the degree of recognition. Through the identification of the characteristic attributes, the tracking becomes easier and more accurate.

In the present exemplary embodiment, the first condition may be that the probability that the local region is occluded is less than a preset probability value. In the moving process of the target, the probability of shielding each local area is different, so that the candidate local area smaller than the preset probability value can be determined as the first local area for local tracking. For example, when the target to be tracked is a human body, the probability that the leg of the human body is blocked is generally higher than the probability that the head of the human body is blocked. The probability that the leg of the human body is occluded as the candidate local region is greater than the probability that the head is occluded as the candidate local region. It is obvious that the first condition is more easily satisfied by the head as a candidate local region. When the statistical probability meets the first condition, the candidate local area which is not shielded in the initial image frame in the image frames is used as the first local area for local tracking, and the accuracy of target tracking can be improved.

In the present exemplary embodiment, the number of the first local areas is plural, and the accuracy of target tracking can be improved by performing synchronous tracking on the plural first local areas.

In some embodiments, the method further comprises:

In the present exemplary embodiment, during the target tracking process, the first local area is occluded in one or more image frames, and at this time, the tracking is continued with the first local area, so that the target is easily lost. Based on the method, one or more local areas can be selected from the local areas where the target is not shielded, and the first local area is updated to improve the success rate of target tracking. When the number of the first local areas is multiple, the multiple first local areas can be updated correspondingly and synchronously.

In the present exemplary embodiment, the respective images of the target tracking may be acquired by a camera, radar detection, or the like. And determining whether the candidate local area in the tracking target is blocked by the interference object or not by analyzing the perspective relation of each target in the image. The interferent is a target in the image except the target, and comprises the following steps: people other than the tracked person, moving vehicles, stationary devices (including fences, etc.), and the like.

In the present exemplary embodiment, the perspective relationship refers to a spatial position, an outline, and a projection for displaying each target in the field of view, and includes a relative relationship of near-large-far-small, near-high-far-low, near-distant-dense, near-wide-distant-narrow, and the like. In the present exemplary embodiment, information such as spatial positions, outlines, and the like of each target within the visual field range and each candidate local region in the target can be captured by capturing images and photographs by a tracking device (e.g., a camera device or the like).

In the present exemplary embodiment, the foreground target refers to a target that is closer to the tracking apparatus than the target to be tracked, on the basis of the tracking apparatus. The foreground objects are one or more of the interferents. I.e. any interferents may become foreground targets.

In the exemplary embodiment, if the position of the foreground target overlaps with the position of the candidate local area in the target to be tracked, the interferent at the position of the foreground target blocks the view of the tracking device for acquiring the candidate local area. The candidate local area is blocked by the interfering object in the image frame.

determining the external frames of the different targets;

In the present exemplary embodiment, fig. 2 is a schematic diagram illustrating an occlusion relation determination according to an exemplary embodiment. As shown in fig. 2, in the acquired frame image, when the first distance is greater than the second distance, it is described that the distance between the interfering object and the tracking device is smaller than the distance between the target to be tracked (the occluded target in the drawing) and the tracking device. In this case, the interferent is the foreground object of the target. Meanwhile, the position of the foreground target is overlapped with the position of the target to be tracked, and the interference object shields the target to be tracked. In the present exemplary embodiment, the circumscribed frame is a feature capture frame that is defined according to a tracked feature (global feature or local feature) when the tracking apparatus captures the target. The outer frame can be in any shape such as rectangle, circle and the like.

In the present exemplary embodiment, after the local tracking information is obtained based on the first local area tracking, the global tracking information of the target in different image frames is obtained according to the local tracking information and the association relationship between the first local area and the entire target.

For example, the local tracking information of the first local region in the m-th frame image is tracked, including the position coordinates of the feature of the first local region in the m-th frame image (for example, the position coordinates of the upper left corner of the circumscribed frame defining the local feature in the m-th frame image: X'_Office、Y’_Office(ii) a Wherein X'_OfficeCoordinates of the upper left corner of the bounding box to delineate the local feature in the X-axis direction in the m-th frame image, where Y'_OfficeCoordinates of the upper left corner of the circumscribed frame which defines the local features in the Y-axis direction in the mth frame image);

determining the position coordinates of the upper left corner of the external frame enclosing the overall characteristics of the target to be tracked in the image of the (m-1) th frame: x_{Machine for finishing}、Y_{Machine for finishing}(ii) a Wherein, X_{Machine for finishing}Coordinates of the upper left corner of the circumscribed frame for defining the overall characteristics of the target to be tracked in the X-axis direction in the m-1 frame image, Y_{Machine for finishing}Coordinates of the upper left corner of an external frame for enclosing the overall characteristics of the target to be tracked in the Y-axis direction in the image of the (m-1) th frame are determined;

determining the position coordinates of the upper left corner of the circumscribed frame which defines the first local area in the m-1 frame image: x_Office、Y_Office(ii) a Wherein, X_OfficeTo define the coordinates of the upper left corner of the bounding box of the first local area in the X-axis direction in the m-1 th frame image, Y_OfficeFor bounding the outer frame of the first partCoordinates of the upper left corner in the Y-axis direction in the m-1 th frame image;

determining the relative position relation of a first local area which is not occluded in the m-1 frame image in the target: x_{Difference (D)}＝X_{Machine for finishing}-X_Office，Y_{Difference (D)}＝Y_{Machine for finishing}-Y_Office(ii) a Wherein, X_{Difference (D)}Coordinate difference, Y, of the upper left corner of the circumscribed frame enclosing the overall characteristics of the target to be tracked in the m-1 frame image and the upper left corner of the circumscribed frame enclosing the first local area in the X-axis direction_{Difference (D)}Coordinate difference of the upper left corner of the external frame enclosing the overall characteristics of the target to be tracked in the m-1 frame image and the upper left corner of the external frame enclosing the first local area in the Y-axis direction is determined;

determining global tracking information X 'of target to be tracked in mth frame image'_{Machine for finishing}＝X’_Office+X_{Difference (D)}、Y’_{Machine for finishing}＝Y’_Office+Y_{Difference (D)}(ii) a Wherein X'_{Machine for finishing}The coordinate, Y ', of the upper left corner of the circumscribed frame surrounding the overall feature of the target to be tracked in the direction of the X axis in the m-th frame image'_{Machine for finishing}And coordinates of the upper left corner of the circumscribed frame for defining the overall characteristics of the target to be tracked in the Y-axis direction in the image of the mth frame.

Determining the width W of an outer frame enclosing the overall characteristics of the target to be tracked in the m-1 frame image_{Machine for finishing}And high H_{Machine for finishing}And simultaneously, the width Winteger and the height Hinteger of the circumscribed frame for delineating the overall characteristics of the target to be tracked in the mth frame image are respectively the same as the width Winteger and the height Hinteger in the m-1 frame image.

In the present exemplary embodiment, the occlusion relationship between targets or between a target and an environment may also be determined using radar, binocular vision, or the like. And by combining algorithms such as background modeling, optical flow and the like, the unoccluded target area is more accurately determined. The tracking algorithm changes use other feature models, deep learning, etc.

Meanwhile, in the field of computer vision, when the target is integrally tracked, the condition that the tracked target is lost due to shielding easily occurs. Fig. 3 is a schematic diagram of a tracking target loss process. As shown in fig. 3, an initial state, an occlusion state, and a tracking failure state are included. The tracking device can track the target to be tracked in the initial state, and the target to be tracked is not shielded. In the shielding state, the tracking device finds that the target to be tracked is shielded, then the target to be tracked is lost, and a target frame (an external frame) of the target to be tracked is left on the foreground target, so that the tracking fails. FIG. 4 is a schematic diagram of a possible situation in which an object to be tracked is occluded. As shown in fig. 4, the target to be tracked may be blocked by a human body, may be blocked by a fixed obstacle, may be blocked by a vehicle, and the like.

According to the method and the device, the target can be locally tracked by selecting the non-shielded alternative local area as the first local area, and the global tracking information of the target in different image frames is obtained according to the obtained local tracking information and the incidence relation between the first local area and the whole target to be tracked. Therefore, the situation that the target is lost when the target is continuously tracked in a whole manner when the target is partially shielded can be reduced. When the target to be tracked is partially shielded, if the target to be tracked is continuously and integrally tracked, the tracking device cannot integrally identify the target to be tracked due to the partial shielding, so that the tracking result is judged that the target does not exist in the next frame of image, and the target is lost.

FIG. 5 is a schematic diagram illustrating target tracking learning, according to an example embodiment. As shown in fig. 5, the tracking learning region weight for tracking the target to be tracked includes both the whole body and the upper body. The overall characteristics of the target to be tracked are originally tracked, the core learning area is in the central position of the target, namely the characteristic learning weight obtained from the central position is the highest, and the learning weight of the edge position is the weakest. When the weight of the tracking learning area is the whole body of the target to be tracked, the tracking core area is shielded, and the target to be tracked is easy to lose. When the weight of the tracking learning area is the upper half of the target to be tracked, the tracking core area is not easy to be blocked and the target to be tracked cannot be easily lost because the probability of blocking the lower half of the target to be tracked is higher. If the human body is tracked integrally, the learning weight core area is positioned in the waist and abdomen, and due to the alternate motion of the legs, the shielding of short objects and the like of the lower half body, the leg areas are easily shielded, and the tracking condition is unfavorable because the change is large. Therefore, the upper body is used for tracking, and the learning weight core region is moved to the chest part, so that stable tracking is easier. The device can improve the situation that rapid change of the two legs is not beneficial to tracking, and can solve the problem of shielding of common objects, such as fences, non-motor vehicles, low cars and the like.

FIG. 6 is an overall flow diagram illustrating a target tracking method according to an exemplary embodiment. As shown in fig. 6, the target tracking method includes:

video frame analysis: the m-1 th frame image (detection frame image) and the m-th frame image (tracking frame image) are acquired.

Analyzing the detection frame image: in the m-1 frame image, acquiring the position of an external frame of the target to be tracked (for example, the coordinate of the upper left corner of the external frame), and judging the shielding relation; selecting a local area for local tracking (acquiring the coordinates of the upper left corner of an external frame of the local area); and the target tracking model learns to track the local area.

Tracking frame image analysis: tracking a new position of the local area according to the local area model; and obtaining the global position according to the new position of the local region and the position and the scaling relation between the local region and the global position in the detection frame image.

And finishing the task when the frame image processing is finished, otherwise, continuing the video frame analysis.

The models used in the method may be conventional models commonly used or existing in the field, for example, the target tracking model and the local area model may be various target tracking models and local area models existing in the field, respectively, and the updating of the related models is also an existing updating method, which is not described in detail herein.

In another aspect, the present disclosure provides a target tracking apparatus. FIG. 7 is a schematic diagram illustrating a target tracking device architecture according to an exemplary embodiment. As shown in fig. 7, the target tracking apparatus provided in the embodiment of the present disclosure includes:

a first processing unit 71 configured to determine at least one first local area used when tracking a target to be tracked;

a second processing unit 72, configured to determine an association relationship between the first local area and the entire target to be tracked;

a third processing unit 73, configured to perform local tracking between a plurality of image frames based on the first local area, and obtain local tracking information;

and a fourth processing unit 74, configured to obtain global tracking information of the target to be tracked in different image frames according to the local tracking information and the association relationship.

In the present exemplary embodiment, the first local region may be a local region of the whole object in the image frame, which includes a recognition feature for tracking and recognizing by the tracking device (for example, when the object is a human body, the first local region may be a certain organ tissue of the human body in the image, including a head, a limb, a body, and the like).

In the present exemplary embodiment, the association relationship between the first local region and the entire target to be tracked includes a positional relationship and a relative local region relationship in the image between the first local region and the entire target to be tracked, and the like. The position relation can include a coordinate relation between the area center coordinate of the first local area and the area center coordinate of the whole target to be tracked; the relative local region relationship may include a relative relationship between the outline and the area of the first local region and the outline and the area of the entire target to be tracked, for example, which region of which position of the entire target the first local region is.

In the exemplary embodiment, global tracking is that the tracking device identifies and locates according to the overall characteristics of the target (i.e., the target to be tracked as a whole). For example, when the target to be tracked is a person, all the characteristics (including the head, the limbs, the body and the like) of the person are seen as a whole to be identified and positioned. The local tracking is that the tracking device identifies and positions the target according to partial characteristics of the target. For example, when the target to be tracked is a person, only the head of the person can be identified and positioned, so as to determine the position of the target.

The global tracking information may be position information (including coordinate information) of the entire target in the image and/or region information (including a morphological contour, an area, and the like composed of the global features) of the entire target, which is obtained based on the tracking of the global features of the target.

In the present exemplary embodiment, when the target has an easy-to-track-recognize feature whose feature is obvious, local tracking may be performed on at least one first local area (having an obvious easy-to-track-recognize feature) in a plurality of image frames, local tracking information of this first local area is acquired, and then global tracking information of the target is acquired based on this local tracking information.

In the exemplary embodiment, when the target does not have obvious features and is easy to track and identify, the multiple different first local areas may be synchronously tracked, the local tracking information of the multiple first local areas is obtained, and then the global tracking information of the target is obtained based on the local tracking information of the multiple first local areas, so that the tracking success rate is improved.

According to the method and the device for tracking the target to be tracked, the target to be tracked is locally tracked among the image frames through the at least one first local area, and the global tracking information of the target to be tracked in different image frames is obtained according to the obtained local tracking information and the incidence relation between the first local area and the whole target to be tracked. Therefore, local tracking information is obtained through local tracking of one or more first local areas, global tracking information is obtained based on the local tracking information, and compared with the method of directly carrying out global tracking on the whole target to be tracked, the method can reduce the number of features required to be identified when the target can be tracked. Meanwhile, when global tracking is carried out on the whole target to be tracked, the whole area of the target needs to be identified and matched, once partial area is shielded, the situation that the target cannot be identified and then lost due to the fact that the whole area cannot be matched can occur. And the whole target to be tracked is higher than a local part of the target to be tracked due to the large area of the whole target to be tracked. In contrast, the area to be identified and matched is smaller by performing the local tracking through the first local area. For the same obstacle, the probability that the local area is blocked relative to the global area is also small. Therefore, the global tracking information is acquired by performing the local tracking on the first local area, and compared with the method of directly performing the global tracking on the whole target, the target tracking can be performed more stably, so that the target tracking success rate is improved.

In the present exemplary embodiment, the candidate local regions are all local regions in a global region of a target to be tracked in the image frame. In the present exemplary embodiment, a plurality of one or more candidate local regions for local tracking may be determined in advance.

In one embodiment, a plurality of parts with lower occlusion probability can be selected as the candidate local regions based on big data statistics.

In the present exemplary embodiment, the first condition may be that the probability of the local occlusion is less than a preset probability value. In the moving process of the target, the probability that each part is shielded is different, so that the candidate local area smaller than the preset probability value can be determined as the first local area to perform local tracking. For example, when the target to be tracked is a human body, the probability that the leg of the human body is blocked is generally higher than the probability that the head of the human body is blocked. The probability that the leg of the human body is occluded as the candidate local region is greater than the probability that the head is occluded as the candidate local region. It is obvious that the first condition is more easily satisfied by the head as a candidate local region.

In one embodiment, if the target to be tracked has been continuously tracked for a period of time. If the time exceeds the preset time, one or more parts which are not shielded or shielded less frequently can be selected as the candidate local area according to the shielded statistical condition of the target to be tracked. In this way, a region more suitable as a candidate local region in the current environment can be determined, thereby improving the tracking success rate.

If the tracking quality is not good, for example, when the feature recognition is not obvious and the recognition is easy to be wrong, the tracking quality is lower than the preset quality threshold, the local area with the tracking quality not lower than the preset quality threshold can be reselected from the candidate local areas as the updated first local area. Tracking then continues for the updated first local area. The preset quality threshold is a preset parameter value which is set in the system and used for evaluating the target tracking quality. For example a preset parameter value of 50. When the tracking quality is good, for example, the selected local features are obvious and easy to identify, the tracking device can easily capture the target, and the target is not easy to lose. The tracking quality can now be evaluated as 80, above a preset quality threshold 50, without reselecting the first local area. If the tracking quality evaluation is 20 and is smaller than the preset quality threshold 50, then the candidate local area needs to be reselected as the first local area for local tracking.

In the present exemplary embodiment, the candidate local region may be determined according to the characteristic attribute of different local regions of the target. The characteristic attributes include: the number of features and/or the significance parameter of an individual feature. The number of features refers to the number of features contained in the part, with a different number of features making the part easier to identify. For example, when the target to be tracked is a human body, the local regions may be a head and a waist. When the head is a partial region, the features included in the head include the features of the five sense organs themselves, the features of the five sense organs individually, the head shape feature, and the hair style feature. In contrast, when the waist is localized, the only features included are the waist size and the waist shape. A significance parameter for a single feature means that there are some significant differences in the individual features, and these differences can be taken as significance parameters. Such as a human body with obvious easily distinguishable scars, color spots, etc. in its facial features. The higher the significance of a feature, the higher the degree of recognition. Through the identification of the characteristic attributes, the tracking becomes easier and more accurate.

In the present exemplary embodiment, the first condition may be that the probability that the local region is occluded is less than a preset probability value. Because the probability that each local area is shielded is different in the moving process of the target to be tracked, the alternative local area smaller than the preset probability value can be determined as the first local area for local tracking. For example, when the target to be tracked is a human body, the probability that the leg of the human body is blocked is generally higher than the probability that the head of the human body is blocked. The probability that the leg of the human body is occluded as the candidate local region is greater than the probability that the head is occluded as the candidate local region. It is obvious that the first condition is more easily satisfied by the head as a candidate local region. When the statistical probability meets the first condition, the candidate local area which is not shielded in the initial image frame in the image frames is used as the first local area for local tracking, and the accuracy of target tracking can be improved.

In the present exemplary embodiment, during the target tracking process, the first local area is occluded in one or more image frames, and at this time, the tracking is continued with the first local area, so that the target is easily lost. Based on the method, one or more local areas can be selected from the parts of the target which are not shielded, and the first local area is updated to improve the success rate of target tracking. When the number of the first local areas is multiple, the multiple first local areas can be updated correspondingly and synchronously.

In the present exemplary embodiment, the foreground target refers to a target that is closer to the tracking apparatus than the tracking target on the basis of the tracking apparatus. The foreground objects are one or more of the interferents. I.e. any interferents may become foreground targets.

In the exemplary embodiment, when the position of the foreground target overlaps with the position of the candidate local area in the target, the interferent at the position of the foreground target blocks the view of the tracking device for acquiring the candidate local area. The candidate local area is blocked by the interfering object in the image frame.

In the present exemplary embodiment, fig. 2 is a schematic diagram illustrating an occlusion relation determination according to an exemplary embodiment. As shown in fig. 2, in the acquired frame image, when the first distance is greater than the second distance, it is described that the distance between the interfering object and the tracking device is smaller than the distance between the tracking target (the occluded target in the drawing) and the tracking device. In this case, the interferent is the foreground object of the target. Meanwhile, the position of the foreground target is overlapped with the position of the target, and the tracking target is shielded by the interferent. In the present exemplary embodiment, the circumscribed frame is a feature capture frame that is defined according to a tracked feature (global feature or local feature) when the tracking apparatus captures the target. The outer frame can be in any shape such as rectangle, circle and the like.

In the exemplary embodiment, after the local tracking information is obtained based on the first local area tracking, the global tracking information of the target to be tracked in different image frames is obtained according to the local tracking information and the incidence relation between the first local area and the whole to be tracked.

determining the position coordinates of the upper left corner of the circumscribed frame which defines the first local area in the m-1 frame image: x_Office、Y_Office(ii) a Wherein, X_OfficeTo define the coordinates of the upper left corner of the bounding box of the first local area in the X-axis direction in the m-1 th frame image, Y_OfficeCoordinates of the upper left corner of the circumscribed frame which defines the first local area in the Y-axis direction in the m-1 frame image;

determine that it is not received in the m-1 frame imageRelative position relationship of the first local area of occlusion in the target: x_{Difference (D)}＝X_{Machine for finishing}-X_Office，Y_{Difference (D)}＝Y_{Machine for finishing}-Y_Office(ii) a Wherein, X_{Difference (D)}Coordinate difference, Y, of the upper left corner of the circumscribed frame enclosing the overall characteristics of the target to be tracked in the m-1 frame image and the upper left corner of the circumscribed frame enclosing the first local area in the X-axis direction_{Difference (D)}Coordinate difference of the upper left corner of the external frame enclosing the overall characteristics of the target to be tracked in the m-1 frame image and the upper left corner of the external frame enclosing the first local area in the Y-axis direction is determined;

The present disclosure also provides an electronic device. Fig. 8 is a schematic diagram of an electronic device shown in accordance with an example embodiment. As shown in fig. 8, an electronic device provided in an embodiment of the present disclosure includes: a processor 830 and a memory 820 for storing a computer program capable of running on the processor, wherein the processor 830 is configured to execute the steps of the method provided by the embodiments described above when the computer program is run.

The present disclosure also provides a computer-readable storage medium. The computer readable storage medium provided by the embodiments of the present disclosure has a computer program stored thereon, and the computer program realizes the steps of the method provided by the above embodiments when being executed by a processor.

In the several embodiments provided in the present application, it should be understood that the disclosed apparatus and method may be implemented in other ways. The above-described device embodiments are merely illustrative, for example, the division of the unit is only a logical functional division, and there may be other division ways in actual implementation, such as: multiple units or components may be combined, or may be integrated into another system, or some features may be omitted, or not implemented. In addition, the coupling, direct coupling or communication connection between the components shown or discussed may be through some interfaces, and the indirect coupling or communication connection between the devices or units may be electrical, mechanical or other forms.

The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed on a plurality of network units; some or all of the units can be selected according to actual needs to achieve the purpose of the solution of the embodiment.

In addition, all the functional units in the embodiments of the present invention may be integrated into one processing module, or each unit may be separately used as one unit, or two or more units may be integrated into one unit; the integrated unit can be realized in a form of hardware, or in a form of hardware plus a software functional unit.

In some cases, any two of the above technical features may be combined into a new method solution without conflict. In some cases, any two of the above technical features may be combined into a new device solution without conflict.

Those of ordinary skill in the art will understand that: all or part of the steps for implementing the method embodiments may be implemented by hardware related to program instructions, and the program may be stored in a computer readable storage medium, and when executed, the program performs the steps including the method embodiments; and the aforementioned storage medium includes: various media capable of storing program codes, such as a removable Memory device, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, and an optical disk.

The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the appended claims.

Claims

1. A method of target tracking, the method comprising:

2. The target tracking method of claim 1, wherein the determining at least one first local area for use in tracking the target to be tracked comprises:

determining at least one candidate local region for target tracking;

3. The target tracking method of claim 2, wherein determining at least one first local region for use in tracking the target to be tracked further comprises:

4. The method of claim 2, wherein determining at least one candidate local region for target tracking comprises:

5. The target tracking method according to claim 2, wherein selecting, as the first local region, at least one candidate local region whose statistical probability of being occluded satisfies a first condition according to the statistical probability of being occluded of each of the candidate local regions comprises:

6. The target tracking method according to any one of claims 2 to 5, wherein the number of the first local regions is N, where N is a positive integer equal to or greater than 2 and less than P; p is the number of local areas contained in the target to be tracked;

7. The target tracking method of claim 6, further comprising:

8. An object tracking apparatus, characterized in that the apparatus comprises:

9. An electronic device, comprising: a processor and a memory for storing a computer program operable on the processor, wherein the processor is operable to perform the steps of the method of any of claims 1 to 7 when the computer program is executed.

10. A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 7.