CN106682619B - Object tracking method and device - Google Patents

Object tracking method and device Download PDF

Info

Publication number
CN106682619B
CN106682619B CN201611232615.2A CN201611232615A CN106682619B CN 106682619 B CN106682619 B CN 106682619B CN 201611232615 A CN201611232615 A CN 201611232615A CN 106682619 B CN106682619 B CN 106682619B
Authority
CN
China
Prior art keywords
image frame
target tracking
detection
detection object
current image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201611232615.2A
Other languages
Chinese (zh)
Other versions
CN106682619A (en
Inventor
蒋化冰
孙斌
吴礼银
康力方
李小山
张干
赵亮
邹武林
徐浩明
廖凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHANGHAI MUMU JUCONG ROBOT TECHNOLOGY Co.,Ltd.
Original Assignee
Shanghai Mumu Jucong Robot Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Mumu Jucong Robot Technology Co ltd filed Critical Shanghai Mumu Jucong Robot Technology Co ltd
Priority to CN201611232615.2A priority Critical patent/CN106682619B/en
Publication of CN106682619A publication Critical patent/CN106682619A/en
Application granted granted Critical
Publication of CN106682619B publication Critical patent/CN106682619B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • G06V20/42Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items of sport video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence

Landscapes

  • Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Software Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Theoretical Computer Science (AREA)
  • Image Analysis (AREA)

Abstract

The embodiment of the application provides an object tracking method and device. The method comprises the following steps: marking a bounding box of at least one detected object in a current image frame; respectively calculating the coincidence degree of the bounding box of at least one detection object and the bounding box of the target tracking object in the previous image frame; and determining the target tracking object from the at least one detection object according to the calculated at least one coincidence degree. By adopting the technical scheme to track the object, the effectiveness and the accuracy of object tracking are improved.

Description

Object tracking method and device
Technical Field
The present application relates to the field of digital image processing, and in particular, to a method and an apparatus for tracking an object.
Background
Object tracking is nowadays increasingly used in the fields of video surveillance, autonomous vehicle navigation and robotics. For example, in the field of robots, pedestrian tracking is used as a basic function, and can provide support for human-computer interaction of the robot, so that the robot can better perform various tasks.
In the prior art, when a pedestrian is tracked, image data acquired by a camera is selected from the image data, and the largest bounding-box (bounding-box) is selected as a tracked object. This approach is relatively simple and the tracking of the pedestrian is inaccurate.
Disclosure of Invention
In view of this, the present application provides an object tracking method and apparatus, which are used to improve accuracy of pedestrian tracking.
The embodiment of the application provides an object tracking method, which comprises the following steps:
marking a bounding box of at least one detected object in a current image frame;
respectively calculating the coincidence degree of the bounding box of at least one detection object and the bounding box of the target tracking object in the previous image frame;
and determining the target tracking object from the at least one detection object according to the calculated at least one coincidence degree.
Further optionally, for any one of the at least one detection object, calculating a degree of coincidence between a bounding box of the detection object and a bounding box of a target tracking object in a previous image frame includes: respectively calculating the intersection area and the union area of the bounding box of the detection object and the bounding box of the target tracking object in the previous image frame; and determining the coincidence degree of the boundary box of the detected object and the boundary box of the target tracking object in the previous image frame according to the area of the intersection and the area of the union.
Further optionally, the determining the target tracking object from the at least one detection object according to the calculated at least one coincidence degree includes: and if the calculated at least one contact ratio has a contact ratio meeting a preset contact ratio condition, marking the detection object corresponding to the contact ratio meeting the preset contact ratio condition as the target tracking object.
Further optionally, the determining the target tracking object from the at least one detection object according to the calculated at least one coincidence degree includes: if the calculated at least one contact ratio does not have a contact ratio meeting a preset contact ratio condition, acquiring a dense color histogram of the at least one detection object; respectively calculating the similarity of the at least one detection object and the target tracking object according to the color dense histogram of the at least one detection object and the color dense histogram of the target tracking object in at least one image frame before the current image frame; and acquiring a detection object with the similarity meeting the set requirement from the at least one detection object as the target tracking object.
Further optionally, for any one of the at least one detection object, calculating a similarity between the detection object and the target tracking object includes: respectively calculating the similarity of the color dense histogram of the detection object and the color dense histogram of the target tracking object in at least one image frame before the current image frame to obtain the similarity of at least one color dense histogram; and acquiring an average value of the similarity of the at least one color dense histogram as the similarity of the detection object and the target tracking object.
Further optionally, the method further comprises: and storing the corresponding relation among the frame number of the current image frame, the boundary box of the target tracking object in the current image frame and the detection object defined by the boundary box of the target tracking object in the current image frame.
An embodiment of the present application further provides an object tracking apparatus, including:
a marking unit for marking a bounding box of at least one detection object in a current image frame;
the calculating unit is used for respectively calculating the coincidence degree of the boundary frame of at least one detection object and the boundary frame of the target tracking object in the previous image frame;
a determining unit, configured to determine the target tracking object from the at least one detection object according to the calculated at least one coincidence degree.
Further optionally, the computing unit is configured to: for any detection object in the at least one detection object, calculating the area of the intersection and the area of the union of the bounding box of the detection object and the bounding box of the target tracking object in the previous image frame; and determining the coincidence degree of the detection object and a boundary box of the target tracking object in the previous image frame according to the area of the intersection and the area of the union.
Further optionally, the computing unit is configured to: and if the calculated at least one contact ratio has a contact ratio meeting a preset contact ratio condition, marking the detection object corresponding to the contact ratio meeting the preset contact ratio condition as the target tracking object.
Further optionally, the determining unit is configured to: if the calculated at least one contact ratio does not have a contact ratio meeting a preset contact ratio condition, acquiring a dense color histogram of the at least one detection object; respectively calculating the similarity of the at least one detection object and the target tracking object according to the color dense histogram of the at least one detection object and the color dense histogram of the target tracking object in at least one image frame before the current image frame; and acquiring a detection object with the similarity meeting the set requirement from the at least one detection object as the target tracking object.
According to the object tracking method and device provided by the embodiment of the application, the boundary frame of at least one detection object in the current image frame is marked, the coincidence degree between the boundary frame of the target tracking object and the boundary frame of the previous image frame is calculated, and the target tracking object is determined from the at least one detection object based on the coincidence degree between the boundary frame of the at least one detection object in the current image frame and the boundary frame of the target tracking object in the previous image frame. The defect that the object tracking is easily influenced by the environment where the object is located is overcome, and more accurate pedestrian tracking is realized.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and those skilled in the art can also obtain other drawings according to the drawings without creative efforts.
Fig. 1 is a schematic flowchart of an object tracking method according to an embodiment of the present application;
fig. 2 is another schematic flowchart of an object tracking method according to an embodiment of the present application;
fig. 3 is a schematic structural diagram of an object tracking apparatus according to an embodiment of the present application.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
When the object tracking process is realized, the target tracking object is captured through the camera, for example, a pedestrian is captured through the monocular camera. In the existing object tracking technology, when a camera is used for capturing pedestrians, on one hand, the pedestrians have the characteristics of different postures and various outline appearances, and on the other hand, the environments where the pedestrians are located and the shooting illumination conditions are quite complex. Due to the above reasons, missing detection and false report are easily caused in the capture of pedestrians, so that the accuracy of object tracking is directly reduced. In view of the defect, the core of the embodiment of the present application is to acquire the feature of the detection object in the current image frame and perform feature matching on the feature of the detection object and the target tracking object determined in the previous image frame, thereby acquiring the target tracking object in the current image frame and avoiding missing detection and false report.
Fig. 1 is a schematic flowchart of an object tracking method provided in an embodiment of the present application, and with reference to fig. 1, the method includes:
step 101, marking a bounding box of at least one detected object in the current image frame.
Step 102, respectively calculating the coincidence degree of the bounding box of at least one detection object and the bounding box of the target tracking object in the previous image frame.
And 103, determining the target tracking object from the at least one detection object according to the calculated at least one contact ratio.
In step 101, the current image frame is an image frame taken at the current time, and the current image frame includes at least one detection object. At least one detection object in the current image frame may be a target tracking object, and at least one detection object may not be a target tracking object (meaning that tracking fails).
Optionally, in this embodiment, in the process of implementing object tracking, the image station is shot by a camera so as to perform object capture, for example, the object may be captured by a monocular camera or a monocular camera. The camera may be mounted on a tracking device. Based on this, shooting may be performed by a camera mounted on the tracking apparatus to obtain a current image frame. The implementation of the tracking device may vary depending on the application scenario. For example, when the technical solution of the embodiment of the present application is applied to a robot object tracking scene, the robot may serve as a tracking device, and a monocular or monocular camera may be mounted on a certain part of the robot, for example, a head of the robot to photograph a pedestrian.
After the current image frame is captured, a bounding box of at least one detected object in the current image frame may be marked.
In an alternative embodiment, after the detection object is captured, the bounding box of at least one detection object in the current image frame is marked, which can be implemented by a fast-Region-based temporal Neural network (fast-RCNN) algorithm based on a Neural network. In the existing object tracking technology, when a camera is used for capturing pedestrians, on one hand, the pedestrians have the characteristics of different postures and various outline appearances, and on the other hand, the environments where the pedestrians are located and the shooting illumination conditions are quite complex. Due to the above reasons, missing detection and false report are easily caused in the capture of pedestrians, so that the accuracy of object tracking is directly reduced. And the missing rate and the false positive rate of target detection by the fast-RCNN algorithm are low, so that a foundation is laid for realizing accurate and high-accuracy object tracking. As the fast-RCNN algorithm is mature prior art, the embodiment of the application is not described in detail.
In an alternative embodiment, the current image frame contains at least one detected object that is all objects indicated to be present in the current image frame. By marking the bounding boxes of all the detection objects in the current image frame, the detection missing condition which possibly occurs is favorably avoided.
In another alternative embodiment, before marking the bounding box of at least one detection object in the current image frame, the detection objects in the current image frame may be subjected to a preliminary screening to determine at least one detection object that needs to mark the bounding box, that is, a detection object meeting the marking condition is selected from all the detection objects in the current image frame for marking the bounding box. For example, the detection object with a more regular bounding box may be selected for marking, or the detection object with a larger bounding box (for example, the area of the bounding box is larger than a preset value) may be selected for marking. By adopting the marking mode, the marking speed of the detected object is improved, the data volume required to be processed by object tracking is reduced, the object tracking efficiency is improved, and resources are saved.
For step 102, the previous image frame is: a previous image frame adjacent to a frame number of the current image frame. For example, assuming that the frame number of the current image frame is N, the image frame with the frame number N-1 is the previous image frame.
In the previous image frame, the target tracking object has been determined, i.e. the previous image frame contains the target tracking object and the bounding box of the target tracking object. Optionally, if the previous image frame is the first frame, the detected object with the largest bounding box may be directly selected as the target tracking object, as in the prior art. If the previous image frame is not the first frame, the method provided by the embodiment shown in fig. 1 may be used to determine the target tracking object from the detection objects included in the previous image frame.
After the at least one detection object in the current image frame is marked by the boundary frame, the coincidence degree of the boundary frame of the at least one detection object in the current image frame and the boundary frame of the target tracking object in the previous image frame is respectively calculated, and at least one coincidence degree is obtained at the moment. For example, assuming that the current image frame includes M detection objects, the overlapping degrees of the bounding boxes of the M detection objects and the bounding box of the target tracking object in the previous image frame are calculated respectively to obtain the overlapping degrees of the M bounding boxes. In general, when the shooting angle is not changed between two preceding and succeeding frames of images, the degree of overlap of the bounding boxes included in the two frames of images can be obtained by superimposing the two frames of images.
After obtaining at least one coincidence degree, a target detection object is determined from the at least one detection object based on the at least one coincidence degree, in step 103. It is worth mentioning that, in theory, the target detection object determined in the current image frame should be the same as the target detection object determined in the previous image frame.
In this embodiment, the bounding box of at least one detection object in the current image frame is marked, and the calculation of the coincidence degree between the bounding box of the at least one detection object and the bounding box of the target tracking object in the previous image frame is performed. And determining the target tracking object from the at least one detection object based on the coincidence degree of the boundary box of the at least one detection object in the current image frame and the boundary box of the target tracking object in the previous image frame. The defect that the object tracking is easily influenced by the environment where the object is located is overcome, and more accurate pedestrian tracking is realized.
Fig. 2 is another technical flowchart of an object tracking method provided in an embodiment of the present application, and in conjunction with fig. 2, the method includes:
step 201, marking a bounding box of at least one detected object in the current image frame.
Step 202, for any one of the at least one detection object, respectively calculating the intersection area and the union area of the bounding box of the detection object and the bounding box of the target tracking object in the previous image frame.
Step 203, determining the coincidence degree of the boundary box of the detection object and the boundary box of the target tracking object in the previous image frame according to the area of the intersection and the area of the union.
And 204, judging whether the calculated at least one contact ratio has a contact ratio meeting a preset contact ratio condition.
If there is a coincidence degree meeting a preset coincidence degree condition in the at least one calculated coincidence degree, executing step 205; if there is no coincidence that meets the predetermined coincidence condition in the at least one calculated coincidence, step 206-step 208 are performed.
Step 205, marking the detected object corresponding to the coincidence degree meeting the preset coincidence degree condition as the target tracking object.
Step 206, obtaining a color dense histogram of the at least one detection object.
Step 207, calculating the similarity between the at least one detection object and the target tracking object according to the color dense histogram of the at least one detection object and the color dense histogram of the target tracking object in at least one image frame before the current image frame.
And step 208, acquiring a detection object with the similarity meeting the set requirement from the at least one detection object as the target tracking object.
The specific implementation of step 201 is described in the embodiment shown in fig. 1, and is not described herein again.
For step 202, the current image frame captured by the camera has the same capture angle and capture parameters as the previous image frame. After the current image frame is obtained, the current image frame and the previous image frame are overlapped to calculate the area of the intersection and the area of the union of the boundary frame of at least one detection object in the current image frame and the boundary frame of the target tracking object in the previous image frame, which are generated by overlapping.
For example, the bounding box marking the first detection object in the current image frame is a1, the bounding box marking the second detection object is a2, the bounding box marking the third detection object is A3, and the bounding box of the target tracking object in the previous image frame is B. The following are calculated respectively: the area A1 ≧ B of the intersection of the boundary frame A1 with the boundary frame B and the area A1 ≦ B of the union of the boundary frame A1 with the boundary frame B; the area A2 ≧ B of the intersection of the boundary frame A2 with the boundary frame B and the area A2 ≦ B of the union of the boundary frame A2 with the boundary frame B; the area A3 ≧ B of the intersection of the boundary box A3 with the boundary box B and the area A3 ≦ B of the union of the boundary box A3 with the boundary box B.
For step 203, based on the area of the intersection and the area of the union obtained by the above calculation, the coincidence degree of the at least one detection object in the current image frame and the bounding box of the target tracking object in the previous image frame may be calculated.
In an alternative embodiment, the degree of overlap is the area of the intersection/the area of the union, but is not limited thereto.
Bearing the above example, the first coincidence degree ═ a1 ═ B)/(a1 uberb) of the first detection object in the current image frame with the target tracking object; a second coincidence degree ═ a2 ═ B)/(a2 uber) of the second detection object in the current image frame with the target tracking object; a third coincidence degree ═ A3 ═ B)/(A3 uberb) of the third detection object in the current image frame with the target tracking object.
Alternatively, the overlap ratio may be calculated accordingly, i.e., the overlap ratio is the area of the intersection/(the area of the intersection + the area of the union). Alternatively, the degree of coincidence, i.e., the area of the intersection of the degrees of coincidence K1/the area of the union K2, may be calculated based on the predetermined coefficients K1 and K2.
For step 204, after acquiring the coincidence degree between the bounding box of at least one detection object in the current image frame and the bounding box of the target tracking object in the previous image frame, at least one coincidence degree is obtained. And determining a target tracking object from at least one detection object based on the value of the at least one contact ratio.
Optionally, the preset overlap ratio condition may be an overlap ratio threshold, where the threshold is generally an empirical value, and a value of the threshold is related to a shooting interval of the camera. If the time interval between two frames of images is small when the camera shoots and the moving range of the shot detection object is small, the coincidence degree of the positions of the target tracking object in the two front and back image frames is high. For the above case, the threshold value of coincidence degree may be set relatively small. On the contrary, when the camera shoots, the time interval between two frames of images is large, the moving range of the shot detection object is large, and the coincidence degree of the positions of the target tracking object in the two front and back image frames is low. For the above case, the threshold value of coincidence degree may be set relatively large. In the embodiment, when the shooting frequency of the camera is set to be 1 second for shooting 5 frames of images, the coincidence degree threshold value can be set to be 0.6. Based on the above-mentioned setting of the shooting frequency and the overlap ratio threshold, more accurate detection results of the target tracking object can be obtained by determining that the detection object corresponding to the overlap ratio larger than 0.6 is the target tracking object.
In step 205, in a possible case, when the coincidence degree condition is greater than 0.6, a coincidence degree greater than 0.6 is selected from at least one coincidence degree, and the detection object corresponding to the coincidence degree is taken as the target tracking object.
With respect to step 206, in a possible case, within a shooting time interval of the camera, the detected object moves faster, and there may be no object whose coincidence of the bounding box meets the coincidence condition in the two frames before and after, that is, which detected object in the current image frame is the target detected object based on the coincidence condition. In this case, a color dense histogram of each of the at least one detection object marked in the current image frame is further acquired. For example, if M detection objects are marked in the current image frame, the respective color dense histograms of the M detection objects are calculated, respectively, to obtain M color dense histograms. The color dense histogram represents the color features of the detection object, describes the proportion of different colors in the image of the detection object, and can realize accurate identification of the detection object according to the color dense histogram.
Wherein a color dense histogram may be computed from the image portions within the bounding box.
Optionally, the color dense histogram is calculated by a calcHist function of an Open Source Computer Vision Library CV (Open). Since Open CV is a mature prior art, details of a specific histogram calculation process are not described in the embodiments of the present application.
With respect to step 207, in an alternative embodiment, the similarity between the at least one detection object and the target tracking object is calculated separately, and the similarity between the color dense histogram of the detection object and the color dense histogram of the target tracking object in at least one image frame before the current image frame may be calculated separately. Based on the above calculation process, the similarity of at least one color dense histogram is obtained, and the average value of the similarities of the at least one color dense histogram is calculated as the similarity of the detection object and the target tracking object.
The color dense histogram of the target tracking object in at least one image frame before the current image frame comprises the color dense histogram of the target tracking object in the previous image frame and the corresponding color dense histogram of the target tracking object in a plurality of images before the previous image frame. That is, assuming that the frame number of the current image frame is N and the frame number N-N is the first N image frames of the current image frame, the color dense histogram of the target tracking object in at least one image frame before the current image frame includes: the color dense histogram H-1 of the target tracking object in the N-1 th frame, the color dense histogram H-2 … … of the target tracking object in the N-2 th frame, and the color dense histogram H-N of the target tracking object in the N-N th frame, where N is an empirical value and is an integer equal to or less than N and equal to or greater than 0.
Assume that, in the current image frame, the color dense histogram of the first detection object among the three detection objects of the marker is H1, the color dense histogram of the first detection object is H2, and the color dense histogram of the first detection object is H3. For a first detection object, a specific process of calculating the similarity between the first detection object and the target tracking object is as follows: respectively calculating the similarity of the color dense histogram of the first detection object and the color dense histogram of the target tracking object in n image frames before the current image frame to obtain n color dense histogram similarities; and calculating the average value of the similarity of the n color dense histograms as the similarity of the first detection object and the target tracking object.
For example, the calculation process of the similarity between the first detection object and the target tracking object is as follows:
S1=[similarity(H1,H-1)+similarity(H1,H-2)+……+similarity(H1,H-n)]/n
the calculation process of the similarity between the second detection object and the target tracking object is as follows:
S2=[similarity(H2,H-1)+similarity(H2,H-2)+……+similarity(H2,H-n)]/n
the calculation process of the similarity between the third detection object and the target tracking object is as follows:
S3=[similarity(H3,H-1)+similarity(H3,H-2)+……+similarity(H3,H-n)]/n
wherein similarity () is a similarity calculation function. In the execution mode, the similarity of the detected object in the current image frame and at least one dense historical color histogram of the target tracking object determined in the previous image frame is compared, so that the influence of the posture, the contour appearance, the environment where the pedestrian is located and the shooting illumination condition of the pedestrian on tracking is reduced, and the reliability of the calculation result of the similarity is further improved.
With reference to step 208, in an alternative embodiment, after calculating the similarity between the detection object and the target tracking object, it is determined whether the similarity meets the set requirement one by one. Optionally, the requirements set in this embodiment are: the similarity is greater than a similarity threshold and the similarity is the maximum of the at least one similarity.
The value of the similarity threshold is an empirical value, and is related to the environment where the detection object is located and the lighting condition of the shooting. In the embodiment of the application, during actual operation, the test is carried out in a conventional office area, and when the similarity threshold value is 0.7, a more accurate tracking result can be obtained.
In this embodiment, a bounding box of at least one detection object in a current image frame is obtained, and the bounding box is subjected to overlap ratio calculation with a bounding box of a target tracking object in a previous image frame. And determining the target tracking object from the at least one detection object based on the coincidence degree of the boundary box of the at least one detection object in the current image frame and the boundary box of the target tracking object in the previous image frame. The situations of missing detection and false report are avoided, and the accuracy of object tracking is improved. And when the coincidence degree cannot judge whether the target tracking object exists in the current image frame, further acquiring a color density histogram of at least one detection object in the current image frame, performing similarity calculation with the color density histogram of the target tracking object in at least one image frame before the current image frame, and acquiring the target tracking object from at least one detection object in the current image frame through the similarity of the color density histogram. Based on the scheme, the influence of the posture, the outline appearance, the environment where the pedestrian is located and the shooting illumination condition of the pedestrian on tracking is reduced, and the object tracking with high accuracy is realized.
It should be noted that, in the technical solution of the embodiment of the present application, for each current image frame, a frame number of the current image frame, a boundary frame of the target tracking object in the current image frame, a color dense histogram of the target tracking object in the current image frame, and a corresponding relationship between the target tracking objects are stored.
Optionally, in an actual operation process, when the color dense histogram of the target tracking object in the current image frame is stored, for each image frame, the color dense histogram of the target tracking object in the image frame is stored in a list with a fixed length of n according to the order of the frame numbers of the image frame. And when the number of the color dense histograms stored in the list is less than n, continuously adding the color dense histograms of the target tracking objects in the image frames according to the increasing sequence of the frame numbers of the image frames. If the number stored in the list exceeds N, the color dense histogram of the target tracking object in the image frame with the maximum frame number (the current image frame with the frame number of N) is added at one end of the list, and the color dense histogram of the target tracking object in the image frame with the minimum frame number (the image frame with the frame number of N-N) is deleted at the other end of the list. The list length n is an empirical value, and through a plurality of experimental trials, the most accurate similarity calculation result and higher calculation efficiency can be obtained when n is 10. And storing the content for comparison when the next image frame is used for identifying the target tracking object, and the details are not repeated.
The lower part will be combined with an application scenario to use a specific example to specifically describe the technical solution of the embodiment of the present application.
And the monocular camera is arranged at the head of the intelligent robot and is used for shooting the interactive object and realizing the detection of the interactive object when the robot executes the tracking task of the interactive object.
When the interactive object and the robot perform man-machine interaction, the interactive object is shot through the monocular camera to obtain a first image frame. In the first image frame, a boundingbox of the detected object in the first image frame is obtained by adopting a Faster R-CNN algorithm.
And selecting the detection object corresponding to the bounding box with the largest area as an interactive object, marking the detection object as b-box0, and calculating a color dense histogram of the selected part of the b-box0, and marking the histogram as H0. It should be understood that the interactive object performs human-computer interaction with the robot, the interactive object is closest to the robot, and the rotating direction of the robot eye assembly faces to the direction of the interactive object, so in the shooting result of the monocular camera, the interactive object is located in the middle of the visual field, and the bounding box area is the largest.
After the interactive object is determined through the first image frame, a second image frame is photographed at a photographing angle of the first image frame at a photographing interval of 0.2S. And in the second image frame, acquiring a bounding box of the detected object in the second image frame by adopting a Faster R-CNN algorithm. Assuming that, in the second image frame, the bounding-box corresponding to the four detection objects is respectively obtained: b-box1, b-box2, b-box3 and b-box 4.
Respectively calculating the coincidence degrees C1, C2 and C3 of the bounding box of each detection object in the second image frame and the bounding box of the determined interaction object in the first image frame:
C1=(b-box0∩b-box1)/(b-box0∪b-box1)
C2=(b-box0∩b-box2)/(b-box0∪b-box2)
C3=(b-box0∩b-box3)/(b-box0∪b-box3)
in this embodiment, the overlap ratio threshold is set to 0.6.
In one scenario, assuming that the values of C1, C2, and C3 are 0.2, 0.7, and 0.1, respectively, then C2 >0.7 >0.6, and the detected object corresponding to C2 is tracked as the interactive object.
In another scenario, assuming that the values of C1, C2, C3 are 04, 0.5, and 0.1, respectively, no coincidence value greater than 0.6 is satisfied. Color dense histograms of the four detection objects acquired in the second image frame are calculated respectively, namely color dense histograms of the images selected by the b-box1, the b-box2, the b-box3 and the b-box4 are calculated, and color dense histograms H1, H2, H3 and H4 are obtained respectively.
Calculating the similarity of the color dense histogram H0 of the determined interactive object in the first image frame and the color dense histograms H1, H2, H3 and H4 of the four detected objects in the second image frame respectively:
S1=similarity(H0,H1);S2=similarity(H0,H2);
S3=similarity(H0,H3);S4=similarity(H0,H4);
in this embodiment, the similarity threshold is set to 0.7.
After obtaining the four similarities, the similarity greater than the similarity threshold of 0.7 among S1, S2, S3, S4 is selected. Assuming that the values of S1, S2, S3, and S4 are 0.1, 0.75, 0.1, and 0.05, respectively, and S2 is 0.75>0.7, it is determined that the detection object corresponding to S2 is an interactive object and tracking is performed. The b-box2 corresponding to the interaction object in the second image frame is saved along with the color dense histogram H2.
After the position of the interactive object is acquired in the second frame image, the robot tracks the interactive object: the mobile body is close to the interactive object, and the camera is adjusted again to be aligned with the interactive object. After 0.2S, the camera takes a third image frame.
In the third image frame, the Faster R-CNN algorithm is adopted to acquire bounding boxes of two detected objects in the third image frame, which are marked as b-box21 and b-box 22. And respectively calculating the coincidence degrees of the b-box21 and the b-box22 and the b-box2 of the interaction object determined in the second image frame, respectively judging whether the coincidence degrees of the two coincidence degrees contain the coincidence degree larger than 0.6, and if the coincidence degrees exist, tracking the detection object corresponding to the coincidence degree larger than 0.6 as the interaction object.
Under another situation, if the coincidence degrees calculated by the b-box2 and the b-box21 and the b-box22 respectively do not satisfy more than 0.6, color dense histograms of two detection objects acquired in the third image frame are calculated respectively, namely, color dense histograms of images selected by the b-box21 and the b-box22 are calculated, and H21 and H22 are obtained respectively.
Calculating the average value of the similarity of each of the color dense histogram H0 of the interaction object determined in the first image frame, the color dense histogram H2 of the interaction object determined in the second image frame, and the color dense histograms H21 and H22 of the two detection objects in the third image frame:
S21=[similarity(H0,H21)+similarity(H0,H22)]/2
S22=[similarity(H1,H21)+similarity(H2,H22)]/2;
wherein, calculating similarity () can be realized by the compareHist function of Open CV:
cvCompareHist(const CvHistogram*hist1,const CvHistogram*hist2,intmethod);
method:CV_COMP_COREEL
Figure GDA0001238661480000131
wherein,
Figure GDA0001238661480000132
where k is the number of the color dense histogram, d (h)1,h2) Representing a dense histogram h of colors1,h2The similarity between i and j is used for counting, Num is the number of bins of the color dense histogram, and may be set to Num 16 × 4, h in this embodimentk(i) Bin, h' of the k-th color dense histogramk(i) Representing a process of normalization.
In the above statement, cvCompareHist is the comparison function for color dense histograms, CvHistogram is the function used to create multidimensional histograms,hist1and hist2 isThe two colors compared are dense histograms.
After obtaining the two similarities, the similarity greater than the similarity threshold of 0.7 in S21, S22 is selected. Assuming that the values of S1, S2, S3, and S4 are 0.2 and 0.8, respectively, and S22 is 0.8>0.7, it is determined that the detection object corresponding to S22 is an interactive object and tracking is performed. The b-box22 corresponding to the interactive object in the third image frame is saved, along with a color dense histogram H22.
After the position of the interactive object is acquired in the third frame of image, the robot tracks the interactive object: the mobile body is close to the interactive object, and the camera is adjusted again to be aligned with the interactive object. After 0.2S, the camera takes a fourth image frame. The following tracking principle is as described above, and is not described herein again. It should be noted that, when similarity calculation is performed according to the color dense histograms of the preset number of frame images of the interactive object before the frame in the current image, the value of the preset number should be kept in a reasonable range, so as to avoid tracking delay caused by complex calculation.
Fig. 3 is a schematic structural diagram of an apparatus of an object tracking apparatus provided in the present application, and in conjunction with fig. 3, the apparatus includes:
a marking unit 301, configured to mark a bounding box of at least one detected object in a current image frame;
a calculating unit 302, configured to calculate a degree of coincidence between a bounding box of at least one detected object and a bounding box of a target tracking object in a previous image frame, respectively;
a determining unit 303, configured to determine the target tracking object from the at least one detection object according to the calculated at least one coincidence degree.
Further optionally, the calculating unit 302 is configured to: for any detection object in the at least one detection object, calculating the area of the intersection and the area of the union of the bounding box of the detection object and the bounding box of the target tracking object in the previous image frame; and determining the coincidence degree of the detection object and a boundary frame of the target tracking object in the previous image frame according to the area of the intersection and the area of the union.
Further optionally, the calculating unit 302 is configured to: and if the calculated at least one contact ratio has a contact ratio meeting a preset contact ratio condition, marking the detection object corresponding to the contact ratio meeting the preset contact ratio condition as the target tracking object. Further optionally, the determining unit 303 is configured to: if the coincidence degree which accords with the coincidence degree range does not exist in the at least one calculated coincidence degree, obtaining a dense color histogram of each detection object; calculating the similarity of the at least one detection object and the target tracking object according to the color dense histogram of the at least one detection object and the at least one historical color dense histogram of the target tracking object; and acquiring a detection object with the similarity meeting the set requirement as the object of the target tracking object.
Further optionally, the calculating unit 302 is configured to: if the calculated at least one contact ratio does not have a contact ratio meeting a preset contact ratio condition, acquiring a dense color histogram of the at least one detection object; respectively calculating the similarity of the at least one detection object and the target tracking object according to the color dense histogram of the at least one detection object and the color dense histogram of the target tracking object in at least one image frame before the current image frame; and acquiring a detection object with the similarity meeting the set requirement from the at least one detection object as the target tracking object.
Further optionally, the calculating unit 302 is configured to: respectively calculating the similarity of the color dense histogram of the detection object and the color dense histogram of the target tracking object in at least one image frame before the current image frame to obtain the similarity of at least one color dense histogram;
and acquiring an average value of the similarity of the at least one color dense histogram as the similarity of the detection object and the target tracking object.
Further optionally, the object tracking apparatus provided in the embodiment of the present application further includes a storage unit. The storage unit is used for storing the corresponding relation among the frame number of the current image frame, the boundary box of the target tracking object in the current image frame and the detection object defined by the boundary box of the target tracking object in the current image frame.
In the object tracking apparatus provided in this embodiment of the present application, the marking unit 301 marks a bounding box of at least one detected object in the current image frame, the calculating unit 302 calculates a degree of coincidence between the bounding box of the at least one detected object in the current image frame and a bounding box of a target tracking object in a previous image frame, and the determining unit 303 determines the target tracking object from the at least one detected object according to the calculated at least one degree of coincidence. The defect that the object tracking is easily influenced by the environment where the object is located is overcome, and more accurate pedestrian tracking is realized.

Claims (10)

1. An object tracking method, comprising:
marking a bounding box of at least one detected object in a current image frame;
respectively calculating the coincidence degree of the bounding box of at least one detection object and the bounding box of the target tracking object in the previous image frame; when the previous image frame is a non-first frame, determining the target tracking object of the previous image frame from the detection objects contained in the previous image frame based on the coincidence degree of a boundary frame of the detection object contained in the previous image frame and a boundary frame of the target tracking object in a next previous image frame of the previous image frame;
determining the target tracking object of the current image frame from the at least one detection object according to the calculated at least one degree of coincidence.
2. The method of claim 1, wherein for any detected object in the at least one detected object, calculating a degree of coincidence of a bounding box of the detected object with a bounding box of a target tracking object in a previous image frame comprises:
respectively calculating the intersection area and the union area of the bounding box of the detection object and the bounding box of the target tracking object in the previous image frame;
and determining the coincidence degree of the boundary box of the detected object and the boundary box of the target tracking object in the previous image frame according to the area of the intersection and the area of the union.
3. The method according to claim 1, wherein said determining the target tracking object for the current image frame from the at least one detected object based on the calculated at least one degree of overlap comprises:
if the calculated at least one contact ratio has a contact ratio meeting a preset contact ratio condition, marking a detection object corresponding to the contact ratio meeting the preset contact ratio condition as the target tracking object of the current image frame.
4. The method according to claim 1, wherein said determining the target tracking object for the current image frame from the at least one detected object based on the calculated at least one degree of overlap comprises:
if the calculated at least one contact ratio does not have a contact ratio meeting a preset contact ratio condition, acquiring a dense color histogram of the at least one detection object;
respectively calculating the similarity of the at least one detection object and the target tracking object according to the color dense histogram of the at least one detection object and the color dense histogram of the target tracking object in at least one image frame before the current image frame;
and acquiring a detection object with the similarity meeting the setting requirement from the at least one detection object as the target tracking object of the current image frame.
5. The method according to claim 4, wherein calculating, for any one of the at least one detection object, a similarity between the detection object and the target tracking object comprises:
respectively calculating the similarity of the color dense histogram of the detection object and the color dense histogram of the target tracking object in at least one image frame before the current image frame to obtain the similarity of at least one color dense histogram;
and acquiring an average value of the similarity of the at least one color dense histogram as the similarity of the detection object and the target tracking object.
6. The method according to any one of claims 1-5, further comprising:
and storing the corresponding relation among the frame number of the current image frame, the boundary box of the target tracking object in the current image frame and the detection object defined by the boundary box of the target tracking object in the current image frame.
7. An object tracking apparatus, comprising:
a marking unit for marking a bounding box of at least one detection object in a current image frame;
the calculating unit is used for respectively calculating the coincidence degree of the boundary frame of at least one detection object and the boundary frame of the target tracking object in the previous image frame; when the previous image frame is a non-first frame, determining the target tracking object of the previous image frame from the detection objects contained in the previous image frame based on the coincidence degree of a boundary frame of the detection object contained in the previous image frame and a boundary frame of the target tracking object in a next previous image frame of the previous image frame;
a determining unit, configured to determine the target tracking object of the current image frame from the at least one detection object according to the calculated at least one degree of overlap.
8. The apparatus of claim 7, wherein the computing unit is configured to:
for any detection object in the at least one detection object, calculating the area of the intersection and the area of the union of the bounding box of the detection object and the bounding box of the target tracking object in the previous image frame;
and determining the coincidence degree of the detection object and a boundary frame of the target tracking object in the previous image frame according to the area of the intersection and the area of the union.
9. The apparatus according to claim 7 or 8, wherein the computing unit is configured to:
if the calculated at least one contact ratio has a contact ratio meeting a preset contact ratio condition, marking a detection object corresponding to the contact ratio meeting the preset contact ratio condition as the target tracking object of the current image frame.
10. The apparatus according to claim 7 or 8, wherein the determining unit is configured to:
if the calculated at least one contact ratio does not have a contact ratio meeting a preset contact ratio condition, acquiring a dense color histogram of the at least one detection object;
respectively calculating the similarity of the at least one detection object and the target tracking object according to the color dense histogram of the at least one detection object and the color dense histogram of the target tracking object in at least one image frame before the current image frame;
and acquiring a detection object with the similarity meeting the setting requirement from the at least one detection object as the target tracking object of the current image frame.
CN201611232615.2A 2016-12-28 2016-12-28 Object tracking method and device Active CN106682619B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611232615.2A CN106682619B (en) 2016-12-28 2016-12-28 Object tracking method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611232615.2A CN106682619B (en) 2016-12-28 2016-12-28 Object tracking method and device

Publications (2)

Publication Number Publication Date
CN106682619A CN106682619A (en) 2017-05-17
CN106682619B true CN106682619B (en) 2020-08-11

Family

ID=58871920

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611232615.2A Active CN106682619B (en) 2016-12-28 2016-12-28 Object tracking method and device

Country Status (1)

Country Link
CN (1) CN106682619B (en)

Families Citing this family (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109859240B (en) * 2017-11-30 2021-06-18 比亚迪股份有限公司 Video object tracking method and device and vehicle
CN108830204B (en) * 2018-06-01 2021-10-19 中国科学技术大学 Method for detecting abnormality in target-oriented surveillance video
CN109034013B (en) * 2018-07-10 2023-06-13 腾讯科技(深圳)有限公司 Face image recognition method, device and storage medium
CN109684920B (en) 2018-11-19 2020-12-11 腾讯科技(深圳)有限公司 Object key point positioning method, image processing method, device and storage medium
WO2020113452A1 (en) 2018-12-05 2020-06-11 珊口(深圳)智能科技有限公司 Monitoring method and device for moving target, monitoring system, and mobile robot
CN110163068A (en) * 2018-12-13 2019-08-23 腾讯科技(深圳)有限公司 Target object tracking, device, storage medium and computer equipment
CN109743497B (en) * 2018-12-21 2020-06-30 创新奇智(重庆)科技有限公司 Data set acquisition method and system and electronic device
CN109800667A (en) * 2018-12-28 2019-05-24 广州烽火众智数字技术有限公司 A kind of pedestrian tracting method and system
SG10201905273VA (en) 2019-06-10 2019-08-27 Alibaba Group Holding Ltd Method and system for evaluating an object detection model
CN112585944A (en) * 2020-01-21 2021-03-30 深圳市大疆创新科技有限公司 Following method, movable platform, apparatus and storage medium
GB2610457A (en) * 2021-03-31 2023-03-08 Nvidia Corp Generation of bounding boxes
US20230206466A1 (en) * 2021-12-27 2023-06-29 Everseen Limited System and method for tracking and identifying moving objects

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101577005A (en) * 2009-06-12 2009-11-11 北京中星微电子有限公司 Target tracking method and device
CN102982559A (en) * 2012-11-28 2013-03-20 大唐移动通信设备有限公司 Vehicle tracking method and system
CN103914685A (en) * 2014-03-07 2014-07-09 北京邮电大学 Multi-target tracking method based on generalized minimum clique graph and taboo search
US9390506B1 (en) * 2015-05-07 2016-07-12 Aricent Holdings Luxembourg S.A.R.L. Selective object filtering and tracking
CN106023155A (en) * 2016-05-10 2016-10-12 电子科技大学 Online object contour tracking method based on horizontal set

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101577005A (en) * 2009-06-12 2009-11-11 北京中星微电子有限公司 Target tracking method and device
CN102982559A (en) * 2012-11-28 2013-03-20 大唐移动通信设备有限公司 Vehicle tracking method and system
CN103914685A (en) * 2014-03-07 2014-07-09 北京邮电大学 Multi-target tracking method based on generalized minimum clique graph and taboo search
US9390506B1 (en) * 2015-05-07 2016-07-12 Aricent Holdings Luxembourg S.A.R.L. Selective object filtering and tracking
CN106023155A (en) * 2016-05-10 2016-10-12 电子科技大学 Online object contour tracking method based on horizontal set

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于颜色直方图匹配的非刚性目标跟踪算法;马丽 等;《青岛理工大学学报》;20051231;第26卷(第5期);第79-83页 *

Also Published As

Publication number Publication date
CN106682619A (en) 2017-05-17

Similar Documents

Publication Publication Date Title
CN106682619B (en) Object tracking method and device
CN105447459B (en) A kind of unmanned plane detects target and tracking automatically
CN108875683B (en) Robot vision tracking method and system
JP2915894B2 (en) Target tracking method and device
CN110287907B (en) Object detection method and device
KR101788225B1 (en) Method and System for Recognition/Tracking Construction Equipment and Workers Using Construction-Site-Customized Image Processing
CN110610150A (en) Tracking method, device, computing equipment and medium of target moving object
Jiang et al. Multiple pedestrian tracking using colour and motion models
Saito et al. People detection and tracking from fish-eye image based on probabilistic appearance model
Liu et al. Spatial-temporal motion information integration for action detection and recognition in non-static background
Liang et al. Deep background subtraction with guided learning
EP3035242B1 (en) Method and electronic device for object tracking in a light-field capture
CN110458017B (en) Target tracking scale estimation method and related device
Baris et al. Classification and tracking of traffic scene objects with hybrid camera systems
Zhang et al. A novel efficient method for abnormal face detection in ATM
Jiang et al. Online pedestrian tracking with multi-stage re-identification
CN115797405A (en) Multi-lens self-adaptive tracking method based on vehicle wheel base
Vu et al. Real-time robust human tracking based on Lucas-Kanade optical flow and deep detection for embedded surveillance
Guo et al. Global-Local MAV Detection Under Challenging Conditions Based on Appearance and Motion
Hussien Detection and tracking system of moving objects based on MATLAB
CN114945071A (en) Photographing control method, device and system for built-in camera of recycling machine
Liu et al. A simplified swarm optimization for object tracking
Duanmu et al. A multi-view pedestrian tracking framework based on graph matching
KR20200079070A (en) Detecting system for approaching vehicle in video and method thereof
JP2004355601A (en) Target chasing device, target chasing method, computer-readable recording medium with program recorded and program

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Room 402, Building 33 Guangshun Road, Changning District, Shanghai, 2003

Applicant after: Shanghai zhihuilin Medical Technology Co.,Ltd.

Address before: Room 402, Building 33 Guangshun Road, Changning District, Shanghai, 2003

Applicant before: Shanghai Zhihui Medical Technology Co.,Ltd.

Address after: Room 402, Building 33 Guangshun Road, Changning District, Shanghai, 2003

Applicant after: Shanghai Zhihui Medical Technology Co.,Ltd.

Address before: Room 402, Building 33 Guangshun Road, Changning District, Shanghai, 2003

Applicant before: SHANGHAI MROBOT TECHNOLOGY Co.,Ltd.

Address after: Room 402, Building 33 Guangshun Road, Changning District, Shanghai, 2003

Applicant after: SHANGHAI MROBOT TECHNOLOGY Co.,Ltd.

Address before: Room 402, Building 33 Guangshun Road, Changning District, Shanghai, 2003

Applicant before: SHANGHAI MUYE ROBOT TECHNOLOGY Co.,Ltd.

CB02 Change of applicant information
TA01 Transfer of patent application right

Effective date of registration: 20200617

Address after: 201400 Shanghai Fengxian District Xinyang Highway 1800 Lane 2 2340 Rooms

Applicant after: SHANGHAI MUMU JUCONG ROBOT TECHNOLOGY Co.,Ltd.

Address before: Room 402, Building 33 Guangshun Road, Changning District, Shanghai, 2003

Applicant before: Shanghai zhihuilin Medical Technology Co.,Ltd.

TA01 Transfer of patent application right
GR01 Patent grant
GR01 Patent grant