CN110458861A

CN110458861A - Object detection and tracking and equipment

Info

Publication number: CN110458861A
Application number: CN201810417081.3A
Authority: CN
Inventors: 胡琦; 李献; 王世婷; 温东超
Original assignee: Kato Corp
Current assignee: Kato Corp
Priority date: 2018-05-04
Filing date: 2018-05-04
Publication date: 2019-11-15
Anticipated expiration: 2038-05-04
Also published as: CN110458861B

Abstract

This disclosure relates to object detection and tracking and equipment.Provide a kind of object tracking device for image sequence, wherein described image frame sequence includes multiple images frame, each picture frame includes at least one object, and the equipment includes determination unit, is configured as determining discrete objects detection image frame from described multiple images frame based on the information about the object relative status in picture frame；Tracking cell is configured as the object detection based at least one object in identified discrete objects detection image frame, obtains the information to image tracing for being used for image frame sequence.

Description

Object detection and tracking and equipment

Technical field

The present invention relates to the object detections and tracking in object detection and tracking more particularly to image frame sequence.

Background technique

Video tracking is commonly used in the target object in multiple images frame in identification a period of time.In video tracking, The object in each picture frame is detected to obtain the testing result of target object, then combine detection result is to obtain target object Tracking.In general, during the test, object is by assigned ID, and the testing result of the object with identical ID can be used for Obtain the motion profile of the object.

In recent years, extremely important for various applications (such as monitoring is analyzed, action recognition etc.) to image tracing.With The progress (such as introducing HOG feature) to attract people's attention is obtained in terms of person detecting, the tracking based on detection has much attraction. But it is still bottleneck that speed is detected in the case where meeting required precision.Most of existing trackings based on detection Detection always is carried out to each frame.This not can be carried out real-time tracking, especially true in HD video.

In addition, the tracking detected on a frame-by-frame basis always inevitably to " be difficult to detect " frame detects, should " it is difficult To detect " frame is, for example, the frame wherein blocked between object, it may cause tracking mistake, such as identity switching in this way, and This analysis that may result in the progress of the behavior to people, which exists, twists, and the track of the people may lose.

Therefore, there is still a need for improve for the tracking based on detection.

Method described in this section is the method that can be pursued, but is not necessarily the method previously conceived or pursued. Therefore, unless otherwise indicated, it should not assuming any method described in this section merely because being included in this section and becoming The prior art.Equally, unless otherwise stated, the problem of recognizing out about one or more methods should not be assumed that in this section On the basis of all recognized in the prior art any.

Summary of the invention

The present invention be in view of in the prior art the technical issues of and propose, and provide improved object tracing technique.

The present disclosure proposes it is a kind of it is improved based on detection to image tracing, wherein the content for considering video (especially It is the information of the object relative status about the object in picture frame) in the case where, only sample several discrete frames (hereinafter referred to For " sample detecting frame ") to execute object detection, and between two sample detecting frames interval can according to the content of video and Dynamic changes.

In one aspect, the present disclosure proposes a kind of object tracking device for image frame sequence, wherein image frame sequence Column include multiple images frame, and each picture frame includes at least one object.The equipment can include determining that unit, be configured as base Discrete object detection picture frame is determined from multiple images frame in the information about the object relative status in picture frame；And Tracking cell is configured as the object detection based at least one object in identified discrete object detection picture frame, It obtains for the information to image tracing in image frame sequence.

On the other hand, the present disclosure proposes a kind of method for tracing object for image frame sequence, wherein image frame sequence Column include multiple images frame, and each picture frame includes at least one object.The equipment can include determining that step, for based on pass The information of object relative status in picture frame determines discrete object detection picture frame from multiple images frame；And tracking Step, for the object detection based at least one object in identified discrete object detection picture frame, acquisition is used for The information to image tracing in image frame sequence.

In further aspect, a kind of equipment including at least one processor and at least one storage equipment, institute are proposed State at least one storage equipment on be stored with instruction, the instruction by least one described processor execution when may make it is described extremely A few processor executes method as described herein.

Still on the other hand, a kind of non-transient storage equipment for being stored with instruction is proposed, which when being executed can be with So that at least one processor executes method as described herein.

Higher tracking velocity may be implemented to image tracing based on detection in the disclosure, while still realizing higher tracking Precision, still can be such in full HD video.

The tracking of the disclosure is also applied for other to image tracing, such as car tracing etc..The disclosure can be used for Mankind's activity analysis, monitoring etc..

Referring to the following description of attached drawing accoding to exemplary embodiment, more features of the invention be will be apparent.

Detailed description of the invention

Be incorporated in specification and the attached drawing for constituting part of specification show the embodiment of the present invention, and with retouch State principle for explaining the present invention together.In the accompanying drawings, similar appended drawing reference indicates similar project.

Fig. 1 shows object detection in the prior art and tracking process.

Fig. 2 shows the examples in the prior art due to identity switching caused by blocking.

Fig. 3 schematically shows the tracking effect example of the disclosure.

Fig. 4 shows the schematic block diagram of the object tracking device according to the disclosure.

Fig. 5 shows the exemplary head and shoulder region as target area.

Fig. 6 shows the exemplary overlapping ratio calculation of the disclosure.

The exemplary motion change rate that Fig. 7 shows the disclosure calculates.

Fig. 8 shows example when being spaced too sparse between the picture frame for detection.

Fig. 9 shows example when being spaced too intensive between the picture frame for detection.

Figure 10 shows the example that the frame blocked should be skipped.

Figure 11 shows the method for tracing object according to the disclosure.

Figure 12 is the flow chart for showing the processing according to first embodiment of the present disclosure.

Figure 13 is the flow chart for showing the processing of the determination sample detecting frame according to first embodiment of the present disclosure.

Figure 14 schematically shows the setting of initial detecting frame.

Figure 15 schematically shows the adjustment of initial detecting frame.

Figure 16 schematically shows finally determining sample detecting frame and frame set.

Figure 17 schematically shows the association process according to first embodiment of the present disclosure.

Figure 18 is schematically shown between the testing result in the prediction and picture frame of the pre-track in picture frame Compare.

Figure 19 is shown schematically for the ROI region on the picture frame of detection.

Figure 20 is the flow chart for showing the processing according to second embodiment of the present disclosure.

Figure 21 is the flow chart for showing the processing of the determination sample detecting frame according to second embodiment of the present disclosure.

Figure 22 schematically shows the candidate image frame based on initial image frame.

Figure 23 is the flow chart for showing the processing according to third embodiment of the present disclosure.

Figure 24 is the flow chart for showing the processing of the determination sample detecting frame according to third embodiment of the present disclosure.

Figure 25 is the block diagram for showing the exemplary hardware arrangement for the computer system that the embodiment of the present invention may be implemented.

Specific embodiment

There has been described exemplary possible embodiments related to image tracing with what it is based on detection.In the following description, For illustrative purposes, numerous specific details are set forth in order to provide thorough understanding of the present invention.However, it will be apparent that It is that can practice the present invention without these specific details.In other cases, it is not described in detail well known structure And device, to avoid unnecessarily blocking, covering or obscuring the present invention.

The most of of the prior art are made of based on the tracking of detection person detecting step and data associated steps, it All handled on each frame in the image frame sequence of video, this is in paper Online Multi-Person Tracking-by-Detection from a Single,Uncalibrated Camera,Michael D.Breitenstein.etc.TPAMI 2010 is described, herein being integrally incorporated the paper herein by reference.

Fig. 1 is the flow chart for showing the tracking processing based on detection of the prior art being carried out to each frame.

For the picture frame t comprising at least one personage to be detected, person detector (it is based on sliding window or based on spy Sign etc.) detected on whole image frame to obtain testing result, then which detection knot data associating unit determines Fruit should guide which track of the previous tracking result on frame t-1.In order to pursue high-precision, data correlation method may be very multiple It is miscellaneous.For example, it is contemplated that arriving confidence level, the character positions of detector, movement and appearance, greedy scheme and score function are more concerned.

The speed of the prior art tracked frame by frame is very slow, because of type whether based on sliding window or based on feature Type, to all picture frames frame by frame detect and processing speed all can be reduced seriously to the detection of whole image frame.In addition, right In a frame, the region of search for detection is bigger, and detection speed is slower.

In addition, the object detection frame by frame and tracking of the prior art may suffer from identity switching, this may be especially Detecting " be difficult to detect " frame when occur, should " be difficult to detect " frame refers to what plurality of object inevitably mutually blocked Frame.However, it is noteworthy that in the track, the same person should always be assigned the same ID, and if the same person It is given different ID, then may cause the behavioural analysis presence to the personage and twist, and the track of the personage may lose It loses.

Specifically, if there are multiple objects to shade one another in monitoring scene, since art methods must be right " being difficult to detect " frame is detected, this may cause identity switching.As shown in Fig. 2, two people move in opposite direction and get over Come closer to.They intersect in frame t, then block.Due to blocking, so frame t is " be difficult to detect " frame.In frame t In the case where upper execution person detecting, the two people are detected.Then, if using associating policy (such as based on position, Movement is associated with appearance etc.) determine which testing result should guide which track because they too close to so that it cannot They are readily separated from by using simple associating policy, so identity switching occurs.

It has been found that movement is often to link up between consecutive frame, so in the current frame, object typically occurs in The region adjacent with the position in former frame in present frame, thus to all frames execute frame by frame detection and in entire frame it is enterprising Row detection is unnecessary.

In consideration of it, the present disclosure proposes improved using discrete frames detection and tracking to image tracing, and particularly, only A few frames are confirmed as discrete frames to execute detection.Such discrete frames, which determine, is considered adopting for the frame for detection Sample, and the discrete sampling for detection can be referred to as sample objects detection frame.In view of the one of only detection image frame sequence Partial frame rather than all frames, tracking velocity of the invention can be used for the figure of target detection by means of discrete (and not all) It is enhanced as frame.

In addition, the sampling of video frame can adaptively be executed according to the content of video, wherein the content of video is for example Movement, posture including object and the block information between object, and the interval between two sample detecting frames can basis The content of video and dynamic changes.

Preferably, the content of each frame in video can be indicated by reliability the value of the confidence.Reliability the value of the confidence can be used for Measure the reliability and complexity of frame.It is " being difficult to detect " frame that frame with higher reliability the value of the confidence, which has higher probability not, And the detection on the frame is correct.The reliability the value of the confidence of one frame can be according to the movement of object and posture and at this The block information of object calculates in frame.The sample detecting frame for using its reliability the value of the confidence as high as possible detects to execute to obtain More accurate testing result is obtained, This further reduces detection mistakes.

Therefore, by means of reliability the value of the confidence, the disclosure can to avoid the frame reliability the value of the confidence to it is low " be difficult to examine Survey " frame is detected to reduce tracking mistake, such as identity switches, so as to realize better accuracy.

As shown in figure 3, occurring due to blocking, frame t is " being difficult to detect " frame, the disclosure will on frame t-1 and frame t+1 without Detection is executed on frame t, and movement position of the object on frame t will be predicted to avoid ID switching.

This disclosure relates to object detection and tracking, and it is not limited to single object, and can be applied to multiple object detections And tracking.Object can be any moveable object, such as people, mobile animal, vehicle etc..The disclosure will exist wherein It is useful in the application scenarios of many object/personages, and has big especially suitable for wherein customer due to obtaining article The shopping scene and similar scene of deformation.For application environment, the disclosure is handled suitable for offline video, or The delay of online processing exports.

Embodiment of the disclosure is detailed below.Note that " image " refers to can in the context of description To be image of any appropriate form, such as the picture frame in video etc., therefore to a certain extent, " image " can be with " frame ", " picture frame " are used interchangeably.

Fig. 4 shows the schematic block diagram of equipment according to an embodiment of the present disclosure.It is according to an embodiment of the present disclosure to set Standby is the object tracking device for image frame sequence, and wherein image frame sequence includes multiple images frame, and each picture frame includes At least one object, and the equipment comprises determining that unit 401, is configured as based on opposite about the object in picture frame The information of state determines discrete object detection picture frame from multiple images frame；And tracking cell 402, it is configured as base The object detection of at least one object in identified discrete object detection picture frame is obtained for image frame sequence In the information to image tracing.

In addition, equipment 400 can also include storage equipment, it is used to store input picture frame and any amount of image Any pre-track and testing result of object on frame, image (especially its starting figure that wherein track can be corresponding Picture and final image) and image on corresponding object region store together.In addition, the track of storage can be assigned accordingly Index is in order to retrieve.It is noted that alternatively, this storage equipment can be located at except equipment 400.

Information about the object relative status in picture frame is usually related with the content in the picture frame of video, wherein regarding The content of frequency for example may include the block information between movement, posture and the object of object.Object relative status may include Occlusion state and/or picture frame between the relative status between object in one frame, such as object in same picture frame In object and previous image frames in object between relative status, such as about movement, posture etc..The relative status of object It is not so limited.

According to some embodiments, the information about the object relative status in picture frame may include in a picture frame The block information of object, for example, at least two it is person-to-person block, the screening between people and other objects (such as building, vehicle) Gear etc..The block information of object (the overlapping ratio between the subject area of such as each object) can be surveyed by various modes Amount.

For the object detection in image, the subject area that the representative part of object can be used to detect, and Without detecting whole image.It, can be using the head and shoulder portion of people as test object, as shown in Figure 5 taking human as example.Also it can be used His subject area, such as face area.This subject area can be detected by various technologies, for example, object detector can Be based on sliding window or based on feature etc..

Overlapping ratio can be calculated in many ways based on subject area, such as the intersection ratio or object of subject area Alternate position spike between region.Fig. 6 shows the overlapping between two rectangles for indicating the subject area for detection.

Overlap ratio rate is defined as the intersection of two rectangles and the ratio of union, as shown in fig. 6, wherein these rectangles are indicated same Subject area on one specific image, such as the head and shoulder portion rectangle of people.

Union=rectangle B+ rectangle A- intersection C

If being overlapped the predetermined threshold that ratio is greater than or equal to such as 0.8, the two objects indicated by rectangle A and B It can be considered as overlapping excessively to be difficult to be detected.

Alternatively, overlapping ratio can the alternate position spike (for example, being characterized by Euclidean distance) based on the center of the two rectangles Calculated, if their center location difference respectively divided by after width and height be both less than another predetermined threshold (such as 0.5), It is excessive and be difficult to be detected to can be considered as overlapping by the two objects that rectangle A and B are indicated.

According to some embodiments, the object relative status in picture frame can also include the phase between the object in consecutive frame To state, relative status such as related with the movement velocity of the object in these frames, posture and any other behavior.About The information of this relative status can be referred to as motion change information, and can be become by the movement of the object between picture frame Rate measures.

The motion change rate of object can calculate in various ways.According to one embodiment, current image frame can be based on In object relative to the object at least one previous image frames variation come computing object this motion change rate.Example Such as, the motion change rate of object can be indicated with the change rate of object velocity, this can be by using right in previous image frames The track of elephant calculates.

Fig. 7 shows the example calculation of the motion change rate of the object by the change rate expression of the speed of object.If The change rate of object velocity is greater than threshold value, then the movement of object is not constant, and otherwise movement may be considered constant.

As shown in fig. 7, circle is the position of object in pre-track.Wherein frame t-m can be regarded as being calculated Head (head) frame, and frame t-n can be regarded as tail (tail) frame for being calculated.Speed on one frame can be according to two Position on a frame is calculated.

WhereinIt is the speed of object in frame t-m.(x,y)_t-nIt is the known location of the object in frame t-n, (x, y)_t-mIt is the known location of the object in frame t-m, and Δ t_mIt is the time difference between frame t-n and frame t-m, T can be used^head- T^tailIt calculates, T^head, T^tailIt can indicate respectively time or the index of each frame.

The change rate of the speed of object is calculated as follows:

Wherein, (rate_u, rate_v) is respectively change rate of the speed of object on the direction x and the direction y, timeDiff For the time difference between frame t-m and frame t-n, can calculate as before.

If rate_u and rate_v are both less than threshold value, the speed of object be it is constant, otherwise speed is not constant.

The measurement of motion change rate is not so limited, and can also be changed relatively by other between object (such as The attitudes vibration etc. of object) it measures.

According to the disclosure, discrete object detection frame can be according to the information of the relative status about object in various ways To determine.

On the one hand, adjacent two intervals being used between the discrete picture frame of object detection can be according to this opposite shape State is adjusted to can choose picture frame appropriate as discrete picture frame to execute object detection.Two sample objects inspections The interval surveyed between frame can be related to the quantity of non-sampled images between the two frames, this can pass through the two picture frames Between index difference measure because the image in sequence successively sorts.

If the interval between two sample detecting frames is too big, tracking mistake may cause, when object may frequently change (such as the object on video fast moves or when the posture of object significantlys change) is especially true when change.

Fig. 8 shows the too sparse exemplary cases in the interval between consecutive frame, this may cause tracking mistake or even body Part switching.(a) in Fig. 8 shows the testing result of the track ID 0 in frame t-n, and (b) in Fig. 8 shows non-detection frame t, be For the frame between two discrete frames of object detection；The testing result (solid-line rectangle) in (c) frame t+m in Fig. 8.Such as Fig. 8 Shown, people (substantially changes its posture in two sample detecting frames in frame t-n to frame t+m).The two testing results cannot pass through It is interrelated using associating policy (for example, movement, position, appearance).Then, the testing result in frame t+m cannot be with track The prediction (dashed rectangle) of ID 0 is associated, thus new track ID1 will be created by the testing result in frame t+m, and ID has occurred Switching.Therefore, under this scene, the interval between two sample detecting frames should be smaller.

Note that prediction is commonly referred to as predicting the subject area on expectation picture frame by following previous track, and And prediction can be executed in various ways.For example, such prediction can be by calculating pre-track ID0 in last S image On average speed complete, usually take S=10.If pre-track ID0 ratio S is short, whole trace images are taken.Picture frame t+ Prediction rectangle/subject area of the dashed rectangle as shown on m can be calculate by the following formula:

Position_t+m=Position_t-n+speed*timDiff

Wherein, Position_t+mIndicate the representative position of the prediction subject area on picture frame t+m, such as prediction target area The center in domain；Position_t-nFor the representative position of subject area of the pre-track ID0 on its picture frame t-n, such as target area The center in domain, index difference/time difference of the timeDiff between picture frame t-n and picture frame t+m, can be such as preceding calculating, speed It can indicate aforementioned average speed.Certainly, prediction process can be realized in other ways, such as described below.

If the interval between two sample detecting frames is too small, tracking velocity will be adversely affected, for video On object with constant speed it is mobile or for remaining stationary it is especially true.

Fig. 9 shows the too intensive sample situation in the interval between consecutive frame, and in this case, frame t-n and t are tested It measures.(a) and (b) in Fig. 9 shows detection frame t-n and t, and (c) in Fig. 9 shows still undetected frame t+m.Such as figure Shown in 9, the people remain stationary for a long time.Interval between two detection frames is too small so that it cannot realize faster tracking velocity. So in this scenario, the interval between two sample detecting frames should be bigger.

Accordingly, it is considered to tracking mistake and tracking velocity between compromise, should suitably determine two sample detecting frames it Between interval.

According to one embodiment, the interval between two adjacent discrete objects detection frames can be according to the two picture frames Between the motion change rate of object adjust.Preferably, the interval between two adjacent discrete objects detection frames can be with The motion change rate of object is negatively correlated.That is, the motion change rate of object is bigger, it is spaced smaller.

More specifically, at least one object in picture frame, as long as the movement of the object between adjacent image frame becomes Rate is greater than or equal to predetermined threshold, this, which might mean that, has big variation and the object in picture frame between picture frame Movement is not constant, therefore should reduce interval, to select more dense picture frame accurately to detect this big change Change.When the motion change rate of all objects is less than predetermined threshold, this might mean that the movement of the object in picture frame is basic On be it is constant, due to a small amount of picture frame be sufficiently used for detection such case, interval will be amplified.In this case, Preferably, object velocity is smaller, is spaced bigger.

In this aspect, according to one embodiment, multiple images frame initially can be used as initial samples by sampling at a predetermined interval Picture frame, this, which can be equal to, is divided into the frame set with predefined size for multiple images frame, what the size of frame set can refer to Be included in the number of the frame in frame set, and can correspond to the interval between the start frame of frame set and final frame/away from From, and initial samples picture frame is the start frame and/or final frame of frame set.

Then the interval between the object detection picture frame of every two sampling can consider the object in such image Motion change rate in the case where and be adjusted.For example, from the first initial samples set of frames, the position of final frame can be with It is adjusted in the case where considering motion change rate of the object in final frame relative to start frame, then becomes updated Object detection picture frame, next frame by the start frame as subsequent frame set, and can be used as adjusting subsequent frame collection The basis of corresponding final frame in conjunction.The rest may be inferred, until adjustable all initial samples picture frames.Sampled image frames are The start frame and/or final frame of all frame set, such as can be the first start frame and all final frames later, Huo Zheke To be all start frames and last final frame, etc..

The track of picture frame can be obtained in a manner of various in this field.What track all can finally be determined in acquisition It is obtained, or can be formed along with determination/adjustment of sample objects detection frame based on these picture frames when picture frame.Example Such as, after initial object detection frame is adjusted, the testing result of subsequent object detection frame adjusted can be previously formed Track is associated to update track.The rest may be inferred, until having detected all object detection frames and ultimately forming track, The testing result of partition image between middle discrete picture can by it is well known in the art it is various in a manner of be predicted.

According to another embodiment, multiple images frame can be divided into frame set in initial being accommodated property, wherein subsequent Frame set is referred to previous frame set and is dynamically arranged.More specifically, the first two sampled image frames (first frame set) can be with It is initially sampled at a predetermined interval or their interval can consider the motion change rate of object and by into one as described above Step is adaptively adjusted, and can form track based on the testing result of the two frames.Then, it is based on the first two picture frame In following frame, subsequent frame set can be set, the size of the subsequent frame set is equal to aforementioned predetermined space or adjusted Interval, wherein the final frame in subsequent frame set can be considered as the initial image frame for object detection, it is then possible to examining The size of subsequent frame set is further adaptively adjusted in the case where the motion change rate for having considered object, especially final frame Position, to become updated object detection picture frame, then, the object detection results of subsequent image frames can with it is previous Track is associated.The rest may be inferred, until having handled all picture frames.

According to one embodiment, determination unit includes the unit for being configured as setting initial image frame, and is configured as Initial pictures are updated relative to the motion change information of previous object detection picture frame according to the object in initial image frame The unit of frame.Wherein, when the motion change information is greater than or equal to second threshold, initial image frame and previous object are examined The distance between altimetric image frame/interval should reduce, it means that initial image frame should be adjusted to examine closer to previous object Altimetric image frame, and when the motion state change information is less than second threshold, initial image frame and previous object detection The distance between picture frame/interval should increase, it means that initial image frame should be adjusted to further from previous object detection Picture frame.

On the other hand, discrete objects detection frame can be determined based on the reliability the value of the confidence of picture frame, and be had Higher than threshold value reliability the value of the confidence picture frame as object detection picture frame.Reliability the value of the confidence refers to reflection picture frame In object a possibility that not being " being difficult to detect " value.Frame with higher reliability the value of the confidence about following situation have compared with High probability: the frame is not that the detection on " being difficult to detect " frame and the frame is correct.

As shown in Figure 10, it is blocked on some frames.(a) frame t-n, there is no block；(b) frame t is blocked；With (c) frame t+m, there is no block.But if the object on frame shades one another, the detection on frame may be inaccurate therefore excellent Selection of land skipped frame t executes detection to avoid to it.

Reliability the value of the confidence (can especially be primarily based upon blocking for object in picture frame based on object relative status information State) it determines.Preferably, the reliability the value of the confidence of picture frame can with the object in picture frame block ratio negative correlation, This means that blocking, ratio is bigger, and reliability the value of the confidence is smaller.Specifically, if the object on frame shades one another, on frame Detection may inaccuracy, therefore preferably skip the frame and detected to avoid to it.Therefore, it can at least be sent out to avoid to possible The frame that raw object blocks carries out object detection, therefore compared with prior art, detection accuracy can be improved.

In this respect, the reliability confidence of each picture frame can be calculated for multiple images frame according to one embodiment Then reliability the value of the confidence is higher than the picture frame of threshold value as object detection picture frame by value.As set forth above, it is possible to improve detection Precision.In addition, in order to be further reduced the quantity of sample detecting frame, it can be further at a predetermined interval to frame sampling.

According to another embodiment, multiple images frame can be firstly split into the frame set of the frame with predetermined quantity, so Afterwards for each frame set, the reliability the value of the confidence of each picture frame can be calculated, then has in set of frames and is higher than threshold The picture frame of the predefined quantity of the reliability the value of the confidence of value can be selected as object detection frame.For example, for each frame collection Close, can choose picture frame with highest reliability the value of the confidence or reliability the value of the confidence it is high before several picture frames as object Detection frame.Alternatively, can choose the closest predefined frame in frame set in the case where reliability the value of the confidence is closer to each other The frame of (such as final frame, intermediate frame etc.) is used for object detection.Corresponding to each case, predetermined threshold can be set adaptively Value.

According to yet another embodiment, the frame set divided from multiple images frame can be adjusted, such as described above Consider the motion change information of object and is adjusted.It is then possible to calculate each image for such adjusted frame set Then the reliability the value of the confidence of frame can select the picture frame of the predefined quantity in frame set to examine as object as described above Survey frame.

It, can be only to including a part of frame in frame set rather than institute for frame set according to another embodiment There is frame to execute the calculating of reliability the value of the confidence.Preferably, can only to include frame set final frame predefined quantity frame Rather than reliability the value of the confidence is calculated to all frames in frame set.The frame of predefined quantity including final frame may include most The frame of whole frame two sides, final frame can be placed in the middle in this case, or only includes the frame in frame set.As an example Son, the predefined quantity can also be determined in the case where considering the motion state change information of final frame.Motion state Change information is bigger, then predefined quantity is bigger.

According to another embodiment, frame set can be dynamically adjusted with reference to previous sample objects detection image frame.More Specifically, the first frame set of the frame including predetermined quantity can be set first, or can be considered motion change information and into The size of the one successive step first frame set, the then object inspection with highest reliability the value of the confidence in first frame set Sample objects detection frame can be confirmed as described above by surveying frame.Then, first frame set can be updated to determine Final frame of the object detection frame as frame set, and be based on the final frame, can be arranged as described above subsequent new The frame that frame set, such as new frame set include the predetermined quantity, or be equal to including its number and be included in previous frame set In the number destination frame of frame, etc., or can consider that motion change information is new further to adjust this also as described abovely Frame set, then can as described above for such subsequent frame set using reliability the value of the confidence come selecting object detection Frame is as sample objects detection frame.The rest may be inferred, until having handled all frame set.

In accordance with an embodiment of the present disclosure, determination unit may further include: computing unit 403, be configured as based on figure Blocking ratio and calculate the reliability the value of the confidence of picture frame as the object in frame；And selecting unit 404, it is configured as selecting Reliability the value of the confidence is higher than the picture frame of threshold value as object detection picture frame.

In accordance with an embodiment of the present disclosure, the determination unit can also include: for object detection picture frame to be determined, It is configured as the unit of setting initial image frame, and the number including initial image frame being configured to determine that in multiple images frame The unit of the picture frame with highest reliability the value of the confidence in a candidate image frame.For example, having highest reliable for determining The unit of the picture frame of property the value of the confidence can realize by selecting unit 404, or can be except selecting unit 404.

Preferably for each of several candidate image frames candidate image frame, candidate figure can be based further on Its reliability the value of the confidence is calculated as distance/interval of frame to initial image frame.

Preferably, the quantity of several candidate image frames can the object velocity based on the object in initial image frame come Setting, and wherein object velocity is smaller, then and the quantity is smaller, to reduce the calculating cost to sampled image frames.

It preferably, can be according to the object in picture frame to the distance of previous object detection picture frame from initial image frame Motion state change information adjust, and wherein when motion state change information is greater than or equal to second threshold, distance meeting Shorten.In that case it is preferable that motion change rate is bigger, apart from smaller.When motion state change information is less than the second threshold When value, which is increased.In that case it is preferable that object velocity is smaller, distance is bigger.

After discrete objects detection frame has been determined, the track based on discrete objects detection frame can be formed, such as pass through The testing result of object detection frame is connected with corresponding pre-track.Particularly, between two adjacent discrete picture frames The testing result of partition image can use the testing result of the two consecutive frames and be interpolated, or using be included in this two It pre-track on previous frame in a frame and is interpolated including the testing result in the following frame in the two frames, so Track is formed along these testing results that will be connected to each other afterwards.

It is noted that this interpolation can execute in various ways.For example, interpolation algorithm appropriate is that simple bilinearity is inserted Value-based algorithm.

According to one embodiment, tracking cell 402 can also include associative cell 405 and positioning unit 406, associative cell 405 are configured as the object in the testing result of the object in existing object detection image frame and previous objects detection image frame Testing result be aligned, positioning unit 406 is configured as the detection knot based on the object on existing object detection image frame Fruit and track determining from previous objects detection image frame, determine previous objects detection image frame and subsequent object detection image The tracking information of object between frame.

Said units can be realized in various ways, such as software module, hardware component, firmware etc., as long as they can Realize described function.

In accordance with an embodiment of the present disclosure, a kind of method for tracing object for image frame sequence is provided, wherein picture frame Sequence includes multiple images frame, and each picture frame includes at least one object.Figure 11 shows the stream according to disclosed method Cheng Tu.

In step S1101 (hereinafter referred to as determine step), based on the information about the object relative status in picture frame, Discrete object detection picture frame is determined from multiple images frame.

In step S1201 (hereinafter referred to as tracking step), based in identified discrete object detection picture frame The object detection of at least one object obtains the information to image tracing in image frame sequence.

It is then possible to be exported the information as tracking result, or it can be further processed to obtain tracking knot Fruit.

Preferably, the information about object relative status may include that object in picture frame blocks ratio and picture frame In at least one of the motion change rate of object.

Preferably, determine step can also include calculate step, for for each picture frame based on pair in picture frame Elephant blocks ratio to calculate the reliability the value of the confidence of the picture frame, and selection step, for selecting with higher than threshold value The picture frame of reliability the value of the confidence is as object detection picture frame.

Preferably, the distance between adjacent two discrete object detection picture frames can be according in two picture frames The motion change rate of object adjust.

Preferably, determine that step can also include: that initial image frame is arranged for object detection picture frame to be determined There is highest reliability the value of the confidence in step, and several candidate image frames including initial image frame of determining multiple picture frame Picture frame as object detection picture frame the step of.

Preferably for each of several candidate image frames, can be based further on from the candidate image frame to first The distance of beginning picture frame calculates its reliability the value of the confidence.

Preferably, the quantity of several candidate image frames can at least be set based on the object velocity in initial image frame It sets, and preferably, object velocity is smaller, quantity is smaller.

It preferably, can be according to pair in initial image frame to the distance of previous objects detection image frame from initial image frame As being adjusted relative to the motion change rate of the previous objects detection image.When motion change rate is greater than or equal to second threshold When, which reduces, and when the motion change rate of all objects is both less than second threshold, which is increased.

Preferably, the reliability the value of the confidence of picture frame can with the object in picture frame block ratio negative correlation.

Preferably, the distance between adjacent two discrete picture frames for object detection can be with the two images The motion change rate of object between frame is negatively correlated.

Preferably, the step of step may include for initial image frame to be arranged is determined and for according to initial pictures Object in frame updates initial image frame as object detection relative to the motion state change information of previous detection image frame The step of picture frame.When motion state change information is greater than or equal to second threshold, initial image frame and first object detection The distance of picture frame reduces, and when motion state change information is less than second threshold, initial image frame and first object are examined The distance of altimetric image frame increases.

Preferably, tracking step may include: associated steps, for by the inspection of the object in existing object detection image frame Result is surveyed to be aligned with the testing result of the object in previous objects detection image frame；And positioning step, work as being based on The testing result of object on preceding object detection picture frame and track determining from previous objects detection image frame, determine first The tracking information of object between preceding object detection picture frame and subsequent object detection image frame.

Using the determination and detection of discrete picture frame, therefore the quantity of the picture frame for object detection can be reduced, because The detection and tracking speed of picture frame can be improved in this.

Preferably, discrete picture frame can the content based on video (be based especially on the object relative status in image Information is more particularly based on motion change rate) adaptively further determine that, it is accordingly used in the picture frame of object detection and tracking It can more suitably be determined, and tracking velocity can be further improved, while precision is not adversely affected.

It is highly preferred that discrete picture frame can be further determined that based on the reliability the value of the confidence of picture frame, described image frame Block information of the reliability the value of the confidence in particular based on the object in picture frame it is calculated so that any " being difficult to detect " Frame can be skipped effectively and can be detected to avoid on low " being difficult to detect " frame of reliability the value of the confidence, to reduce Mistake, such as identity switching are tracked, so as to realize better accuracy, and the present invention only detects a few frames, To reach faster tracking velocity.

Hereinafter, some embodiments be will be described in detail with reference to the accompanying drawings.

First embodiment

Hereinafter, first embodiment of the present disclosure is described with reference to Figure 12, Figure 12 is the first reality shown according to the disclosure Apply the flow chart of the object detection and tracking processing of example.Input is the image frame sequence of video, and in such embodiments, Input picture frame will be divided into the frame set for being used for object detection.Moreover it is preferred that frame set can be in view of about figure As the object relative status in frame information (the especially motion change rate of object) in the case where be adaptive adjusted, and Once suitably determining frame set, track can be obtained.

In step S1201, information (the especially object about the object relative status in picture frame can considered Motion change rate) in the case where, be suitably set the sample detecting image of present frame set and the final frame as frame set.

For the present frame to be determined, corresponding input can also include the object's position in previous frame.In previous frame Object's position can be used for predicting the position of object in the movement of object, including movement velocity and direction and a frame, and can For being sampled to detection frame.

Figure 13 schematically shows the process flow in step S1201.

As shown in step S1301, initial detecting frame can be selected from multiple images sequence.Such initial detecting frame can To be determined relative to previous detection frame, and their interval can be predetermined value.

Preferably, such initial detecting frame can be the final frame of frame set, and can be according to previous frame set Size is determined relative to previous detection frame.For example, if the size of previous frame set is S_last, the start frame of present frame set Index be T_start, it is the next frame of the final frame of previous frame set, the then frame of the initial samples detection frame of present frame set Index is T_start+S_last- 1, it is indicated as T_init, as shown in figure 14.

In step S1302, it is contemplated that the motion change rate of the object in picture frame can further adjust initial detecting Frame.The motion change rate of object can be calculated by using the track of the object in pre-track.If the change of object velocity Rate be greater than or equal to threshold value, then it is believed that the movement of object be not it is constant, otherwise move be constant.Reference above Fig. 7 describes this calculating.For at least one object in initial image frame, as long as the movement of an object is not constant , then the movement of picture frame is regarded as non-constant, and initial image frame can be updated.

More specifically, if the speed of an object in object in scene is non-constant, the frame of initial detecting frame Index can be adjusted to that: T_{init_opt}=T_init- β, as shown in figure 15.The value of β is adjusted according to the change rate of the speed of object, such as The change rate of fruit object velocity is big, then β is set as biggish value, it means that the image big for motion change, the sampling interval can With more dense.Otherwise, β is set as lesser value.

For example, in order to determine value β, when, there are when at least one object, motion change rate can be respectively in picture frame Largest motion change rate among the motion change rate of a object, or the statistical value of the motion change rate for each object, such as Average value etc..

Then, if the speed of all objects is constant in scene, the frame index of initial detecting frame is adjusted are as follows:

T_{init_opt}=T_init+α。

Wherein T_initIt is the initialization frame index of sample detecting frame.α value is adjusted according to the speed of object.The speed of object Smaller, then α is assigned to bigger value, this might mean that, for constant image, the sampling interval may be sparse.

Thus, can the detection image frame that sampled with final optimization pass and determination by adjusting initial detecting frame.

In step S1303, it is based on sample detecting image and previous detection image, present frame set can also be formed, As shown in figure 16.

In step S1202, detection can be executed to the sample detecting frame of each frame set.Another example is use to be based on The object detector of adaBoost executes detection to whole image to generate testing result.

As another example, it can be examined in the detection frame of sampling using the object of off-line training by using sliding window Device is surveyed to execute object detection (being designated as global detection).Global detection can provide initial object position for new tracker, And the object's position of correcting offset.In the detection structure frame by frame of the prior art, pursues high-precision and always mean that sacrifice is handled Speed, and computational load is higher.But in the disclosure, due to executing global detection in a few frames, so processing speed It significantly improves.

In step S1203, the object detected can be associated with current track.Data correlation is for determining global inspection It surveys which track is result belong to, and the testing result for belonging to same target is linked to long track.Similar to the prior art, use Movement, position, appearance carry out data correlation, as shown in figure 17.It in one implementation, and can be based on the track in frame Prediction executes this association compared between testing result, as shown in figure 18.

For example, by comparing between the prediction and testing result of track in frame position and difference in size execute this meter It calculates.Can determine testing result whether be object in present frame position.As another example, which is using personage's weight Color similarity between trajectory predictions and testing result in frame is compared in new recognition methods.

Firstly, generating track-testing result pair for each track and each testing result.

Secondly, the associated score of each track-testing result pair can be calculated, wherein associated score is according to movement, position It sets determined with appearance.

AssociationScore=

SimilarityScore–SizeDifference–PositionDistance

SimilarityScore is the color similarity between the trajectory predictions in testing result and present frame, can quilt It determines in various ways.

SizeDifference is the difference in size between the trajectory predictions in testing result and present frame.For example, it can To be calculated as follows as SizeRatio:

SizeRatio=

MIN(prediction_width,detection_width)/MAX(prediction_width,detection_ width)

Wherein, detection_width is the width of testing result, and prediction_width is object in present frame The width (for example, width of subject area) of prediction.

PositionDistance is the alternate position spike on testing result and present frame between trajectory predictions.For example, it can be with It is calculated as diffRatio as follows:

PositionDiffX=| predition_cx-detection_cx |

PositionDiffY=| prediction_cy-prediction_cy |

DiffRatioX=positionDiffX/objectSize

DiffRatioY=positionDiffY/objectSize

DiffRatio=MAX (diffRatioX, diffRatioY)

Wherein predition_cx, predition_cy are the centers for the subject area predicted in present frame, can be sat by x, y Mark instruction, is calculated, and detection_cx, detection_cy are the centers of testing result in accordance with the following methods, can be by X, y-coordinate instruction.ObjectSize indicates object size, and usually the size of subject area (for example, rectangle) is (for example, face Product).

Prediction_cx=timeT_position_cx+timeDiff*speed_X

Prediction_cy=timeT_position_cy+timeDiff*speed_Y

Prediction_width=timeT_position_Width

Prediction_height=timeT_position_Height

Wherein, (prediction_cx, prediction_cy, prediction_width, prediction_height) It is the prediction data of object in present frame, the x at the subject area center respectively predicted, the subject area of y-coordinate and prediction Width and height.(timeT_position_cx, timeT_position_cy, timeT_position_Width, timeT_ Position_Height) be object in known frame T location information, the respectively x at subject area center, y-coordinate and The width and height of subject area.And speed_X and Speed_Y are by as calculated above with reference to processing described in Fig. 7 Object speed.

DiffRatio and sizeRatio can distinguish measurement object prediction testing result between location similarity and Size similarity.If diffRatio and sizeRatio are both less than corresponding threshold value, object is associated with testing result.

Then, according to associated score calculated, the track detection pair that associated score is less than threshold value is removed, if a rail Mark tracks associated or multiple with multiple testing results are associated with a testing result, then with highest associated score Track-testing result will be to will be kept as association results.

In step S1204, the track of the object in the non-sample frame in frame set is determined.

In view of the possible successful association of some objects, and some objects fail, we are according to association results to these objects Method is determined using different tracks, to realize tracking velocity as fast as possible and realize tracking accuracy as high as possible.

For track associated with global detection result, rapid location location algorithm can use, such as insert by movement Value.One example is according to estimation position.

New_position_X=timeT_position_X+timeDiff*speed_X

New_position_Y=timeT_position_Y+timeDiff*speed_Y

New_position_Width=timeT_position_Width

New_position_Height=timeT_position_Height

Wherein, (New_position_X, New_position_Y, New_position_Width, New_position_ Height) be object in present frame position, the x at the subject area center respectively in present frame, y-coordinate and subject area Width and height.(timeT_position_X, timeT_position_Y, timeT_position_Width, timeT_ Position_Height) be object in known frame T position.And speed_X and Speed_Y are by such as joining above According to the speed for the object for handling calculating described in Fig. 7.Then it can be executed in the subject area with estimated position Object detection.For example, using object detector by sliding window come sweep object region with test object.

For, all without associated track, in the next frame, connecting since object moves with any one global detection result Continuous property, object generally occurs in region adjacent with the position in former frame in the frame, therefore the inspection only in ROI region It surveys (being represented as local detection) to be performed, wherein ROI region is estimated according to the movement of object, than whole image frame It is much smaller.Part detection is the test object on the partial region of frame, rather than entire frame is detected.This can reduce calculating Cost.And due to the continuity of movement, object tends to occur at the adjacent area of the position in former frame, therefore ROI region It can with maximum probability cover object.

As shown in figure 19,19 (a) detection of an object in the current frame is shown, as shown in solid-line rectangle, and 19 (b) In, dashed rectangle is the prediction of estimation, and biggish solid-line rectangle is ROI region, and lesser solid-line rectangle is testing result Object.Compared with global detection, it is much lower to calculate cost, and reach similar precision.

Firstly, the following ROI region determined for executing detection.

ROI region is detected according to the following formula.

ROI_position_cx=prediction_cx

ROI_position_cy=prediction_cy

ROI_position_width=prediction_width*widthRatio

ROI_position_height=prediction_height*heightRatio

Wherein, ROI_position_cx, ROI_position_cy, ROI_position_Width, ROI_position_ Height can indicate respectively the x at ROI region center, the width and height of y-coordinate and ROI region.Prediction_cx, Prediction_cy is the center of object prediction in present frame.WidthRatio, heightRatio are predefined values, The default value of widthRatio, heightRatio are 1.8.

Secondly, detection is executed in ROI region with test object.For example, using object detector by sliding window come ROI region is scanned with test object.

The size that frame set is dynamically adjusted at least through the motion change rate according to object in frame may be implemented faster Tracking velocity, while keeping precision.

Second embodiment

Second embodiment of the present disclosure is described below in reference to Figure 20, Figure 20 is shown according to second embodiment of the present disclosure Object detection and tracking processing flow chart.Input is the image frame sequence of video, and in such embodiments, input Picture frame will be divided into the frame set for being used for object detection.Moreover it is preferred that in view of opposite about the object in picture frame The block information of the information of state, especially object is adaptively adjusted frame set, and frame collection has once suitably been determined It closes, so that it may obtain track.

In step S2001, the information about the object relative status in picture frame, especially object can considered Shielding rate, in the case where, it is suitably set the sample detecting image of present frame set and the final frame as frame set.

For the present frame to be determined, corresponding input further includes the object's position in previous frame.Object in previous frame Position is used for the position of object in predicted motion, including movement velocity and direction and a frame, and for examining to sampling Frame is surveyed to be sampled.

Figure 21 schematically illustrates the process flow in step S2001.

As shown in step S2101, initial detecting frame can be selected from multiple images sequence.Preferably, such as step S1301 Described in, this initial detecting frame can be determined relative to previous detection frame according to the size of previous frame set.

In step S2102, candidate sample detecting frame can be determined around initial detecting frame, and preferably, initially Detection frame is placed in the middle.

Figure 22 is shown according to the disclosure for determining the exemplary candidate detection frame of final detection frame.Such as Figure 22 institute Show, the start frame of candidate sample detecting frame is T_init- δ, the final frame of candidate sample detecting frame are T_init+ δ, wherein T_initIt indicates Initial detecting frame, the initial detecting frame can be initially set to have interval relative to previous frame, which can be predefined Value, or can be adaptively adjusted as already described above, and δ can be predefined value, or can be according to the speed of object Degree is to adjust value δ.In order to reduce the calculating cost of sample detecting frame, object velocity is smaller, then the value is smaller.

Note that being exemplary above, initialization frame can not be placed in the middle, and candidate frame can be other forms.

In step S2103, the reliability the value of the confidence (FRCV) of each candidate sample detecting frame is calculated, and selects to have most The frame of big FRCV is as sample detecting frame.The calculating of FRCV, can be as it is contemplated that shielding rate on frame between two objects Upper calculating.

In addition, it is contemplated that the probability that detection mistake closer to initial detecting frame, does not occur on the frame for a frame is then high, and And the reliability the value of the confidence of the frame is big.Therefore, the calculating of FRCV is it is further contemplated that between a frame and initial detecting frame Distance.

Exemplary formula for calculating FRCV is as follows.

FRCV_i=1-w₁*max(overlapRatio)-w₂*|T_i-T_init|/δ

Wherein T_iI-th frame, w₁And w₂It is predefined weight factor, can be determined by machine learning method.Max (overlapRatio) indicate maximum heavy in the overlapping ratio calculated between the every two object in initial image frame Folded ratio.δ can be value predetermined, or value δ can also be adjusted according to the speed of object.The speed of object is got over Small, δ is set as smaller value.

Then, select the maximum frame of reliability the value of the confidence as sample detecting frame.That is,

I=argmax (FRCV_i), wherein i is frame index.

In step S2104, detection image based on pattern detection image and before can also form present frame collection It closes, as shown in figure 16.

Processing in the step S2002-2004 of second embodiment is almost the same with the processing in first embodiment, that is, place Reason is similar with the processing in step S1202-1204, and therefore retouching in detail thereof will be omitted the processing in these similar steps It states.

It by utilizing the calculated the value of the confidence of block information based on object in picture frame, can effectively skip any " It is difficult to detect " frame, to avoid tracking mistake, especially identity switches.

3rd embodiment

Third embodiment of the present disclosure is described below in reference to Figure 23, Figure 23 is shown according to third embodiment of the present disclosure Object detection and tracking processing flow chart.Input is the image frame sequence of video, and in such embodiments, input Picture frame will be divided into the frame set for being used for object detection.Moreover it is preferred that in view of opposite about the object in picture frame Both the motion change rate of the information of state, especially object and block information are adaptively adjusted frame set, and once fit Locality determines frame set, so that it may obtain track.

In step S2301, the object in the information about object relative status, especially picture frame can considered Shielding rate and both motion change rates in the case where, be suitably set adopting for present frame set and the final frame as frame set Sample detection image.

For the present frame to be determined, corresponding input further includes the object's position in previous frame.Object in previous frame Position of the position for object in predicted motion, including movement velocity and direction and a frame, and for sample detecting Frame is sampled.

Figure 24 schematically shows the process flow in step S2301.

As shown in step S2401, initial detecting frame can be selected from multiple images sequence.Preferably, such as step S1301 Described in, this initial detecting frame can be determined relative to previous detection frame according to the size of previous frame set.

In step S2402, it is further contemplated that the movement of the object in frame optimizes initial detecting frame, such as above Text is then based on the initial detecting frame of optimization described in step S1302, and time can be determined around the initial detecting frame of optimization Sample detecting frame is selected, as above described in step S2102.

In step S2403, the reliability the value of the confidence (FRCV) of each candidate sample detecting frame is calculated, and selects to have most The frame of big FRCV is as sample detecting frame, as above described in step S2103.

In step S2404, based on sample detecting picture frame and previous detection image frame, present frame can also be formed Set, as shown in figure 16.

Processing in the step S2302-2304 of 3rd embodiment is almost the same with the processing in first embodiment, that is, place Manage it is similar with the processing in step S1202-1204, and therefore by omit the processing in these similar steps detailed description this In.

By means of both block informations of motion change rate and object in picture frame, it is any " difficult effectively to skip With detection " frame, while can determine the interval between detection frame more properly preferably to defer to the change of the movement between picture frame Change to avoid tracking mistake, especially identity switches, and tracking accuracy can be further improved, and tracking velocity is not by unfavorable shadow It rings.

Figure 25 is the block diagram for showing the exemplary hardware arrangement of computer system 1000 of implementable the embodiment of the present invention.

As shown in figure 25, computer system includes computer 1110.Computer 1110 is deposited including processing unit 1120, system Nonvolatile memory interface 1150, user's input can be removed in reservoir 1130, non-removable nonvolatile memory interface 1140 Interface 1160, socket 1170, video clip 1190 and output peripheral interface 1195, they are connected by system bus 1121 It connects.

System storage 1130 includes ROM (read-only memory) 1131 and RAM (random access memory) 1132.BIOS (basic input output system) 1133 resides in ROM 1131.Operating system 1134, application program 1135, other program modules 1136 and some program datas 1137 reside in RAM 1132.

Non-removable nonvolatile memory 1141 (such as hard disk) is connected to non-removable nonvolatile memory interface 1140.Non-removable nonvolatile memory 1141 can store such as operating system 1144, application program 1145, other program moulds Block 1146 and some program datas 1147.

Removable nonvolatile memory (such as floppy disk drive 1151 and CD-ROM drive 1155) is connected to removable Except nonvolatile memory interface 1150.For example, diskette 1 152 can be inserted into floppy disk drive 1151, and CD (CD) 1156 can It is inserted into CD-ROM drive 1155.

Such as input equipment of mouse 1161 and keyboard 1162 is connected to user input interface 1160.

Computer 1110 can be connected to remote computer 1180 by socket 1170.For example, socket 1170 can Remote computer 1180 is connected to through local area network 1171.Alternatively, socket 1170 may be connected to modem (modulation Device-demodulator) 1172, and modem 1172 is connected to remote computer 1180 through wide area network 1173.

Remote computer 1180 may include the memory 1181 of such as hard disk, store remote application 1185.

Video clip 1190 is connected to monitor 1191.

Output peripheral interface 1195 is connected to printer 1196 and loudspeaker 1197.

Computer system shown in Figure 25 is merely illustrative, and is in no way intended to limit the present invention, its application or is made With.

Computer system shown in Figure 25 can be implemented as any embodiment at the place in standalone computer or equipment Reason system, wherein can remove one or more unnecessary components or one or more additional components can be added.

[industrial feasibility]

The present invention can be used for many applications.For example, the present invention can be used for the static state of detection and tracking camera capture Object in image or mobile video, and for portable device, the mobile phone of (being based on camera) etc. equipped with camera Etc. being especially advantageous.

It is noted that method and apparatus described in the text can be implemented as software, firmware, hardware or any combination of them. Some components can for example be implemented as the software run on digital signal processor or microprocessor.Other assemblies can be such as It is embodied as hardware and/or specific integrated circuit.

In addition, various ways can be used to carry out method and system of the invention.For example, software, hardware, firmware can be passed through Or any combination of them carries out method and system of the invention.The sequence of the step of this method described above is only explanation Property, and unless specifically stated otherwise, otherwise the step of method of the invention is not limited to the sequence being described in detail above.This Outside, in some embodiments, the present invention can also be presented as the program recorded in recording medium, including for implementing according to the present invention Method machine readable instructions.Therefore, present invention also contemplates that storage is used to implement program according to the method for the present invention Recording medium.

Although describing the present invention by reference to example embodiment, it should be understood that it is real that the present invention is not limited to disclosed examples Apply example.Above-described embodiment can be modified without departing from the spirit and scope of the present invention.Following claim Range will be given broadest explanation, to include all such modifications and equivalent structure and function.

In specification in front, the embodiment of the present invention is described referring to many details, it can be with It realizes and changes.Therefore, of the invention and applicant be intended to as invention it is unique and unique indicate be the application power Benefit requires, and includes any subsequent correction.For the term for including in these claims, any definition for being expressly recited herein It will determine the meaning of term used in claim.Therefore, there is no any restrictions clearly described in claim, element, Property, feature, advantage or attribute should not in any way limit this scope of the claims.Therefore, the description and the appended drawings quilt It is considered illustrative rather than restrictive.

Claims

1. a kind of object tracking device for image frame sequence, wherein described image frame sequence includes multiple images frame, each Picture frame includes at least one object, and the equipment includes:

Determination unit is configured as based on the information about the object relative status in picture frame come true from described multiple images frame Determine discrete objects detection image frame；

Tracking cell is configured as examining based on the object of at least one object in identified discrete objects detection image frame It surveys, obtains the information to image tracing for being used for image frame sequence.

2. equipment according to claim 1, wherein the information about object relative status includes the object in picture frame Block at least one of the motion change rate of the object in ratio and picture frame.

3. equipment according to claim 1, wherein determination unit further include:

Computing unit is configured as each picture frame blocking ratio and calculate the picture frame based on the object in picture frame Reliability the value of the confidence, and

Selecting unit is configured as selecting the picture frame with the reliability the value of the confidence higher than threshold value as object detection image Frame.

4. equipment according to claim 2 or 3, wherein the distance between two adjacent discrete objects detection image frames root It is adjusted according to the motion change rate of the object in two picture frames.

5. equipment according to claim 3, wherein determination unit further include: for object detection picture frame to be determined,

It is configured as the unit of setting initial image frame, and

Being configured to determine that in several candidate image frames including initial image frame of multiple picture frame has highest reliability Unit of the picture frame of the value of the confidence as object detection picture frame.

6. equipment according to claim 5, wherein being based further on each of several candidate image frames from this The distance of candidate image frame to initial image frame calculates its reliability the value of the confidence.

7. equipment according to claim 5, wherein the quantity of several candidate image frames is at least based on initial image frame In object velocity be set, and

Wherein, object velocity is smaller, and quantity is smaller.

8. equipment according to claim 5, wherein from initial image frame to the distance of previous objects detection image frame according to Object in initial image frame is adjusted relative to the motion change rate of the previous objects detection image, and

When motion change rate is greater than or equal to second threshold, which reduces, and works as the motion change rate of all objects all When less than second threshold, which is increased.

9. equipment according to claim 3, wherein the reliability the value of the confidence of picture frame and blocking for the object in picture frame Ratio is negatively correlated.

10. equipment according to claim 4, wherein the distance between two adjacent discrete objects detection image frames and this The motion change rate of object between two picture frames is negatively correlated.

11. equipment according to claim 10, wherein determination unit includes:

It is configured as the unit of setting initial image frame, and

It is configured as the motion state change information according to the object in initial image frame relative to previous objects detection image frame Update unit of the initial image frame as object detection picture frame, and

Wherein, when motion state change information is greater than or equal to second threshold, initial image frame and first object detection image The distance of frame reduces, and when motion state change information is less than second threshold, initial image frame and first object detection figure As the distance of frame increases.

12. equipment according to claim 11, wherein tracking cell includes:

Associative cell is configured as the testing result of the object in existing object detection image frame and previous objects detection image The testing result of object in frame is aligned；And

Positioning unit is configured as testing result based on the object on existing object detection image frame and from previous objects Detection image frame determine track, determine the object between previous objects detection image frame and subsequent object detection image frame with Track information.

13. a kind of method for tracing object for image frame sequence, wherein described image frame sequence includes multiple images frame, each Picture frame includes at least one object, and the described method includes:

Step is determined, for based on the information about the object relative status in picture frame, determination to be discrete from multiple images frame Object detection picture frame；And

Tracking step, for being obtained based on the object detection of at least one object in identified discrete objects detection image frame The information to image tracing that must be used in image frame sequence.

14. a kind of equipment, including

At least one processor；With

At least one storage equipment, it is described at least one storage equipment be stored with instruction on it, the instruction by it is described at least When one processor executes, at least one described processor is made to execute the method according to claim 11.

15. a kind of non-transient storage equipment of store instruction, which makes at least one processor execute basis when being executed Method described in claim 13.