CN109416728A - Object detection method, device and computer system - Google Patents

Object detection method, device and computer system Download PDF

Info

Publication number
CN109416728A
CN109416728A CN201680087590.3A CN201680087590A CN109416728A CN 109416728 A CN109416728 A CN 109416728A CN 201680087590 A CN201680087590 A CN 201680087590A CN 109416728 A CN109416728 A CN 109416728A
Authority
CN
China
Prior art keywords
target
detection
candidate
frame
detection target
Prior art date
Application number
CN201680087590.3A
Other languages
Chinese (zh)
Inventor
刘晓青
伍健荣
白向晖
Original Assignee
富士通株式会社
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 富士通株式会社 filed Critical 富士通株式会社
Priority to PCT/CN2016/101237 priority Critical patent/WO2018058595A1/en
Publication of CN109416728A publication Critical patent/CN109416728A/en

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06KRECOGNITION OF DATA; PRESENTATION OF DATA; RECORD CARRIERS; HANDLING RECORD CARRIERS
    • G06K9/00Methods or arrangements for reading or recognising printed or written characters or for recognising patterns, e.g. fingerprints

Abstract

A kind of object detection method, device and computer system, this method comprises: for first key frame of video image, the target on first key frame is detected based on deep neural network, all detection targets on first key frame are obtained, and identify (101) for each detection Target Assignment;For the normal frames after each key frame, according to the object detection results of previous frame, it determines the candidate region for corresponding to each detection target on present frame, each detection target is relocated according to the candidate region, obtains the tracking target (102) on present frame;For other key frames of video image, the target on other described key frames is detected based on deep neural network, obtain all detection targets on other described key frames, and according to the Relocation result of previous frame, (103) are integrated to the detection target on other described key frames.This method can be improved the precision of target detection and reduce time loss.

Description

Object detection method, device and computer system Technical field

The present invention relates to field of image processing, in particular to a kind of object detection method, device and computer system.

Background technique

Currently, the object detection method for video pays attention to multidate information and has ignored static object.In the target detection for image, depth convolutional neural networks realize higher precision, however, being directed to video, this method is very time-consuming and unqualified.

It should be noted that the above description of the technical background is intended merely to conveniently carry out clear, complete explanation to technical solution of the present invention, and facilitates the understanding of those skilled in the art and illustrate.It cannot be merely because these schemes be expounded in background technology part of the invention and think that above-mentioned technical proposal is known to those skilled in the art.

Summary of the invention

In order to solve the problems, such as that background technique is pointed out, the embodiment of the present invention provides a kind of object detection method, device and computer system, is used as detector with deep neural network (DNN, Deep Neural Network), dynamic and static object are detected simultaneously, reduce the time.

According to the present embodiment in a first aspect, providing a kind of object detection method, wherein the described method includes:

For first key frame of video image, the target on first key frame is detected based on deep neural network, obtains all detection targets on first key frame, and identify for each detection Target Assignment;

For the normal frames after each key frame, according to the object detection results of previous frame, determines the candidate region for corresponding to each detection target on present frame, each detection target is relocated according to the candidate region, obtains the tracking target on present frame;

For other key frames of video image, the target on other described key frames is detected based on deep neural network, all detection targets on other described key frames are obtained, and according to the Relocation result of previous frame, the detection target on other described key frames is integrated.

According to the second aspect of the present embodiment, a kind of object detecting device is provided, wherein described device includes:

First detection unit detects first key frame of video image based on deep neural network to the target on first key frame, obtains all detection targets on first key frame, and is each inspection Survey Target Assignment mark;

Reset bit location, it is for the normal frames after each key frame, according to the object detection results of previous frame, determines the candidate region that each detection target is corresponded on present frame, each detection target is relocated according to the candidate region, obtains the tracking target on present frame;

Second detection unit, its other key frame for video image, the target on other described key frames is detected based on deep neural network, obtain all detection targets on other described key frames, and according to the Relocation result of previous frame, the detection target on other described key frames is integrated.

According to the third aspect of the present embodiment, a kind of computer system is provided, wherein the computer system includes device described in aforementioned second aspect.

The beneficial effect of the embodiment of the present invention is: through the embodiment of the present invention, can be improved the precision of target detection and reduces time loss.

Referring to following description and accompanying drawings, only certain exemplary embodiments of this invention is disclosed in detail, specifying the principle of the present invention can be in a manner of adopted.It should be understood that embodiments of the present invention are not so limited in range.In the range of the clause of appended claims, embodiments of the present invention include many changes, modifications and are equal.

The feature for describing and/or showing for a kind of embodiment can be used in one or more other embodiments in a manner of same or similar, be combined with the feature in other embodiment, or the feature in substitution other embodiment.

It should be emphasized that term "comprises/comprising" refers to the presence of feature, one integral piece, step or component when using herein, but the presence or additional of one or more other features, one integral piece, step or component is not precluded.

Detailed description of the invention

It can be combined with elements and features shown in one or more other attached drawings or embodiment in the elements and features described in one drawing or one embodiment of the embodiment of the present invention.In addition, in the accompanying drawings, similar label indicates corresponding component in several attached drawings, and may be used to indicate corresponding component used in more than one embodiment.

Included attached drawing is used to provide to be further understood from the embodiment of the present invention, and which constitute part of specifications, for illustrating embodiments of the present invention, and come together to illustrate the principle of the present invention with verbal description.It should be evident that drawings in the following description are only some embodiments of the invention, for those of ordinary skill in the art, without any creative labor, it is also possible to obtain other drawings based on these drawings.In the accompanying drawings:

Fig. 1 is the schematic diagram of the object detection method of embodiment 1;

Fig. 2 is the configuration diagram of the object detection method of embodiment 1

Fig. 3 is the schematic diagram that target detection is carried out to the key frame of video image;

Fig. 4 is the schematic diagram relocated to detection target;

Fig. 5 is the flow chart relocated to detection target;

Fig. 6 is the overall schematic of the embodiment relocated to detection target;

Fig. 7 is the schematic diagram integrated to the detection target on other key frames;

Fig. 8 is the overall schematic for the embodiment integrated to the detection target on other key frames;

Fig. 9 is the schematic diagram of the object detecting device of embodiment 2;

Figure 10 is the schematic diagram for resetting bit location of the object detecting device of embodiment 2;

Figure 11 is the schematic diagram of the second detection unit of the object detecting device of embodiment 2;

Figure 12 is the schematic diagram of the computer system of embodiment 3.

Specific embodiment

Referring to attached drawing, by following specification, aforementioned and other feature of the invention be will be apparent.In the specification and illustrated in the drawings, specifically disclose only certain exemplary embodiments of this invention, which show some embodiments that can wherein use principle of the invention, it will be appreciated that, the present invention is not limited to described embodiments, on the contrary, the present invention includes whole modifications, modification and the equivalent fallen within the scope of the appended claims.Various embodiments of the invention are illustrated with reference to the accompanying drawing.These embodiments are only exemplary, and are not limitations of the present invention.

The embodiment of the present invention is illustrated with reference to the accompanying drawing.

Embodiment 1

A kind of object detection method is present embodiments provided, Fig. 1 is the schematic diagram of this method, as shown in Figure 1, this method comprises:

Step 101: for first key frame of video image, the target on first key frame being detected based on deep neural network, obtains all detection targets on first key frame, and identify for each detection Target Assignment;

Step 102: for the normal frames after each key frame, according to the object detection results of previous frame, it determines the candidate region for corresponding to each detection target on present frame, each detection target is relocated according to the candidate region, obtains the tracking target on present frame;

Step 103: for other key frames of video image, based on deep neural network on other described key frames Target detected, obtain all detection targets on other described key frames, and according to the Relocation result of previous frame, integrate to the detection target on other described key frames.

Fig. 2 illustrates the overall architecture of the object detection method of the present embodiment, as shown in Fig. 2, the present embodiment is based on deep neural network and carries out target detection for key frame, obtains all targets on each key frame, referred to as detection target;For normal frames, no longer progress target detection, but the testing result based on previous frame/target Relocation result, repositioning target referred to as track target.Furthermore, for the key frame other than first key frame, not only detect all targets on the key frame, there are also the Relocation results for being based on former frame (normal frames), target on the key frame is integrated, in order to avoid lose target or repeat identification target or wrong identification target.

Method through this embodiment only carries out target detection to key frame, can realize higher detection accuracy with less time loss.

In a step 101, above-mentioned target detection can be realized by DNN detector, that is, carrying out target detection based on deep neural network.For the working principle of deep neural network, the prior art can be referred to, the present embodiment is no longer described in detail.Detection by step 101 to the target on key frame, all targets (referred to as detection target) on available key frame, as shown in figure 3, each detection target, which can be assigned a mark (ID), is used to indicate the detection target.The present embodiment with no restriction, can be a specified digital number, can also indicate that the attribute etc. of the detection target to the type of the mark.

In the present embodiment, as shown in Fig. 2, in key frame followed by n normal frames, the present embodiment to the value of n with no restriction, the value of n can be considered target detection precision and calculation amount (it is related to time loss, it is computationally intensive, time loss is more, and calculation amount is small, and time loss is few), if it is desired to improve the precision of target detection, it can set n to lesser value, if it is desired to reduce calculation amount, namely reduce time loss, n can be set to biggish value, may be preferably less than 10.

In a step 102, for a normal frames close to key frame, the present embodiment relocates detection target according to the object detection results (having obtained detection target) of the key frame, for other normal frames, the present embodiment relocates tracking target according to the target Relocation result (having obtained tracking target) of a upper normal frames.Since the tracking target that Relocation result obtains is also the detection target on key frame, for convenience of explanation, target Relocation result is known as object detection results, is known as tracking target to detect target.That is, the object detection results that step 102 is mentioned, which are contained, carries out the object detection results that target detection obtains to key frame, also contains and the target Relocation result that target relocates is carried out to normal frames.Similarly, the detection target that step 102 is mentioned, which is contained, carries out the detection target that target detection obtains to key frame, also contains and carries out the tracking target that target relocates to normal frames.

In a step 102, it can be extended by the bounding box to above-mentioned detection target, obtain the candidate region of the detection target.

Fig. 4 illustrates the schematic diagram that target reorientation is carried out for some detection target.

As shown in Figure 4, for some target on previous frame, as the left side Fig. 4 image in target in ellipse, the present embodiment may search for the candidate region of the present frame for the target, to relocate the target in the subsequent sections, target in image on the right of Fig. 4 plus in broad-brush rectangle, and corresponding mark is distributed for it.

In the present embodiment, parameter can be usedThe bounding box of original object (target on previous frame) is extended to obtain the candidate region.If the size of original object is B_w and B_h, the size of the candidate region after extension are as follows:

In the present embodiment,Value can according to need setting, preferably can be greater than 1.5.

Above-mentioned be extended to bounding box is merely illustrative in the method for obtaining the subsequent sections, and not in this, as limitation, other enforceable extended methods can also be applied the present embodiment.

In a step 102, it has obtained to relocate the detection target according to the candidate region for the candidate region of detection target.Fig. 5 provides a kind of method of reorientation, as shown in figure 5, this method comprises:

Step 501: traversing above-mentioned candidate region using pre- fixed step size, obtain the multiple candidate targets for corresponding to the detection target;

Step 502: calculating the similarity of each candidate target Yu above-mentioned detection target;

Step 503: the tracking target according to the determining candidate target with the detection object matching of above-mentioned similarity, using matched candidate target as the detection target on present frame.

In the present embodiment, if the similarity of only one candidate target and above-mentioned detection target is greater than first threshold, using the candidate target as above-mentioned tracking target, and the corresponding tracking number of the detection target is reset.

In the present embodiment, if there is the similarity of multiple candidate targets and above-mentioned detection target is greater than first threshold, it then selects the candidate target with maximum similarity as above-mentioned tracking target from above-mentioned multiple candidate targets, and resets the corresponding tracking number of the detection target.

In the present embodiment, if the similarity of all candidate targets and above-mentioned detection target is all not more than first threshold, judge whether the corresponding tracking number of the detection target is 0;If tracking number is not 0, tracking number is subtracted 1, and retain the detection target on present frame;If tracking number is 0, the detection target is not retained on present frame.

In the present embodiment, above-mentioned subsequent sections are traversed by using step-length d, obtain the multiple candidate targets for corresponding to the detection target, range shown in all rectangles on the right side of Fig. 4.It can reduce the quantity of succeeding target by the way that step-length d is arranged, to reduce calculation amount.In one embodiment, d can be set to less than 4 pixels, is advisable with guaranteeing that the pixel of each candidate target is higher than 50, but the present embodiment is not in this, as limitation.

In the present embodiment, it can determine whether candidate target matches with detection target by calculating the similarity between characteristics of image.In one embodiment, gradient difference can be used, but the present embodiment, not in this, as limitation, other methods for calculating similarity are also suitable.Since the method for calculating similarity is the prior art, this is no longer described in detail in the present embodiment.

In the present embodiment, the candidate target of object matching is found and detected by setting first threshold, however, if the similarity of all candidate target and detection target is all not more than the first threshold, that is, the detection target is not found in present frame, then the present embodiment is not directly to remove the detection target in present frame, but retain the detection target of several frames by tracking number (keep_track) this parameter, to avoid misjudgment.

In the present embodiment, a parameter is provided with for each detection target on key frame, referred to as keep_track, if on present frame not with it is original detection object matching candidate target, the present embodiment does not delete the mark of the detection target at once, but the value of keep track is set to subtract 1, and retain the detection target without updating its position, until the value of keep_track is equal to 0.If detection target matches before keep_track is 0 with some candidate target on previous frame, which will be reset.In the present embodiment, the value of keep_track can be arranged to [3 5] namely it can be 3 or 4 or 5, however, the present embodiment may be set to be other values not in this, as limitation.

In the present embodiment, if the video image is high-resolution, before implementing in the method for the present embodiment can also each frame image (key frame and normal frames) to the video image carry out down-sampling, such as usage factor δ carries out down-sampling to the target area of each frame image or candidate target region, the present embodiment with no restrictions, can use existing means to the specific embodiment of down-sampling.

Fig. 6 is the overall flow figure of an embodiment of the method for relocating of the present embodiment, please refers to Fig. 6, this method comprises:

Step 601: down-sampling being carried out to the target area of previous frame image and carries out feature extraction;

Step 602: down-sampling being carried out to the candidate target region of current frame image and carries out feature extraction;

Step 603: characteristic matching obtains matching score;

Step 604: judging to match whether score is greater than first threshold, if the judgment is Yes, then follow the steps 605;It is no to then follow the steps 606;

Step 605: selecting optimal candidate target;

Step 606: keep_track is subtracted 1;

Step 607: judging whether keep_track is 0, if the judgment is No, then retain the position of the detection target in present frame;Otherwise the position of the detection target is not retained in present frame.

In step 601 and step 602, described feature extraction refers to the feature for extracting the candidate target on detection target/current frame image on previous frame image, to carry out characteristic matching, such as calculating similarity etc..Also, the present embodiment with no restriction, may be performed simultaneously, can also be performed separately to the execution of step 601 and step 602 sequence.

In the present embodiment, after being disposed by step 102 to the last one normal frames before next key frame, the present embodiment enters the processing (step 103) to next key frame (namely other key frames in addition to first key frame).

In step 103, for next key frame, other than carrying out object detection process identical with step 101, also the detection target on other frames is integrated according to the Relocation result of previous frame (the last one above-mentioned normal frames), namely, according to the tracking target in a upper normal frames, the testing result that target identification is assigned to current key frame, it is possible thereby to determine whether that target has been moved off, if there is new target to enter.

Fig. 7 provides a kind of integration method, as shown in fig. 7, this method comprises:

Step 701: each detection target on other described key frames is matched with each candidate target in former frame;

Step 702: if the overlapping region of the detection target and the candidate target is greater than second threshold, and the matching score of the detection target and the candidate target is greater than first threshold, then the mark of the candidate target matched for the detection Target Assignment;

Step 703: if the overlapping region of the detection target and the candidate target is not more than first threshold no more than the matching score of second threshold or the detection target and the candidate target, for the new mark of the detection Target Assignment.

In the present embodiment, if mark is assigned in all detection targets on other key frames, and still either with or without the candidate target matched in above-mentioned former frame, then the candidate target not matched is not retained on other key frames.

Fig. 8 is the overall flow figure of an embodiment of the integration method of the present embodiment, please refers to Fig. 8, this method comprises:

Step 801: overlapping matrix IOU;

Step 802: judging IOUijWhether it is greater than second threshold and thens follow the steps 803, otherwise if the judgment is Yes New mark is distributed for target i;

Step 803: Image Feature Matching;

Step 804: judging to match whether score is greater than first threshold, if the judgment is Yes, the mark of target j is then distributed for target i, IOU matrix is the matrix of (N-1) × (M-1) at this time, and execute step 805, otherwise new mark is distributed for target i, IOU matrix is the matrix of (N-1) × M at this time;

Step 805: judge whether IOU line number is greater than 0, if the judgment is Yes, then returns to step 801, it is no to then follow the steps 806;

Step 806: judging whether IOU columns is greater than 0, if the judgment is Yes, then do not retain the candidate target not matched on current key frame, otherwise terminate.

In the present embodiment, as shown in Figure 8, the detection target from new key frame and overlapping between the tracking target from a upper normal frames is calculated first, obtain IOU (the Intersection Over Union of N × M, hand over except simultaneously) matrix, for the detection target i of the key frame, if the detection target i of tracking target j and the key frame in normal frames have overlapping, overlapping region IOUijGreater than second threshold th2, also, the matching score of the two targets is greater than first threshold, then assigning mark identical with tracking target j to the detection target i, otherwise, assigns a new mark to the detection target i.

In the present embodiment, the process of Image Feature Matching as hereinbefore, such as by the method for calculating similarity (matching score) realizes that details are not described herein again.

In the present embodiment, if tracking target j is matched with detection target i, jth column will be deleted from IOU matrix, i.e. target j is no longer matched.Therefore, for robustness, descending sequence is carried out to the maximum value of IOU matrix rows first, and arranges each line position of IOU matrix to set accordingly, and then matched since matrix the first row.If mark is assigned in all detection targets in key frame, and still either with or without the tracking target matched in normal frames, then these targets will be removed.Processing in this way, can determine whether target has left visual range, if having new target to enter, and obtain the destination number in visual range.

Method through this embodiment only carries out target detection to the key frame of video image, and carries out target reorientation to the normal frames of video image, can be improved the precision of target detection and reduces time loss.

Embodiment 2

A kind of object detecting device is present embodiments provided, it is specific to implement to be not repeated with the implementation of the method for reference implementation example 1, content something in common since the principle that the device solves the problems, such as is similar with the method for embodiment 1 Explanation.

Fig. 9 is the schematic diagram of the object detecting device of the present embodiment, as shown in figure 9, the device 900 includes: first detection unit 901, resets bit location 902 and second detection unit 903.

The first detection unit 901 is used for first key frame for video image, is detected based on deep neural network to the target on first key frame, obtains all detection targets on first key frame, and identify for each detection Target Assignment.

This resets bit location 902 for for the normal frames after each key frame, according to the object detection results of previous frame, it determines the candidate region for corresponding to each detection target on present frame, each detection target is relocated according to the candidate region, obtains the tracking target on present frame;

Second detection unit 903 is used for other key frames for video image, the target on other key frames is detected based on deep neural network, obtain all detection targets on other key frames, and according to the Relocation result of previous frame, the detection target on other key frames is integrated.

Figure 10 is the schematic diagram of an embodiment for resetting bit location 902 of the present embodiment.

As shown in Figure 10, in the present embodiment, it may include expanding element 1001 that this, which resets bit location 902, is extended to the bounding box of the detection target, obtains the candidate region of the detection target.Specific extended method can be with the explanation of reference pair Fig. 4.

As shown in Figure 10, in the present embodiment, it can also include: Traversal Unit 1002, computing unit 1003 and determination unit 1004 that this, which resets bit location 902, the Traversal Unit 1002 can be used pre- fixed step size and traverse above-mentioned candidate region, obtain the multiple candidate targets for corresponding to the detection target;The computing unit 1003 can calculate the similarity of each candidate target Yu the detection target;The determination unit 1004 can determine the candidate target with the detection object matching, the tracking target using matched candidate target as the detection target on present frame according to the similarity.

In the present embodiment, which using the candidate target as above-mentioned tracking target, and can reset the corresponding tracking number of the detection target when the similarity for having a candidate target and above-mentioned detection target is greater than first threshold.

In the present embodiment, the determination unit 1004 can also be when the similarity for having multiple candidate targets and above-mentioned detection target be greater than first threshold, it selects the candidate target with maximum similarity as above-mentioned tracking target from multiple candidate target, and resets the corresponding tracking number of the detection target.

In the present embodiment, which can also judge to track whether number is 0 when the similarity of all candidate targets and the detection target is all not more than first threshold;If tracking number is not 0, tracking number is subtracted 1, And retain the detection target on present frame;If tracking number is 0, the detection target is not retained on present frame.

In the present embodiment, as shown in Figure 9, the device 900 can also include: downsampling unit 904, can target area/candidate target region in the above-mentioned key frame to above-mentioned video image and/or above-mentioned normal frames carry out down-sampling, to carry out the matching of characteristics of image.As previously mentioned, details are not described herein again.

Figure 11 is the schematic diagram of an embodiment of the second detection unit 903 of the present embodiment.

As shown in figure 11, in the present embodiment, which may include matching unit 1101 and processing unit 1102.The matching unit 1101 is for matching each detection target on other above-mentioned key frames with each candidate target in former frame;The processing unit 1102 is used for the overlapping region in the detection target and above-mentioned candidate target and is greater than second threshold, and when the matching score of the detection target and above-mentioned candidate target is greater than first threshold, for the mark for the candidate target that the detection Target Assignment matches;When the overlapping region of the detection target and above-mentioned candidate target is not more than first threshold no more than the matching score of second threshold or the detection target and above-mentioned candidate target, for the new mark of the detection Target Assignment.

In the present embodiment, mark can also be assigned in all detection targets on other above-mentioned key frames in the processing unit 1102, and in above-mentioned former frame still either with or without the candidate target matched when, do not retain the candidate target not matched on other key frames.

Device through this embodiment only carries out target detection to the key frame of video image, and carries out target reorientation to the normal frames of video image, can be improved the precision of target detection and reduces time loss.

Embodiment 3

The present embodiment additionally provides a kind of computer system, is configured with the other device 900 of foregoing target detection.

Figure 12 is the schematic block diagram that the system of the computer system 1200 of the embodiment of the present invention is constituted.As shown in figure 12, which may include central processing unit 1201 and memory 1202;Memory 1202 is coupled to central processing unit 1201.It is worth noting that, the figure is exemplary;Other kinds of structure can also be used, to supplement or replace the structure, to realize telecommunications functions or other function.

In one embodiment, the function of the other device 900 of target detection can be integrated into central processing unit 1201.Wherein, central processing unit 1201, which can be configured as, realizes object detection method described in embodiment 1.

Such as, it is control as follows that the central processing unit 1201 can be configured as progress: for first key frame of video image, the target on first key frame is detected based on deep neural network, all detection targets on first key frame are obtained, and are identified for each detection Target Assignment;For the normal frames after each key frame, according to upper The object detection results of one frame determine the candidate region for corresponding to each detection target on present frame, are relocated according to the candidate region to each detection target, obtain the tracking target on present frame;For other key frames of video image, the target on other key frames is detected based on deep neural network, all detection targets on other key frames are obtained, and according to the Relocation result of previous frame, the detection target on other key frames is integrated.

In another embodiment, object detecting device 900 can be with 1201 separate configuration of central processing unit, such as the chip connecting with central processing unit 1201 can be configured by object detecting device 900, the function of object detecting device 900 is realized by the control of central processing unit 1201.

As shown in figure 12, which can also include: input unit 1203, audio treatment unit 1204, display 1205, power supply 1206.It is worth noting that, computer system 1200 is also not necessary to include all components shown in Figure 12;In addition, computer system 1200 can also include the component being not shown in Figure 12, the prior art can be referred to.

As shown in figure 12, central processing unit 1201 is otherwise referred to as controller or operational controls, may include microprocessor or other processor devices and/or logic device, which receives the operation for inputting and controlling all parts of computer system 1200.

One of wherein, memory 1202, such as can be buffer, flash memory, hard disk driver, removable medium, volatile memory, nonvolatile memory or other appropriate devices or more.The related information such as above-mentioned video image, characteristic matching can be stored, the program executed for information about can be additionally stored.And the program of the memory 1202 storage can be performed in central processing unit 1201, to realize information storage or processing etc..The function of other component with it is existing similar, details are not described herein again.Each component of computer system 1200 can by specialized hardware, firmware, software or its in conjunction with realizing, be made without departing from the scope of the present invention.

In the present embodiment, which can be video monitoring system, but not in this, as limitation.

Computer system through this embodiment only carries out target detection to the key frame of video image, and carries out target reorientation to the normal frames of video image, can be improved the precision of target detection and reduces time loss.

The embodiment of the present invention also provides a kind of computer-readable program, wherein described program makes the object detecting device or computer system execute object detection method described in embodiment 1 when executing described program in object detecting device or computer system.

The embodiment of the present invention also provides a kind of storage medium for being stored with computer-readable program, wherein the computer-readable program makes object detecting device or computer system execute object detection method described in embodiment 1.

The device and method more than present invention can be by hardware realization, can also be by combination of hardware software realization.The present invention relates to such computer-readable programs, when the program is performed by logical block, the logical block can be made to realize devices described above or component parts, or the logical block is made to realize various method or steps described above.The invention further relates to the storage mediums for storing procedure above, such as hard disk, disk, CD, DVD, flash memory.

Hardware, the software module executed by processor or both combination can be embodied directly in conjunction with the object detection method in object detecting device that the embodiment of the present invention describes.For example, one or more combinations of one or more of functional block diagram and/or functional block diagram shown in Fig. 9-11, both can correspond to each software module of computer program process, and can also correspond to each hardware module.These software modules can correspond respectively to each step shown in Fig. 1,5,7.These software modules are for example solidified using field programmable gate array (FPGA) and are realized by these hardware modules.

Software module can be located at the storage medium of RAM memory, flash memory, ROM memory, eprom memory, eeprom memory, register, hard disk, mobile disk, CD-ROM or any other form known in the art.A kind of storage medium can be coupled to processor, to enable a processor to from the read information, and information can be written to the storage medium;Or the storage medium can be the component part of processor.Pocessor and storage media can be located in ASIC.The software module can store in a memory in the mobile terminal, also can store in the storage card that can be inserted into mobile terminal.For example, the software module is storable in the flash memory device of the MEGA-SIM card or large capacity if equipment (such as mobile terminal) is using the MEGA-SIM card of larger capacity or the flash memory device of large capacity.

For one or more combinations of Fig. 9-11 one or more of functional block diagram described and/or functional block diagram, it can be implemented as general processor for executing function described herein, digital signal processor (DSP), specific integrated circuit (ASIC), field programmable gate array (FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components or it be any appropriately combined.For one or more combinations of one or more of Fig. 9-11 functional block diagram described and/or functional block diagram, it is also implemented as calculating the combination of equipment, for example, the combination of DSP and microprocessor, multi-microprocessor, the one or more microprocessors or any other this configuration combined with DSP communication.

Combining specific embodiment above, invention has been described, it will be appreciated by those skilled in the art that these descriptions are all exemplary, it is not limiting the scope of the invention.Those skilled in the art can make various variants and modifications to the present invention with principle according to the present invention, these variants and modifications are also within the scope of the invention.

Claims (19)

  1. A kind of object detection method, wherein the described method includes:
    For first key frame of video image, the target on first key frame is detected based on deep neural network, obtains all detection targets on first key frame, and identify for each detection Target Assignment;
    For the normal frames after each key frame, according to the object detection results of previous frame, determines the candidate region for corresponding to each detection target on present frame, each detection target is relocated according to the candidate region, obtains the tracking target on present frame;
    For other key frames of video image, the target on other described key frames is detected based on deep neural network, all detection targets on other described key frames are obtained, and according to the Relocation result of previous frame, the detection target on other described key frames is integrated.
  2. According to the method described in claim 1, wherein it is determined that corresponding to the candidate region of each detection target on present frame, comprising:
    The bounding box of the detection target is extended, the candidate region of the detection target is obtained.
  3. According to the method described in claim 1, wherein, being relocated according to the candidate region to each detection target, comprising:
    The candidate region is traversed using pre- fixed step size, obtains the multiple candidate targets for corresponding to the detection target;
    Calculate the similarity of each candidate target and the detection target;
    Tracking target according to the determining candidate target with the detection object matching of the similarity, using matched candidate target as the detection target on present frame.
  4. According to the method described in claim 3, wherein, according to the determining candidate target with the detection object matching of the similarity, comprising:
    If there is the similarity of a candidate target and the detection target is greater than first threshold, then using the candidate target as the tracking target, and the corresponding tracking number of the detection target is reset.
  5. According to the method described in claim 3, wherein, according to the determining candidate target with the detection object matching of the similarity, comprising:
    If there is the similarity of multiple candidate targets and the detection target is greater than first threshold, then select the candidate target with maximum similarity as the tracking target from the multiple candidate target, and reset the corresponding tracking number of the detection target.
  6. According to the method described in claim 3, wherein, according to the determining candidate target with the detection object matching of the similarity, comprising:
    If the similarity of all candidate targets and the detection target is all not more than first threshold, judge to track whether number is 0;
    If tracking number is not 0, tracking number is subtracted 1, and retain the detection target on present frame;
    If tracking number is 0, the detection target is not retained on present frame.
  7. According to the method described in claim 1, wherein, the method also includes:
    To on the key frame and/or the normal frames target area or candidate target region carry out down-sampling.
  8. According to the method described in claim 1, wherein, being integrated to the detection target on other described key frames, comprising:
    Each detection target on other described key frames is matched with each candidate target in former frame;
    If the overlapping region of the detection target and the candidate target is greater than second threshold, and the matching score of the detection target and the candidate target is greater than first threshold, then the mark of the candidate target matched for the detection Target Assignment;
    If the overlapping region of the detection target and the candidate target is not more than first threshold no more than the matching score of second threshold or the detection target and the candidate target, for the new mark of the detection Target Assignment.
  9. According to the method described in claim 8, wherein,
    If mark is assigned in all detection targets on other described key frames, and still either with or without the candidate target matched in the former frame, then the candidate target not matched is not retained on other described key frames.
  10. A kind of object detecting device, wherein described device includes:
    First detection unit detects first key frame of video image based on deep neural network to the target on first key frame, obtains all detection targets on first key frame, and identify for each detection Target Assignment;
    Reset bit location, it is for the normal frames after each key frame, according to the object detection results of previous frame, determines the candidate region that each detection target is corresponded on present frame, each detection target is relocated according to the candidate region, obtains the tracking target on present frame;
    Second detection unit, its other key frame for video image, the target on other described key frames is detected based on deep neural network, obtain all detection targets on other described key frames, and according to the Relocation result of previous frame, the detection target on other described key frames is integrated.
  11. Device according to claim 10, wherein the bit location that resets includes:
    Expanding element is extended the bounding box of the detection target, obtains the candidate region of the detection target.
  12. Device according to claim 10, wherein the bit location that resets includes:
    Traversal Unit traverses the candidate region using pre- fixed step size, obtains the multiple candidate targets for corresponding to the detection target;
    Computing unit calculates the similarity of each candidate target and the detection target;
    Determination unit, the tracking target according to the determining candidate target with the detection object matching of the similarity, using matched candidate target as the detection target on present frame.
  13. Device according to claim 12, wherein
    The determination unit using the candidate target as the tracking target, and resets the corresponding tracking number of the detection target when there is the similarity of a candidate target and the detection target to be greater than first threshold.
  14. Device according to claim 12, wherein
    The determination unit is when there is the similarity of multiple candidate targets and the detection target to be greater than first threshold, it selects the candidate target with maximum similarity as the tracking target from the multiple candidate target, and resets the corresponding tracking number of the detection target.
  15. Device according to claim 12, wherein
    The determination unit judges to track whether number is 0 when the similarity of all candidate targets and the detection target is all not more than first threshold;If tracking number is not 0, tracking number is subtracted 1, and retain the detection target on present frame;If tracking number is 0, the detection target is not retained on present frame.
  16. Device according to claim 10, wherein described device further include:
    Downsampling unit, on the key frame and/or the normal frames target area or candidate target region carry out down-sampling.
  17. Device according to claim 10, wherein the second detection unit includes:
    Matching unit matches each detection target on other described key frames with each candidate target in former frame;
    Processing unit is greater than second threshold in the overlapping region of the detection target and the candidate target, and when the matching score of the detection target and the candidate target is greater than first threshold, the mark of candidate target matched for the detection Target Assignment;It is the detection Target Assignment when the overlapping region of the detection target and the candidate target is not more than first threshold no more than the matching score of second threshold or the detection target and the candidate target New mark.
  18. Device according to claim 17, wherein
    Mark is assigned in all detection targets of the processing unit on other described key frames, and in the former frame still either with or without the candidate target matched when, do not retain the candidate target not matched on other described key frames.
  19. A kind of computer system, wherein the computer system includes the described in any item devices of claim 10-18.
CN201680087590.3A 2016-09-30 2016-09-30 Object detection method, device and computer system CN109416728A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
PCT/CN2016/101237 WO2018058595A1 (en) 2016-09-30 2016-09-30 Target detection method and device, and computer system

Publications (1)

Publication Number Publication Date
CN109416728A true CN109416728A (en) 2019-03-01

Family

ID=61763242

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201680087590.3A CN109416728A (en) 2016-09-30 2016-09-30 Object detection method, device and computer system

Country Status (2)

Country Link
CN (1) CN109416728A (en)
WO (1) WO2018058595A1 (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109544603A (en) * 2018-11-28 2019-03-29 上饶师范学院 Method for tracking target based on depth migration study

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8027513B2 (en) * 2007-03-23 2011-09-27 Technion Research And Development Foundation Ltd. Bitmap tracker for visual tracking under very general conditions
US20120207356A1 (en) * 2011-02-10 2012-08-16 Murphy William A Targeted content acquisition using image analysis
CN104166861B (en) * 2014-08-11 2017-09-29 成都六活科技有限责任公司 A kind of pedestrian detection method
CN105447458B (en) * 2015-11-17 2018-02-27 深圳市商汤科技有限公司 A kind of large-scale crowd video analytic system and method

Also Published As

Publication number Publication date
WO2018058595A1 (en) 2018-04-05

Similar Documents

Publication Publication Date Title
CN107358596B (en) Vehicle loss assessment method and device based on image, electronic equipment and system
US10354117B2 (en) Fingerprint identification method and apparatus
WO2016054778A1 (en) Generic object detection in images
US10482681B2 (en) Recognition-based object segmentation of a 3-dimensional image
US10860837B2 (en) Deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition
US9355313B2 (en) Detecting and extracting image document components to create flow document
JP4814375B2 (en) Detection device, detection method, and integrated circuit for detection
CN101689328B (en) Image-processing device and image-processing method
US8498473B2 (en) System for computationally quantifying similarities between images
US9208403B1 (en) Systems and methods for processing image data associated with line detection
JP5202037B2 (en) Feature point position determination method and apparatus
US20140023271A1 (en) Identifying A Maximally Stable Extremal Region (MSER) In An Image By Skipping Comparison Of Pixels In The Region
WO2019170012A1 (en) Traffic lane line data processing method and apparatus, computer device, and storage medium
CN102156751B (en) Method and device for extracting video fingerprint
RU2613038C2 (en) Method for controlling terminal device with use of gesture, and device
JP6188400B2 (en) Image processing apparatus, program, and image processing method
WO2016095689A1 (en) Recognition and searching method and system based on repeated touch-control operations on terminal interface
US10685241B2 (en) Method and apparatus for indicating lane
JP6471448B2 (en) Noise identification method and noise identification apparatus for parallax depth image
JP6099234B2 (en) Parallel touchpoint detection using processor graphics
JP2018037053A (en) Method, apparatus and device for detecting lane line
DE102013016732A1 (en) Method of zooming on a screen and electronic device and computer readable medium using self
WO2003090153A1 (en) Reshaping freehand drawn lines and shapes in an electronic document
US9076066B2 (en) Image processing device and method for determining a similarity between two images
US20140152847A1 (en) Product comparisons from in-store image and video captures

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination