CN108256506A - Object detecting method and device, computer storage media in a kind of video - Google Patents
Object detecting method and device, computer storage media in a kind of video Download PDFInfo
- Publication number
- CN108256506A CN108256506A CN201810151829.XA CN201810151829A CN108256506A CN 108256506 A CN108256506 A CN 108256506A CN 201810151829 A CN201810151829 A CN 201810151829A CN 108256506 A CN108256506 A CN 108256506A
- Authority
- CN
- China
- Prior art keywords
- frame
- testing result
- detection block
- key
- detection
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
- G06V20/46—Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
Abstract
The invention discloses object detecting method in a kind of video, the method includes:Several key frames are determined, and carry out object detection to each key frame based on target video, obtain the testing result of each key frame;According to the testing result of each key frame, the testing result of the intermediate frame between determining per adjacent two key frames;The testing result of each intermediate frame is modified, obtains the testing result of revised each intermediate frame;The testing result of testing result based on each key frame, each revised intermediate frame determines the testing result of the target video.The present invention further simultaneously discloses article detection device and computer storage media in a kind of video.
Description
Technical field
The present invention relates to the object detection technologies in technical field of computer vision, and in particular to object is examined in a kind of video
Survey method, apparatus and computer storage media.
Background technology
Object detection is the major issue of computer vision field and the basic technology of intelligent video analysis in video.Depending on
Object detection can have important application, such as safety monitoring, automatic Pilot and advanced video retrieval at many aspects in frequency.
Object detection is built upon on the basis of picture object detection in video, but due to the introducing of temporal information so that
Problem modeling is more complicated.Existing video object detection method in the tradeoff effect between detection speed and accuracy rate also not
Practical application request can be met, if needing to consume the plenty of time, efficiency all using object detector to frame each in video
Than relatively low, if be sparsely detected, detection performance can have substantial degradation.
Invention content
In view of this, present invention contemplates that providing object detecting method and device, computer storage media, energy in a kind of video
Real-time object detection in realizing video under the premise of ensureing compared with high-accuracy.
In order to achieve the above objectives, the technical proposal of the invention is realized in this way:
In a first aspect, an embodiment of the present invention provides object detecting method in a kind of video, the method includes:
Several key frames are determined, and carry out object detection to each key frame based on target video, are obtained each
The testing result of the key frame;
According to the testing result of each key frame, the detection of the intermediate frame between determining per adjacent two key frames
As a result;
The testing result of each intermediate frame is modified, obtains the detection knot of revised each intermediate frame
Fruit;
The testing result of testing result based on each key frame, each revised intermediate frame, determines institute
State the testing result of target video.
In said program, optionally, after the testing result for determining the target video, the method further includes:
By the identical detection block of classification in each adjacent two frame, serial operation is carried out according to spatial position overlapping degree, is obtained
Object chain, the object chain are made of across multiframe and the identical detection block of classification;
It is reclassified, and obtain the classification confidence of each detection block for the detection block on each object chain respectively.
It is optionally, described that several key frames are determined based on target video in said program, and to each key frame
Object detection is carried out, obtains the testing result of each key frame, including:
Multiple initial key frames are chosen according to prefixed time interval, object detection is carried out to each initial key frame, is obtained
To the spatial position of the detection block in each initial key frame and classification confidence;
For the detection block in every two neighboring initial key frame, based on spatial position and classification confidence progress
Match, be less than predetermined threshold value in response to the matching degree of spatial position and classification confidence, in the two neighboring initial key frame
Between each frame in select secondary key frame, and object detection is carried out to each secondary key frame, obtain each described time
The spatial position of detection block in grade key frame and classification confidence;
Wherein, the key frame determined for the target video only includes each initial key frame, alternatively, including simultaneously each
The initial key frame and each secondary key frame.
In said program, optionally, the testing result according to each key frame is determined per adjacent two passes
The testing result of intermediate frame between key frame, including:
For every two adjacent key frames, the frame between left frame, intermediate frame and left frame and intermediate frame is taken, is calculated
First motion history image (MHI, Motion History Image), using first nerves network to first motion history
Image zooming-out feature, predicts first offset of the detection block from left frame to intermediate frame, and first offset is added to a left side
In the detection block of side frame, as the spatial position for the detection block for traveling to intermediate frame, the classification confidence of the detection block of intermediate frame
It is identical with the classification confidence of the detection block of left frame;
For every two adjacent key frames, the frame between right side frame, intermediate frame and right side frame and intermediate frame is taken, is calculated
Second motion history image extracts feature to second motion history image using first nerves network, predicts detection block
Second offset is added in the detection block of right side frame, as traveling to by the second offset from right side frame to intermediate frame
The classification confidence of the spatial position of the detection block of intermediate frame, the classification confidence of the detection block of intermediate frame and the detection block of left frame
It spends identical;
By propagated from intermediate frame from left frame to intermediate frame as a result, and propagating to obtain from right side frame to intermediate frame
It is intermediate frame as a result, merging the testing result as intermediate frame.
In said program, optionally, the testing result to each intermediate frame is modified, and is obtained revised
The testing result of each intermediate frame, including:
The image of intermediate frame and testing result are subjected to change of scale operation according to target scale, the target scale is big
In current scale;
Feature is extracted to described image using nervus opticus network, predicts counterpart position in input frame to described image
The offset is added by the offset put with the input frame, the space as gained after being corrected in the target scale
Position;
Wherein, the input frame is the detection block of intermediate frame.
In said program, optionally, the testing result based on each key frame, it is each it is revised it is described in
Between frame testing result, determine the testing result of the target video, including:
The testing result of testing result based on each key frame, each revised intermediate frame, utilizes line
Property interpolation algorithm determines in the target video other frames except the key frame unless each and each intermediate frame
Testing result.
It is optionally, described to be reclassified, and obtain for the detection block on each object chain respectively in said program
The classification confidence of each detection block, including:
Several detection blocks on each object chain are chosen at equal intervals, cut out the corresponding figure of several described detection blocks
Described image is simultaneously zoomed to same size by picture, and feature is extracted to each described image of same size using third nerve network
And classify, obtain the classification confidence of each detection block on each object chain.
Second aspect, an embodiment of the present invention provides article detection device in a kind of video, described device includes:
First determining module, for determining several key frames based on target video;
Key frame detection module for carrying out object detection to each key frame, obtains each key frame
Testing result;
Second determining module for the testing result according to each key frame, is determined per adjacent two key frames
Between intermediate frame testing result;
Correcting module is modified for the testing result to each intermediate frame, obtains revised each described
The testing result of intermediate frame;
Third determining module, for the testing result based on each key frame, each revised intermediate frame
Testing result, determine the testing result of the target video.
In said program, optionally, described device further includes:
Sort module again, will after determining the testing result of the target video in the third determining module
The identical detection block of the classification of each adjacent two frames kind carries out serial operation according to spatial position overlapping degree, obtains object chain, institute
Object chain is stated across multiframe and is made of the identical detection block of classification;It is carried out again for the detection block on each object chain respectively
Classification, obtains the classification confidence of each detection block.
In said program, optionally, first determining module is additionally operable to:
Multiple initial key frames are chosen according to prefixed time interval, object detection is carried out to each initial key frame, is obtained
To the spatial position of the detection block in each initial key frame and classification confidence;
For the detection block in every two neighboring initial key frame, based on spatial position and classification confidence progress
Match;
It is less than predetermined threshold value in response to the matching degree of spatial position and classification confidence, in the two neighboring initial pass
Secondary key frame is selected in each frame between key frame, and object detection is carried out to each secondary key frame, obtains each institute
State spatial position and the classification confidence of the detection block in secondary key frame;
Wherein, the key frame determined for the target video only includes each initial key frame, alternatively, including simultaneously each
The initial key frame and each secondary key frame.
In said program, optionally, second determining module is used for:
For every two adjacent key frames, the frame between left frame, intermediate frame and left frame and intermediate frame is taken, is calculated
First motion history image (MHI) extracts feature to first motion history image using first nerves network, predicts inspection
First offset of the frame from left frame to intermediate frame is surveyed, first offset is added in the detection block of left frame, as biography
It is multicast to the spatial position of the detection block of intermediate frame, the classification of the classification confidence of the detection block of intermediate frame and the detection block of left frame
Confidence level is identical;
For every two adjacent key frames, the frame between right side frame, intermediate frame and right side frame and intermediate frame is taken, is calculated
Second motion history image extracts feature to second motion history image using first nerves network, predicts detection block
Second offset is added in the detection block of right side frame, as traveling to by the second offset from right side frame to intermediate frame
The classification confidence of the spatial position of the detection block of intermediate frame, the classification confidence of the detection block of intermediate frame and the detection block of left frame
It spends identical;
By propagated from intermediate frame from left frame to intermediate frame as a result, and propagating to obtain from right side frame to intermediate frame
It is intermediate frame as a result, merging the testing result as intermediate frame.
In said program, optionally, the correcting module is additionally operable to:
The image of intermediate frame and testing result are subjected to change of scale operation according to target scale, the target scale is big
In current scale;
Feature is extracted to described image using nervus opticus network, predicts counterpart position in input frame to described image
The offset is added by the offset put with the input frame, the space as gained after being corrected in the target scale
Position;
Wherein, the input frame is the detection block of intermediate frame.
The third aspect an embodiment of the present invention provides a kind of computer storage media, is deposited in the computer storage media
Computer program is contained, the computer program is used to perform object detecting method in above-described video.
Object detecting method and device, computer storage media in the video that the embodiment of the present invention proposes, are regarded based on target
Frequency determines several key frames, and carries out object detection to each key frame, obtains the detection knot of each key frame
Fruit;According to the testing result of each key frame, the testing result of the intermediate frame between determining per adjacent two key frames;
The testing result of each intermediate frame is modified, obtains the testing result of revised each intermediate frame;It is based on
The testing result of the testing result of each key frame, each revised intermediate frame, determines the target video
Testing result;In this way, object detection only is carried out by detector to key frame, without using detector to intermediate frame and except pass
Other frames except key frame and intermediate frame are detected, and not only can guarantee the accuracy of the testing result of key frame, but also can save meter
It is counted as this and time;The testing result of intermediate frame is predicted using the testing result of key frame, passes through the result to intermediate frame
It is modified, the accuracy of the testing result of the intermediate frame of prediction gained can be improved;Inspection based on key frame and intermediate frame
It surveys as a result, estimate the testing result of each frame in target video, can save and calculate cost and time;By of the present invention
Technical solution can be realized and calculate the well balanced of cost and detection performance, can be realized under the premise of ensureing compared with high-accuracy
Real-time object detection in video.
Description of the drawings
Fig. 1 is a kind of realization flow diagram of object detecting method in video provided in an embodiment of the present invention;
Fig. 2 m- scale gridding analysis schematic diagrames when being provided in an embodiment of the present invention;
Fig. 3 is the exemplary plot of detection framework provided in an embodiment of the present invention;
Fig. 4 is a kind of composition structure diagram of article detection device in video provided in an embodiment of the present invention.
Specific embodiment
The technical solution of the present invention is further elaborated in the following with reference to the drawings and specific embodiments.
The embodiment of the present invention provides object detecting method in a kind of video, as shown in Figure 1, the method mainly includes:
Step 101 determines several key frames based on target video, and carries out object detection to each key frame,
Obtain the testing result of each key frame.
Here, the target video can be real-time video, can also be history video.
Here, the target video is collected by image acquisition device such as camera or camera etc..
Wherein, the key frame determined for the target video only includes initial key frame, alternatively, true for the target video
Fixed key frame is simultaneously including initial key frame and secondary key frame.
As an alternative embodiment, described determine several key frames based on target video, including:
Multiple initial key frames are chosen according to prefixed time interval, object detection is carried out to each initial key frame, is obtained
To the spatial position of the detection block in each initial key frame and classification confidence;
It is less than predetermined threshold value in response to the matching degree of spatial position and classification confidence, in the two neighboring initial pass
Secondary key frame is selected in each frame between key frame, and object detection is carried out to each secondary key frame, obtains each institute
State spatial position and the classification confidence of the detection block in secondary key frame;
Wherein, the key frame determined for the target video only includes each initial key frame, alternatively, including simultaneously each
The initial key frame and each secondary key frame.
That is, if matching degree between the detection block of certain two neighboring initial key frame is less than predetermined threshold value,
Secondary key frame is selected in each frame between the two neighboring initial key frame;If the inspection of certain two neighboring initial key frame
The matching degree surveyed between frame is greater than or equal to predetermined threshold value, then without each frame between the two neighboring initial key frame
The middle secondary key frame of selection.
Here, the value of the prefixed time interval can be set or adjusted according to accuracy of detection and/or detection speed.
Here, the testing result obtained after object detection, the spatial position including detection block are carried out to each key frame
And classification confidence.
In general, object detection is carried out to key frame using based on the object detector of picture.
Here, after secondary key frame is determined, if also need to the secondary key frame and initial pass adjacent thereto
Key frame carries out matching degree verification, can also be set or adjusted according to the demand of accuracy of detection and/or detection speed.
For example, a target video shares 121 frames, selected if choosing a series of initial key frames at interval of 24 frames
The initial key frame taken is the 1st frame, the 25th frame, the 49th frame, the 73rd frame, the 97th frame, the 121st frame.It is examined with the object based on picture
It surveys device and object detection is carried out to a series of this initial key frame, then calculate detection knot of the testing result with the 25th frame of the 1st frame
The matching degree of fruit, the matching degree of the testing result of the 25th frame and the testing result of the 49th frame, the testing result of the 49th frame with
The matching degree of the testing result of 73rd frame, the matching degree of the testing result of the 73rd frame and the testing result of the 97th frame, the 97th
The matching degree of the testing result of frame and the testing result of the 121st frame, it is assumed that the only testing result of the 1st frame and the detection of the 25th frame
As a result matching degree and the matching degree of the testing result of the testing result and the 73rd frame of the 49th frame is less than certain threshold value
Then, it is secondary key frame that the 13rd frame is determined between the 1st frame and the 25th frame, determines that the 85th frame is between the 73rd frame and the 97th frame
Secondary key frame, and object detection is carried out to this series of secondary key frame based on the object detector of picture.
It, will be with the average value phase if the frame number average value per adjacent two initial key frame is non-integer in practical application
Frame corresponding to integer value that is adjacent and being less than the average value, is determined as secondary key frame.
For example, if two initial key frames are the 1st frame and the 24th frame, the average value of the 1st frame and the 24th frame is 12.5,
Due to 12 <, 12.5,13 > 12.5, then the secondary key frame being determined as the 12nd frame between the 1st frame and the 24th frame.
It is described that several key frames are determined based on target video in a specific embodiment, and to each key
Frame carries out object detection, obtains the testing result of each key frame, including:
Choose multiple initial key frames according to prefixed time interval, with based on the object detector of picture to the initial pass
Key frame carries out object detection, obtains spatial position and the classification confidence of detection block corresponding with each initial key frame;
For the detection block in every two neighboring initial key frame, based on spatial position and classification confidence progress
Match, be less than predetermined threshold value in response to the matching degree of spatial position and classification confidence, in the two neighboring initial key frame
Between each frame in select secondary key frame, and object detection is carried out to each secondary key frame, obtain each described time
The spatial position of detection block in grade key frame and classification confidence.
That is, initial key frame and secondary key frame are answered together as the key frame of entire frame on key frame
Testing result is obtained with object detector.
Step 102, the testing result according to each key frame, the centre between determining per adjacent two key frames
The testing result of frame.
As an alternative embodiment, the testing result according to each key frame, determines per adjacent two
The testing result of intermediate frame between the key frame, including:
For every two adjacent key frames, the frame between left frame, intermediate frame and left frame and intermediate frame is taken, is calculated
First motion history image (MHI, Motion History Image), using first nerves network to first motion history
Image zooming-out feature, predicts first offset of the detection block from left frame to intermediate frame, and first offset is added to a left side
In the detection block of side frame, as the spatial position for the detection block for traveling to intermediate frame, the classification confidence of the detection block of intermediate frame
It is identical with the classification confidence of the detection block of left frame;
For every two adjacent key frames, the frame between right side frame, intermediate frame and right side frame and intermediate frame is taken, is calculated
Second motion history image extracts feature to second motion history image using first nerves network, predicts detection block
Second offset is added in the detection block of right side frame, as traveling to by the second offset from right side frame to intermediate frame
The classification confidence of the spatial position of the detection block of intermediate frame, the classification confidence of the detection block of intermediate frame and the detection block of left frame
It spends identical;
By propagated from intermediate frame from left frame to intermediate frame as a result, and propagating to obtain from right side frame to intermediate frame
It is intermediate frame as a result, merging the testing result as intermediate frame.
Wherein, the first nerves network is by the special trained neural network of the first training set.By to
One neural network input, two key frames of left and right and its image of testing result and intermediate frame, first nerves network can export intermediate frame
Testing result.Here, the left frame and the right side frame, take out from two adjacent key frames respectively.The centre
Frame is the frame between two adjacent key frames.
In this way, when asking for the testing result of intermediate frame, without being examined based on the detector of picture to intermediate frame
It surveys, it is only necessary to can predict the testing result of intermediate frame using the testing result of the key frame obtained in step 101.
121 frames are still shared with above-mentioned target video, the key frame determined includes the 1st frame of initial key frame, the 25th frame, the
49 frames, the 73rd frame, the 97th frame, the 121st frame are illustrated for secondary the 13rd frame of key frame, the 85th frame, then adjacent key frame
Including (1,13), (13,25), (25,49), (49,73), (73,85), (85,97), (97,121), per adjacent two keys
Intermediate frame between frame includes the 7th frame, the 17th frame, the 37th frame, the 61st frame, the 79th frame, the 91st frame, the 109th frame;Only with adjacent pass
For key frame is the 1st frame and the 7th frame, the intermediate frame between them is the 4th frame, is propagated from the 1st frame of left frame to the 4th frame of intermediate frame
Propagation result be denoted as the first testing result of the 4th frame of intermediate frame, the biography propagated from the 7th frame of right side frame to the 4th frame of intermediate frame
Second of testing result that result is denoted as the 4th frame of intermediate frame is broadcast, therefore, the testing result of the 4th frame of intermediate frame predicted includes
The first testing result and second of testing result.
Wherein, the first testing result includes spatial position and the classification confidence of detection block, second of testing result
Spatial position and classification confidence including detection block.
According to the 1st frame of left frame, the 4th frame of intermediate frame and the 2nd frame of frame, the 3rd frame between them, MHI is calculated, utilizes the
One neural network extracts feature to the MHI, predicts offset of the detection block from the 1st frame of left frame to the 4th frame of intermediate frame, will
The offset is added in the detection block of the 1st frame of left frame, as the spatial position for the detection block for traveling to the 4th frame of intermediate frame,
The classification confidence of the detection block of the 4th frame of intermediate frame is identical with the classification confidence of the detection block of the 1st frame of left frame.
Similarly, according to the 7th frame of right side frame, the 4th frame of intermediate frame and the 5th frame of frame, the 6th frame between them, MHI is calculated,
Feature is extracted to the MHI using first nerves network, predicts detection block from the 7th frame of right side frame to the inclined of the 4th frame of intermediate frame
The offset is added in the detection block of right side the 7th frame of frame, the sky as the detection block for traveling to the 4th frame of intermediate frame by shifting amount
Between position, the classification confidence of the detection block of the 4th frame of intermediate frame is identical with the classification confidence of the detection block of the 7th frame of right side frame.
By taking the 13rd frame of intermediate frame of adjacent the 1st frame of key frame and the 25th frame as an example, be propagated through to obtain from the 1st frame the 13rd
The testing result of frame is detection block A+ classification confidences A ';The testing result of the 13rd frame for being propagated through to obtain from the 13rd frame is inspection
Frame B+ classification confidence B ' are surveyed, if detection block A includes detection block a1, a2, a3 totally 3 frames, classification confidence A ' is put including classification
Reliability a1 ', a2 ', a3 ' correspond to detection block a1, a2, a3 respectively;Detection block B includes detection block b1, b2 totally 2 frames, confidence of classifying
Degree B ' includes classification confidence b1 ', b2 ', corresponds to detection block b1, b2 respectively;So, the testing result packet of the 13rd frame of intermediate frame
It includes:Totally 5 frames, this corresponding classification confidence of 5 frames are a1 ', a2 ', a3 ', b1, b2 by a1, a2, a3, b1, b2.
Step 103 is modified the testing result of each intermediate frame, obtains revised each intermediate frame
Testing result.
In this way, being modified by the testing result to intermediate frame, it can make predicted obtaining rather than device is examined after testing
The testing result of the intermediate frame measured is more accurate, while can also save calculating cost.
As an alternative embodiment, the testing result to each intermediate frame is modified, repaiied
The testing result of each intermediate frame after just, including:
The image of intermediate frame and testing result are subjected to change of scale operation according to target scale, the target scale is big
In current scale;
Feature is extracted to described image using nervus opticus network, predicts counterpart position in input frame to described image
The offset is added by the offset put with the input frame, the space as gained after being corrected in the target scale
Position;
Wherein, the input frame is the detection block of intermediate frame.
Specifically, if ask for the testing result of intermediate frame using first nerves network, completed based on the first scale,
When being then modified using the testing result of frame between nervus opticus Internet on middle, completed based on the second scale, wherein, first
Scale is less than the second scale.Here, the scale can be understood as the size of the resolution ratio of image.That is, in scale dimension
On degree, the spatial position of frame is corrected one by one from low resolution to high-resolution.
Still using above-mentioned adjacent key frame as the 1st frame and the 7th frame, for the 4th frame of intermediate frame between them, the second god is utilized
Testing result through the 4th frame of frame between Internet on middle is modified, then the input of nervus opticus network is:The figure of the 4th frame of intermediate frame
The testing result of the 4th frame of picture and intermediate frame;The output of nervus opticus network is:The testing result of revised the 4th frame of intermediate frame.
Here, the testing result of the 4th frame of intermediate frame includes spatial position and the classification confidence of detection block, passes through the second god
When being modified through network to the testing result of the 4th frame, it is only necessary to which the spatial position of the detection block of the 4th frame is modified.
Wherein, the nervus opticus network is by the special trained neural network of the second training set.By to
The testing result of two neural networks input intermediate frame and the image of intermediate frame, nervus opticus network can export revised centre
The testing result of frame.
After above-mentioned steps 102, step 103 carry out several grades, the testing result on each frame can be obtained.
Here, how many grade are specifically performed, can be set or adjusted according to the demand of accuracy of detection and/or detection speed.
During testing result due to determining intermediate frame, not using detector, this will be than directly using detector speed than
Soon, time cost can be saved.
Assuming that without being inserted into secondary key frame among adjacent the 1st frame of key frame and the 25th frame, then, take the 1st frame and the 25th
The 13rd frame between frame is as first order intermediate frame;Take the 7th frame between the 1st frame and the 13rd frame, take the 13rd frame and the 25th frame it
Between the 19th frame as second level intermediate frame;In view of time and the balance of precision, the 13rd frame of intermediate frame, the 7th frame, the 19th is obtained
After the testing result of frame, the above method is not just recycled to solve in the 1st frame to 25 frames except the 1st frame, the 25th frame, the 13rd frame, the 7th
The testing result of other frames except frame, the 19th frame, but use linear interpolation algorithm obtain other frames as a result, with quickly
Obtain the testing result of other frames.
Step 104, the testing result based on each key frame, each revised intermediate frame detection knot
Fruit determines the testing result of the target video.
As an alternative embodiment, the testing result based on each key frame, each revised
The testing result of the intermediate frame determines the testing result of the target video, including:
The testing result of testing result based on each key frame, each revised intermediate frame, utilizes line
Property interpolation algorithm determines in the target video other frames except the key frame unless each and each intermediate frame
Testing result.
It so, it is possible while the testing result accuracy for ensureing important frame such as key frame and intermediate frame, moreover it is possible to estimate
Go out in target video the testing result of other frames except the key frame unless each and each intermediate frame, meet the time and
The balance of precision, the real-time object detection in can realizing video under the premise of high-accuracy is ensured.
Further, after step 104, the method may also include:
Step 105 (not shown in figure 1):By the identical detection block of classification in each adjacent two frame, it is overlapped according to spatial position
Degree carries out serial operation, obtains object chain, and the object chain is made of across multiframe and the identical detection block of classification;Needle respectively
Detection block on each object chain is reclassified, and obtains the classification confidence of each detection block.
That is, by the generic detection block of adjacent two frame, it is together in series according to spatial position overlapping degree, finally
Form the chain being made of generic detection block across multiframe;Detection block on each chain is reclassified,
And determine the classification confidence of the detection block on each chain.
Here, the classification refers to classification described in object, such as the mankind, animal, the vehicles.The classification can root
It is set according to the universal standard or customer demand.
Here, the spatial position overlapping degree, refers to:Two detection blocks closest in adjacent two frame are connected
Get up.
For example, there are 4 frames in first frame, it is denoted as frame 1, frame 2, frame 3, frame 4 respectively;There are 4 frames on second frame, be denoted as respectively
Frame 1 ', frame 2 ', frame 3 ', frame 4 '.If frame 1 and 1 ' distance of frame are near, frame 1 with frame 1 ' is stringed together, forms first chain;If frame 2
It is near with 2 ' distance of frame, then frame 2 with frame 2 ' is stringed together, form Article 2 chain;If frame 3 and 3 ' distance of frame are near, by frame 3 and frame 3 '
It strings together, forms Article 3 chain;If frame 3 and 3 ' distance of frame are near, frame 3 with frame 3 ' is stringed together, forms Article 4 chain.It is practical to answer
In, it is possible to which first chain, Article 2 chain, Article 3 chain are identical with the object classification corresponding to Article 4 chain, it is also possible to no
Together.For example, corresponding object is people on four chains.For another example, the corresponding object of first chain is behaved, and Article 2 chain is corresponding
Object is dog, and the corresponding object of Article 3 chain is tree, and the corresponding object of Article 4 chain is vehicle.
As an alternative embodiment, described reclassified respectively for the detection block on each object chain,
And the classification confidence of each detection block is obtained, including:
Several detection blocks on each object chain are chosen at equal intervals, cut out the corresponding figure of several described detection blocks
Described image is simultaneously zoomed to same size by picture, and feature is extracted to each described image of same size using third nerve network
And classify, obtain the classification confidence of each detection block on each object chain.
Here, equally spaced value can be set or adjusted according to the length of chain.Here equally spaced value, with selection
The value of constant duration during initial key frame may be the same or different.
Here, several frames are chosen, it can be understood as one frame of interval selection at regular intervals.
Assuming that the length of a chain is 30 frames, then, a frame is chosen every 6 frames, several selected frames can be
1st, 7,13,19,25, correspond to frame on this article of chain on 30 frames.
In fact, for corresponding to a chain, each frame both corresponds to an independent frame on this chain, is not in one
There is the situation of the frame of 2 or 2 of a frame image or more on chain.
In practical application, if true picture is one dog of a people, one vehicle, what is detected is 2 people, 2 dog, 1 vehicle;In true people
Position detection go out 2 frames, this 2 frames have overlapping, then retain the high frame of classification confidence.
Classifying again here, actually reaffirms the object in frame and the classification confidence of frame, still
The spatial position of frame will not be determined again.
For each generic, the 1st to the 25th frame, if each frame has cat, has dog, has vehicle, all someone;Then may
A chain in relation to cat is obtained, a chain in relation to dog, a chain in relation to people, one has the chain to cut-off.
For example, confirmed according to step 101 to 104, the object in the 1st frame to the 25th frame frame is cat, confidence level 0.5;
It is likely to be obtained after step 105 is classified again, the object in the 1st frame to the 25th frame frame is cat, confidence level 0.8;Or the 1st frame
Object into the 25th frame frame is dog, confidence level 0.7.
Wherein, the third nerve network is by the special trained neural network of third training set.By to
The chain that three neural networks input each is made of generic detection block, third nerve network can export the inspection on each chain
Survey the classification confidence of frame.
In this way, being reclassified by the classification confidence in the testing result to each frame, classification confidence can be promoted
Accuracy, so as to further promote the accuracy of the testing result of each frame.
For a target video, if a node is interpreted as a frame, then, which should have multiple defeated
Ingress and multiple output nodes.By taking target video has 600 frames as an example, if having chosen 50 key frames altogether, then, which regards
Frequency should have 50 input nodes and 600 output nodes.
It is usually only optimized in a dimension in time or scale, and without synthesis modeling relative to existing
Method for, present applicant proposes a kind of new video object detection frameworks, are carried out in two dimensions of time and scale comprehensive
Modeling analysis is closed, specifically, carrying out latticed asymptotic analysis in two dimensions of time and scale.
M- scale gridding analysis schematic diagram when Fig. 2 is, as shown in Fig. 2, video object detection block proposed by the present invention
Frame, by object detection be modeled as when m- scale two-dimensional space in directed acyclic graph, laterally for time dimension, from left to right
Time is incremented by successively, and longitudinal direction is scale dimension, and photo resolution is sequentially increased from top to bottom.When each node is some in Fig. 2
Between testing result of the point under some scale, each directed edge is a kind of operation, and the process of object detection is sparse from top
Node start, by a series of paths, reach the intensive node of bottom.On time dimension, motion history figure is utilized
It is inputted as (MHI) is used as, testing result is traveled on other frames;In scale dimension, one from low resolution to high-resolution
Correct to grade level-one the spatial position of frame.It is final to obtain at high resolutions by the propagation of this grid type and amendment path
The testing result of each frame.
Fig. 3 is the exemplary plot of detection framework, wherein, T represents time propagation module, and S representation space position correction modules are right
For the figure, in this corresponding frame image (being assumed to be key frame, corresponding 1st frame) of time t, detect to obtain 4 using detector
A frame, and the object in each frame is people;In this corresponding frame image of time t+4x (being assumed to be key frame, corresponding 25th frame),
Detect to obtain 4 frames using detector, and the object in 3 frames is people, the object in 1 frame is vehicle;It is corresponded in time t+2x
This frame image (being assumed to be intermediate frame, corresponding 13rd frame), detector is not utilized to detect, but by based on when m- scale
It propagates and corrects, obtain 8 frames, including 4 frames propagated from the 1st frame to the 13rd frame, in addition from the 25th frame to the 13rd frame
Propagate 4 obtained frames.
Object detecting method in the video that the embodiment of the present invention proposes, it is proposed that a kind of new video object detection framework,
Latticed asymptotic analysis is carried out in two dimensions of time and scale.Efficient time dimension is designed under the frame proposed
Testing result specifically, on time dimension, by the use of motion history image (MHI) as input, is traveled to it by propagation module
On his frame;In scale dimension, the spatial position of frame is corrected one by one from low resolution to high-resolution.Pass through this net
The propagation of form and amendment path, the final testing result for obtaining each frame at high resolutions.Pass through technology of the present invention
Scheme can be realized and calculate the well balanced of cost and detection performance, can realize video under the premise of ensureing compared with high-accuracy
In real-time object detection.
Using technical solution of the present invention, monitor video can be analyzed in real time, detects interested object,
The video flowing of vehicle-mounted camera can also be analyzed in real time, the objects such as pedestrian, vehicle in detection road ahead carry out base
It is driven in the auxiliary of vision.
Object detecting method in corresponding above-mentioned video, present embodiments provides article detection device in a kind of video, such as Fig. 4
Shown, which includes:
First determining module 10, for determining several key frames based on target video;
Key frame detection module 20 for carrying out object detection to each key frame, obtains each key frame
Testing result;
Second determining module 30 for the testing result according to each key frame, is determined per adjacent two keys
The testing result of intermediate frame between frame;
Correcting module 40 is modified for the testing result to each intermediate frame, obtains revised each institute
State the testing result of intermediate frame;
Third determining module 50, for the testing result based on each key frame, each revised centre
The testing result of frame determines the testing result of the target video.
Further, described device further includes:
Sort module 60 again, for determined in the third determining module 50 testing result of the target video it
Afterwards, by the identical detection block of the classification of each adjacent two frame, serial operation is carried out according to spatial position overlapping degree, obtains target
Chain, the object chain are made of across multiframe and the identical detection block of classification;Respectively for the detection block on each object chain into
Row reclassifies, and obtains the classification confidence of each detection block.
As a kind of embodiment, first determining module 10 is additionally operable to:
Multiple initial key frames are chosen according to prefixed time interval, object detection is carried out to each initial key frame, is obtained
To the spatial position of the detection block in each initial key frame and classification confidence;
For the detection block in every two neighboring initial key frame, based on spatial position and classification confidence progress
Match;
It is less than predetermined threshold value in response to the matching degree of spatial position and classification confidence, in the two neighboring initial pass
Secondary key frame is selected in each frame between key frame, and object detection is carried out to each secondary key frame, obtains each institute
State spatial position and the classification confidence of the detection block in secondary key frame;
Wherein, the key frame determined for the target video only includes each initial key frame, alternatively, including simultaneously each
The initial key frame and each secondary key frame.
As a kind of embodiment, second determining module 30 is used for:
For every two adjacent key frames, the frame between left frame, intermediate frame and left frame and intermediate frame is taken, is calculated
First motion history image (MHI) extracts feature to first motion history image using first nerves network, predicts inspection
First offset of the frame from left frame to intermediate frame is surveyed, first offset is added in the detection block of left frame, as biography
It is multicast to the spatial position of the detection block of intermediate frame, the classification of the classification confidence of the detection block of intermediate frame and the detection block of left frame
Confidence level is identical;
For every two adjacent key frames, the frame between right side frame, intermediate frame and right side frame and intermediate frame is taken, is calculated
Second motion history image extracts feature to second motion history image using first nerves network, predicts detection block
Second offset is added in the detection block of right side frame, as traveling to by the second offset from right side frame to intermediate frame
The classification confidence of the spatial position of the detection block of intermediate frame, the classification confidence of the detection block of intermediate frame and the detection block of left frame
It spends identical;
By propagated from intermediate frame from left frame to intermediate frame as a result, and propagating to obtain from right side frame to intermediate frame
It is intermediate frame as a result, merging the testing result as intermediate frame.
As a kind of embodiment, the correcting module 40 is additionally operable to:
The image of intermediate frame and testing result are subjected to change of scale operation according to target scale, the target scale is big
In current scale;
Feature is extracted to described image using nervus opticus network, predicts counterpart position in input frame to described image
The offset is added by the offset put with the input frame, the space as gained after being corrected in the target scale
Position;
Wherein, the input frame is the detection block of intermediate frame.
As a kind of embodiment, the third determining module 50 is additionally operable to:
The testing result of testing result based on each key frame, each revised intermediate frame, utilizes line
Property interpolation algorithm determines in the target video other frames except the key frame unless each and each intermediate frame
Testing result.
As a kind of embodiment, sort module 60 again are additionally operable to:
Several detection blocks on each object chain are chosen at equal intervals, cut out the corresponding figure of several described detection blocks
Described image is simultaneously zoomed to same size by picture, and feature is extracted to each described image of same size using third nerve network
And classify, obtain the classification confidence of each detection block on each object chain.
It will be appreciated by those skilled in the art that each processing module in video shown in Fig. 4 in article detection device
Realize that function can refer to the associated description of object detecting method in aforementioned video and understand.It will be appreciated by those skilled in the art that
The function of each processing unit can be by running on the program on processor and reality in article detection device in video shown in Fig. 4
It is existing, it can also be realized by specific logic circuit.
In practical application, above-mentioned key frame detection module 20 can be by being realized based on the object detector of picture.It is above-mentioned
First determining module 10, the second determining module 30, correcting module 40, third determining module 50 and the specific knot of sort module 60 again
Structure may both correspond to processor.The specific structure of processor can be central processing unit (CPU, Central
Processing Unit), microprocessor (MCU, Micro Controller Unit), digital signal processor (DSP,
Digital Signal Processing) or programmable logic device (PLC, Programmable Logic Controller)
Deng electronic component or the set of electronic component with processing function.Wherein, the processor includes executable code, institute
It states executable code to be stored in storage medium, the processor can be by the communication interfaces such as bus and the storage medium
It is connected, in the corresponding function for performing specific each unit, is read from the storage medium and run the executable code.
The part that the storage medium is used to store the executable code is preferably non-moment storage medium.
First determining module 10, the second determining module 30, correcting module 40, third determining module 50 and mould of classifying again
Block 60 can integrate corresponding to same processor or correspond to respectively different processors;When integrating corresponding to same processor,
The processor handles first determining module 10, the second determining module 30, correcting module 40, third using the time-division and determines mould
Block 50 and again 60 corresponding function of sort module.
Article detection device in video proposed by the present invention carries out latticed gradual in two dimensions of time and scale
Testing result specifically, on time dimension, by the use of motion history image (MHI) as input, is traveled to other frames by analysis
On;In scale dimension, the spatial position of frame is corrected one by one from low resolution to high-resolution;Pass through this grid type
Propagation and correct path, the final testing result for obtaining each frame at high resolutions;In this way, can realize calculate cost and
Well balanced, the real-time object detection in realizing video under the premise of ensureing compared with high-accuracy of detection performance.
The embodiment of the present invention also describes a kind of computer storage media, and calculating is stored in the computer storage media
Machine executable instruction, the computer executable instructions are used to perform object detection side in the video described in foregoing individual embodiments
Method.That is, after the computer executable instructions are executed by processor, any one aforementioned technical solution can be realized
Object detecting method in the video of offer.
It will be appreciated by those skilled in the art that in the computer storage media of the present embodiment each program function, can refer to
The associated description of object detecting method in video described in foregoing embodiments and understand.
It should be noted that technical solution of the present invention has stronger versatility.In addition to carrying out above-mentioned object detection
Task outside, by the replacement to the particular module such as modules such as the second determining module and correcting module, can complete in video
The tasks such as object tracking, object instance segmentation.
By taking object instance is divided as an example, under the frame of method provided by the present invention, from the segmentation result of sparse key frame
Start, the propagation of mask is split on a timeline, and walk and correct in spatial position previous step, at this point, by object detector
It is substituted for dispenser.
By taking object tracking as an example, under the frame of method provided by the present invention, since the testing result of sparse key frame,
It on a timeline into the propagation of line trace, and walks and corrects in spatial position previous step, at this point, object detector is substituted for tracking
Device.
Object detecting method and device, computer storage media in video described in the various embodiments described above, can concrete application
In application scenarios such as intelligent video analysis, unmanned vehicle automatic Pilots.
The application scenarios for being particularly applicable in unmanned field are given below.In practical application, intelligent automobile passes through above-mentioned
Object detecting method and device, computer storage media in video, using on the time from sparse to intensive, from low resolution on scale
Rate determines the testing result of each frame in target video, in the premise for ensureing high-accuracy to high-resolution detection process
Real-time object detection in lower realization video analyzes the video flowing of vehicle-mounted camera in real time, detects in road ahead
The objects such as pedestrian, vehicle, the auxiliary for carrying out view-based access control model drive.
The application scenarios being particularly applicable on intelligent video analysis are given below.In practical application, robot passes through above-mentioned
Object detecting method and device, computer storage media in video, using on the time from sparse to intensive, from low resolution on scale
Rate is analyzed monitor video, can quickly and accurately be determined in target video in real time to high-resolution detection process
The testing result of each frame detects interested object.
In several embodiments provided herein, it should be understood that disclosed device and method can pass through it
Its mode is realized.Apparatus embodiments described above are only schematical, for example, the division of the unit, only
A kind of division of logic function can have other dividing mode, such as in actual implementation:Multiple units or component can combine or
It is desirably integrated into another system or some features can be ignored or does not perform.In addition, shown or discussed each composition portion
Point mutual coupling or direct-coupling or communication connection can be the INDIRECT COUPLINGs by some interfaces, equipment or unit
Or communication connection, can be electrical, mechanical or other forms.
The above-mentioned unit illustrated as separating component can be or may not be physically separate, be shown as unit
The component shown can be or may not be physical unit;Both it can be located at a place, multiple network lists can also be distributed to
In member;Part or all of unit therein can be selected according to the actual needs to realize the purpose of this embodiment scheme.
In addition, each functional unit in various embodiments of the present invention can be fully integrated into a processing unit, also may be used
To be each unit individually as a unit, can also two or more units integrate in a unit;It is above-mentioned
The form that hardware had both may be used in integrated unit is realized, can also be realized in the form of hardware adds SFU software functional unit.
One of ordinary skill in the art will appreciate that:Realizing all or part of step of above method embodiment can pass through
The relevant hardware of program instruction is completed, and aforementioned program can be stored in computer read/write memory medium, which exists
During execution, step including the steps of the foregoing method embodiments is performed;And aforementioned storage medium includes:Movable storage device read-only is deposited
Reservoir (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), magnetic disc or
The various media that can store program code such as CD.
If alternatively, the above-mentioned integrated unit of the present invention is realized in the form of software function module and is independent product
Sale in use, can also be stored in a computer read/write memory medium.Based on such understanding, the present invention is implemented
The technical solution of example substantially in other words can be embodied the part that the prior art contributes in the form of software product,
The computer software product is stored in a storage medium, and being used including some instructions (can be with so that computer equipment
It is personal computer, server or network equipment etc.) perform all or part of each embodiment the method for the present invention.
And aforementioned storage medium includes:Movable storage device, ROM, RAM, magnetic disc or CD etc. are various can to store program code
Medium.
The above description is merely a specific embodiment, but protection scope of the present invention is not limited thereto, any
Those familiar with the art in the technical scope disclosed by the present invention, can readily occur in change or replacement, should all contain
Lid is within protection scope of the present invention.Therefore, protection scope of the present invention should be based on the protection scope of the described claims.
Claims (10)
1. a kind of object detecting method in video, which is characterized in that the method includes:
Several key frames are determined, and carry out object detection to each key frame based on target video, are obtained each described
The testing result of key frame;
According to the testing result of each key frame, the detection knot of the intermediate frame between determining per adjacent two key frames
Fruit;
The testing result of each intermediate frame is modified, obtains the testing result of revised each intermediate frame;
The testing result of testing result based on each key frame, each revised intermediate frame, determines the mesh
Mark the testing result of video.
2. according to the method described in claim 1, it is characterized in that, it is described determine the target video testing result after,
The method further includes:
By the identical detection block of classification in each adjacent two frame, serial operation is carried out according to spatial position overlapping degree, obtains target
Chain, the object chain are made of across multiframe and the identical detection block of classification;
It is reclassified, and obtain the classification confidence of each detection block for the detection block on each object chain respectively.
3. method according to claim 1 or 2, which is characterized in that it is described that several key frames are determined based on target video,
And object detection is carried out to each key frame, the testing result of each key frame is obtained, including:
Multiple initial key frames are chosen according to prefixed time interval, object detection is carried out to each initial key frame, is obtained each
The spatial position of detection block in a initial key frame and classification confidence;
For the detection block in every two neighboring initial key frame, matched based on spatial position and classification confidence;
It is less than predetermined threshold value in response to the matching degree of spatial position and classification confidence, in the two neighboring initial key frame
Between each frame in select secondary key frame, and object detection is carried out to each secondary key frame, obtain each described time
The spatial position of detection block in grade key frame and classification confidence;
Wherein, the key frame determined for the target video only includes each initial key frame, alternatively, including simultaneously each described
Initial key frame and each secondary key frame.
4. method according to claim 1 or 2, which is characterized in that the testing result according to each key frame,
The testing result of intermediate frame between determining per adjacent two key frames, including:
For every two adjacent key frames, the frame between left frame, intermediate frame and left frame and intermediate frame is taken, calculates first
Motion history image extracts feature to first motion history image using first nerves network, predicts detection block from a left side
First offset is added in the detection block of left frame to the first offset of intermediate frame by side frame, as traveling to centre
The spatial position of the detection block of frame, the classification confidence of the detection block of intermediate frame and the classification confidence phase of the detection block of left frame
Together;
For every two adjacent key frames, the frame between right side frame, intermediate frame and right side frame and intermediate frame is taken, calculates second
Motion history image extracts feature to second motion history image using first nerves network, predicts detection block from the right side
Second offset is added in the detection block of right side frame, to the second offset of intermediate frame as traveling to centre by side frame
The spatial position of the detection block of frame, the classification confidence of the detection block of intermediate frame and the classification confidence phase of the detection block of left frame
Together;
It is intermediate frame as a result, and from right side frame is propagated to intermediate frame by what is propagated from left frame to intermediate frame
Between frame as a result, merging the testing result as intermediate frame.
5. method according to claim 1 or 2, which is characterized in that the testing result to each intermediate frame into
Row is corrected, and obtains the testing result of revised each intermediate frame, including:
The image of intermediate frame and testing result are subjected to change of scale operation according to target scale, the target scale, which is more than, works as
Preceding scale;
Feature is extracted to described image using nervus opticus network, predicts and object space is corresponded in input frame to described image
The offset is added by offset with the input frame, the spatial position as gained after being corrected in the target scale;
Wherein, the input frame is the detection block of intermediate frame.
6. method according to claim 1 or 2, which is characterized in that the testing result based on each key frame,
The testing result of each revised intermediate frame determines the testing result of the target video, including:
The testing result of testing result based on each key frame, each revised intermediate frame is inserted using linear
Value-based algorithm determines in the target video detection knot of other frames except the key frame unless each and each intermediate frame
Fruit.
7. according to the method described in claim 2, it is characterized in that, described carry out respectively for the detection block on each object chain
It reclassifies, and obtains the classification confidence of each detection block, including:
Several detection blocks on each object chain are chosen at equal intervals, cut out the corresponding image of several described detection blocks simultaneously
Described image is zoomed into same size, is gone forward side by side using third nerve network to each described image extraction feature of same size
Row classification, obtains the classification confidence of each detection block on each object chain.
8. article detection device in a kind of video, which is characterized in that described device includes:
First determining module, for determining several key frames based on target video;
Key frame detection module for carrying out object detection to each key frame, obtains the detection of each key frame
As a result;
Second determining module, for the testing result according to each key frame, between determining every adjacent two key frames
Intermediate frame testing result;
Correcting module is modified for the testing result to each intermediate frame, obtains revised each centre
The testing result of frame;
Third determining module, for the testing result based on each key frame, the inspection of each revised intermediate frame
It surveys as a result, determining the testing result of the target video.
9. device according to claim 8, which is characterized in that described device further includes:
Sort module again, after determining the testing result of the target video in the third determining module, Jiang Gexiang
The identical detection block of classification in adjacent two frames, carries out serial operation according to spatial position overlapping degree, obtains object chain, the mesh
Mark chain is made of across multiframe and the identical detection block of classification;Divided again for the detection block on each object chain respectively
Class, and obtain the classification confidence of each detection block.
10. a kind of computer storage media, computer executable instructions, the calculating are stored in the computer storage media
Machine executable instruction requires object detecting method in 1 to 7 any one of them video for perform claim.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810151829.XA CN108256506B (en) | 2018-02-14 | 2018-02-14 | Method and device for detecting object in video and computer storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810151829.XA CN108256506B (en) | 2018-02-14 | 2018-02-14 | Method and device for detecting object in video and computer storage medium |
Publications (2)
Publication Number | Publication Date |
---|---|
CN108256506A true CN108256506A (en) | 2018-07-06 |
CN108256506B CN108256506B (en) | 2020-11-24 |
Family
ID=62744333
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810151829.XA Active CN108256506B (en) | 2018-02-14 | 2018-02-14 | Method and device for detecting object in video and computer storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108256506B (en) |
Cited By (22)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109059780A (en) * | 2018-09-11 | 2018-12-21 | 百度在线网络技术(北京)有限公司 | Detect method, apparatus, equipment and the storage medium of obstacle height |
CN109063593A (en) * | 2018-07-13 | 2018-12-21 | 北京智芯原动科技有限公司 | A kind of face tracking method and device |
CN109118519A (en) * | 2018-07-26 | 2019-01-01 | 北京纵目安驰智能科技有限公司 | Target Re-ID method, system, terminal and the storage medium of Case-based Reasoning segmentation |
CN109308463A (en) * | 2018-09-12 | 2019-02-05 | 北京奇艺世纪科技有限公司 | A kind of video object recognition methods, device and equipment |
CN109344789A (en) * | 2018-10-16 | 2019-02-15 | 北京旷视科技有限公司 | Face tracking method and device |
CN109711296A (en) * | 2018-12-14 | 2019-05-03 | 百度在线网络技术(北京)有限公司 | Object classification method and its device, computer program product, readable storage medium storing program for executing |
CN110070050A (en) * | 2019-04-24 | 2019-07-30 | 厦门美图之家科技有限公司 | Object detection method and system |
CN110189378A (en) * | 2019-05-23 | 2019-08-30 | 北京奇艺世纪科技有限公司 | A kind of method for processing video frequency, device and electronic equipment |
CN110427816A (en) * | 2019-06-25 | 2019-11-08 | 平安科技(深圳)有限公司 | Object detecting method, device, computer equipment and storage medium |
CN110705412A (en) * | 2019-09-24 | 2020-01-17 | 北京工商大学 | Video target detection method based on motion history image |
CN111178245A (en) * | 2019-12-27 | 2020-05-19 | 深圳佑驾创新科技有限公司 | Lane line detection method, lane line detection device, computer device, and storage medium |
CN111860373A (en) * | 2020-07-24 | 2020-10-30 | 浙江商汤科技开发有限公司 | Target detection method and device, electronic equipment and storage medium |
CN112061139A (en) * | 2020-09-03 | 2020-12-11 | 三一专用汽车有限责任公司 | Automatic driving control method, automatic driving device and computer storage medium |
CN112528932A (en) * | 2020-12-22 | 2021-03-19 | 北京百度网讯科技有限公司 | Method and device for optimizing position information, road side equipment and cloud control platform |
US10984588B2 (en) | 2018-09-07 | 2021-04-20 | Baidu Online Network Technology (Beijing) Co., Ltd | Obstacle distribution simulation method and device based on multiple models, and storage medium |
US11113546B2 (en) | 2018-09-04 | 2021-09-07 | Baidu Online Network Technology (Beijing) Co., Ltd. | Lane line processing method and device |
US11126875B2 (en) | 2018-09-13 | 2021-09-21 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method and device of multi-focal sensing of an obstacle and non-volatile computer-readable storage medium |
US11205289B2 (en) | 2018-09-07 | 2021-12-21 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method, device and terminal for data augmentation |
US11307302B2 (en) | 2018-09-07 | 2022-04-19 | Baidu Online Network Technology (Beijing) Co., Ltd | Method and device for estimating an absolute velocity of an obstacle, and non-volatile computer-readable storage medium |
WO2023138444A1 (en) * | 2022-01-22 | 2023-07-27 | 北京眼神智能科技有限公司 | Pedestrian action continuous detection and recognition method and apparatus, storage medium, and computer device |
US11718318B2 (en) | 2019-02-22 | 2023-08-08 | Apollo Intelligent Driving (Beijing) Technology Co., Ltd. | Method and apparatus for planning speed of autonomous vehicle, and storage medium |
US11780463B2 (en) | 2019-02-19 | 2023-10-10 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method, apparatus and server for real-time learning of travelling strategy of driverless vehicle |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101499085A (en) * | 2008-12-16 | 2009-08-05 | 北京大学 | Method and apparatus for fast extracting key frame |
US20120154684A1 (en) * | 2010-12-17 | 2012-06-21 | Jiebo Luo | Method for producing a blended video sequence |
CN103400386A (en) * | 2013-07-30 | 2013-11-20 | 清华大学深圳研究生院 | Interactive image processing method used for video |
CN103413322A (en) * | 2013-07-16 | 2013-11-27 | 南京师范大学 | Keyframe extraction method of sequence video |
CN105931269A (en) * | 2016-04-22 | 2016-09-07 | 海信集团有限公司 | Tracking method for target in video and tracking device thereof |
CN106447608A (en) * | 2016-08-25 | 2017-02-22 | 中国科学院长春光学精密机械与物理研究所 | Video image splicing method and device |
-
2018
- 2018-02-14 CN CN201810151829.XA patent/CN108256506B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101499085A (en) * | 2008-12-16 | 2009-08-05 | 北京大学 | Method and apparatus for fast extracting key frame |
US20120154684A1 (en) * | 2010-12-17 | 2012-06-21 | Jiebo Luo | Method for producing a blended video sequence |
CN103413322A (en) * | 2013-07-16 | 2013-11-27 | 南京师范大学 | Keyframe extraction method of sequence video |
CN103400386A (en) * | 2013-07-30 | 2013-11-20 | 清华大学深圳研究生院 | Interactive image processing method used for video |
CN105931269A (en) * | 2016-04-22 | 2016-09-07 | 海信集团有限公司 | Tracking method for target in video and tracking device thereof |
CN106447608A (en) * | 2016-08-25 | 2017-02-22 | 中国科学院长春光学精密机械与物理研究所 | Video image splicing method and device |
Non-Patent Citations (1)
Title |
---|
徐浩然: "基于对象的时空域压缩视频摘要技术", 《数字技术与应用》 * |
Cited By (34)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109063593A (en) * | 2018-07-13 | 2018-12-21 | 北京智芯原动科技有限公司 | A kind of face tracking method and device |
CN109118519A (en) * | 2018-07-26 | 2019-01-01 | 北京纵目安驰智能科技有限公司 | Target Re-ID method, system, terminal and the storage medium of Case-based Reasoning segmentation |
US11113546B2 (en) | 2018-09-04 | 2021-09-07 | Baidu Online Network Technology (Beijing) Co., Ltd. | Lane line processing method and device |
US11307302B2 (en) | 2018-09-07 | 2022-04-19 | Baidu Online Network Technology (Beijing) Co., Ltd | Method and device for estimating an absolute velocity of an obstacle, and non-volatile computer-readable storage medium |
US11205289B2 (en) | 2018-09-07 | 2021-12-21 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method, device and terminal for data augmentation |
US10984588B2 (en) | 2018-09-07 | 2021-04-20 | Baidu Online Network Technology (Beijing) Co., Ltd | Obstacle distribution simulation method and device based on multiple models, and storage medium |
CN110375659A (en) * | 2018-09-11 | 2019-10-25 | 百度在线网络技术(北京)有限公司 | Detect method, apparatus, equipment and the storage medium of obstacle height |
CN109059780A (en) * | 2018-09-11 | 2018-12-21 | 百度在线网络技术(北京)有限公司 | Detect method, apparatus, equipment and the storage medium of obstacle height |
CN109059780B (en) * | 2018-09-11 | 2019-10-15 | 百度在线网络技术(北京)有限公司 | Detect method, apparatus, equipment and the storage medium of obstacle height |
US11519715B2 (en) | 2018-09-11 | 2022-12-06 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method, device, apparatus and storage medium for detecting a height of an obstacle |
CN110375659B (en) * | 2018-09-11 | 2021-07-27 | 百度在线网络技术(北京)有限公司 | Method, device, equipment and storage medium for detecting height of obstacle |
US11047673B2 (en) | 2018-09-11 | 2021-06-29 | Baidu Online Network Technology (Beijing) Co., Ltd | Method, device, apparatus and storage medium for detecting a height of an obstacle |
CN109308463A (en) * | 2018-09-12 | 2019-02-05 | 北京奇艺世纪科技有限公司 | A kind of video object recognition methods, device and equipment |
US11126875B2 (en) | 2018-09-13 | 2021-09-21 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method and device of multi-focal sensing of an obstacle and non-volatile computer-readable storage medium |
CN109344789A (en) * | 2018-10-16 | 2019-02-15 | 北京旷视科技有限公司 | Face tracking method and device |
CN109711296B (en) * | 2018-12-14 | 2022-01-25 | 百度在线网络技术(北京)有限公司 | Object classification method in automatic driving, device thereof and readable storage medium |
CN109711296A (en) * | 2018-12-14 | 2019-05-03 | 百度在线网络技术(北京)有限公司 | Object classification method and its device, computer program product, readable storage medium storing program for executing |
US11780463B2 (en) | 2019-02-19 | 2023-10-10 | Baidu Online Network Technology (Beijing) Co., Ltd. | Method, apparatus and server for real-time learning of travelling strategy of driverless vehicle |
US11718318B2 (en) | 2019-02-22 | 2023-08-08 | Apollo Intelligent Driving (Beijing) Technology Co., Ltd. | Method and apparatus for planning speed of autonomous vehicle, and storage medium |
CN110070050A (en) * | 2019-04-24 | 2019-07-30 | 厦门美图之家科技有限公司 | Object detection method and system |
CN110070050B (en) * | 2019-04-24 | 2021-08-20 | 厦门美图之家科技有限公司 | Target detection method and system |
CN110189378A (en) * | 2019-05-23 | 2019-08-30 | 北京奇艺世纪科技有限公司 | A kind of method for processing video frequency, device and electronic equipment |
CN110427816A (en) * | 2019-06-25 | 2019-11-08 | 平安科技(深圳)有限公司 | Object detecting method, device, computer equipment and storage medium |
CN110427816B (en) * | 2019-06-25 | 2023-09-08 | 平安科技(深圳)有限公司 | Object detection method, device, computer equipment and storage medium |
WO2020258499A1 (en) * | 2019-06-25 | 2020-12-30 | 平安科技(深圳)有限公司 | Object detection method and apparatus, and computer device and storage medium |
CN110705412A (en) * | 2019-09-24 | 2020-01-17 | 北京工商大学 | Video target detection method based on motion history image |
CN111178245A (en) * | 2019-12-27 | 2020-05-19 | 深圳佑驾创新科技有限公司 | Lane line detection method, lane line detection device, computer device, and storage medium |
CN111178245B (en) * | 2019-12-27 | 2023-12-22 | 佑驾创新(北京)技术有限公司 | Lane line detection method, lane line detection device, computer equipment and storage medium |
CN111860373B (en) * | 2020-07-24 | 2022-05-20 | 浙江商汤科技开发有限公司 | Target detection method and device, electronic equipment and storage medium |
CN111860373A (en) * | 2020-07-24 | 2020-10-30 | 浙江商汤科技开发有限公司 | Target detection method and device, electronic equipment and storage medium |
CN112061139A (en) * | 2020-09-03 | 2020-12-11 | 三一专用汽车有限责任公司 | Automatic driving control method, automatic driving device and computer storage medium |
CN112528932A (en) * | 2020-12-22 | 2021-03-19 | 北京百度网讯科技有限公司 | Method and device for optimizing position information, road side equipment and cloud control platform |
CN112528932B (en) * | 2020-12-22 | 2023-12-08 | 阿波罗智联(北京)科技有限公司 | Method and device for optimizing position information, road side equipment and cloud control platform |
WO2023138444A1 (en) * | 2022-01-22 | 2023-07-27 | 北京眼神智能科技有限公司 | Pedestrian action continuous detection and recognition method and apparatus, storage medium, and computer device |
Also Published As
Publication number | Publication date |
---|---|
CN108256506B (en) | 2020-11-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108256506A (en) | Object detecting method and device, computer storage media in a kind of video | |
CN110364008B (en) | Road condition determining method and device, computer equipment and storage medium | |
Moers et al. | The exid dataset: A real-world trajectory dataset of highly interactive highway scenarios in germany | |
CN104134349B (en) | A kind of public transport road conditions disposal system based on traffic multisource data fusion and method | |
CN110415277B (en) | Multi-target tracking method, system and device based on optical flow and Kalman filtering | |
Anand et al. | Data fusion-based traffic density estimation and prediction | |
Luo et al. | Queue length estimation for signalized intersections using license plate recognition data | |
CN111027430B (en) | Traffic scene complexity calculation method for intelligent evaluation of unmanned vehicles | |
CN103366602A (en) | Method of determining parking lot occupancy from digital camera images | |
Hussein et al. | Automated pedestrian safety analysis at a signalized intersection in New York City: Automated data extraction for safety diagnosis and behavioral study | |
Piccoli et al. | Fussi-net: Fusion of spatio-temporal skeletons for intention prediction network | |
Tageldin et al. | Comparison of time-proximity and evasive action conflict measures: Case studies from five cities | |
CN110268457A (en) | For determining medium, controller that the actuated method of collection, computer program product, the computer capacity of at least two vehicles read and including the vehicle of the controller | |
Filatov et al. | Any motion detector: Learning class-agnostic scene dynamics from a sequence of lidar point clouds | |
Dhouioui et al. | Design and implementation of a radar and camera-based obstacle classification system using machine-learning techniques | |
CN112149471B (en) | Loop detection method and device based on semantic point cloud | |
CN114372503A (en) | Cluster vehicle motion trail prediction method | |
CN113867367B (en) | Processing method and device for test scene and computer program product | |
CN115841080A (en) | Multi-view dynamic space-time semantic embedded open pit truck transport time prediction method | |
CN110021161A (en) | A kind of prediction technique and system of traffic direction | |
CN113361528B (en) | Multi-scale target detection method and system | |
Realpe et al. | Towards fault tolerant perception for autonomous vehicles: Local fusion | |
Shin et al. | Image-based learning to measure the stopped delay in an approach of a signalized intersection | |
Katariya et al. | A pov-based highway vehicle trajectory dataset and prediction architecture | |
Hussein et al. | Analysis of road user behavior and safety during New York City’s summer streets program |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |