CN108491816A - The method and apparatus for carrying out target following in video - Google Patents
The method and apparatus for carrying out target following in video Download PDFInfo
- Publication number
- CN108491816A CN108491816A CN201810276460.5A CN201810276460A CN108491816A CN 108491816 A CN108491816 A CN 108491816A CN 201810276460 A CN201810276460 A CN 201810276460A CN 108491816 A CN108491816 A CN 108491816A
- Authority
- CN
- China
- Prior art keywords
- target
- candidate
- tracked
- video
- region
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V20/00—Scenes; Scene-specific elements
- G06V20/40—Scenes; Scene-specific elements in video content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/25—Determination of region of interest [ROI] or a volume of interest [VOI]
Abstract
The embodiment of the present application discloses the method and apparatus for carrying out target following in video.The method for carrying out target following in video includes the position based on target to be tracked in the historical frames of video, and candidate region is intercepted out from the present frame of video;By the full convolutional network of the candidate region intercepted input training in advance, characteristic pattern is obtained, wherein characteristic pattern includes the candidate target region information for being used to indicate candidate target present position in characteristic pattern;Based on candidate target region information, determined from characteristic pattern and the one-to-one candidate target region of each candidate target;And by the candidate target region determined, with the highest candidate target region of target similarity to be tracked as the target to be tracked in present frame.The embodiment can be determined the target to be tracked in present frame from multiple candidate target regions, be conducive to the accuracy of target following based on the feature of target to be tracked itself.
Description
Technical field
The invention relates to image processing fields, and in particular to computer vision field, more particularly in video
The method and apparatus for carrying out target following.
Background technology
Target following refers to establishing the position relationship for the object of being tracked in continuous video sequence, it is complete obtaining object
Whole movement locus.For example, the target coordinate position of given image first frame, calculates the definite position of the target in next frame image
It sets.
During the motion, target may will present the variation on some images, such as variation, the scale of posture or shape
Variation, background is blocked or the variation etc. of light luminance.The research of target tracking algorism is also around these variations of solution and tool
The application start of body.
In the prior art, the algorithm of plurality of target tracking is there has been, for example, particle filter (Particle Filter) side
Method, the optical flow algorithm of feature based point, the track algorithm etc. based on correlation filtering.
Invention content
The embodiment of the present application proposes the method and apparatus for carrying out target following in video.
In a first aspect, the embodiment of the present application provides a kind of method carrying out target following in video, including:Based on waiting for
Position of the target in the historical frames of video is tracked, candidate region is intercepted out from the present frame of video;The candidate that will be intercepted
The full convolutional network of region input training in advance, obtains characteristic pattern, wherein characteristic pattern includes to be used to indicate candidate target in feature
The candidate target region information of present position in figure;Based on candidate target region information, determined from characteristic pattern and each candidate
The one-to-one candidate target region of target;And by the candidate target region determined, most with target similarity to be tracked
High candidate target region is as the target to be tracked in present frame.
In some embodiments, before by the candidate region intercepted the input in advance full convolutional network of training, method
Further include the steps that trained full convolutional network, training full convolutional network the step of include:Establish initial full convolutional network;Obtain instruction
Practice sample set, training sample set includes multiple training samples pair, and training sample is to the wherein two frame figures including same video file
As and for label target object residing region in two field pictures markup information;By the initial full volume of training sample set input
Product network, based on the initial full convolutional network of pre-set loss function training, the full convolutional network after being trained.
In some embodiments, by the candidate target region determined, with the highest time of target similarity to be tracked
Select target area as the target to be tracked in present frame, including:Each candidate target region intercepted out from characteristic pattern is defeated
Enter preset pond layer, obtains candidate feature figure corresponding with each candidate target region;For each candidate feature figure, meter
Similarity between calculating the candidate feature figure and the clarification of objective figure to be tracked that obtains in advance;By with obtain in advance it is to be tracked
The candidate target region corresponding to the highest candidate feature figure of similarity between clarification of objective figure is waited for as in present frame
Track target.
In some embodiments, by the highest time of similarity between the clarification of objective figure to be tracked obtained in advance
Select the candidate target region corresponding to characteristic pattern as the target to be tracked in present frame after, method further includes:Will with it is advance
The highest candidate feature figure of similarity between the clarification of objective figure to be tracked obtained is as clarification of objective figure to be tracked.
In some embodiments, method further includes:It is detected in the present frame of video at predetermined intervals to be tracked
Target;And based on the target to be tracked detected, update clarification of objective figure to be tracked.
In some embodiments, the present frame of the historical frames of video and video is two frames adjacent in video.
Second aspect, the embodiment of the present application also provides a kind of devices carrying out target following in video, including:Interception
Unit is configured to the position in the historical frames of video based on target to be tracked, candidate is intercepted out from the present frame of video
Region;Feature acquiring unit is configured to, by the full convolutional network of the candidate region intercepted input training in advance, obtain feature
Figure, wherein characteristic pattern includes the candidate target region information for being used to indicate candidate target present position in characteristic pattern;Candidate mesh
Area determination unit is marked, candidate target region information is based on, determines to wait correspondingly with each candidate target from characteristic pattern
Select target area;And target tracking unit, it is configured in the candidate target region that will be determined, it is similar to target to be tracked
Highest candidate target region is spent as the target to be tracked in present frame.
In some embodiments, device further includes training unit, and training unit is configured to institute in feature acquiring unit
The candidate region input of interception is in advance before the full convolutional network of training:Establish initial full convolutional network;Training sample set is obtained,
Training sample set includes multiple training samples pair, training sample to including same video file wherein two field pictures and be used for
The markup information in label target object residing region in two field pictures;Training sample set is inputted into initial full convolutional network, base
Initial full convolutional network, the full convolutional network after being trained are trained in pre-set loss function.
In some embodiments, target tracking unit is further configured to:Each candidate that will be intercepted out from characteristic pattern
Target area inputs preset pond layer, obtains candidate feature figure corresponding with each candidate target region;For each time
Characteristic pattern is selected, the similarity between calculating the candidate feature figure and the clarification of objective figure to be tracked that obtains in advance;Will with it is advance
The candidate target region conduct corresponding to the highest candidate feature figure of similarity between the clarification of objective figure to be tracked obtained
Target to be tracked in present frame.
In some embodiments, device further includes determination unit;Determination unit is configured to will be in target tracking unit
The candidate target region corresponding to the highest candidate feature figure of similarity between the clarification of objective figure to be tracked obtained in advance
After the target to be tracked in present frame, by the similarity highest between the clarification of objective figure to be tracked obtained in advance
Candidate feature figure as clarification of objective figure to be tracked.
In some embodiments, device further includes:Detection unit is configured to working as in video at predetermined intervals
Target to be tracked is detected in previous frame;And updating unit, it is configured to, based on the target to be tracked detected, update mesh to be tracked
Target characteristic pattern.
In some embodiments, the present frame of the historical frames of video and video is two frames adjacent in video.
The third aspect, the embodiment of the present application also provides a kind of equipment, including:One or more processors;Storage device,
For storing one or more programs, when one or more programs are executed by one or more processors so that one or more
Processor realizes such as any method of first aspect.
Fourth aspect, the embodiment of the present application also provide a kind of computer readable storage medium, are stored thereon with computer journey
Sequence, wherein such as first aspect any method is realized when program is executed by processor.
The method and apparatus provided by the embodiments of the present application for carrying out target following in video, by being based on target to be tracked
Position in the historical frames of video intercepts out candidate region from the present frame of video, and the candidate region intercepted is inputted
The full convolutional network of training in advance obtains characteristic pattern, then is based on candidate target region information, is determined from characteristic pattern and each time
The one-to-one candidate target region of target is selected, it is finally, similar to target to be tracked by the candidate target region determined
Spend highest candidate target region as the target to be tracked in present frame, can based on the feature of target to be tracked itself, from
The target to be tracked in present frame is determined in multiple candidate target regions, is conducive to the accuracy of target following.
Description of the drawings
By reading a detailed description of non-restrictive embodiments in the light of the attached drawings below, the application's is other
Feature, objects and advantages will become more apparent upon:
Fig. 1 is that this application can be applied to exemplary system architecture figures therein;
Fig. 2 is the flow chart according to one embodiment of the method for carrying out target following in video of the application;
Fig. 3 A are the schematic diagrames of target to be tracked present position in one of video historical frames;
Fig. 3 B are the schematic diagrames of the candidate region intercepted out in the present frame of video;
Fig. 3 C are the schematic diagrames of each candidate target region in candidate region;
Fig. 4 A~Fig. 4 D are the application scenarios signals according to the method for carrying out target following in video of the application
Figure;
Fig. 5 is the flow chart according to another embodiment of the method for carrying out target following in video of the application;
Fig. 6 is in the method for carrying out target following in video of each embodiment of the application, the full convolutional network that uses
The schematic flow chart of training method;
Fig. 7 is the structure chart according to one embodiment of the device for carrying out target following in video of the application;
Fig. 8 is adapted for the structural schematic diagram of the computer system of the server for realizing the embodiment of the present application.
Specific implementation mode
The application is described in further detail with reference to the accompanying drawings and examples.It is understood that this place is retouched
The specific embodiment stated is used only for explaining related invention, rather than the restriction to the invention.It also should be noted that in order to
Convenient for description, is illustrated only in attached drawing and invent relevant part with related.
It should be noted that in the absence of conflict, the features in the embodiments and the embodiments of the present application can phase
Mutually combination.The application is described in detail below with reference to the accompanying drawings and in conjunction with the embodiments.
Fig. 1 shows the method that can apply the target following of progress in video of the application or carries out target in video
The exemplary system architecture 100 of the embodiment of the device of tracking.
As shown in Figure 1, system architecture 100 may include terminal device 101,102,103, network 104 and server 105.
Network 104 between terminal device 101,102,103 and server 105 provide communication link medium.Network 104 can be with
Including various connection types, such as wired, wireless communication link or fiber optic cables etc..
Terminal device 101,102,103 can be hardware, can also be software.When terminal device 101,102,103 is hard
Can be the various electronic equipments that there is display screen and support video playing, including but not limited to smart mobile phone, tablet when part
Computer, E-book reader, MP3 player (Moving Picture Experts Group Audio Layer III, dynamic
Image expert's compression standard audio level 3), MP4 (Moving Picture Experts Group Audio Layer IV, move
State image expert's compression standard audio level 4) player, pocket computer on knee and desktop computer etc..When terminal is set
Standby 101,102,103 when being software, may be mounted in above-mentioned cited electronic equipment.Its may be implemented into multiple softwares or
Software module (such as providing the multiple softwares or software module of Distributed Services), can also be implemented as single software or soft
Part module.It is not specifically limited herein.
Server 105 can be to provide the server of various services, such as to being played on terminal device 101,102,103
Video provides the background process server supported.Background process server can to receive target following request etc. data into
Row analysis etc. processing, and by handling result (for example, be loaded with target to be tracked each video frame region video counts
According to) feed back to terminal device.
It should be noted that the method for carrying out target following in video that the embodiment of the present application is provided can be by servicing
Device 105 executes, alternatively, can also be executed by terminal device 101,102,103.Correspondingly, target following is carried out in video
Device can be set in server 105, alternatively, can also be set in terminal device 101,102,103.
It should be understood that the number of the terminal device 101,102,103 in Fig. 1, network 104 and server 105 is only to show
Meaning property.According to needs are realized, can have any number of terminal device, network and server.
With continued reference to Fig. 2, one embodiment of the method for carrying out target following in video according to the application is shown
Flow 200.This carries out the method for target following in video, includes the following steps:
Step 201, the position based on target to be tracked in the historical frames of video intercepts out time from the present frame of video
Favored area.
In the present embodiment, executive agent (such as the service shown in FIG. 1 of the method for target following is carried out in video
Device) it can be asked in response to receiving the target following of user's transmission, carry out the operation of performance objective tracking in video.
Herein, target following request for example may include clarification of objective information to be tracked.Characteristic information for example may be used
To be any information that can characterize clarification of objective to be tracked.For example, in application scenes, user it is expected track up
In the video file that certain Basketball Match obtains, track of first sportsman to shoot in ball match.So, in the application scenarios
In, " sportsman of first shooting " can be used as clarification of objective information to be tracked.
Then, the executive agent of the method for the present embodiment can be according to the playing sequence of video file, according to certain target
Detection algorithm carries out target detection in the video frame.If detecting target in a certain frame of video file, can incite somebody to action
The video frame is as the historical frames in this step, and the position based on target to be tracked in the historical frames, to be tracked to determine
The candidate region that target is likely to occur in the video frame of follow-up play.
In application scenes, if target to be tracked is in (x in the historical frames of video1y1,x2y2) in this region,
It is possible in the current frame, according to (x1y1,x2y2) this rectangular area range, determine the range of a candidate region.
Here, (x1y1) and (x2y2) can be the upper left corner and the right side for characterizing rectangular area of the target to be tracked residing for historical frames respectively
Coordinate value of the inferior horn under preset plane right-angle coordinate.It is understood that in present frame, candidate region may include or
Person does not include (x1y1,x2y2) this rectangular area.
Shown in Fig. 3 A, it illustrates a historical frames 300A.From historical frames 300A, it has been determined that gone out to be tracked
Rectangular area residing for target 310, and the upper left corner of the rectangular area, the lower right corner are under preset rectangular coordinate system (Oxy)
Coordinate value is respectively (x1y1) and (x2y2).At this point it is possible in the current frame, intercept out a range as candidate region.
Shown in Fig. 3 B, that schematically shows in the present frame of video, range residing for candidate region 320
(x3y3,x4y4).It is not difficult to find out, the candidate region (x that Fig. 3 B are intercepted out3y3,x4y4) contain (x1y1,x2y2) this rectangle region
Domain.
The specific location and range size of candidate region can be arranged according to priori and specific application scenarios.
For example, can be arranged according to the movement speed of target to be tracked, moving range etc..
In application scenes, target to be tracked can be a certain motor vehicle.It, can basis in these application scenarios
In the movement speed range (for example, 0~100km/ hours) and video frame of motor vehicle, the road of motor-driven vehicle going is in video frame
In residing region, position and the size of candidate region is arranged.
It is understood that in the case where lacking priori, in order to avoid target to be tracked is not appeared in from current
It, can also be using the whole region of present frame as candidate region in application scenes in the candidate region that frame selects.
It returns with continued reference to shown in Fig. 2, the method for carrying out target following in video of the present embodiment further includes:
Step 202, by the full convolutional network of the candidate region intercepted input training in advance, characteristic pattern is obtained, wherein special
Sign figure includes the candidate target region information for being used to indicate candidate target present position in characteristic pattern.
In this step, the candidate region intercepted out can be input to (for example, candidate region 320 shown in Fig. 3 B) pre-
The first full convolutional network of training, obtains characteristic pattern.
Full convolutional network (Fully Convolutional Network, FCN) can receive the input figure of arbitrary dimension
Then picture up-samples the characteristic pattern (feature map) of the last one convolutional layer by warp lamination, it is made to be restored to
The identical size of input picture, so as to produce a prediction to each pixel.Meanwhile remaining original input picture
In spatial information, finally classify to each pixel on the characteristic pattern for the sizes such as scheming with input, use pixel by pixel
Softmax classified calculatings are lost, and are equivalent to each pixel and are corresponded to a training sample.
More than having the characteristics that just because of FCN, the candidate area size size of FCN no matter is inputted, can be passed through
FCN is obtained in candidate region, the characteristic information of each pixel.
The characteristic pattern obtained by FCN for example may include that the classification information of the pixel in candidate region and recurrence are believed
Breath.Herein, classification information can indicate that pixel belongs to the probability of target to be tracked, and feature can then be indicated by returning information
In figure, belong to the Probability Area of target to be tracked.Specifically, by the FCN, the characteristic pattern in multiple channels can be exported, these
Characteristic pattern can include in candidate region, and each pixel belongs to the probabilistic information of target to be tracked and the pixel belongs to
When target to be tracked, information of the pixel relative to the relative position of the target to be tracked belonging to it.
Step 203, it is based on candidate target region information, determines to wait correspondingly with each candidate target from characteristic pattern
Select target area.
It by step 202, can obtain in candidate region, the classification information and recurrence information of each pixel.It is appreciated that
, the probability that target to be tracked is belonged to for each in candidate region is more than the pixel of a predetermined threshold value, corresponding
The information of one relative position relative to the target to be tracked belonging to it.In application scenes, the pixel is relative to it
The information of the relative position of affiliated target to be tracked can be expressed as surrounding the bounding box of the pixel.Herein, bounding box
Such as can include a rectangular area of the pixel in candidate region.So, all to belong to in candidate region
The probability of target to be tracked is more than that the bounding box determined of the pixel of a predetermined threshold value is clustered, can determine with respectively
The one-to-one candidate target region of candidate target.As shown in Figure 3 C, that schematically shows to respectively being wrapped in candidate region 320
Enclose four candidate target region 310a~310d that box is clustered.
Step 204, by the candidate target region determined, with the highest candidate target region of target similarity to be tracked
As the target to be tracked in present frame.
By comparing the similarity of each candidate target region and target to be tracked, can be determined from candidate target region
A candidate target region closest with target to be tracked.The candidate target region, it is believed that be most probable in present frame
It is mesh target area to be tracked.
In some optional realization methods, through the above steps 203, define each candidate mesh in candidate region
Region is marked, also, each pixel in candidate target region all has a probability for belonging to target to be tracked.Therefore, in this step,
The probability that target to be tracked can be belonged to based on each pixel in each candidate target region determines that the candidate target region is to wait for
The probability of target is tracked, and using the probability as the similarity of the candidate target region and target to be tracked.So,
With by all candidate target regions, the candidate target region with maximum probability is as the target to be tracked in present frame.
The present embodiment in video carry out target following method, by based on target to be tracked video historical frames
In position, intercept out candidate region from the present frame of video, by the candidate region intercepted input in advance training full volume
Product network obtains characteristic pattern, then is based on the candidate target region information, is determined from the characteristic pattern and each candidate
The one-to-one candidate target region of target, finally, by the candidate target region determined, with target similarity to be tracked
Highest candidate target region as the target to be tracked in the present frame, can based on the feature of target to be tracked itself,
The target to be tracked in present frame is determined from multiple candidate target regions, is conducive to the accuracy of target following.
It is an applied field of the method for carrying out target following in video of the present embodiment shown in Fig. 4 A~Fig. 4 D
The schematic diagram of scape.
In this scenario, it is assumed that there is Chinese athlete A, Japan sportsman B and South Korea sportsman C to participate in a certain long-distance running together
Match, it is expected that carrying out target following to Chinese athlete A.
Assuming that in some historical frames, Chinese athlete A is in position as shown in Figure 4 B.
Then, the position based on Chinese athlete A in historical frames as shown in Figure 4 B, from present frame as shown in Figure 4 C
In, determine a candidate region 410.
Then, the candidate region 410, the full convolutional network of input training in advance are intercepted, and obtains three candidate target areas
Domain 410a, 410b and 410c, as shown in Figure 4 D.By candidate target region 410a~410c and target to be tracked (for example, from such as
The feature of the Chinese athlete A extracted in the historical frames of Fig. 4 B) similarity calculation, it may be determined that go out, candidate target region
Candidate target region 410b in 410a~410c is Chinese athlete A.
It is shown in Figure 5, it is the signal of another embodiment of the method for carrying out target following in video of the application
Property flow chart 500.
The method of the present embodiment includes:
Step 501, the position based on target to be tracked in the historical frames of video intercepts out time from the present frame of video
Favored area.
Step 502, by the full convolutional network of the candidate region intercepted input training in advance, characteristic pattern is obtained, wherein special
Sign figure includes the candidate target region information for being used to indicate candidate target present position in characteristic pattern.
Step 503, it is based on candidate target region information, determines to wait correspondingly with each candidate target from characteristic pattern
Select target area.
501~step 503 of above-mentioned steps can be according to the side similar with the step 201 of embodiment illustrated in fig. 2~step 203
Formula executes, and details are not described herein.
By step 501 as above~503, can be determined from characteristic pattern candidate correspondingly with each candidate target
Target area, for example, candidate target region can have the form of expression as shown in 310a~310d in Fig. 3 C.
Step 504, each candidate target region intercepted out from characteristic pattern is inputted into preset pond layer, obtained and each time
Select the corresponding candidate feature figure in target area.
Herein, each candidate target region intercepted out from characteristic pattern can be used as each ROI (regions of
Interest, area-of-interest).By to these ROI carry out pondization operate, can obtain respectively with these candidate target regions
Candidate feature figure corresponding, with identical size.The pondization operation of this step for example can be maximum pond (max
Pooling), average pond (mean pooling), random pool (stochastic pooling) etc..
Step 505, for each candidate feature figure, the candidate feature figure and the target to be tracked that obtains in advance are calculated
Similarity between characteristic pattern.
It herein, can be to be tracked as what is obtained in advance using the clarification of objective figure to be tracked determined from historical frames
Clarification of objective figure.So, the clarification of objective figure to be tracked obtained in advance can with to present frame execute step 501~
Each candidate feature figure obtained after 505 is of the same size.
In this step, mode that is any existing or waiting for the following exploitation may be used to calculate candidate feature figure and obtain in advance
Similarity between the clarification of objective figure to be tracked taken, it may for example comprise but be not limited to Euclidean distance, cosine similarity etc..
Step 506, by the highest candidate feature figure of similarity between the clarification of objective figure to be tracked obtained in advance
Corresponding candidate target region is as the target to be tracked in present frame.
It is understood that the similarity between candidate feature figure and the clarification of objective figure to be tracked obtained in advance is got over
Height, the region corresponding to the candidate feature figure are that the possibility of the target to be tracked in present frame is bigger.By comparing each candidate
Similarity between characteristic pattern and the clarification of objective figure to be tracked obtained in advance, can therefrom determine in present frame, most have
It may be mesh target area to be tracked.
The method for carrying out target following in video of the present embodiment obtains each candidate mesh using the method in the ponds ROI
The candidate feature figure for marking region, can make obtained candidate feature figure be of the same size, advantageously reduce similarity
Calculation amount when calculating, and further increase the accuracy that target to be tracked is determined from candidate target.
In some optional realization methods of the present embodiment, is determining from each candidate feature figure and in advance obtaining
It, can be highest candidate special by the similarity after the highest candidate feature figure of similarity between clarification of objective figure to be tracked
New characteristic pattern of the sign figure as target to be tracked.It so, can be with when the follow-up each frame to video carries out target following
Using the new characteristic pattern as the benchmark of similarity calculation, occur over time gradually in the characteristics of some targets to be tracked
The application scenarios of change can further promote the tracking of target to be tracked by being updated to clarification of objective figure to be tracked
Accuracy.
In some optional realization methods of the present embodiment, can also by interval of for a period of time again detection wait for
The mode of track target updates clarification of objective figure to be tracked.
Specifically, continuing with referring to Fig. 5, it, can be at predetermined intervals in the present frame of video in step 507
Middle detection target to be tracked.
Herein, preset time interval can according to application scenarios, the features of movement of target to be tracked in video come
Setting.Also, the time interval can be a fixed value, and can also be one can variable value.For example, in application scenes,
The detection of primary target to be tracked can be carried out at interval of 100 frames.Existing mesh may be used in mesh object detection method to be tracked
Detection algorithm is marked to realize, details are not described herein.
Then, in step 508, based on the target to be tracked detected, clarification of objective figure to be tracked is updated.
It so, can be to avoid caused by multiple similarity operation by the update to clarification of objective figure to be tracked
Clarification of objective figure to be tracked error accumulation, and then promoted target following accuracy.
In some optional realization methods, the full convolutional network used in the application the various embodiments described above may be used
Mode as shown in FIG. 6 is trained.
Specifically, step 601, initial full convolutional network is established.
Herein, an initial full convolutional network with multiple convolutional layers can be established, and is the initial full convolution net
Parameter in network assigns initial value.
Step 602, training sample set is obtained, training sample set includes multiple training samples pair, and training sample is to including same
The wherein two field pictures of one video file and markup information for label target object residing region in two field pictures.
Markup information can be it is any can be to the target object and the information that distinguishes of non-targeted object in video frame.
For example, " 1 " can be identified to each pixel in video frame, belonging to target object, and to other non-targeted right in video frame
Each pixel logo " 0 " of elephant.
Step 603, training sample set is inputted into initial full convolutional network, it is initial based on the training of pre-set loss function
Full convolutional network, the full convolutional network after being trained.
After the initial full convolutional network of training sample set input, characteristic pattern can be exported, and characteristic pattern can have
Characterization pixel belongs to the probability (that is, classification information) of target object and when pixel belongs to target to be tracked, the pixel
Information (that is, return information) of the point relative to the relative position of the target to be tracked belonging to it.
By the way that classification information, recurrence information and markup information are inputted pre-set loss function, it can be deduced that a damage
Mistake value.By penalty values backpropagation in full convolutional neural networks, each parameter in full convolutional neural networks can be carried out
Adjustment.
So, by the way that training sample set is circularly inputted full convolutional network, determines penalty values and loss
The backpropagation of value can constantly adjust the parameter in full convolutional neural networks, until reach trained completion condition (for example,
Penalty values are less than a certain preset penalty values threshold value).
In some optional realization methods of method for carrying out target following in video of each embodiment of the application, depending on
The historical frames of frequency and the present frame of video can be two frames adjacent in video.So, frame by frame may be implemented in video
Middle tracking target, is conducive to the continuity of target following.
With further reference to Fig. 7, as the realization to method shown in above-mentioned each figure, this application provides it is a kind of in video into
One embodiment of the device of row target following, the device embodiment is corresponding with embodiment of the method shown in Fig. 2, device tool
Body can be applied in various electronic equipments.
As shown in fig. 7, the device for carrying out target following in video of the present embodiment may include interception unit 701, spy
Levy acquiring unit 702, candidate target region determination unit 703 and target tracking unit 704.
Interception unit 701 is configurable to the position in the historical frames of video based on target to be tracked, from working as video
Candidate region is intercepted out in previous frame.
Feature acquiring unit 702 is configurable to the full convolutional network of the candidate region intercepted input training in advance,
Obtain characteristic pattern, wherein characteristic pattern includes the candidate target region letter for being used to indicate candidate target present position in characteristic pattern
Breath.
Candidate target region determination unit 703 can be based on candidate target region information, be determined from characteristic pattern and each time
Select the one-to-one candidate target region of target.
Target tracking unit 704 is configurable in the candidate target region that will be determined, with target similarity to be tracked
Highest candidate target region is as the target to be tracked in present frame.
In some optional realization methods, the device for carrying out target following in video of the present embodiment can also include
Training unit (not shown).
In these optional realization methods, training unit is configurable to the candidate that will be intercepted in feature acquiring unit
Region input is in advance before the full convolutional network of training:Establish initial full convolutional network;Obtain training sample set, training sample set
Including multiple training samples pair, training sample to including same video file wherein two field pictures and be used for label target pair
As the markup information in the residing region in two field pictures;Training sample set is inputted into initial full convolutional network, based on pre-setting
The initial full convolutional network of loss function training, the full convolutional network after being trained.
In some optional realization methods, target tracking unit 704 can also be further configured to:It will be from characteristic pattern
The middle each candidate target region intercepted out inputs preset pond layer, obtains candidate feature corresponding with each candidate target region
Figure;For each candidate feature figure, between calculating the candidate feature figure and the clarification of objective figure to be tracked that obtains in advance
Similarity;By the time corresponding to the highest candidate feature figure of similarity between the clarification of objective figure to be tracked obtained in advance
Select target area as the target to be tracked in present frame.
In some optional realization methods, the device for carrying out target following in video of the present embodiment can also be into one
Step includes determination unit (not shown).
In these optional realization methods, determination unit can also configure for will be obtained with advance in target tracking unit
Clarification of objective figure to be tracked between the highest candidate feature figure of similarity corresponding to candidate target region as current
It is after target to be tracked in frame, the similarity between the clarification of objective figure to be tracked obtained in advance is highest candidate special
Sign figure is used as clarification of objective figure to be tracked.
In some optional realization methods, the device for carrying out target following in video of the present embodiment can also include
Detection unit (not shown) and updating unit (not shown).
In these optional realization methods, detection unit is configurable at predetermined intervals in the current of video
Target to be tracked is detected in frame.
Updating unit is configurable to, based on the target to be tracked detected, update clarification of objective figure to be tracked.
In some optional realization methods, the historical frames of video and the present frame of video are two frames adjacent in video.
Below with reference to Fig. 8, it illustrates the computers suitable for terminal device/server for realizing the embodiment of the present application
The structural schematic diagram of system 800.Terminal device/server shown in Fig. 8 is only an example, should not be to the embodiment of the present application
Function and use scope bring any restrictions.
As shown in figure 8, computer system 800 includes central processing unit (CPU) 801, it can be read-only according to being stored in
Program in memory (ROM) 802 or be loaded into the program in random access storage device (RAM) 803 from storage section 808 and
Execute various actions appropriate and processing.In RAM 803, also it is stored with system 800 and operates required various programs and data.
CPU 801, ROM 802 and RAM 803 are connected with each other by bus 804.Input/output (I/O) interface 805 is also connected to always
Line 804.
It is connected to I/O interfaces 805 with lower component:Importation 806 including keyboard, mouse etc.;It is penetrated including such as cathode
The output par, c 807 of spool (CRT), liquid crystal display (LCD) etc. and loud speaker etc.;Storage section 808 including hard disk etc.;
And the communications portion 809 of the network interface card including LAN card, modem etc..Communications portion 809 via such as because
The network of spy's net executes communication process.Driver 810 is also according to needing to be connected to I/O interfaces 805.Detachable media 811, such as
Disk, CD, magneto-optic disk, semiconductor memory etc. are mounted on driver 810, as needed in order to be read from thereon
Computer program be mounted into storage section 808 as needed.
Particularly, in accordance with an embodiment of the present disclosure, it may be implemented as computer above with reference to the process of flow chart description
Software program.For example, embodiment of the disclosure includes a kind of computer program product comprising be carried on computer-readable medium
On computer program, which includes the program code for method shown in execution flow chart.In such reality
It applies in example, which can be downloaded and installed by communications portion 809 from network, and/or from detachable media
811 are mounted.When the computer program is executed by central processing unit (CPU) 801, limited in execution the present processes
Above-mentioned function.It should be noted that computer-readable medium described herein can be computer-readable signal media or
Computer readable storage medium either the two arbitrarily combines.Computer readable storage medium for example can be --- but
Be not limited to --- electricity, magnetic, optical, electromagnetic, infrared ray or semiconductor system, device or device, or arbitrary above combination.
The more specific example of computer readable storage medium can include but is not limited to:Electrical connection with one or more conducting wires,
Portable computer diskette, hard disk, random access storage device (RAM), read-only memory (ROM), erasable type may be programmed read-only deposit
Reservoir (EPROM or flash memory), optical fiber, portable compact disc read-only memory (CD-ROM), light storage device, magnetic memory
Part or above-mentioned any appropriate combination.In this application, computer readable storage medium can any be included or store
The tangible medium of program, the program can be commanded the either device use or in connection of execution system, device.And
In the application, computer-readable signal media may include the data letter propagated in a base band or as a carrier wave part
Number, wherein carrying computer-readable program code.Diversified forms may be used in the data-signal of this propagation, including but not
It is limited to electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be computer
Any computer-readable medium other than readable storage medium storing program for executing, the computer-readable medium can send, propagate or transmit use
In by instruction execution system, device either device use or program in connection.Include on computer-readable medium
Program code can transmit with any suitable medium, including but not limited to:Wirelessly, electric wire, optical cable, RF etc., Huo Zheshang
Any appropriate combination stated.
The calculating of the operation for executing the application can be write with one or more programming languages or combinations thereof
Machine program code, described program design language include object oriented program language-such as Java, Smalltalk, C+
+, further include conventional procedural programming language-such as " C " language or similar programming language.Program code can
Fully to execute on the user computer, partly execute, executed as an independent software package on the user computer,
Part executes or executes on a remote computer or server completely on the remote computer on the user computer for part.
In situations involving remote computers, remote computer can pass through the network of any kind --- including LAN (LAN)
Or wide area network (WAN)-is connected to subscriber computer, or, it may be connected to outer computer (such as utilize Internet service
Provider is connected by internet).
Flow chart in attached drawing and block diagram, it is illustrated that according to the system of the various embodiments of the application, method and computer journey
The architecture, function and operation in the cards of sequence product.In this regard, each box in flowchart or block diagram can generation
A part for a part for one module, program segment, or code of table, the module, program segment, or code includes one or more uses
The executable instruction of the logic function as defined in realization.It should also be noted that in some implementations as replacements, being marked in box
The function of note can also occur in a different order than that indicated in the drawings.For example, two boxes succeedingly indicated are actually
It can be basically executed in parallel, they can also be executed in the opposite order sometimes, this is depended on the functions involved.Also it to note
Meaning, the combination of each box in block diagram and or flow chart and the box in block diagram and or flow chart can be with holding
The dedicated hardware based system of functions or operations as defined in row is realized, or can use specialized hardware and computer instruction
Combination realize.
Being described in unit involved in the embodiment of the present application can be realized by way of software, can also be by hard
The mode of part is realized.Described unit can also be arranged in the processor, for example, can be described as:A kind of processor packet
Include interception unit, feature acquiring unit, candidate target region determination unit and target tracking unit.Wherein, these units
Title does not constitute the restriction to the unit itself under certain conditions, for example, interception unit is also described as " based on waiting for
Position of the target in the historical frames of video is tracked, the unit of candidate region is intercepted out from the present frame of video ".
As on the other hand, present invention also provides a kind of computer-readable medium, which can be
Included in device described in above-described embodiment;Can also be individualism, and without be incorporated the device in.Above-mentioned calculating
Machine readable medium carries one or more program, when said one or multiple programs are executed by the device so that should
Device:Position based on target to be tracked in the historical frames of video intercepts out candidate region from the present frame of video;By institute
The full convolutional network of the candidate region input training in advance of interception, obtains characteristic pattern, wherein characteristic pattern includes to be used to indicate candidate
The candidate target region information of target present position in characteristic pattern;Based on candidate target region information, determined from characteristic pattern
Go out and the one-to-one candidate target region of each candidate target;And by the candidate target region determined, with mesh to be tracked
The highest candidate target region of similarity is marked as the target to be tracked in present frame.
Above description is only the preferred embodiment of the application and the explanation to institute's application technology principle.People in the art
Member should be appreciated that invention scope involved in the application, however it is not limited to technology made of the specific combination of above-mentioned technical characteristic
Scheme, while should also cover in the case where not departing from foregoing invention design, it is carried out by above-mentioned technical characteristic or its equivalent feature
Other technical solutions of arbitrary combination and formation.Such as features described above has similar work(with (but not limited to) disclosed herein
Can technical characteristic replaced mutually and the technical solution that is formed.
Claims (14)
1. a kind of method carrying out target following in video, including:
Position based on target to be tracked in the historical frames of video intercepts out candidate region from the present frame of video;
By the full convolutional network of the candidate region intercepted input training in advance, characteristic pattern is obtained, wherein characteristic pattern includes to be used for
Indicate the candidate target region information of candidate target present position in the characteristic pattern;
Based on the candidate target region information, determine to wait correspondingly with each candidate target from the characteristic pattern
Select target area;And
By in the candidate target region determined, with the highest candidate target region of target similarity to be tracked as described current
Target to be tracked in frame.
2. according to the method described in claim 1, wherein, the candidate region intercepted to be inputted to full volume trained in advance described
The step of before product network, the method further includes the steps that trained full convolutional network, the training full convolutional network include:
Establish initial full convolutional network;
Training sample set is obtained, the training sample set includes multiple training samples pair, and the training sample is to including same regard
The wherein two field pictures of frequency file and markup information for label target object residing region in two field pictures;
It is described initial based on the training of pre-set loss function by the training sample set input initial full convolutional network
Full convolutional network, the full convolutional network after being trained.
It is described by the candidate target region determined 3. according to the method described in claim 1, wherein, with mesh to be tracked
The highest candidate target region of similarity is marked as the target to be tracked in the present frame, including:
Each candidate target region intercepted out from the characteristic pattern is inputted into preset pond layer, is obtained and each candidate target area
The corresponding candidate feature figure in domain;
For each candidate feature figure, calculate the candidate feature figure and the clarification of objective figure to be tracked that obtains in advance it
Between similarity;
Corresponding to the highest candidate feature figure of similarity between the clarification of objective figure to be tracked obtained in advance
Candidate target region is as the target to be tracked in the present frame.
4. according to the method described in claim 3, wherein, it is described by with the clarification of objective figure to be tracked that obtains in advance
Between the highest candidate feature figure of similarity corresponding to candidate target region as the target to be tracked in the present frame
Later, the method further includes:
Using the highest candidate feature figure of similarity between the clarification of objective figure to be tracked obtained in advance as described in
Clarification of objective figure to be tracked.
5. according to the method described in claim 4, wherein, the method further includes:
Target to be tracked is detected in the present frame of the video at predetermined intervals;And
Based on the target to be tracked detected, the clarification of objective figure to be tracked is updated.
6. according to the method described in one of claim 1-5, wherein the historical frames of the video and the present frame of the video are
Two adjacent frames in the video.
7. a kind of device carrying out target following in video, including:
Interception unit is configured to the position in the historical frames of video based on target to be tracked, is cut from the present frame of video
Take out candidate region;
Feature acquiring unit is configured to, by the full convolutional network of the candidate region intercepted input training in advance, obtain feature
Figure, wherein characteristic pattern includes the candidate target region information for being used to indicate candidate target present position in the characteristic pattern;
Candidate target region determination unit is configured to be based on the candidate target region information, be determined from the characteristic pattern
Go out and each one-to-one candidate target region of candidate target;And
Target tracking unit is configured in the candidate target region that will be determined, with the highest time of target similarity to be tracked
Select target area as the target to be tracked in the present frame.
8. device according to claim 7, wherein described device further includes training unit, and the training unit configuration is used
In before the candidate region intercepted is inputted the full convolutional network of training in advance by the feature acquiring unit:
Establish initial full convolutional network;
Training sample set is obtained, the training sample set includes multiple training samples pair, and the training sample is to including same regard
The wherein two field pictures of frequency file and markup information for label target object residing region in two field pictures;
It is described initial based on the training of pre-set loss function by the training sample set input initial full convolutional network
Full convolutional network, the full convolutional network after being trained.
9. device according to claim 7, wherein the target tracking unit is further configured to:
Each candidate target region intercepted out from the characteristic pattern is inputted into preset pond layer, is obtained and each candidate target area
The corresponding candidate feature figure in domain;
For each candidate feature figure, calculate the candidate feature figure and the clarification of objective figure to be tracked that obtains in advance it
Between similarity;
Corresponding to the highest candidate feature figure of similarity between the clarification of objective figure to be tracked obtained in advance
Candidate target region is as the target to be tracked in the present frame.
10. device according to claim 9, wherein described device further includes determination unit;
The determination unit be configured to the target tracking unit by with the clarification of objective to be tracked that obtains in advance
The candidate target region corresponding to the highest candidate feature figure of similarity between figure is as the mesh to be tracked in the present frame
After mark, using the highest candidate feature figure of similarity between the clarification of objective figure to be tracked obtained in advance as institute
State clarification of objective figure to be tracked.
11. device according to claim 10, wherein described device further includes:
Detection unit is configured to detect target to be tracked in the present frame of the video at predetermined intervals;And
Updating unit is configured to, based on the target to be tracked detected, update the clarification of objective figure to be tracked.
12. according to the device described in one of claim 7-11, wherein the present frame of the historical frames of the video and the video
For two frames adjacent in the video.
13. a kind of equipment, including:
One or more processors;
Storage device, for storing one or more programs,
When one or more of programs are executed by one or more of processors so that one or more of processors are real
The now method as described in any in claim 1-6.
14. a kind of computer readable storage medium, is stored thereon with computer program, wherein described program is executed by processor
Methods of the Shi Shixian as described in any in claim 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810276460.5A CN108491816A (en) | 2018-03-30 | 2018-03-30 | The method and apparatus for carrying out target following in video |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201810276460.5A CN108491816A (en) | 2018-03-30 | 2018-03-30 | The method and apparatus for carrying out target following in video |
Publications (1)
Publication Number | Publication Date |
---|---|
CN108491816A true CN108491816A (en) | 2018-09-04 |
Family
ID=63317744
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201810276460.5A Pending CN108491816A (en) | 2018-03-30 | 2018-03-30 | The method and apparatus for carrying out target following in video |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN108491816A (en) |
Cited By (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109492579A (en) * | 2018-11-08 | 2019-03-19 | 广东工业大学 | A kind of video object detection method and system based on ST-SIN |
CN110084835A (en) * | 2019-06-06 | 2019-08-02 | 北京字节跳动网络技术有限公司 | Method and apparatus for handling video |
CN110211158A (en) * | 2019-06-04 | 2019-09-06 | 海信集团有限公司 | Candidate region determines method, apparatus and storage medium |
CN110472728A (en) * | 2019-07-30 | 2019-11-19 | 腾讯科技(深圳)有限公司 | Target information determines method, target information determining device, medium and electronic equipment |
CN110490902A (en) * | 2019-08-02 | 2019-11-22 | 西安天和防务技术股份有限公司 | Method for tracking target, device, computer equipment applied to smart city |
CN110930434A (en) * | 2019-11-21 | 2020-03-27 | 腾讯科技(深圳)有限公司 | Target object tracking method and device, storage medium and computer equipment |
CN110955259A (en) * | 2019-11-28 | 2020-04-03 | 上海歌尔泰克机器人有限公司 | Unmanned aerial vehicle, tracking method thereof and computer-readable storage medium |
WO2020093724A1 (en) * | 2018-11-06 | 2020-05-14 | 北京字节跳动网络技术有限公司 | Method and device for generating information |
CN111275741A (en) * | 2020-01-19 | 2020-06-12 | 北京迈格威科技有限公司 | Target tracking method and device, computer equipment and storage medium |
CN111368101A (en) * | 2020-03-05 | 2020-07-03 | 腾讯科技(深圳)有限公司 | Multimedia resource information display method, device, equipment and storage medium |
CN111402294A (en) * | 2020-03-10 | 2020-07-10 | 腾讯科技(深圳)有限公司 | Target tracking method, target tracking device, computer-readable storage medium and computer equipment |
CN111428535A (en) * | 2019-01-09 | 2020-07-17 | 佳能株式会社 | Image processing apparatus and method, and image processing system |
CN111524165A (en) * | 2020-04-22 | 2020-08-11 | 北京百度网讯科技有限公司 | Target tracking method and device |
CN111539991A (en) * | 2020-04-28 | 2020-08-14 | 北京市商汤科技开发有限公司 | Target tracking method and device and storage medium |
CN112241670A (en) * | 2019-07-18 | 2021-01-19 | 杭州海康威视数字技术股份有限公司 | Image processing method and device |
CN112347817A (en) * | 2019-08-08 | 2021-02-09 | 初速度(苏州)科技有限公司 | Video target detection and tracking method and device |
Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101770648A (en) * | 2009-01-06 | 2010-07-07 | 北京中星微电子有限公司 | Video monitoring based loitering system and method thereof |
CN101867798A (en) * | 2010-05-18 | 2010-10-20 | 武汉大学 | Mean shift moving object tracking method based on compressed domain analysis |
CN102881022A (en) * | 2012-07-20 | 2013-01-16 | 西安电子科技大学 | Concealed-target tracking method based on on-line learning |
CN104484889A (en) * | 2014-12-15 | 2015-04-01 | 三峡大学 | Target tracking method and device |
CN104794733A (en) * | 2014-01-20 | 2015-07-22 | 株式会社理光 | Object tracking method and device |
CN105335986A (en) * | 2015-09-10 | 2016-02-17 | 西安电子科技大学 | Characteristic matching and MeanShift algorithm-based target tracking method |
CN105730336A (en) * | 2014-12-10 | 2016-07-06 | 比亚迪股份有限公司 | Reverse driving assistant and vehicle |
CN106097391A (en) * | 2016-06-13 | 2016-11-09 | 浙江工商大学 | A kind of multi-object tracking method identifying auxiliary based on deep neural network |
WO2017015947A1 (en) * | 2015-07-30 | 2017-02-02 | Xiaogang Wang | A system and a method for object tracking |
CN106650630A (en) * | 2016-11-11 | 2017-05-10 | 纳恩博(北京)科技有限公司 | Target tracking method and electronic equipment |
CN106709936A (en) * | 2016-12-14 | 2017-05-24 | 北京工业大学 | Single target tracking method based on convolution neural network |
CN106909885A (en) * | 2017-01-19 | 2017-06-30 | 博康智能信息技术有限公司上海分公司 | A kind of method for tracking target and device based on target candidate |
CN107330920A (en) * | 2017-06-28 | 2017-11-07 | 华中科技大学 | A kind of monitor video multi-target tracking method based on deep learning |
CN107423707A (en) * | 2017-07-25 | 2017-12-01 | 深圳帕罗人工智能科技有限公司 | A kind of face Emotion identification method based under complex environment |
-
2018
- 2018-03-30 CN CN201810276460.5A patent/CN108491816A/en active Pending
Patent Citations (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101770648A (en) * | 2009-01-06 | 2010-07-07 | 北京中星微电子有限公司 | Video monitoring based loitering system and method thereof |
CN101867798A (en) * | 2010-05-18 | 2010-10-20 | 武汉大学 | Mean shift moving object tracking method based on compressed domain analysis |
CN102881022A (en) * | 2012-07-20 | 2013-01-16 | 西安电子科技大学 | Concealed-target tracking method based on on-line learning |
CN104794733A (en) * | 2014-01-20 | 2015-07-22 | 株式会社理光 | Object tracking method and device |
CN105730336A (en) * | 2014-12-10 | 2016-07-06 | 比亚迪股份有限公司 | Reverse driving assistant and vehicle |
CN104484889A (en) * | 2014-12-15 | 2015-04-01 | 三峡大学 | Target tracking method and device |
WO2017015947A1 (en) * | 2015-07-30 | 2017-02-02 | Xiaogang Wang | A system and a method for object tracking |
CN105335986A (en) * | 2015-09-10 | 2016-02-17 | 西安电子科技大学 | Characteristic matching and MeanShift algorithm-based target tracking method |
CN106097391A (en) * | 2016-06-13 | 2016-11-09 | 浙江工商大学 | A kind of multi-object tracking method identifying auxiliary based on deep neural network |
CN106650630A (en) * | 2016-11-11 | 2017-05-10 | 纳恩博(北京)科技有限公司 | Target tracking method and electronic equipment |
CN106709936A (en) * | 2016-12-14 | 2017-05-24 | 北京工业大学 | Single target tracking method based on convolution neural network |
CN106909885A (en) * | 2017-01-19 | 2017-06-30 | 博康智能信息技术有限公司上海分公司 | A kind of method for tracking target and device based on target candidate |
CN107330920A (en) * | 2017-06-28 | 2017-11-07 | 华中科技大学 | A kind of monitor video multi-target tracking method based on deep learning |
CN107423707A (en) * | 2017-07-25 | 2017-12-01 | 深圳帕罗人工智能科技有限公司 | A kind of face Emotion identification method based under complex environment |
Non-Patent Citations (1)
Title |
---|
SHAOQING REN ET AL: "Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks", 《IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE》 * |
Cited By (28)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020093724A1 (en) * | 2018-11-06 | 2020-05-14 | 北京字节跳动网络技术有限公司 | Method and device for generating information |
CN109492579A (en) * | 2018-11-08 | 2019-03-19 | 广东工业大学 | A kind of video object detection method and system based on ST-SIN |
CN109492579B (en) * | 2018-11-08 | 2022-05-10 | 广东工业大学 | ST-SIN-based video object detection method and system |
CN111428535A (en) * | 2019-01-09 | 2020-07-17 | 佳能株式会社 | Image processing apparatus and method, and image processing system |
CN110211158B (en) * | 2019-06-04 | 2023-03-28 | 海信集团有限公司 | Candidate area determination method, device and storage medium |
CN110211158A (en) * | 2019-06-04 | 2019-09-06 | 海信集团有限公司 | Candidate region determines method, apparatus and storage medium |
CN110084835A (en) * | 2019-06-06 | 2019-08-02 | 北京字节跳动网络技术有限公司 | Method and apparatus for handling video |
CN110084835B (en) * | 2019-06-06 | 2020-08-21 | 北京字节跳动网络技术有限公司 | Method and apparatus for processing video |
CN112241670B (en) * | 2019-07-18 | 2024-03-01 | 杭州海康威视数字技术股份有限公司 | Image processing method and device |
CN112241670A (en) * | 2019-07-18 | 2021-01-19 | 杭州海康威视数字技术股份有限公司 | Image processing method and device |
CN110472728A (en) * | 2019-07-30 | 2019-11-19 | 腾讯科技(深圳)有限公司 | Target information determines method, target information determining device, medium and electronic equipment |
CN110490902A (en) * | 2019-08-02 | 2019-11-22 | 西安天和防务技术股份有限公司 | Method for tracking target, device, computer equipment applied to smart city |
CN112347817A (en) * | 2019-08-08 | 2021-02-09 | 初速度(苏州)科技有限公司 | Video target detection and tracking method and device |
CN112347817B (en) * | 2019-08-08 | 2022-05-17 | 魔门塔(苏州)科技有限公司 | Video target detection and tracking method and device |
CN110930434A (en) * | 2019-11-21 | 2020-03-27 | 腾讯科技(深圳)有限公司 | Target object tracking method and device, storage medium and computer equipment |
CN110930434B (en) * | 2019-11-21 | 2023-05-12 | 腾讯科技(深圳)有限公司 | Target object following method, device, storage medium and computer equipment |
CN110955259B (en) * | 2019-11-28 | 2023-08-29 | 上海歌尔泰克机器人有限公司 | Unmanned aerial vehicle, tracking method thereof and computer readable storage medium |
CN110955259A (en) * | 2019-11-28 | 2020-04-03 | 上海歌尔泰克机器人有限公司 | Unmanned aerial vehicle, tracking method thereof and computer-readable storage medium |
CN111275741B (en) * | 2020-01-19 | 2023-09-08 | 北京迈格威科技有限公司 | Target tracking method, device, computer equipment and storage medium |
CN111275741A (en) * | 2020-01-19 | 2020-06-12 | 北京迈格威科技有限公司 | Target tracking method and device, computer equipment and storage medium |
CN111368101A (en) * | 2020-03-05 | 2020-07-03 | 腾讯科技(深圳)有限公司 | Multimedia resource information display method, device, equipment and storage medium |
CN111368101B (en) * | 2020-03-05 | 2021-06-18 | 腾讯科技(深圳)有限公司 | Multimedia resource information display method, device, equipment and storage medium |
CN111402294A (en) * | 2020-03-10 | 2020-07-10 | 腾讯科技(深圳)有限公司 | Target tracking method, target tracking device, computer-readable storage medium and computer equipment |
CN111402294B (en) * | 2020-03-10 | 2022-10-18 | 腾讯科技(深圳)有限公司 | Target tracking method, target tracking device, computer-readable storage medium and computer equipment |
CN111524165B (en) * | 2020-04-22 | 2023-08-25 | 北京百度网讯科技有限公司 | Target tracking method and device |
CN111524165A (en) * | 2020-04-22 | 2020-08-11 | 北京百度网讯科技有限公司 | Target tracking method and device |
CN111539991A (en) * | 2020-04-28 | 2020-08-14 | 北京市商汤科技开发有限公司 | Target tracking method and device and storage medium |
CN111539991B (en) * | 2020-04-28 | 2023-10-20 | 北京市商汤科技开发有限公司 | Target tracking method and device and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108491816A (en) | The method and apparatus for carrying out target following in video | |
CN108846440B (en) | Image processing method and device, computer readable medium and electronic equipment | |
CN108898086A (en) | Method of video image processing and device, computer-readable medium and electronic equipment | |
CN110399848A (en) | Video cover generation method, device and electronic equipment | |
CN108830235A (en) | Method and apparatus for generating information | |
CN110381368A (en) | Video cover generation method, device and electronic equipment | |
CN112101305B (en) | Multi-path image processing method and device and electronic equipment | |
CN111091166B (en) | Image processing model training method, image processing device, and storage medium | |
CN110033423B (en) | Method and apparatus for processing image | |
CN110059623B (en) | Method and apparatus for generating information | |
CN109377508A (en) | Image processing method and device | |
CN112073748A (en) | Panoramic video processing method and device and storage medium | |
CN110035236A (en) | Image processing method, device and electronic equipment | |
CN109118456A (en) | Image processing method and device | |
CN108446658A (en) | The method and apparatus of facial image for identification | |
CN109300139A (en) | Method for detecting lane lines and device | |
CN108595211A (en) | Method and apparatus for output data | |
CN110288037A (en) | Image processing method, device and electronic equipment | |
CN110287350A (en) | Image search method, device and electronic equipment | |
CN111310595B (en) | Method and device for generating information | |
CN109446379A (en) | Method and apparatus for handling information | |
CN104541304A (en) | Target object angle determination using multiple cameras | |
CN111598923B (en) | Target tracking method and device, computer equipment and storage medium | |
CN108595011A (en) | Information displaying method, device, storage medium and electronic equipment | |
CN110378936B (en) | Optical flow calculation method and device and electronic equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20180904 |