CN109558505A - Visual search method, apparatus, computer equipment and storage medium - Google Patents

Visual search method, apparatus, computer equipment and storage medium Download PDF

Info

Publication number
CN109558505A
CN109558505A CN201811392516.XA CN201811392516A CN109558505A CN 109558505 A CN109558505 A CN 109558505A CN 201811392516 A CN201811392516 A CN 201811392516A CN 109558505 A CN109558505 A CN 109558505A
Authority
CN
China
Prior art keywords
main body
frame image
visual search
tracking
image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811392516.XA
Other languages
Chinese (zh)
Inventor
张柳清
李国洪
邱鑫
高树会
张亚洲
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Baidu Online Network Technology Beijing Co Ltd
Beijing Baidu Netcom Science and Technology Co Ltd
Original Assignee
Beijing Baidu Netcom Science and Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Baidu Netcom Science and Technology Co Ltd filed Critical Beijing Baidu Netcom Science and Technology Co Ltd
Priority to CN201811392516.XA priority Critical patent/CN109558505A/en
Publication of CN109558505A publication Critical patent/CN109558505A/en
Priority to KR1020207035613A priority patent/KR102440198B1/en
Priority to EP19886874.7A priority patent/EP3885934A4/en
Priority to JP2020571638A priority patent/JP7204786B2/en
Priority to PCT/CN2019/094248 priority patent/WO2020103462A1/en
Priority to US17/041,411 priority patent/US11348254B2/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/732Query formulation
    • G06F16/7328Query by example, e.g. a complete video frame or video sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/223Analysis of motion using block-matching
    • G06T7/238Analysis of motion using block-matching using non-full search, e.g. three-step search
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/532Query formulation, e.g. graphical querying
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/53Querying
    • G06F16/538Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/73Querying
    • G06F16/738Presentation of query results
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/783Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • G06F16/7837Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content
    • G06F16/784Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content using objects detected or recognised in the video content the detected or recognised objects being people
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/70Information retrieval; Database structures therefor; File system structures therefor of video data
    • G06F16/78Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/787Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/223Analysis of motion using block-matching
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/269Analysis of motion using gradient-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/56Extraction of image or video features relating to colour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/41Higher-level, semantic clustering, classification or understanding of video scenes, e.g. detection, labelling or Markovian modelling of sport events or news items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V20/00Scenes; Scene-specific elements
    • G06V20/40Scenes; Scene-specific elements in video content
    • G06V20/46Extracting features or characteristics from the video content, e.g. video fingerprints, representative shots or key frames
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10016Video; Image sequence
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30241Trajectory
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V2201/00Indexing scheme relating to image or video recognition or understanding
    • G06V2201/07Target detection

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Library & Information Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computational Linguistics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Image Analysis (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
  • Closed-Circuit Television Systems (AREA)

Abstract

The application proposes a kind of visual search method, apparatus, computer equipment and storage medium, wherein method includes: to receive the i-th frame image, wherein i is positive integer;The position of main body and classification in the i-th frame image are extracted, and generates the corresponding detection block of the main body;The main body is tracked according to the position of main body in the i-th frame image in the subsequent frame image of the i-th frame image, and the detection block is adjusted according to the tracking result.By this method, the tracking of main body in video flowing can be realized, improve the continuity of visual search, solve the technical issues of visual search can not be identified and be tracked to the main body in live video stream in the prior art.

Description

Visual search method, apparatus, computer equipment and storage medium
Technical field
This application involves visual search technical field more particularly to a kind of visual search method, apparatus, computer equipment and Storage medium.
Background technique
Visual search is a kind of using vision contents such as image, videos as search input source, right using Visual identification technology After the vision content of input is identified, retrieved, the technology of the search results of variforms such as image, text is returned.With view Feel the continuous development of identification technology, more and more users pass through visual search technology on mobile terminals, to know periphery object The information of body.
However, current visual search product is incomplete, the main body in live video stream can not be carried out identification and with Track.
Summary of the invention
The application is intended to solve at least some of the technical problems in related technologies.
For this purpose, the application proposes a kind of visual search method, apparatus, computer equipment and storage medium, it is existing for solving There is the technical issues of visual search can not be identified and be tracked to the main body in live video stream in technology.
In order to achieve the above object, the application first aspect embodiment proposes a kind of visual search method, comprising:
Receive the i-th frame image, wherein i is positive integer;
The position of main body and classification in the i-th frame image are extracted, and generates the corresponding detection block of the main body;And
According to the position of main body in the i-th frame image to the main body in the subsequent frame image of the i-th frame image It is tracked, and the detection block is adjusted according to the tracking result.
The visual search method of the embodiment of the present application extracts the position of main body in the i-th frame image by receiving the i-th frame image It sets and classification, and generates the corresponding detection block of main body, according to main body in the i-th frame image in the subsequent frame image of the i-th frame image Position main body is tracked, and detection block is adjusted according to tracking result.As a result, by according in the i-th frame image The position of main body in subsequent frames tracks main body, and is adjusted according to tracking result to detection block, realizes video The tracking of main body in stream, improves the continuity of visual search.
In order to achieve the above object, the application second aspect embodiment proposes a kind of visual search device, comprising:
Receiving module, for receiving the i-th frame image, wherein i is positive integer;
Extraction module, for extracting the position of main body and classification in the i-th frame image, and it is corresponding to generate the main body Detection block;And
Tracking module, in the subsequent frame image of the i-th frame image according to the position of main body in the i-th frame image It sets and the main body is tracked, and the detection block is adjusted according to the tracking result.
The visual search device of the embodiment of the present application extracts the position of main body in the i-th frame image by receiving the i-th frame image It sets and classification, and generates the corresponding detection block of main body, according to main body in the i-th frame image in the subsequent frame image of the i-th frame image Position main body is tracked, and detection block is adjusted according to tracking result.As a result, by according in the i-th frame image The position of main body in subsequent frames tracks main body, and is adjusted according to tracking result to detection block, realizes video The tracking of main body in stream, improves the continuity of visual search.
In order to achieve the above object, the application third aspect embodiment proposes a kind of computer equipment, comprising: processor and deposit Reservoir;Wherein, the processor is held to run with described by reading the executable program code stored in the memory The corresponding program of line program code, for realizing the visual search method as described in first aspect embodiment.
In order to achieve the above object, the application fourth aspect embodiment proposes a kind of non-transitory computer-readable storage medium Matter is stored thereon with computer program, realizes that the vision as described in first aspect embodiment is searched when which is executed by processor Suo Fangfa.
In order to achieve the above object, the 5th aspect embodiment of the application proposes a kind of computer program product, when the calculating When instruction in machine program product is executed by processor, the visual search method as described in first aspect embodiment is realized.
The additional aspect of the application and advantage will be set forth in part in the description, and will partially become from the following description It obtains obviously, or recognized by the practice of the application.
Detailed description of the invention
The application is above-mentioned and/or additional aspect and advantage will become from the following description of the accompanying drawings of embodiments Obviously and it is readily appreciated that, in which:
Fig. 1 is a kind of flow diagram of video searching method provided by the embodiment of the present application;
Fig. 2 is the flow diagram of another kind video searching method provided by the embodiment of the present application;
Fig. 3 is the flow diagram of another visual search method provided by the embodiment of the present application;
Fig. 4 is the flow diagram of another visual search method provided by the embodiment of the present application;
Fig. 5 is the realization process schematic of the visual search method of one embodiment of the application;
Fig. 6 is the single-frame images timing diagram of visual search;
Fig. 7 is a kind of structural schematic diagram of visual search device provided by the embodiment of the present application;
Fig. 8 is the structural schematic diagram of another kind visual search device provided by the embodiment of the present application;
Fig. 9 is the structural schematic diagram of another visual search device provided by the embodiment of the present application;
Figure 10 is the structural schematic diagram of another visual search device provided by the embodiment of the present application;And
Figure 11 is the structural schematic diagram of computer equipment provided by the embodiment of the present application.
Specific embodiment
Embodiments herein is described below in detail, examples of the embodiments are shown in the accompanying drawings, wherein from beginning to end Same or similar label indicates same or similar element or element with the same or similar functions.Below with reference to attached The embodiment of figure description is exemplary, it is intended to for explaining the application, and should not be understood as the limitation to the application.
Below with reference to the accompanying drawings the visual search method, apparatus, computer equipment and storage medium of the embodiment of the present application are described.
Current visual search product, there are following deficiencies:
(1) operating process is cumbersome.When user carries out visual search using mobile terminal, need to open camera alignment target master Body is taken pictures and saves image into the photograph album of mobile terminal, then image is selected from photograph album, is uploaded image by network Visual search is carried out to visual search server.
(2) time-consuming for visual search.Image for visual search is needed through network transmission to visual search server, by After visual search server is detected, identified to the main body in image, the position of main body, recognition result are returned to mobile whole End.
(3) single main body being only capable of in identification image.
(4) main body in live video stream can not be carried out identifying and maintains recognition result in subsequent video stream.
In order to solve at least one of above problem present in visual search product, present applicant proposes a kind of visions Searching method.Fig. 1 is a kind of flow diagram of visual search method provided by the embodiment of the present application, and this method can be applied In the mobile terminals such as mobile phone, tablet computer, laptop.
As shown in Figure 1, the visual search method may comprise steps of:
Step 101, the i-th frame image is received, wherein i is positive integer.
Wherein, the i-th frame image is the frame image in live video stream.
When user wants to obtain the information of periphery object, user can be obtained by the visual search function of mobile terminal Take the information of periphery object.Mobile terminal starts the video flowing of camera acquisition periphery object, and the i-th frame is received from video flowing Image, wherein i is positive integer.
When user wants to obtain the information of multiple objects, user can acquire the video flowing comprising multiple objects, clap When taking the photograph, user only needs to start camera alignment target object, without clicking shooting button manually, without from photograph album Selection image simultaneously uploads, to simplify the operating process of visual search.
Step 102, the position of main body and classification in the i-th frame image are extracted, and generates the corresponding detection block of main body.
In the present embodiment, after receiving the i-th frame image, the i-th frame image can be detected and be identified, extract i-th The position of main body and classification in frame image, and generate the corresponding detection block of main body.
In a kind of possible implementation of the embodiment of the present application, mobile terminal detects received i-th frame image When, it can be using the object detection model realization based on deep learning, after the relevant parameter for having configured object detection model, it will Received i-th frame image is input in object detection model, is carried out by object detection model to the main body for including in the i-th frame image Detection exports the position of main body in the i-th frame image.
It, can be suitable according to the main body'choice for including in the i-th frame image when mobile terminal identifies the i-th frame image Recognizer, when in the i-th frame image include two dimensional code when, two dimensional code recognizer can be called, when in the i-th frame image comprising plant When the objects such as object, animal, object classification recognizer can be called.
As a kind of possible implementation, mobile terminal can be using the main body disaggregated model based on deep learning to the The main body for including in i frame image is identified, after the relevant parameter for having configured main body disaggregated model, by received i-th frame figure As being input to main body disaggregated model, Classification and Identification carried out to the main body for including in the i-th frame image by main body disaggregated model, output the The classification of main body in i frame image, wherein include the recognition result of main body in classification.
By the way that the main body in the i-th frame image is detected and identified by mobile terminal, mobile terminal and service are avoided Data exchange between device, reduces waiting time, to reduce time-consuming.
Main body in i-th frame image is detected to obtain the position of main body, and main body is identified to obtain main body Classification after, the corresponding detection block of main body can be generated according to the position of main body and classification, wherein detection block carrying body Recognition result.
In a kind of possible implementation of the embodiment of the present application, main body is multiple, and detection block is multiple.Mobile terminal May include multiple main bodys in the i-th frame image in the video flowing of acquisition, using based on deep learning object detection model and master Body disaggregated model can while be detected and be identified to multiple main bodys in the i-th frame image, and be directed to each main body, according to The corresponding position of the main body and classification generate detection block corresponding with the main body.Hereby it is achieved that simultaneously to multiple in image Main body is identified, visual search efficiency is improved, and solves the problems, such as to be only capable of identification single main body in the prior art.
Step 103, main body is carried out according to the position of main body in the i-th frame image in the subsequent frame image of the i-th frame image Tracking, and detection block is adjusted according to tracking result.
Video stream packets contain multiple image, when the i-th frame image is not the last frame image in video flowing, the i-th frame image There are also an at least frame subsequent frame images later.To, it, can also be in the subsequent frame image of the i-th frame image in the present embodiment, root Main body is tracked according to the position of main body in the i-th frame image, and detection block is adjusted according to tracking result.
For example, can be according to the position of main body in the i-th frame image, using relevant target tracking algorism, in the i-th frame image Subsequent frame image in, track the position of main body.It, can be according to the master traced into when tracing into main body in subsequent frame image Position, that is, tracking result of body, is adjusted detection block.
As an example, the track algorithm based on target detection can be used, the subsequent frame image received is carried out Target detection, and the body position that will test is compared with the position of main body in the i-th frame image, when the two is inconsistent, Detection block is adjusted according to the position of main body in subsequent frame image.
In a kind of possible implementation of the embodiment of the present application, when the main body in the i-th frame is multiple, it can use only One identifier is as main body identification code, to distinguish different main bodys, and then when carrying out main body tracking, according to main body identification code pair Main body is tracked, and the corresponding detection block of adjustment.
The visual search method of the present embodiment, by receive the i-th frame image, extract the i-th frame image in main body position and Classification, and the corresponding detection block of main body is generated, according to the position of main body in the i-th frame image in the subsequent frame image of the i-th frame image It sets and main body is tracked, and detection block is adjusted according to tracking result.As a result, by according to main body in the i-th frame image Position main body is tracked in subsequent frames, and detection block is adjusted according to tracking result, is realized in video flowing The tracking of main body improves the continuity of visual search.
It include multiple image in video flowing, the main body for including may be different, in order in video streaming in each frame image It remains to that main body is identified and tracked when main body changes, present applicant proposes another video searching methods.Fig. 2 is this Apply for the flow diagram of another kind video searching method provided by embodiment.
As shown in Fig. 2, on the basis of embodiment as shown in Figure 1, which can also include following step It is rapid:
Step 201, the i-th+M frame image is received, wherein M is positive integer.
During mobile terminal carries out main body identification and tracking to video flowing, mobile terminal is persistently obtained in video flowing Picture frame.
Step 202, judge whether the main body in the i-th+M frame image changes relative to the main body in the i-th frame image.
Step 203, if there is a change, then detection block is regenerated according to the main body detected in the i-th+M frame image, And re-start tracking.
In the present embodiment, mobile terminal receives the i-th frame image, and the main body in the i-th frame image is detected and known Not, during detection and identification, mobile terminal persistently obtains the subsequent frame image of the i-th frame image.For receive i-th + M frame image, mobile terminal carry out subject detection and identification, and the main body that will be recognized in the i-th+M frame to the i-th+M frame image It is compared with the main body in the i-th frame image, judges that the main body in the i-th+M frame image is relative to the main body in the i-th frame image It is no to change.
When knowing that the main body in the i-th+M frame image changes relative to the main body in the i-th frame image, then according to the The main body detected in i+M frame image regenerates detection block, and re-starts tracking.
Specifically, when the main body in the i-th+M frame image at least one with main body difference in the i-th frame when, then according to existing The position for the main body that main body is detected in i-th+M frame image, and the class of main body that main body is identified Not, the corresponding detection block of main body is regenerated in the i-th frame image, and to main body in the subsequent frame image of the i-th+M frame image It is tracked.
The visual search method of the present embodiment, by judging that main body is relative to the i-th frame image in received i-th+M frame image In main body whether change, and inspection is regenerated according to the main body detected in the i-th+M frame image when changing Survey frame, and re-start tracking, hereby it is achieved that when occurring new main body in video streaming, to the identification for main body newly occur with Tracking, the user experience is improved.
In order to clearly describe the specific implementation process tracked in previous embodiment to main body, the application is proposed Another visual search method, Fig. 3 are the flow diagram of another visual search method provided by the embodiment of the present application.
As shown in figure 3, step 103 may comprise steps of on the basis of embodiment as shown in Figure 1:
Step 301, the i-th+n frame image after the i-th frame image is obtained, wherein n is positive integer.
Step 302, main body is tracked according to the position of main body in the i-th+n frame image.
In the present embodiment, after mobile terminal receives the i-th frame image, mistake that the i-th frame image is detected and identified Cheng Zhong also obtains the picture frame after the i-th frame image.Mobile terminal to the i-th+n frame image for receiving carry out subject detection and Identification, obtain the position of main body and classification in the i-th+n frame image, and then in the i-th+n frame image according to the position of main body to master Body is tracked.
During the i-th frame image is detected and identified due to mobile terminal, the subsequent of the i-th frame image is persistently obtained Frame image, and when being tracked in subsequent frame image to main body, need position according to the main body detected in the i-th frame image into Line trace carries out tracking initialization according to the position of main body in the i-th frame image, accordingly, it is possible to receive in the presence of when mobile terminal It, the case where not yet detecting the position of main body in the i-th frame image, at this time then can not be in i+1 frame image when the i-th+n-1 frame image Main body is tracked into the i-th+n-1 frame image.
In a kind of possible implementation of the embodiment of the present application, available i+1 frame image to the i-th+n-1 frame image Between picture frame the tracking of main body is verified according to reference image frame and as reference image frame.For example, comparing i-th Variation range of the position of main body relative to the position of main body in the i-th+n-1 frame in+n frame image, if with the i-th+n-1 frame image The position of middle main body relative to the position of main body in the i-th+n-2 frame variation range within the scope of allowable error, if verifying Main body tracking is accurate.Thereby, it is possible to improve the accuracy rate of main body tracking.
The visual search method of the present embodiment, by the i-th+n frame image after the i-th frame image of acquisition, in the i-th+n frame figure Main body is tracked according to the position of main body as in, improves the continuity of visual search.
In order to clearly describe the specific implementation process tracked in previous embodiment to main body, the application is proposed Another visual search method, Fig. 4 are the flow diagram of another visual search method provided by the embodiment of the present application.
As shown in figure 4, step 103 may comprise steps of on the basis of embodiment as shown in Figure 1:
Step 401, the brightness of subsequent frame image is obtained.
In the present embodiment, after the subsequent frame image for getting the i-th frame image, the illumination of available subsequent image frames is strong Degree.
Since brightness of image is substantially the brightness of each pixel in image, and the brightness of each pixel is substantially RGB The size of value, when rgb value is 0, pixel is black, and brightness is minimum, and when rgb value is 255, pixel is white, brightness Highest.To in the present embodiment, for received subsequent frame image, brightness of the pixel value of available image as image.
Step 402, when the difference of the brightness of two continuous frames image be greater than or equal to the first preset threshold when, call KCF with Track algorithm tracks main body according to the position of main body in the i-th frame image.
Step 403, when the difference of the brightness of two continuous frames image is less than the first preset threshold, optical flow tracking algorithm is called Main body is tracked according to the position of main body in the i-th frame image.
Wherein, the first preset threshold can be preset.
In the present embodiment, a frame image is often received, the brightness of the available image simultaneously records the brightness, and by the figure The brightness of picture is compared with the brightness of previous frame image, obtains the difference of the brightness of two field pictures, when two continuous frames image When the difference of brightness is greater than or equal to the first preset threshold, then angling correlation filter (Kernelized is called Correlation Filters, KCF) track algorithm, main body is tracked according to the position of main body in the i-th frame image.
KCF track algorithm acquires positive negative sample using the circular matrix of target peripheral region, utilizes ridge regression training objective Detector, and the operation of matrix is converted to the point of element in the property of Fourier space diagonalizable using circular matrix Multiply, to substantially reduce operand, improves arithmetic speed, algorithm is made to meet requirement of real-time.
When the difference of the brightness of two continuous frames image is less than the first preset threshold, then optical flow tracking algorithm is called, according to The position of main body tracks main body in i-th frame image.
The principle of optical flow tracking algorithm is: handling a continuous sequence of frames of video, for each video sequence Column, using certain object detection method, detect the foreground target being likely to occur, if foreground target occurs in a certain frame, look for To its representative key feature points (can be randomly generated, also can use angle point as characteristic point);To appointing later For two adjacent video frames of anticipating, the optimum position of the key feature points occurred in previous frame in the current frame is found, thus To the position coordinates of foreground target in the current frame, the tracking of target is can be realized in such iteration.Optical flow tracking algorithm is suitable for Target following when intensity of illumination is smaller.
The visual search method of the present embodiment, by obtaining the brightness of subsequent frame image, when the brightness of two continuous frames image Difference be greater than or equal to the first preset threshold when, call KCF track algorithm according to the position of main body in the i-th frame image to main body It is tracked, when the difference of the brightness of two continuous frames image is less than the first preset threshold, calls optical flow tracking algorithm according to i-th The position of main body tracks main body in frame image, thereby, it is possible to improve the accuracy and precision of main body tracking, improves main body Tracking effect.
Fig. 5 is the realization process schematic of the visual search method of one embodiment of the application.Fig. 6 is the single frames of visual search Image timing diagram.
As shown in figure 5, first to Fig. 1 do subject detection obtain main body position, during subject detection, not will do it with Track initialization, therefore the image of Fig. 2 to figure n-1 are not used in main body tracking, this parts of images can be used for tracking verification.To figure After 1 completes subject detection, the position for the main body that will acquire is stored in memory according to main body identification code, that is, carries out main information more Newly, and according to the position of main body tracking initialization is carried out.When receiving figure n, at this time tracking initialization finish, then to figure n into The tracking of row main body, and main body tracking is carried out to subsequent image, (figure m is led in such as Fig. 5 until carrying out subject detection again Physical examination is surveyed), and tracking initialization is carried out according to new testing result again.After the completion of tracking processing, the position of main body is obtained more Newly, and according to position of the main body identification code to main body stored in memory it is updated.Mobile terminal according to the position of main body into The choosing of row body frame, and main body is identified, such as object classification identification, Text region, two dimensional code identification etc..When having identified Recognition result is stored in memory by Cheng Hou according to main body identification code.In each memory main information (including body position, Recognition result) when updating, mobile terminal finds a view in video flowing according to updated main information and carries out view wash with watercolours in interface Dye, the recognition result of the position of main body and main body is shown in corresponding main body by way of detection block, reaches vision and searches The purpose of rope.
As shown in fig. 6, picture 1 is selected suitable detection method to carry out subject detection, is obtained according to detection configuration information To the position of main body, in the form of detection block feedback selects the position of main body to interface layer in 1 center of picture.According to identification Configuration information selects suitable recognition methods, carries out main body identification to the main body for detecting circle choosing in picture 1, and by recognition result Interface layer is fed back to by total activation, i.e., shows the corresponding recognition result of main body in picture 1.For picture 2, then according to detection Configuration information selects suitable tracking, using determining tracking, according to the position of main body in picture 1 to picture 2 into The tracking of row main body, and by total activation returning tracking result to interface layer, tracking result and recognition result are shown in picture 2.
In order to realize above-described embodiment, the application also proposes a kind of visual search device.
Fig. 7 is a kind of structural schematic diagram of visual search device provided by the embodiment of the present application.
As shown in fig. 7, the visual search device 50 includes: receiving module 510, extraction module 520 and tracking module 530.Wherein,
Receiving module 510, for receiving the i-th frame image, wherein i is positive integer.
Extraction module 520 for extracting the position of main body and classification in the i-th frame image, and generates the corresponding detection of main body Frame.
In a kind of possible implementation of the embodiment of the present application, main body is multiple, and detection block is multiple.
Tracking module 530, in the subsequent frame image of the i-th frame image according to the position pair of main body in the i-th frame image Main body is tracked, and is adjusted according to tracking result to detection block.
In a kind of possible implementation of the embodiment of the present application, as shown in figure 8, on the basis of embodiment as shown in Figure 7 On, the visual search device 50 further include:
Judgment module 540, for judging whether the main body in the i-th+M frame image is sent out relative to the main body in the i-th frame image Changing.
Wherein, M is positive integer.
In the present embodiment, when receiving module 510 receives the i-th+M frame image, extraction module 520 extracts the i-th+M frame figure The position of main body and classification as in.Judgment module 540 judges the main body in the i-th+M frame image relative to the master in the i-th frame image Whether body changes, when determining to change, then by extraction module 540 according to the master detected in the i-th+M frame image The newly-generated detection block of weight, and tracking is re-started by tracking module 530.
Whether changed by judging in received i-th+M frame image main body relative to the main body in the i-th frame image, and Detection block is regenerated according to the main body detected in the i-th+M frame image when changing, and re-starts tracking, by This is realized when occurring new main body in video streaming, and to newly there is the identification of main body and tracking, the user experience is improved.
In a kind of possible implementation of the embodiment of the present application, as shown in figure 9, on the basis of embodiment as shown in Figure 7 On, tracking module 530 includes:
Acquiring unit 531, for obtaining the brightness of subsequent frame image.
Tracking cell 532, for adjusting when the difference of the brightness of two continuous frames image is greater than or equal to the first preset threshold Main body is tracked according to the position of main body in the i-th frame image with KCF track algorithm.
Tracking cell 532 is also used to when the difference of the brightness of two continuous frames image is less than the first preset threshold, calls light Rigid-liquid coupled system tracks main body according to the position of main body in the i-th frame image.
By obtaining the brightness of subsequent frame image, preset when the difference of the brightness of two continuous frames image is greater than or equal to first When threshold value, KCF track algorithm is called to be tracked according to the position of main body in the i-th frame image to main body, when two continuous frames image Brightness difference less than the first preset threshold when, call optical flow tracking algorithm according to the position of main body in the i-th frame image to master Body is tracked, and thereby, it is possible to improve the accuracy and precision of main body tracking, improves main body tracking effect.
In a kind of possible implementation of the embodiment of the present application, as shown in Figure 10, on the basis of embodiment as shown in Figure 7 On, tracking module 530 includes:
Image acquisition unit 533, for obtaining the i-th+n frame image after the i-th frame image, wherein n is positive integer.
Main body tracking cell 534, for being tracked according to the position of main body to main body in the i-th+n frame image.
Further, in a kind of possible implementation of the embodiment of the present application, image acquisition unit 533 is also used to obtain I+1 frame image is to the picture frame between the i-th+n-1 frame image, and as reference image frame.Main body tracking cell 534 is also used to The tracking of main body is verified according to reference image frame.Thereby, it is possible to improve the accuracy rate of main body tracking.
It should be noted that the aforementioned vision for being also applied for the embodiment to the explanation of visual search embodiment of the method Searcher, realization principle is similar, and details are not described herein again.
The visual search device of the embodiment of the present application extracts the position of main body in the i-th frame image by receiving the i-th frame image It sets and classification, and generates the corresponding detection block of main body, according to main body in the i-th frame image in the subsequent frame image of the i-th frame image Position main body is tracked, and detection block is adjusted according to tracking result.As a result, by according in the i-th frame image The position of main body in subsequent frames tracks main body, and is adjusted according to tracking result to detection block, realizes video The tracking of main body in stream, improves the continuity of visual search.
In order to realize above-described embodiment, the application also proposes a kind of computer equipment, comprising: processor and memory.Its In, processor runs journey corresponding with executable program code by reading the executable program code stored in memory Sequence, for realizing visual search method as in the foregoing embodiment.
Figure 11 is the structural schematic diagram of computer equipment provided by the embodiment of the present application, shows and is suitable for being used to realizing this Apply for the block diagram of the exemplary computer device 90 of embodiment.The computer equipment 90 that Figure 11 is shown is only an example, Should not function to the embodiment of the present application and use scope bring any restrictions.
As shown in figure 11, computer equipment 90 is showed in the form of general purpose computing device.The component of computer equipment 90 Can include but is not limited to: one or more processor or processing unit 906, system storage 910 connect not homologous ray The bus 908 of component (including system storage 910 and processing unit 906).
Bus 908 indicates one of a few class bus structures or a variety of, including memory bus or Memory Controller, Peripheral bus, graphics acceleration port, processor or the local bus using any bus structures in a variety of bus structures.It lifts For example, these architectures include but is not limited to industry standard architecture (Industry Standard Architecture;Hereinafter referred to as: ISA) bus, microchannel architecture (Micro Channel Architecture;Below Referred to as: MAC) bus, enhanced isa bus, Video Electronics Standards Association (Video Electronics Standards Association;Hereinafter referred to as: VESA) local bus and peripheral component interconnection (Peripheral Component Interconnection;Hereinafter referred to as: PCI) bus.
Computer equipment 90 typically comprises a variety of computer system readable media.These media can be it is any can be by The usable medium that computer equipment 90 accesses, including volatile and non-volatile media, moveable and immovable medium.
System storage 910 may include the computer system readable media of form of volatile memory, such as deposit at random Access to memory (Random Access Memory;Hereinafter referred to as: RAM) 911 and/or cache memory 912.Computer is set Standby 90 may further include other removable/nonremovable, volatile/non-volatile computer system storage mediums.Only As an example, storage system 913 can be used for reading and writing immovable, non-volatile magnetic media (Figure 11 do not show, commonly referred to as " hard disk drive ").Although being not shown in Figure 11, can provide for reading removable non-volatile magnetic disk (such as " floppy disk ") The disc driver write, and to removable anonvolatile optical disk (such as: compact disc read-only memory (Compact Disc Read Only Memory;Hereinafter referred to as: CD-ROM), digital multi CD-ROM (Digital Video Disc Read Only Memory;Hereinafter referred to as: DVD-ROM) or other optical mediums) read-write CD drive.In these cases, each driving Device can be connected by one or more data media interfaces with bus 908.System storage 910 may include at least one Program product, the program product have one group of (for example, at least one) program module, these program modules are configured to perform this Apply for the function of each embodiment.
Computer-readable signal media may include in a base band or as carrier wave a part propagate data-signal, Wherein carry computer-readable program code.The data-signal of this propagation can take various forms, including --- but It is not limited to --- electromagnetic signal, optical signal or above-mentioned any appropriate combination.Computer-readable signal media can also be Any computer-readable medium other than computer readable storage medium, which can send, propagate or Transmission is for by the use of instruction execution system, device or device or program in connection.
The program code for including on computer-readable medium can transmit with any suitable medium, including --- but it is unlimited In --- wireless, electric wire, optical cable, RF etc. or above-mentioned any appropriate combination.
Can with one or more programming languages or combinations thereof come write for execute the application operation computer Program code, described program design language include object oriented program language-such as Java, Smalltalk, C++, It further include conventional procedural programming language-such as " C " language or similar programming language.Program code can be with It fully executes, partly execute on the user computer on the user computer, being executed as an independent software package, portion Divide and partially executes or executed on a remote computer or server completely on the remote computer on the user computer.
Program/utility 914 with one group of (at least one) program module 9140, can store and deposit in such as system In reservoir 910, such program module 9140 includes but is not limited to operating system, one or more application program, Qi Tacheng It may include the realization of network environment in sequence module and program data, each of these examples or certain combination.Program Module 9140 usually executes function and/or method in embodiments described herein.
Computer equipment 90 can also be with one or more external equipments 10 (such as keyboard, sensing equipment, display 100 Deng) communication, can also be enabled a user to one or more equipment interact with the terminal device 90 communicate, and/or with make Any equipment (such as network interface card, the modulation /demodulation that the computer equipment 90 can be communicated with one or more of the other calculating equipment Device etc.) communication.This communication can be carried out by input/output (I/O) interface 902.Also, computer equipment 90 can be with Pass through network adapter 900 and one or more network (such as local area network (Local Area Network;Hereinafter referred to as: LAN), wide area network (Wide Area Network;Hereinafter referred to as: WAN) and/or public network, for example, internet) communication.Such as figure Shown in 11, network adapter 900 is communicated by bus 908 with other modules of computer equipment 90.Although should be understood that Figure 11 In be not shown, can in conjunction with computer equipment 90 use other hardware and/or software module, including but not limited to: microcode is set Standby driver, redundant processing unit, external disk drive array, RAID system, tape drive and data backup storage system System etc..
Processing unit 906 by the program that is stored in system storage 910 of operation, thereby executing various function application with And data processing, such as realize the visual search method referred in previous embodiment.
In order to realize above-described embodiment, the application also proposes a kind of non-transitorycomputer readable storage medium, deposits thereon Computer program is contained, when which is executed by processor, realizes visual search method as in the foregoing embodiment.
In order to realize above-described embodiment, the application also proposes a kind of computer program product, when the computer program produces When instruction in product is executed by processor, visual search method as in the foregoing embodiment is realized.
In the description of this specification, reference term " one embodiment ", " some embodiments ", " example ", " specifically show The description of example " or " some examples " etc. means specific features, structure, material or spy described in conjunction with this embodiment or example Point is contained at least one embodiment or example of the application.In the present specification, schematic expression of the above terms are not It must be directed to identical embodiment or example.Moreover, particular features, structures, materials, or characteristics described can be in office It can be combined in any suitable manner in one or more embodiment or examples.In addition, without conflicting with each other, the skill of this field Art personnel can tie the feature of different embodiments or examples described in this specification and different embodiments or examples It closes and combines.
In addition, term " first ", " second " are used for descriptive purposes only and cannot be understood as indicating or suggesting relative importance Or implicitly indicate the quantity of indicated technical characteristic.Define " first " as a result, the feature of " second " can be expressed or Implicitly include at least one this feature.In the description of the present application, the meaning of " plurality " is at least two, such as two, three It is a etc., unless otherwise specifically defined.
Any process described otherwise above or method description are construed as in flow chart or herein, and expression includes It is one or more for realizing custom logic function or process the step of executable instruction code module, segment or portion Point, and the range of the preferred embodiment of the application includes other realization, wherein can not press shown or discussed suitable Sequence, including according to related function by it is basic simultaneously in the way of or in the opposite order, Lai Zhihang function, this should be by the application Embodiment person of ordinary skill in the field understood.
Expression or logic and/or step described otherwise above herein in flow charts, for example, being considered use In the order list for the executable instruction for realizing logic function, may be embodied in any computer-readable medium, for Instruction execution system, device or equipment (such as computer based system, including the system of processor or other can be held from instruction The instruction fetch of row system, device or equipment and the system executed instruction) it uses, or combine these instruction execution systems, device or set It is standby and use.For the purpose of this specification, " computer-readable medium ", which can be, any may include, stores, communicates, propagates or pass Defeated program is for instruction execution system, device or equipment or the dress used in conjunction with these instruction execution systems, device or equipment It sets.The more specific example (non-exhaustive list) of computer-readable medium include the following: there is the electricity of one or more wirings Interconnecting piece (electronic device), portable computer diskette box (magnetic device), random access memory (RAM), read-only memory (ROM), erasable edit read-only storage (EPROM or flash memory) and fiber device.In addition, computer-readable Medium can even is that the paper that can print described program on it or other suitable media because can for example by paper or Other media carry out optical scanner, are then edited, interpret or are handled when necessary with other suitable methods with electronics Mode obtains described program, is then stored in computer storage.
It should be appreciated that each section of the application can be realized with hardware, software, firmware or their combination.Above-mentioned In embodiment, software that multiple steps or method can be executed in memory and by suitable instruction execution system with storage Or firmware is realized.Such as, if realized with hardware in another embodiment, following skill well known in the art can be used Any one of art or their combination are realized: have for data-signal is realized the logic gates of logic function from Logic circuit is dissipated, the specific integrated circuit with suitable combinational logic gate circuit, programmable gate array (PGA), scene can compile Journey gate array (FPGA) etc..
Those skilled in the art are understood that realize all or part of step that above-described embodiment method carries It suddenly is that relevant hardware can be instructed to complete by program, the program can store in a kind of computer-readable storage medium In matter, which when being executed, includes the steps that one or a combination set of embodiment of the method.
It, can also be in addition, can integrate in a processing module in each functional unit in each embodiment of the application It is that each unit physically exists alone, can also be integrated in two or more units in a module.Above-mentioned integrated mould Block both can take the form of hardware realization, can also be realized in the form of software function module.The integrated module is such as Fruit is realized and when sold or used as an independent product in the form of software function module, also can store in a computer In read/write memory medium.
Storage medium mentioned above can be read-only memory, disk or CD etc..Although having been shown and retouching above Embodiments herein is stated, it is to be understood that above-described embodiment is exemplary, and should not be understood as the limit to the application System, those skilled in the art can be changed above-described embodiment, modify, replace and become within the scope of application Type.

Claims (10)

1. a kind of visual search method characterized by comprising
Receive the i-th frame image, wherein i is positive integer;
The position of main body and classification in the i-th frame image are extracted, and generates the corresponding detection block of the main body;And
The main body is carried out according to the position of main body in the i-th frame image in the subsequent frame image of the i-th frame image Tracking, and the detection block is adjusted according to the tracking result.
2. visual search method as described in claim 1, which is characterized in that further include:
Receive the i-th+M frame image, wherein M is positive integer;
Judge whether the main body in the i-th+M frame image changes relative to the main body in the i-th frame image;
If there is a change, then detection block is regenerated according to the main body that detects in the i-th+M frame image, and again into Line trace.
3. visual search method as described in claim 1, which is characterized in that the subsequent frame figure in the i-th frame image The main body is tracked according to the position of main body in the i-th frame image as in, comprising:
Obtain the i-th+n frame image after the i-th frame image, wherein n is positive integer;
The main body is tracked according to the position of main body in the i-th+n frame image.
4. visual search method as claimed in claim 3, which is characterized in that further include:
The i+1 frame image is obtained to the picture frame between the i-th+n-1 frame image, and as reference image frame;
The tracking of the main body is verified according to the reference image frame.
5. visual search method as described in claim 1, which is characterized in that the main body is multiple, and the detection block is It is multiple.
6. visual search method as described in claim 1, which is characterized in that in the subsequent frame image of the i-th frame image The main body is tracked according to the position of main body in the i-th frame image, comprising:
Obtain the brightness of subsequent frame image;
When the difference of the brightness of two continuous frames image is greater than or equal to the first preset threshold, call KCF track algorithm according to institute The position for stating main body in the i-th frame image tracks the main body;
When the difference of the brightness of two continuous frames image is less than first preset threshold, call optical flow tracking algorithm according to The position of main body tracks the main body in i-th frame image.
7. a kind of visual search device characterized by comprising
Receiving module, for receiving the i-th frame image, wherein i is positive integer;
Extraction module for extracting the position of main body and classification in the i-th frame image, and generates the corresponding detection of the main body Frame;And
Tracking module, in the subsequent frame image of the i-th frame image according to the position pair of main body in the i-th frame image The main body is tracked, and is adjusted according to the tracking result to the detection block.
8. visual search device as claimed in claim 7, which is characterized in that the tracking module, comprising:
Acquiring unit, for obtaining the brightness of subsequent frame image;
Tracking cell, for when the difference of the brightness of two continuous frames image be greater than or equal to the first preset threshold when, call KCF with Track algorithm tracks the main body according to the position of main body in the i-th frame image;
The tracking cell is also used to call when the difference of the brightness of two continuous frames image is less than first preset threshold Optical flow tracking algorithm tracks the main body according to the position of main body in the i-th frame image.
9. a kind of computer equipment, which is characterized in that including processor and memory;
Wherein, the processor is run by reading the executable program code stored in the memory can be performed with described The corresponding program of program code, for realizing visual search method such as of any of claims 1-6.
10. a kind of non-transitorycomputer readable storage medium, is stored thereon with computer program, which is characterized in that the program Such as visual search method of any of claims 1-6 is realized when being executed by processor.
CN201811392516.XA 2018-11-21 2018-11-21 Visual search method, apparatus, computer equipment and storage medium Pending CN109558505A (en)

Priority Applications (6)

Application Number Priority Date Filing Date Title
CN201811392516.XA CN109558505A (en) 2018-11-21 2018-11-21 Visual search method, apparatus, computer equipment and storage medium
KR1020207035613A KR102440198B1 (en) 2018-11-21 2019-07-01 VIDEO SEARCH METHOD AND APPARATUS, COMPUTER DEVICE, AND STORAGE MEDIUM
EP19886874.7A EP3885934A4 (en) 2018-11-21 2019-07-01 Video search method and apparatus, computer device, and storage medium
JP2020571638A JP7204786B2 (en) 2018-11-21 2019-07-01 Visual search method, device, computer equipment and storage medium
PCT/CN2019/094248 WO2020103462A1 (en) 2018-11-21 2019-07-01 Video search method and apparatus, computer device, and storage medium
US17/041,411 US11348254B2 (en) 2018-11-21 2019-07-01 Visual search method, computer device, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811392516.XA CN109558505A (en) 2018-11-21 2018-11-21 Visual search method, apparatus, computer equipment and storage medium

Publications (1)

Publication Number Publication Date
CN109558505A true CN109558505A (en) 2019-04-02

Family

ID=65867026

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811392516.XA Pending CN109558505A (en) 2018-11-21 2018-11-21 Visual search method, apparatus, computer equipment and storage medium

Country Status (6)

Country Link
US (1) US11348254B2 (en)
EP (1) EP3885934A4 (en)
JP (1) JP7204786B2 (en)
KR (1) KR102440198B1 (en)
CN (1) CN109558505A (en)
WO (1) WO2020103462A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110062272A (en) * 2019-04-30 2019-07-26 腾讯科技(深圳)有限公司 A kind of video data handling procedure and relevant apparatus
CN111008305A (en) * 2019-11-29 2020-04-14 百度在线网络技术(北京)有限公司 Visual search method and device and electronic equipment
WO2020103462A1 (en) * 2018-11-21 2020-05-28 百度在线网络技术(北京)有限公司 Video search method and apparatus, computer device, and storage medium

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104574445A (en) * 2015-01-23 2015-04-29 北京航空航天大学 Target tracking method and device
CN106683110A (en) * 2015-11-09 2017-05-17 展讯通信(天津)有限公司 User terminal and object tracking method and device thereof
CN108053427A (en) * 2017-10-31 2018-05-18 深圳大学 A kind of modified multi-object tracking method, system and device based on KCF and Kalman
CN108154159A (en) * 2017-12-25 2018-06-12 北京航空航天大学 A kind of method for tracking target with automatic recovery ability based on Multistage Detector
CN108230353A (en) * 2017-03-03 2018-06-29 北京市商汤科技开发有限公司 Method for tracking target, system and electronic equipment
CN108665476A (en) * 2017-03-31 2018-10-16 华为数字技术(苏州)有限公司 A kind of pedestrian tracting method and electronic equipment

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH05205052A (en) * 1992-01-23 1993-08-13 Matsushita Electric Ind Co Ltd Automatic tracking device
US20040125877A1 (en) * 2000-07-17 2004-07-01 Shin-Fu Chang Method and system for indexing and content-based adaptive streaming of digital video content
KR101599871B1 (en) * 2009-02-11 2016-03-04 삼성전자주식회사 Photographing apparatus and photographing method
JP5258651B2 (en) * 2009-03-25 2013-08-07 株式会社東芝 Object detection apparatus, object detection method, and program
JP2010231254A (en) * 2009-03-25 2010-10-14 Fujifilm Corp Image analyzing device, method of analyzing image, and program
JP5208893B2 (en) * 2009-09-14 2013-06-12 セコム株式会社 Moving object tracking device
US8744125B2 (en) * 2011-12-28 2014-06-03 Pelco, Inc. Clustering-based object classification
JP2016207140A (en) * 2015-04-28 2016-12-08 Kddi株式会社 Video analysis device, video analysis method, and program
CN107563256A (en) * 2016-06-30 2018-01-09 北京旷视科技有限公司 Aid in driving information production method and device, DAS (Driver Assistant System)
WO2018048353A1 (en) 2016-09-09 2018-03-15 Nanyang Technological University Simultaneous localization and mapping methods and apparatus
EP3312762B1 (en) * 2016-10-18 2023-03-01 Axis AB Method and system for tracking an object in a defined area
US10628961B2 (en) * 2017-10-13 2020-04-21 Qualcomm Incorporated Object tracking for neural network systems
CN108764338B (en) * 2018-05-28 2021-05-04 上海应用技术大学 Pedestrian tracking method applied to video analysis
CN108810616B (en) * 2018-05-31 2019-06-14 广州虎牙信息科技有限公司 Object localization method, image display method, device, equipment and storage medium
CN108830246B (en) * 2018-06-25 2022-02-15 中南大学 Multi-dimensional motion feature visual extraction method for pedestrians in traffic environment
US10726264B2 (en) * 2018-06-25 2020-07-28 Microsoft Technology Licensing, Llc Object-based localization
CN109558505A (en) * 2018-11-21 2019-04-02 百度在线网络技术(北京)有限公司 Visual search method, apparatus, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104574445A (en) * 2015-01-23 2015-04-29 北京航空航天大学 Target tracking method and device
CN106683110A (en) * 2015-11-09 2017-05-17 展讯通信(天津)有限公司 User terminal and object tracking method and device thereof
CN108230353A (en) * 2017-03-03 2018-06-29 北京市商汤科技开发有限公司 Method for tracking target, system and electronic equipment
CN108665476A (en) * 2017-03-31 2018-10-16 华为数字技术(苏州)有限公司 A kind of pedestrian tracting method and electronic equipment
CN108053427A (en) * 2017-10-31 2018-05-18 深圳大学 A kind of modified multi-object tracking method, system and device based on KCF and Kalman
CN108154159A (en) * 2017-12-25 2018-06-12 北京航空航天大学 A kind of method for tracking target with automatic recovery ability based on Multistage Detector

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
朱明敏 等: "基于相关滤波器的长时视觉目标跟踪方法", 《计算机应用》 *

Cited By (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020103462A1 (en) * 2018-11-21 2020-05-28 百度在线网络技术(北京)有限公司 Video search method and apparatus, computer device, and storage medium
CN110062272A (en) * 2019-04-30 2019-07-26 腾讯科技(深圳)有限公司 A kind of video data handling procedure and relevant apparatus
WO2020220968A1 (en) * 2019-04-30 2020-11-05 腾讯科技(深圳)有限公司 Video data processing method and related device
CN110062272B (en) * 2019-04-30 2021-09-28 腾讯科技(深圳)有限公司 Video data processing method and related device
US11900614B2 (en) 2019-04-30 2024-02-13 Tencent Technology (Shenzhen) Company Limited Video data processing method and related apparatus
CN111008305A (en) * 2019-11-29 2020-04-14 百度在线网络技术(北京)有限公司 Visual search method and device and electronic equipment
US11704813B2 (en) 2019-11-29 2023-07-18 Baidu Online Network Technology (Beijing) Co., Ltd. Visual search method, visual search device and electrical device

Also Published As

Publication number Publication date
KR20210008075A (en) 2021-01-20
EP3885934A1 (en) 2021-09-29
KR102440198B1 (en) 2022-09-02
EP3885934A4 (en) 2022-08-24
JP7204786B2 (en) 2023-01-16
US20210012511A1 (en) 2021-01-14
US11348254B2 (en) 2022-05-31
WO2020103462A1 (en) 2020-05-28
JP2021528767A (en) 2021-10-21

Similar Documents

Publication Publication Date Title
Huh et al. Fighting fake news: Image splice detection via learned self-consistency
CN107545241B (en) Neural network model training and living body detection method, device and storage medium
US9740967B2 (en) Method and apparatus of determining air quality
US10970334B2 (en) Navigating video scenes using cognitive insights
CN109284729B (en) Method, device and medium for acquiring face recognition model training data based on video
US8879894B2 (en) Pixel analysis and frame alignment for background frames
US20230376527A1 (en) Generating congruous metadata for multimedia
CN109214238A (en) Multi-object tracking method, device, equipment and storage medium
CN109948542A (en) Gesture identification method, device, electronic equipment and storage medium
Wang et al. A benchmark for clothes variation in person re‐identification
US11087137B2 (en) Methods and systems for identification and augmentation of video content
CN109558505A (en) Visual search method, apparatus, computer equipment and storage medium
CN109948450A (en) A kind of user behavior detection method, device and storage medium based on image
CN105608209A (en) Video labeling method and video labeling device
CN109902658A (en) Pedestrian's characteristic recognition method, device, computer equipment and storage medium
CN113255516A (en) Living body detection method and device and electronic equipment
CN112686122B (en) Human body and shadow detection method and device, electronic equipment and storage medium
Shuai et al. Large scale real-world multi-person tracking
CN110874554A (en) Action recognition method, terminal device, server, system and storage medium
CN113743160A (en) Method, apparatus and storage medium for biopsy
CN111753618A (en) Image recognition method and device, computer equipment and computer readable storage medium
CN111818364B (en) Video fusion method, system, device and medium
CN111819567A (en) Method and apparatus for matching images using semantic features
JP2018045517A (en) Application device, application method, and application program
CN106650252A (en) Data processing method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination