CN106682092A - Target retrieval method and terminal - Google Patents
Target retrieval method and terminal Download PDFInfo
- Publication number
- CN106682092A CN106682092A CN201611075914.XA CN201611075914A CN106682092A CN 106682092 A CN106682092 A CN 106682092A CN 201611075914 A CN201611075914 A CN 201611075914A CN 106682092 A CN106682092 A CN 106682092A
- Authority
- CN
- China
- Prior art keywords
- pending image
- target
- hamming distance
- candidate frame
- obtains
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/51—Indexing; Data structures therefor; Storage structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/50—Information retrieval; Database structures therefor; File system structures therefor of still image data
- G06F16/58—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/583—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Library & Information Science (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Software Systems (AREA)
- Image Analysis (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a target retrieval method and a terminal. The method comprises the following steps: executing the target detection to P images to be processed, and obtaining M candidate frames, wherein P and M are the integers greater than 1; executing the target detection to M candidate frames, and obtaining N targets, wherein N is the integer greater than 1; using a pre-trained multi-target identification model to execute the feature extraction to N targets, and obtaining K features, wherein K is the integer greater than 1; using the local sensitive Hash algorithm to execute the coding to K features, and obtaining K codes; according to K codes and P images to be processed, calculating the Hamming distance, and obtaining P Hamming distance values; and reserving the to-be-processed image corresponding to at least one target Hamming distance value in P Hamming distance values according with a preset threshold value. The target retrieval method is capable of improving the speed and accuracy of the target retrieve.
Description
Technical field
The present invention relates to technical field of video monitoring, and in particular to a kind of target retrieval method and terminal.
Background technology
Target image retrieval based on computer vision is most challenging research theme in computer vision field
One of, it to the effect that describes the feature of image using the anthropomorphic vision of computer mould, and according to the feature for describing from sea
Target image interested is found out in the image of amount.Image retrieval network image search for, medical image excavate, security monitoring and
The fields such as bad image filtering are widely used, and are at machine learning, pattern-recognition, computer vision and image
The focus and difficult point of the multi-door subject crossing researchs such as reason.Due to its complexity and structural so that based on computer vision
The research of target image retrieval correlation technique faces numerous challenges, and the accuracy of image retrieval still has much room for improvement.
High speed development and the popularization of various digitizers with computer technology, multimedia messages in modern society
Quantity rapidly increases so that people more and more touch a large amount of multimedia messages with rich connotation.In order to easily
Fast and accurately extract valuable content from the information aggregate of magnanimity, various image retrieval technologies just gradually into
For the focus of current research.
In prior art, traditional image retrieval great majority be and text and content retrieval, wherein, text retrieval skill
Art only provides the retrieval based on the description keyword of image, i.e., image is manually marked first, then again by closing
The lookup of key word is retrieving image.This search method is although convenient and simple, and Search Results are relatively accurate, but due to mark
The amount of labour is big and there is " semantic gap ", can not usually accurately reflect the content of image.From earlier 1990s,
Researcher just proposes the technology of CBIR in succession.The technology is actually a kind of Fuzzy Query Technology,
It directly from Image Visual Feature to be found, using visual signatures such as color, texture, shapes, realizes and image is regarded
Feel the retrieval of content characteristic.Certainly, the shortcoming of CBIR technology is also it will be apparent that being primarily present feature
Dimension is high, and retrieval rate is slow and the not high defect of retrieval precision.
The content of the invention
A kind of target retrieval method and terminal are embodiments provided, to improving the speed and essence of image retrieval
Degree.
Embodiment of the present invention first aspect provides a kind of target retrieval method, including:
Target detection is carried out to P pending image, M candidate frame is obtained, the P and M is whole more than 1
Number;
Target detection is carried out to the M candidate frame, N number of target is obtained, the N is the integer more than 1;
The multi-object Recognition Model crossed using training in advance carries out feature extraction to N number of target, obtains K feature,
The K is the integer more than 1;
The K feature is encoded using local sensitivity hash algorithm, obtains the K coding;
Hamming distance is calculated according to described K coding and the P pending image, the P Hamming distance value is obtained;
At least one target Hamming distance value for meeting predetermined threshold value in the P Hamming distance value is corresponding pending
Image is retained.
Alternatively, it is described that target detection is carried out to P pending image, M candidate frame is obtained, including:
Determine positive sample collection and negative sample collection;
The positive sample collection and the negative sample collection are trained, training pattern is obtained;
The training pattern is positioned in the P pending image, the X posting is obtained, wherein, institute
It is less than the positive integer of the P to state X;
Pseudo- posting is carried out to the X posting using non-maxima suppression algorithm to process, the K candidate is obtained
Frame, the K is less than the positive integer of the X.
Alternatively, target detection is carried out to the M candidate frame, including:
Target detection is carried out to the M candidate frame using fast neuronal network algorithm.
Alternatively, the employing fast neuronal network algorithm carries out target detection to the M candidate frame, including:
Target detection is carried out to the M candidate frame using the fast neuronal network algorithm based on roi-pooling functions.
Alternatively, it is described that Hamming distance is calculated according to described K coding and the P pending image, obtain the P
Hamming distance value, including:
Q coding corresponding with pending image i is chosen from described K coding, the Q is whole more than or equal to 1
Number, the pending image i is in the P pending image;
The Hamming distance value of the pending image i is determined according to described Q coding and the pending image i.
Embodiment of the present invention second aspect provides a kind of terminal, including:
First detector unit, for carrying out target detection to P pending image, obtains M candidate frame, the P and institute
State M and be integer more than 1;
Second detector unit, for carrying out target detection to the M candidate frame, obtains N number of target, and the N is more than 1
Integer;
Extraction unit, the multi-object Recognition Model for being crossed using training in advance carries out feature extraction to N number of target,
K feature is obtained, the K is the integer more than 1;
Coding unit, for encoding to the K feature using local sensitivity hash algorithm, obtains the K volume
Code;
Computing unit, for calculating Hamming distance according to described K coding and the P pending image, obtains the P
Individual Hamming distance value;
Determining unit, at least one target Hamming distance value of predetermined threshold value will to be met in the P Hamming distance value
Corresponding pending image is retained.
Alternatively, first detector unit includes:
First determining module, determines positive sample collection and negative sample collection;
Generation module, is trained to the positive sample collection and the negative sample collection, obtains training pattern;
Locating module, for the training pattern to be positioned in the P pending image, obtains the X
Posting, wherein, the X is less than the positive integer of the P;
Processing module, carries out pseudo- posting to the X posting and processes using non-maxima suppression algorithm, obtains institute
K candidate frame is stated, the K is less than the positive integer of the X.
Alternatively, second detector unit specifically for:
Target detection is carried out to the M candidate frame using fast neuronal network algorithm.
Alternatively, second detector unit specifically for:
Target detection is carried out to the M candidate frame using the fast neuronal network algorithm based on roi-pooling functions.
Alternatively, the computing unit includes:
Module is chosen, for choosing Q coding corresponding with pending image i from described K coding, the Q is big
In or integer equal to 1, the pending image i is in the P pending image;
Second determining module, for determining the pending image i according to described Q coding and the pending image i
Hamming distance value.
Implement the embodiment of the present invention, have the advantages that:
By the embodiment of the present invention, target detection is carried out to P pending image, obtain M candidate frame, P and M are greatly
In 1 integer, target detection is carried out to M candidate frame, obtain N number of target, N is the integer more than 1, is crossed using training in advance
Multi-object Recognition Model carries out feature extraction to N number of target, obtains K feature, and K is the integer more than 1, is breathed out using local sensitivity
Uncommon algorithm is encoded to K feature, obtains K coding, and according to K coding and P pending image Hamming distance is calculated, and is obtained
To P Hamming distance value, treat at least one target Hamming distance value for meeting predetermined threshold value in P Hamming distance value is corresponding
Process image to be retained.Thus, the speed and precision of target retrieval can be lifted.
Description of the drawings
Technical scheme in order to be illustrated more clearly that the embodiment of the present invention, below will be to making needed for embodiment description
Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are some embodiments of the present invention, for ability
For the those of ordinary skill of domain, on the premise of not paying creative work, can be attached to obtain others according to these accompanying drawings
Figure.
Fig. 1 is a kind of embodiment schematic flow sheet of target retrieval method provided in an embodiment of the present invention;
Fig. 2 a are a kind of first embodiment structural representations of terminal provided in an embodiment of the present invention;
Fig. 2 b are the structural representations of the first detector unit of the terminal described by Fig. 2 a provided in an embodiment of the present invention;
Fig. 2 c are the structural representations of the computing unit of the terminal described by Fig. 2 a provided in an embodiment of the present invention;
Fig. 3 is a kind of second embodiment structural representation of terminal provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete
Site preparation is described, it is clear that described embodiment is a part of embodiment of the invention, rather than the embodiment of whole.Based on this
Embodiment in bright, the every other enforcement that those of ordinary skill in the art are obtained under the premise of creative work is not made
Example, belongs to the scope of protection of the invention.
Term " first ", " second ", " the 3rd " in description and claims of this specification and the accompanying drawing and "
Four " it is etc. for distinguishing different objects, rather than for describing particular order.Additionally, term " comprising " and " having " and it
Any deformation, it is intended that cover and non-exclusive include.For example contain the process of series of steps or unit, method, be
System, product or equipment are not limited to the step of listing or unit, but alternatively also include the step of not listing or list
Unit, or alternatively also include other steps intrinsic for these processes, method, product or equipment or unit.
Referenced herein " embodiment " is it is meant that the special characteristic, structure or the characteristic that describe can be wrapped in conjunction with the embodiments
In being contained at least one embodiment of the present invention.It is identical that each position in the description shows that the phrase might not each mean
Embodiment, nor the independent or alternative embodiment with other embodiments mutual exclusion.Those skilled in the art explicitly and
Implicitly it is understood by, embodiment described herein can be in combination with other embodiments.
Terminal described by the embodiment of the present invention can include smart mobile phone (such as Android phone, iOS mobile phones, Windows
Phone mobile phones etc.), panel computer, palm PC, notebook computer, mobile internet device (MID, Mobile Internet
Devices) or Wearable etc., above-mentioned is only citing, and non exhaustive, including but not limited to above-mentioned terminal.
It should be noted that deep learning is a kind of deep neural network of the utilization with multiple hidden layers for proposing in recent years
Complete the machine learning method of learning tasks.Its essence is, by building the neural network model with multiple hidden layers and using
Substantial amounts of training data is learning to obtain more useful feature, and then lift scheme prediction or the accuracy classified.This programme exists
Highlight based on the feature extraction of deep learning and the depth characteristic of combination extraction to do common-denominator target detection, image retrieval
And the key technology that correlated characteristic is encoded, devise an image indexing system based on deep learning.The program
High with retrieval precision, processing data amount is big, the advantages of retrieval rate is fast, is adapted to be disposed under complex environment.
It should be noted that the embodiment of the present invention, inventor on the basis of numerous studies, to the application under practical matter
Realistic plan is given, such as the target image retrieval under illumination variation, the conforming target image inspection of view-based access control model
Rope etc., and for the feature extraction in retrieving, retrieval model and vision significance problem, it is proposed that corresponding solution party
Method.The embodiment of the present invention is launched mainly around this key technology based on image characteristics extraction in deep learning image retrieval, is
Have studied the technologies such as effective extraction of characteristics of image, the content of covering is mainly including common-denominator target detection, the target of image system
Feature extraction and improve the reality such as retrieval rate and effective method.
Fig. 1 is referred to, is a kind of first embodiment schematic flow sheet of target retrieval method provided in an embodiment of the present invention.
Target retrieval method described in the present embodiment, comprises the following steps:
101st, target detection is carried out to P pending image, obtains M candidate frame, the P and the M are more than 1
Integer.
Wherein, target can be included in pending image, it is also possible to not comprising target.Target is included in pending image
When, at least one candidate frame after target detection, is being obtained, when not including candidate frame in pending image, this is pending
Image is possible to cannot get candidate frame.Can all comprising target in above-mentioned P pending image, it is also possible to partly comprising target.
The object that target can be specified for people, car, user.Target detection is carried out to P pending image using algorithm of target detection, it is false
If obtaining M candidate frame, P and M is the integer more than 1.
Alternatively, in above-mentioned steps 101, target detection is carried out to P pending image, obtains M candidate frame, it may include
Following steps:
11) positive sample collection and negative sample collection, are determined;
12), the positive sample collection and the negative sample collection are trained, training pattern is obtained;
13), the training pattern is positioned in the P pending image, the X posting is obtained, its
In, the X is less than the positive integer of the P;
14), pseudo- posting is carried out to the X posting using non-maxima suppression algorithm to process, obtains the K
Candidate frame, the K is less than the positive integer of the X.
Wherein, the positive sample in step 11 integrates the target that can want to retrieve as user, for example, people, car, dog etc., positive sample
Concentrate and include multiple positive samples.Negative sample integrates then wants the scenery outside the target retrieved as user, and negative sample is concentrated comprising multiple
Negative sample.The sample size for including of above-mentioned positive sample collection and negative sample collection is certainly more, and the model for training is more accurate, but
It is that the quantity of positive sample and negative sample is more, can also increases calculating cost when training.Using grader align sample set and
Negative sample collection is trained, it is possible to obtain a training pattern.Wherein, above-mentioned grader can be neural network classifier,
Hold vector basis (Support Vector Machine, SVM) grader, genetic algorithm class device etc..Can adopt in step 13
Above-mentioned training pattern is positioned to P pending image, so as to further obtain X posting.Due to above-mentioned X positioning
Pseudo- posting may be still contained in frame, further, in step 14 above-mentioned X is positioned using non-maxima suppression algorithm
Frame carries out pseudo- posting and processes, and obtains K candidate frame, and candidate frame precision now is higher.
102nd, target detection is carried out to the M candidate frame, obtains N number of target, the N is the integer more than 1.
Wherein, target detection can be carried out to M candidate frame, now, obtains N number of target, M needs not be equal to N, because having
Also there is pseudo- posting in possible M candidate frame, or, the mesh obtained after certain the candidate frame target detection in M candidate frame
Mark more than one.
Alternatively, target detection is carried out to the M candidate frame, can be realized as follows:
Target detection is carried out to the M candidate frame using fast neuronal network algorithm.
Alternatively, using fast neuronal network algorithm compared to general neural network algorithm, with more preferable processing speed.
Still optionally further, above-mentioned employing fast neuronal network algorithm carries out target detection to the M candidate frame, bag
Include:
Target detection is carried out to the M candidate frame using the fast neuronal network algorithm based on roi-pooling functions.
103rd, the multi-object Recognition Model crossed using training in advance carries out feature extraction to N number of target, obtain K it is special
Levy, the K is the integer more than 1.
Alternatively, based on fast neuronal network algorithm (Fast Neural Network Algorithm, Fast RCNN)
Multi-target detection:Fast RCNN are based in the embodiment of the present invention, its network structure can be with alexnet as source, on its basis
On it is adjusted, i.e., suitably reduce the size of convolution kernel and the quantity of characteristic pattern improving detection speed.Further, if
Substitute the pooling layers behind last convolutional layer with roi-pooling in the network architecture to use different candidate frames
Input, further improves the speed of operation.Finally preserve the multi-target detection model for training.
Wherein, when the multi-object Recognition Model crossed using training in advance carries out feature extraction to above-mentioned N number of target, can be with
K feature is obtained, K is the integer more than 1.
104th, the K feature is encoded using local sensitivity hash algorithm, obtains the K coding.
Wherein, K feature is encoded using local sensitivity hash algorithm, so as to obtain K coding, Mei Yite
Levy one coding of correspondence.
105th, Hamming distance is calculated according to described K coding and the P pending image, obtains the P Hamming distance
From value.
Alternatively, in above-mentioned steps 105, Hamming distance is calculated according to described K coding and the P pending image,
Obtain the P Hamming distance value, it may include following steps:
51) Q coding corresponding with pending image i, is chosen from described K coding, the Q is more than or equal to 1
Integer, the pending image i is in the P pending images;
52) the Hamming distance value of the pending image i, is determined according to described Q coding and the pending image i.
106th, treat at least one target Hamming distance value for meeting predetermined threshold value in the P Hamming distance value is corresponding
Process image to be retained.
Alternatively, at least one target Hamming distance value pair of predetermined threshold value can not be met in the P Hamming distance value
The pending image answered is deleted or marked.
Above-mentioned predetermined threshold value can voluntarily be arranged by system default or user, and for example, predetermined threshold value can be empirical value.It is default
Threshold value can also be a value range.
Hereinafter only it is illustrated as a example by traffic application, it is specific as follows:
Step 101 can be used to obtain candidate frame:
The candidate frame of multiple targets is generated using algorithm of target detection, for example, here by taking people and Che as an example, respectively including
The image of car or people is positive sample, and the image under other natural environments is trained for negative sample, generates the training mould of car and people
Type.The training pattern for generating is carried out coarse positioning in the training image of input respectively, and is removed with non-maxima suppression algorithm
The pseudo- posting that degree of overlapping is done, confidence level is low, probably generates the candidate frame of 500 cars and people, respectively as depth after operation
The target candidate frame of study.The degree of overlapping for assuming artificial mark rectangle frame in candidate frame and image is more than 0.5, then we just
This candidate frame marks into object classification (such as people or car), otherwise just it as background classification.
Step 102 can be used to carry out target detection to the candidate frame in 101:
The embodiment of the present invention can be based on Fast RCNN, and to the candidate frame in step 101 target detection is carried out, and obtain some
Individual target.
Step 103 can be used to carry out feature extraction to the target in 102, obtain several features:
Accurately to extract multiple target feature, in advance to the multiple target to be retrieved (such as various producers, classification, year money car
Type and pedestrian, motorcycle, tricycle etc.) classified.Based on caffe frameworks, network is tied on the basis of googlenet
Structure is modified, and reduces the number of plies of convolution and the quantity of characteristic pattern to improve the speed of service.Model after training is finished
File extracts the feature of destination object as the weight file of the destination object for detecting.Here on the basis of many experiments,
Using the feature of last pooling layer as the final target signature extracted.Finally preserve the multiple target classification mould for training
Type.Feature extraction is carried out to target based on the multiple target disaggregated model, several features are obtained.
Step 104 can be used for the feature to obtaining in step 103 and encode, and obtain several codings:
Coded system in the embodiment of the present invention extracts feature with local sensitivity hash algorithm as source, in step 103
As initial data and by two consecutive number strong points in its space, by identical mapping or projective transformation, (we claim here
For hash functions), the still adjacent probability in new data space of the two data points is very big, and non-conterminous data point quilt
It is mapped to the probability very little of same bucket.All of data in original data set are all carried out after hash mappings, we must
To a hash table, these raw data sets have been dispersed in the bucket of hash table, and each barrel of meeting falls into
Initial data, it is adjacent with regard to there is a strong possibility to belong to the data in same bucket.So, we just pass through hash
Function is operated, and original data set divide into multiple subclass.
Step 105 can be used for the Hamming distance between calculation code and pending image, and step 106 is used for according to Hamming distance
From pending image of the selection comprising target.
Specifically, it is assumed that have great amount of images through aforesaid operations and be saved in specified database the inside, referred to as put in storage at this
Operation.Input inquiry image (pending image includes the destination object trained), to the destination object picture frame to be retrieved, table
Showing will retrieve and the same type of object of inframe inside library file.Here we are first with multiple target disaggregated model to inframe
Target carries out feature extraction, and the feature coding extracted with local sensitivity function pair, then takes out the data inside storehouse, calculates
The distance between inquiry data and storehouse the inside data, and the corresponding destination object of the data of given threshold scope is returned as final
Retrieval result.
By the embodiment of the present invention, target detection is carried out to P pending image, obtain M candidate frame, P and M are greatly
In 1 integer, target detection is carried out to M candidate frame, obtain N number of target, N is the integer more than 1, is crossed using training in advance
Multi-object Recognition Model carries out feature extraction to N number of target, obtains K feature, and K is the integer more than 1, is breathed out using local sensitivity
Uncommon algorithm is encoded to K feature, obtains K coding, and according to K coding and P pending image Hamming distance is calculated, and is obtained
To P Hamming distance value, treat at least one target Hamming distance value for meeting predetermined threshold value in P Hamming distance value is corresponding
Process image to be retained.Thus, the speed and precision of the retrieval of image can be lifted.
Consistent with the abovely, it is below the device of the above-mentioned target retrieval method of enforcement, it is specific as follows:
Fig. 2 a are referred to, is a kind of first embodiment structural representation of terminal provided in an embodiment of the present invention.This enforcement
Terminal described in example, including:First detector unit 201, the second detector unit 202, extraction unit 203, coding unit
204th, computing unit 205 and determining unit 206, specific as follows:
First detector unit 201, for carrying out target detection to P pending images, obtains M candidate frame, the P with
The M is the integer more than 1;
Second detector unit 202, for carrying out target detection to the M candidate frame, obtains N number of target, and the N is big
In 1 integer;
Extraction unit 203, the multi-object Recognition Model for being crossed using training in advance is carried out feature and is carried to N number of target
Take, obtain K feature, the K is the integer more than 1;
Coding unit 204, for encoding to the K feature using local sensitivity hash algorithm, obtains the K
Coding;
Computing unit 205, for calculating Hamming distance according to described K coding and the P pending image, obtains institute
State P Hamming distance value;
Determining unit 206, at least one target Hamming distance of predetermined threshold value will to be met in the P Hamming distance value
Retained from the corresponding pending image of value.
Alternatively, as shown in Figure 2 b, Fig. 2 b are the refinement knot of the first detector unit 201 of the terminal described in Fig. 2 a
Structure, it includes:First determining module 2011, generation module 2012, locating module 2013 and processing module 2014, it is specific as follows:
First determining module 2011, for determining positive sample collection and negative sample collection;
Generation module 2012, for being trained to the positive sample collection and the negative sample collection, obtains training pattern;
Locating module 2013, for the training pattern to be positioned in the P pending image, obtains described
X posting, wherein, the X is less than the positive integer of the P;
Processing module 2014, for being carried out at pseudo- posting to the X posting using non-maxima suppression algorithm
Reason, obtains the K candidate frame, and the K is less than the positive integer of the X.
Alternatively, second detector unit 202 specifically for:
Target detection is carried out to the M candidate frame using fast neuronal network algorithm.
Still optionally further, second detector unit 202 specifically for:
Target detection is carried out to the M candidate frame using the fast neuronal network algorithm based on roi-pooling functions.
Alternatively, as shown in Figure 2 c, Fig. 2 c are the refinement structure of the computing unit 205 of the terminal described in Fig. 2 a, its
Including:The determining module 2052 of module 2051 and second is chosen, it is specific as follows:
Module 2051 is chosen, for choosing Q coding corresponding with pending image i, the Q from described K coding
It is the integer more than or equal to 1, the pending image i is in the P pending image;
Second determining module 2052, for determining the pending figure according to described Q coding and the pending image i
As the Hamming distance value of i.
By the terminal described by the embodiment of the present invention, target detection can be carried out to P pending image, obtain M time
Select frame, P and M to be the integer more than 1, target detection carried out to M candidate frame, obtain N number of target, N is the integer more than 1,
The multi-object Recognition Model crossed using training in advance carries out feature extraction to N number of target, obtains K feature, and K is whole more than 1
Number, is encoded using local sensitivity hash algorithm to K feature, obtains K coding, according to K coding and P pending figure
As calculating Hamming distance, P Hamming distance value is obtained, at least one target Chinese of predetermined threshold value will be met in P Hamming distance value
The corresponding pending image of prescribed distance value is retained.Thus, the speed and precision of the retrieval of image can be lifted.
Consistent with the abovely, Fig. 3 is referred to, is that a kind of second embodiment structure of terminal provided in an embodiment of the present invention is shown
It is intended to.Terminal described in the present embodiment, including:At least one input equipment 1000;At least one output equipment 2000;Extremely
A few processor 3000, such as CPU;With memory 4000, above-mentioned input equipment 1000, output equipment 2000, processor
3000 and memory 4000 connected by bus 5000.
Wherein, above-mentioned input equipment 1000 concretely contact panel, physical button or mouse.
The concretely display screen of above-mentioned output equipment 2000.
Above-mentioned memory 4000 can be high-speed RAM memory, alternatively nonvolatile storage (non-volatile
Memory), such as magnetic disc store.Above-mentioned memory 4000 is used to store batch processing code, above-mentioned input equipment 1000, defeated
Going out equipment 2000 and processor 3000 is used to call the program code stored in memory 4000, performs following operation:
Above-mentioned processor 3000, is used for:
Target detection is carried out to P pending image, M candidate frame is obtained, the P and M is whole more than 1
Number;
Target detection is carried out to the M candidate frame, N number of target is obtained, the N is the integer more than 1;
The multi-object Recognition Model crossed using training in advance carries out feature extraction to N number of target, obtains K feature,
The K is the integer more than 1;
The K feature is encoded using local sensitivity hash algorithm, obtains the K coding;
Hamming distance is calculated according to described K coding and the P pending image, the P Hamming distance value is obtained;
At least one target Hamming distance value for meeting predetermined threshold value in the P Hamming distance value is corresponding pending
Image is retained.
Alternatively, above-mentioned processor 3000 carries out target detection to P pending image, obtains M candidate frame, including:
Determine positive sample collection and negative sample collection;
The positive sample collection and the negative sample collection are trained, training pattern is obtained;
The training pattern is positioned in the P pending image, the X posting is obtained, wherein, institute
It is less than the positive integer of the P to state X;
Pseudo- posting is carried out to the X posting using non-maxima suppression algorithm to process, the K candidate is obtained
Frame, the K is less than the positive integer of the X.
Alternatively, the M candidate frame of above-mentioned processor 3000 pairs carries out target detection, including:
Target detection is carried out to the M candidate frame using fast neuronal network algorithm.
Alternatively, above-mentioned processor 3000 carries out target detection using fast neuronal network algorithm to the M candidate frame,
Including:
Target detection is carried out to the M candidate frame using the fast neuronal network algorithm based on roi-pooling functions.
Alternatively, above-mentioned processor 3000 calculates Hamming distance according to described K coding and the P pending image,
The P Hamming distance value is obtained, including:
Q coding corresponding with pending image i is chosen from described K coding, the Q is whole more than or equal to 1
Number, the pending image i is in the P pending image;
The Hamming distance value of the pending image i is determined according to described Q coding and the pending image i.
The embodiment of the present invention also provides a kind of computer-readable storage medium, wherein, the computer-readable storage medium can be stored with journey
Sequence, including the part or all of step of any target retrieval method described in said method embodiment during the program performing
Suddenly.
Although here combines each embodiment, and invention has been described, however, implementing the present invention for required protection
During, those skilled in the art are by checking the accompanying drawing, disclosure and appended claims, it will be appreciated that and it is real
Other changes of the existing open embodiment.In the claims, " including " (comprising) word be not excluded for other composition
Part or step, "a" or "an" is not excluded for multiple situations.Single processor or other units can realize claim
In some functions enumerating.Mutually different has been recited in mutually different dependent some measures, it is not intended that these are arranged
Apply to combine and produce good effect.
It will be understood by those skilled in the art that embodiments of the invention can be provided as method, device (equipment) or computer journey
Sequence product.Therefore, the present invention can using complete hardware embodiment, complete software embodiment or with reference to software and hardware in terms of
The form of embodiment.And, the present invention can be adopted and wherein include the calculating of computer usable program code at one or more
The computer program implemented in machine usable storage medium (including but not limited to magnetic disc store, CD-ROM, optical memory etc.)
The form of product.Computer program is stored/distributed in suitable medium, is provided together with other hardware or as the one of hardware
Part, it would however also be possible to employ other distribution forms, such as by Internet or other wired or wireless telecommunication systems.
The present invention be with reference to the embodiment of the present invention method, device (equipment) and computer program flow chart with/
Or block diagram is describing.It should be understood that can by each flow process in computer program instructions flowchart and/or block diagram and/
Or the combination of square frame and flow chart and/or the flow process in block diagram and/or square frame.These computer program instructions can be provided
To the processor of all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices producing one
Individual machine so that produced for realizing by the instruction of computer or the computing device of other programmable data processing devices
The device of the function of specifying in one flow process of flow chart or one square frame of multiple flow processs and/or block diagram or multiple square frames.
These computer program instructions may be alternatively stored in can guide computer or other programmable data processing devices with spy
In determining the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory is produced to be included referring to
Make the manufacture of device, the command device realize in one flow process of flow chart or one square frame of multiple flow processs and/or block diagram or
The function of specifying in multiple square frames.
These computer program instructions also can be loaded into computer or other programmable data processing devices so that in meter
Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented process, so as in computer or
The instruction performed on other programmable devices is provided for realizing in one flow process of flow chart or multiple flow processs and/or block diagram one
The step of function of specifying in individual square frame or multiple square frames.
Although with reference to specific features and embodiment, invention has been described, it is clear that, without departing from this
In the case of bright spirit and scope, various modifications and combinations can be carried out to it.Correspondingly, the specification and drawings are only institute
The exemplary illustration of the invention that attached claim is defined, and be considered as cover in the scope of the invention any and all and repair
Change, change, combining or equivalent.Obviously, those skilled in the art the present invention can be carried out it is various change and modification and not
Depart from the spirit and scope of the present invention.So, if the present invention these modification and modification belong to the claims in the present invention and its
Within the scope of equivalent technologies, then the present invention is also intended to comprising these changes and modification.
Claims (10)
1. a kind of target retrieval method, it is characterised in that include:
Target detection is carried out to P pending image, M candidate frame is obtained, the P and M is the integer more than 1;
Target detection is carried out to the M candidate frame, N number of target is obtained, the N is the integer more than 1;
The multi-object Recognition Model crossed using training in advance carries out feature extraction to N number of target, obtains K feature, the K
It is the integer more than 1;
The K feature is encoded using local sensitivity hash algorithm, obtains the K coding;
Hamming distance is calculated according to described K coding and the P pending image, the P Hamming distance value is obtained;
The corresponding pending image of at least one target Hamming distance value of predetermined threshold value will be met in the P Hamming distance value
Retained.
2. method according to claim 1, it is characterised in that described to carry out target detection to P pending image, obtains
M candidate frame, including:
Determine positive sample collection and negative sample collection;
The positive sample collection and the negative sample collection are trained, training pattern is obtained;
The training pattern is positioned in the P pending image, the X posting is obtained, wherein, the X
It is less than the positive integer of the P;
Pseudo- posting is carried out to the X posting using non-maxima suppression algorithm to process, the K candidate frame is obtained,
The K is less than the positive integer of the X.
3. the method according to any one of claim 1 or 2, it is characterised in that described that target is carried out to the M candidate frame
Detection, including:
Target detection is carried out to the M candidate frame using fast neuronal network algorithm.
4. method according to claim 3, it is characterised in that the employing fast neuronal network algorithm is waited to described M
Frame is selected to carry out target detection, including:
Target detection is carried out to the M candidate frame using the fast neuronal network algorithm based on roi-pooling functions.
5. the method according to any one of claim 1 or 2, it is characterised in that described according to described K coding and the P
Individual pending image calculates Hamming distance, obtains the P Hamming distance value, including:
Choose Q corresponding with pending image i from described K coding to encode, the Q is the integer more than or equal to 1, institute
It is in the P pending image to state pending image i;
The Hamming distance value of the pending image i is determined according to described Q coding and the pending image i.
6. a kind of terminal, it is characterised in that include:
First detector unit, for carrying out target detection to P pending image, obtains M candidate frame, and the P and the M are equal
It is the integer more than 1;
Second detector unit, for carrying out target detection to the M candidate frame, obtains N number of target, and the N is whole more than 1
Number;
Extraction unit, the multi-object Recognition Model for being crossed using training in advance carries out feature extraction to N number of target, obtains
K feature, the K is the integer more than 1;
Coding unit, for encoding to the K feature using local sensitivity hash algorithm, obtains the K coding;
Computing unit, for calculating Hamming distance according to described K coding and the P pending image, obtains the P Chinese
Prescribed distance value;
Determining unit, at least one target Hamming distance value correspondence of predetermined threshold value will to be met in the P Hamming distance value
Pending image retained.
7. terminal according to claim 6, it is characterised in that first detector unit includes:
First determining module, for determining positive sample collection and negative sample collection;
Generation module, for being trained to the positive sample collection and the negative sample collection, obtains training pattern;
Locating module, for the training pattern to be positioned in the P pending image, obtains the X positioning
Frame, wherein, the X is less than the positive integer of the P;
Processing module, is processed for pseudo- posting to be carried out to the X posting using non-maxima suppression algorithm, obtains institute
K candidate frame is stated, the K is less than the positive integer of the X.
8. the terminal according to any one of claim 6 or 7, it is characterised in that second detector unit specifically for:
Target detection is carried out to the M candidate frame using fast neuronal network algorithm.
9. terminal according to claim 8, it is characterised in that second detector unit specifically for:
Target detection is carried out to the M candidate frame using the fast neuronal network algorithm based on roi-pooling functions.
10. the terminal according to any one of claim 6 or 7, it is characterised in that the computing unit includes:
Choose module, for from described K coding in choose it is corresponding with pending image i Q encode, the Q be more than or
Integer equal to 1, the pending image i is in the P pending image;
Second determining module, for determining the Chinese of the pending image i according to described Q coding and the pending image i
Prescribed distance value.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611075914.XA CN106682092A (en) | 2016-11-29 | 2016-11-29 | Target retrieval method and terminal |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201611075914.XA CN106682092A (en) | 2016-11-29 | 2016-11-29 | Target retrieval method and terminal |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106682092A true CN106682092A (en) | 2017-05-17 |
Family
ID=58867010
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201611075914.XA Pending CN106682092A (en) | 2016-11-29 | 2016-11-29 | Target retrieval method and terminal |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106682092A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107481188A (en) * | 2017-06-23 | 2017-12-15 | 珠海经济特区远宏科技有限公司 | A kind of image super-resolution reconstructing method |
CN109960742A (en) * | 2019-02-18 | 2019-07-02 | 苏州科达科技股份有限公司 | The searching method and device of local message |
CN110019896A (en) * | 2017-07-28 | 2019-07-16 | 杭州海康威视数字技术股份有限公司 | A kind of image search method, device and electronic equipment |
CN110297931A (en) * | 2019-04-23 | 2019-10-01 | 西北大学 | A kind of image search method |
CN111160110A (en) * | 2019-12-06 | 2020-05-15 | 北京工业大学 | Method and device for identifying anchor based on face features and voice print features |
CN111178146A (en) * | 2019-12-06 | 2020-05-19 | 北京工业大学 | Method and device for identifying anchor based on face features |
CN113679327A (en) * | 2021-10-26 | 2021-11-23 | 青岛美迪康数字工程有限公司 | Endoscopic image acquisition method and device |
CN115052160A (en) * | 2022-04-22 | 2022-09-13 | 江西中烟工业有限责任公司 | Image coding method and device based on cloud data automatic downloading and electronic equipment |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020081026A1 (en) * | 2000-11-07 | 2002-06-27 | Rieko Izume | Image retrieving apparatus |
CN102509305A (en) * | 2011-09-26 | 2012-06-20 | 浙江工业大学 | Animal behavior detection device based on omnidirectional vision |
CN102693311A (en) * | 2012-05-28 | 2012-09-26 | 中国人民解放军信息工程大学 | Target retrieval method based on group of randomized visual vocabularies and context semantic information |
-
2016
- 2016-11-29 CN CN201611075914.XA patent/CN106682092A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20020081026A1 (en) * | 2000-11-07 | 2002-06-27 | Rieko Izume | Image retrieving apparatus |
CN102509305A (en) * | 2011-09-26 | 2012-06-20 | 浙江工业大学 | Animal behavior detection device based on omnidirectional vision |
CN102693311A (en) * | 2012-05-28 | 2012-09-26 | 中国人民解放军信息工程大学 | Target retrieval method based on group of randomized visual vocabularies and context semantic information |
Non-Patent Citations (2)
Title |
---|
ROSS GIRSHICK: "Fast R-CNN", 《COMPUTER SCIENCE》 * |
袁勇: "图像检索:基于内容的图像检索技术", 《袁勇的博客》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107481188A (en) * | 2017-06-23 | 2017-12-15 | 珠海经济特区远宏科技有限公司 | A kind of image super-resolution reconstructing method |
CN110019896A (en) * | 2017-07-28 | 2019-07-16 | 杭州海康威视数字技术股份有限公司 | A kind of image search method, device and electronic equipment |
CN110019896B (en) * | 2017-07-28 | 2021-08-13 | 杭州海康威视数字技术股份有限公司 | Image retrieval method and device and electronic equipment |
US11586664B2 (en) | 2017-07-28 | 2023-02-21 | Hangzhou Hikvision Digital Technology Co., Ltd. | Image retrieval method and apparatus, and electronic device |
CN109960742A (en) * | 2019-02-18 | 2019-07-02 | 苏州科达科技股份有限公司 | The searching method and device of local message |
CN110297931A (en) * | 2019-04-23 | 2019-10-01 | 西北大学 | A kind of image search method |
CN110297931B (en) * | 2019-04-23 | 2021-12-03 | 西北大学 | Image retrieval method |
CN111160110A (en) * | 2019-12-06 | 2020-05-15 | 北京工业大学 | Method and device for identifying anchor based on face features and voice print features |
CN111178146A (en) * | 2019-12-06 | 2020-05-19 | 北京工业大学 | Method and device for identifying anchor based on face features |
CN113679327A (en) * | 2021-10-26 | 2021-11-23 | 青岛美迪康数字工程有限公司 | Endoscopic image acquisition method and device |
CN115052160A (en) * | 2022-04-22 | 2022-09-13 | 江西中烟工业有限责任公司 | Image coding method and device based on cloud data automatic downloading and electronic equipment |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106682092A (en) | Target retrieval method and terminal | |
CN107885764B (en) | Rapid Hash vehicle retrieval method based on multitask deep learning | |
Arietta et al. | City forensics: Using visual elements to predict non-visual city attributes | |
CN109740415A (en) | Vehicle attribute recognition methods and Related product | |
WO2018210047A1 (en) | Data processing method, data processing apparatus, electronic device and storage medium | |
CN111414888A (en) | Low-resolution face recognition method, system, device and storage medium | |
CN114155284A (en) | Pedestrian tracking method, device, equipment and medium based on multi-target pedestrian scene | |
CN109189970A (en) | Picture similarity comparison method and device | |
Lu et al. | An improved target detection method based on multiscale features fusion | |
CN112215188B (en) | Traffic police gesture recognition method, device, equipment and storage medium | |
CN111753826B (en) | Vehicle and license plate association method, device and electronic system | |
CN109784140A (en) | Driver attributes' recognition methods and Related product | |
Hu et al. | Generalized image recognition algorithm for sign inventory | |
CN117078942B (en) | Context-aware refereed image segmentation method, system, device and storage medium | |
CN113704276A (en) | Map updating method and device, electronic equipment and computer readable storage medium | |
Wan et al. | Improved vision-based method for detection of unauthorized intrusion by construction sites workers | |
Santos et al. | RECOGNIZING AND EXPLORING AZULEJOS ON HISTORIC BUILDINGS’FACADES BY COMBINING COMPUTER VISION AND GEOLOCATION IN MOBILE AUGMENTED REALITY APPLICATIONS | |
Wang et al. | YOLOv5-light: efficient convolutional neural networks for flame detection | |
CN106886783A (en) | A kind of image search method and system based on provincial characteristics | |
CN116977692A (en) | Data processing method, device and computer readable storage medium | |
Wang et al. | A lightweight CNN model based on GhostNet | |
CN114724128A (en) | License plate recognition method, device, equipment and medium | |
CN114707017A (en) | Visual question answering method and device, electronic equipment and storage medium | |
Zhang et al. | Object detection of VisDrone by stronger feature extraction FasterRCNN | |
Kim et al. | Accurate abandoned and removed object classification using hierarchical finite state machine |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170517 |