CN106682092A - Target retrieval method and terminal - Google Patents

Target retrieval method and terminal Download PDF

Info

Publication number
CN106682092A
CN106682092A CN201611075914.XA CN201611075914A CN106682092A CN 106682092 A CN106682092 A CN 106682092A CN 201611075914 A CN201611075914 A CN 201611075914A CN 106682092 A CN106682092 A CN 106682092A
Authority
CN
China
Prior art keywords
pending image
target
hamming distance
candidate frame
obtains
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201611075914.XA
Other languages
Chinese (zh)
Inventor
刘凯
刘青松
吴伟华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
SHENZHEN HARZONE TECHNOLOGY Co Ltd
Original Assignee
SHENZHEN HARZONE TECHNOLOGY Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by SHENZHEN HARZONE TECHNOLOGY Co Ltd filed Critical SHENZHEN HARZONE TECHNOLOGY Co Ltd
Priority to CN201611075914.XA priority Critical patent/CN106682092A/en
Publication of CN106682092A publication Critical patent/CN106682092A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/51Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/50Information retrieval; Database structures therefor; File system structures therefor of still image data
    • G06F16/58Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
    • G06F16/583Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Abstract

The invention provides a target retrieval method and a terminal. The method comprises the following steps: executing the target detection to P images to be processed, and obtaining M candidate frames, wherein P and M are the integers greater than 1; executing the target detection to M candidate frames, and obtaining N targets, wherein N is the integer greater than 1; using a pre-trained multi-target identification model to execute the feature extraction to N targets, and obtaining K features, wherein K is the integer greater than 1; using the local sensitive Hash algorithm to execute the coding to K features, and obtaining K codes; according to K codes and P images to be processed, calculating the Hamming distance, and obtaining P Hamming distance values; and reserving the to-be-processed image corresponding to at least one target Hamming distance value in P Hamming distance values according with a preset threshold value. The target retrieval method is capable of improving the speed and accuracy of the target retrieve.

Description

A kind of target retrieval method and terminal
Technical field
The present invention relates to technical field of video monitoring, and in particular to a kind of target retrieval method and terminal.
Background technology
Target image retrieval based on computer vision is most challenging research theme in computer vision field One of, it to the effect that describes the feature of image using the anthropomorphic vision of computer mould, and according to the feature for describing from sea Target image interested is found out in the image of amount.Image retrieval network image search for, medical image excavate, security monitoring and The fields such as bad image filtering are widely used, and are at machine learning, pattern-recognition, computer vision and image The focus and difficult point of the multi-door subject crossing researchs such as reason.Due to its complexity and structural so that based on computer vision The research of target image retrieval correlation technique faces numerous challenges, and the accuracy of image retrieval still has much room for improvement.
High speed development and the popularization of various digitizers with computer technology, multimedia messages in modern society Quantity rapidly increases so that people more and more touch a large amount of multimedia messages with rich connotation.In order to easily Fast and accurately extract valuable content from the information aggregate of magnanimity, various image retrieval technologies just gradually into For the focus of current research.
In prior art, traditional image retrieval great majority be and text and content retrieval, wherein, text retrieval skill Art only provides the retrieval based on the description keyword of image, i.e., image is manually marked first, then again by closing The lookup of key word is retrieving image.This search method is although convenient and simple, and Search Results are relatively accurate, but due to mark The amount of labour is big and there is " semantic gap ", can not usually accurately reflect the content of image.From earlier 1990s, Researcher just proposes the technology of CBIR in succession.The technology is actually a kind of Fuzzy Query Technology, It directly from Image Visual Feature to be found, using visual signatures such as color, texture, shapes, realizes and image is regarded Feel the retrieval of content characteristic.Certainly, the shortcoming of CBIR technology is also it will be apparent that being primarily present feature Dimension is high, and retrieval rate is slow and the not high defect of retrieval precision.
The content of the invention
A kind of target retrieval method and terminal are embodiments provided, to improving the speed and essence of image retrieval Degree.
Embodiment of the present invention first aspect provides a kind of target retrieval method, including:
Target detection is carried out to P pending image, M candidate frame is obtained, the P and M is whole more than 1 Number;
Target detection is carried out to the M candidate frame, N number of target is obtained, the N is the integer more than 1;
The multi-object Recognition Model crossed using training in advance carries out feature extraction to N number of target, obtains K feature, The K is the integer more than 1;
The K feature is encoded using local sensitivity hash algorithm, obtains the K coding;
Hamming distance is calculated according to described K coding and the P pending image, the P Hamming distance value is obtained;
At least one target Hamming distance value for meeting predetermined threshold value in the P Hamming distance value is corresponding pending Image is retained.
Alternatively, it is described that target detection is carried out to P pending image, M candidate frame is obtained, including:
Determine positive sample collection and negative sample collection;
The positive sample collection and the negative sample collection are trained, training pattern is obtained;
The training pattern is positioned in the P pending image, the X posting is obtained, wherein, institute It is less than the positive integer of the P to state X;
Pseudo- posting is carried out to the X posting using non-maxima suppression algorithm to process, the K candidate is obtained Frame, the K is less than the positive integer of the X.
Alternatively, target detection is carried out to the M candidate frame, including:
Target detection is carried out to the M candidate frame using fast neuronal network algorithm.
Alternatively, the employing fast neuronal network algorithm carries out target detection to the M candidate frame, including:
Target detection is carried out to the M candidate frame using the fast neuronal network algorithm based on roi-pooling functions.
Alternatively, it is described that Hamming distance is calculated according to described K coding and the P pending image, obtain the P Hamming distance value, including:
Q coding corresponding with pending image i is chosen from described K coding, the Q is whole more than or equal to 1 Number, the pending image i is in the P pending image;
The Hamming distance value of the pending image i is determined according to described Q coding and the pending image i.
Embodiment of the present invention second aspect provides a kind of terminal, including:
First detector unit, for carrying out target detection to P pending image, obtains M candidate frame, the P and institute State M and be integer more than 1;
Second detector unit, for carrying out target detection to the M candidate frame, obtains N number of target, and the N is more than 1 Integer;
Extraction unit, the multi-object Recognition Model for being crossed using training in advance carries out feature extraction to N number of target, K feature is obtained, the K is the integer more than 1;
Coding unit, for encoding to the K feature using local sensitivity hash algorithm, obtains the K volume Code;
Computing unit, for calculating Hamming distance according to described K coding and the P pending image, obtains the P Individual Hamming distance value;
Determining unit, at least one target Hamming distance value of predetermined threshold value will to be met in the P Hamming distance value Corresponding pending image is retained.
Alternatively, first detector unit includes:
First determining module, determines positive sample collection and negative sample collection;
Generation module, is trained to the positive sample collection and the negative sample collection, obtains training pattern;
Locating module, for the training pattern to be positioned in the P pending image, obtains the X Posting, wherein, the X is less than the positive integer of the P;
Processing module, carries out pseudo- posting to the X posting and processes using non-maxima suppression algorithm, obtains institute K candidate frame is stated, the K is less than the positive integer of the X.
Alternatively, second detector unit specifically for:
Target detection is carried out to the M candidate frame using fast neuronal network algorithm.
Alternatively, second detector unit specifically for:
Target detection is carried out to the M candidate frame using the fast neuronal network algorithm based on roi-pooling functions.
Alternatively, the computing unit includes:
Module is chosen, for choosing Q coding corresponding with pending image i from described K coding, the Q is big In or integer equal to 1, the pending image i is in the P pending image;
Second determining module, for determining the pending image i according to described Q coding and the pending image i Hamming distance value.
Implement the embodiment of the present invention, have the advantages that:
By the embodiment of the present invention, target detection is carried out to P pending image, obtain M candidate frame, P and M are greatly In 1 integer, target detection is carried out to M candidate frame, obtain N number of target, N is the integer more than 1, is crossed using training in advance Multi-object Recognition Model carries out feature extraction to N number of target, obtains K feature, and K is the integer more than 1, is breathed out using local sensitivity Uncommon algorithm is encoded to K feature, obtains K coding, and according to K coding and P pending image Hamming distance is calculated, and is obtained To P Hamming distance value, treat at least one target Hamming distance value for meeting predetermined threshold value in P Hamming distance value is corresponding Process image to be retained.Thus, the speed and precision of target retrieval can be lifted.
Description of the drawings
Technical scheme in order to be illustrated more clearly that the embodiment of the present invention, below will be to making needed for embodiment description Accompanying drawing is briefly described, it should be apparent that, drawings in the following description are some embodiments of the present invention, for ability For the those of ordinary skill of domain, on the premise of not paying creative work, can be attached to obtain others according to these accompanying drawings Figure.
Fig. 1 is a kind of embodiment schematic flow sheet of target retrieval method provided in an embodiment of the present invention;
Fig. 2 a are a kind of first embodiment structural representations of terminal provided in an embodiment of the present invention;
Fig. 2 b are the structural representations of the first detector unit of the terminal described by Fig. 2 a provided in an embodiment of the present invention;
Fig. 2 c are the structural representations of the computing unit of the terminal described by Fig. 2 a provided in an embodiment of the present invention;
Fig. 3 is a kind of second embodiment structural representation of terminal provided in an embodiment of the present invention.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clear, complete Site preparation is described, it is clear that described embodiment is a part of embodiment of the invention, rather than the embodiment of whole.Based on this Embodiment in bright, the every other enforcement that those of ordinary skill in the art are obtained under the premise of creative work is not made Example, belongs to the scope of protection of the invention.
Term " first ", " second ", " the 3rd " in description and claims of this specification and the accompanying drawing and " Four " it is etc. for distinguishing different objects, rather than for describing particular order.Additionally, term " comprising " and " having " and it Any deformation, it is intended that cover and non-exclusive include.For example contain the process of series of steps or unit, method, be System, product or equipment are not limited to the step of listing or unit, but alternatively also include the step of not listing or list Unit, or alternatively also include other steps intrinsic for these processes, method, product or equipment or unit.
Referenced herein " embodiment " is it is meant that the special characteristic, structure or the characteristic that describe can be wrapped in conjunction with the embodiments In being contained at least one embodiment of the present invention.It is identical that each position in the description shows that the phrase might not each mean Embodiment, nor the independent or alternative embodiment with other embodiments mutual exclusion.Those skilled in the art explicitly and Implicitly it is understood by, embodiment described herein can be in combination with other embodiments.
Terminal described by the embodiment of the present invention can include smart mobile phone (such as Android phone, iOS mobile phones, Windows Phone mobile phones etc.), panel computer, palm PC, notebook computer, mobile internet device (MID, Mobile Internet Devices) or Wearable etc., above-mentioned is only citing, and non exhaustive, including but not limited to above-mentioned terminal.
It should be noted that deep learning is a kind of deep neural network of the utilization with multiple hidden layers for proposing in recent years Complete the machine learning method of learning tasks.Its essence is, by building the neural network model with multiple hidden layers and using Substantial amounts of training data is learning to obtain more useful feature, and then lift scheme prediction or the accuracy classified.This programme exists Highlight based on the feature extraction of deep learning and the depth characteristic of combination extraction to do common-denominator target detection, image retrieval And the key technology that correlated characteristic is encoded, devise an image indexing system based on deep learning.The program High with retrieval precision, processing data amount is big, the advantages of retrieval rate is fast, is adapted to be disposed under complex environment.
It should be noted that the embodiment of the present invention, inventor on the basis of numerous studies, to the application under practical matter Realistic plan is given, such as the target image retrieval under illumination variation, the conforming target image inspection of view-based access control model Rope etc., and for the feature extraction in retrieving, retrieval model and vision significance problem, it is proposed that corresponding solution party Method.The embodiment of the present invention is launched mainly around this key technology based on image characteristics extraction in deep learning image retrieval, is Have studied the technologies such as effective extraction of characteristics of image, the content of covering is mainly including common-denominator target detection, the target of image system Feature extraction and improve the reality such as retrieval rate and effective method.
Fig. 1 is referred to, is a kind of first embodiment schematic flow sheet of target retrieval method provided in an embodiment of the present invention. Target retrieval method described in the present embodiment, comprises the following steps:
101st, target detection is carried out to P pending image, obtains M candidate frame, the P and the M are more than 1 Integer.
Wherein, target can be included in pending image, it is also possible to not comprising target.Target is included in pending image When, at least one candidate frame after target detection, is being obtained, when not including candidate frame in pending image, this is pending Image is possible to cannot get candidate frame.Can all comprising target in above-mentioned P pending image, it is also possible to partly comprising target. The object that target can be specified for people, car, user.Target detection is carried out to P pending image using algorithm of target detection, it is false If obtaining M candidate frame, P and M is the integer more than 1.
Alternatively, in above-mentioned steps 101, target detection is carried out to P pending image, obtains M candidate frame, it may include Following steps:
11) positive sample collection and negative sample collection, are determined;
12), the positive sample collection and the negative sample collection are trained, training pattern is obtained;
13), the training pattern is positioned in the P pending image, the X posting is obtained, its In, the X is less than the positive integer of the P;
14), pseudo- posting is carried out to the X posting using non-maxima suppression algorithm to process, obtains the K Candidate frame, the K is less than the positive integer of the X.
Wherein, the positive sample in step 11 integrates the target that can want to retrieve as user, for example, people, car, dog etc., positive sample Concentrate and include multiple positive samples.Negative sample integrates then wants the scenery outside the target retrieved as user, and negative sample is concentrated comprising multiple Negative sample.The sample size for including of above-mentioned positive sample collection and negative sample collection is certainly more, and the model for training is more accurate, but It is that the quantity of positive sample and negative sample is more, can also increases calculating cost when training.Using grader align sample set and Negative sample collection is trained, it is possible to obtain a training pattern.Wherein, above-mentioned grader can be neural network classifier, Hold vector basis (Support Vector Machine, SVM) grader, genetic algorithm class device etc..Can adopt in step 13 Above-mentioned training pattern is positioned to P pending image, so as to further obtain X posting.Due to above-mentioned X positioning Pseudo- posting may be still contained in frame, further, in step 14 above-mentioned X is positioned using non-maxima suppression algorithm Frame carries out pseudo- posting and processes, and obtains K candidate frame, and candidate frame precision now is higher.
102nd, target detection is carried out to the M candidate frame, obtains N number of target, the N is the integer more than 1.
Wherein, target detection can be carried out to M candidate frame, now, obtains N number of target, M needs not be equal to N, because having Also there is pseudo- posting in possible M candidate frame, or, the mesh obtained after certain the candidate frame target detection in M candidate frame Mark more than one.
Alternatively, target detection is carried out to the M candidate frame, can be realized as follows:
Target detection is carried out to the M candidate frame using fast neuronal network algorithm.
Alternatively, using fast neuronal network algorithm compared to general neural network algorithm, with more preferable processing speed.
Still optionally further, above-mentioned employing fast neuronal network algorithm carries out target detection to the M candidate frame, bag Include:
Target detection is carried out to the M candidate frame using the fast neuronal network algorithm based on roi-pooling functions.
103rd, the multi-object Recognition Model crossed using training in advance carries out feature extraction to N number of target, obtain K it is special Levy, the K is the integer more than 1.
Alternatively, based on fast neuronal network algorithm (Fast Neural Network Algorithm, Fast RCNN) Multi-target detection:Fast RCNN are based in the embodiment of the present invention, its network structure can be with alexnet as source, on its basis On it is adjusted, i.e., suitably reduce the size of convolution kernel and the quantity of characteristic pattern improving detection speed.Further, if Substitute the pooling layers behind last convolutional layer with roi-pooling in the network architecture to use different candidate frames Input, further improves the speed of operation.Finally preserve the multi-target detection model for training.
Wherein, when the multi-object Recognition Model crossed using training in advance carries out feature extraction to above-mentioned N number of target, can be with K feature is obtained, K is the integer more than 1.
104th, the K feature is encoded using local sensitivity hash algorithm, obtains the K coding.
Wherein, K feature is encoded using local sensitivity hash algorithm, so as to obtain K coding, Mei Yite Levy one coding of correspondence.
105th, Hamming distance is calculated according to described K coding and the P pending image, obtains the P Hamming distance From value.
Alternatively, in above-mentioned steps 105, Hamming distance is calculated according to described K coding and the P pending image, Obtain the P Hamming distance value, it may include following steps:
51) Q coding corresponding with pending image i, is chosen from described K coding, the Q is more than or equal to 1 Integer, the pending image i is in the P pending images;
52) the Hamming distance value of the pending image i, is determined according to described Q coding and the pending image i.
106th, treat at least one target Hamming distance value for meeting predetermined threshold value in the P Hamming distance value is corresponding Process image to be retained.
Alternatively, at least one target Hamming distance value pair of predetermined threshold value can not be met in the P Hamming distance value The pending image answered is deleted or marked.
Above-mentioned predetermined threshold value can voluntarily be arranged by system default or user, and for example, predetermined threshold value can be empirical value.It is default Threshold value can also be a value range.
Hereinafter only it is illustrated as a example by traffic application, it is specific as follows:
Step 101 can be used to obtain candidate frame:
The candidate frame of multiple targets is generated using algorithm of target detection, for example, here by taking people and Che as an example, respectively including The image of car or people is positive sample, and the image under other natural environments is trained for negative sample, generates the training mould of car and people Type.The training pattern for generating is carried out coarse positioning in the training image of input respectively, and is removed with non-maxima suppression algorithm The pseudo- posting that degree of overlapping is done, confidence level is low, probably generates the candidate frame of 500 cars and people, respectively as depth after operation The target candidate frame of study.The degree of overlapping for assuming artificial mark rectangle frame in candidate frame and image is more than 0.5, then we just This candidate frame marks into object classification (such as people or car), otherwise just it as background classification.
Step 102 can be used to carry out target detection to the candidate frame in 101:
The embodiment of the present invention can be based on Fast RCNN, and to the candidate frame in step 101 target detection is carried out, and obtain some Individual target.
Step 103 can be used to carry out feature extraction to the target in 102, obtain several features:
Accurately to extract multiple target feature, in advance to the multiple target to be retrieved (such as various producers, classification, year money car Type and pedestrian, motorcycle, tricycle etc.) classified.Based on caffe frameworks, network is tied on the basis of googlenet Structure is modified, and reduces the number of plies of convolution and the quantity of characteristic pattern to improve the speed of service.Model after training is finished File extracts the feature of destination object as the weight file of the destination object for detecting.Here on the basis of many experiments, Using the feature of last pooling layer as the final target signature extracted.Finally preserve the multiple target classification mould for training Type.Feature extraction is carried out to target based on the multiple target disaggregated model, several features are obtained.
Step 104 can be used for the feature to obtaining in step 103 and encode, and obtain several codings:
Coded system in the embodiment of the present invention extracts feature with local sensitivity hash algorithm as source, in step 103 As initial data and by two consecutive number strong points in its space, by identical mapping or projective transformation, (we claim here For hash functions), the still adjacent probability in new data space of the two data points is very big, and non-conterminous data point quilt It is mapped to the probability very little of same bucket.All of data in original data set are all carried out after hash mappings, we must To a hash table, these raw data sets have been dispersed in the bucket of hash table, and each barrel of meeting falls into Initial data, it is adjacent with regard to there is a strong possibility to belong to the data in same bucket.So, we just pass through hash Function is operated, and original data set divide into multiple subclass.
Step 105 can be used for the Hamming distance between calculation code and pending image, and step 106 is used for according to Hamming distance From pending image of the selection comprising target.
Specifically, it is assumed that have great amount of images through aforesaid operations and be saved in specified database the inside, referred to as put in storage at this Operation.Input inquiry image (pending image includes the destination object trained), to the destination object picture frame to be retrieved, table Showing will retrieve and the same type of object of inframe inside library file.Here we are first with multiple target disaggregated model to inframe Target carries out feature extraction, and the feature coding extracted with local sensitivity function pair, then takes out the data inside storehouse, calculates The distance between inquiry data and storehouse the inside data, and the corresponding destination object of the data of given threshold scope is returned as final Retrieval result.
By the embodiment of the present invention, target detection is carried out to P pending image, obtain M candidate frame, P and M are greatly In 1 integer, target detection is carried out to M candidate frame, obtain N number of target, N is the integer more than 1, is crossed using training in advance Multi-object Recognition Model carries out feature extraction to N number of target, obtains K feature, and K is the integer more than 1, is breathed out using local sensitivity Uncommon algorithm is encoded to K feature, obtains K coding, and according to K coding and P pending image Hamming distance is calculated, and is obtained To P Hamming distance value, treat at least one target Hamming distance value for meeting predetermined threshold value in P Hamming distance value is corresponding Process image to be retained.Thus, the speed and precision of the retrieval of image can be lifted.
Consistent with the abovely, it is below the device of the above-mentioned target retrieval method of enforcement, it is specific as follows:
Fig. 2 a are referred to, is a kind of first embodiment structural representation of terminal provided in an embodiment of the present invention.This enforcement Terminal described in example, including:First detector unit 201, the second detector unit 202, extraction unit 203, coding unit 204th, computing unit 205 and determining unit 206, specific as follows:
First detector unit 201, for carrying out target detection to P pending images, obtains M candidate frame, the P with The M is the integer more than 1;
Second detector unit 202, for carrying out target detection to the M candidate frame, obtains N number of target, and the N is big In 1 integer;
Extraction unit 203, the multi-object Recognition Model for being crossed using training in advance is carried out feature and is carried to N number of target Take, obtain K feature, the K is the integer more than 1;
Coding unit 204, for encoding to the K feature using local sensitivity hash algorithm, obtains the K Coding;
Computing unit 205, for calculating Hamming distance according to described K coding and the P pending image, obtains institute State P Hamming distance value;
Determining unit 206, at least one target Hamming distance of predetermined threshold value will to be met in the P Hamming distance value Retained from the corresponding pending image of value.
Alternatively, as shown in Figure 2 b, Fig. 2 b are the refinement knot of the first detector unit 201 of the terminal described in Fig. 2 a Structure, it includes:First determining module 2011, generation module 2012, locating module 2013 and processing module 2014, it is specific as follows:
First determining module 2011, for determining positive sample collection and negative sample collection;
Generation module 2012, for being trained to the positive sample collection and the negative sample collection, obtains training pattern;
Locating module 2013, for the training pattern to be positioned in the P pending image, obtains described X posting, wherein, the X is less than the positive integer of the P;
Processing module 2014, for being carried out at pseudo- posting to the X posting using non-maxima suppression algorithm Reason, obtains the K candidate frame, and the K is less than the positive integer of the X.
Alternatively, second detector unit 202 specifically for:
Target detection is carried out to the M candidate frame using fast neuronal network algorithm.
Still optionally further, second detector unit 202 specifically for:
Target detection is carried out to the M candidate frame using the fast neuronal network algorithm based on roi-pooling functions.
Alternatively, as shown in Figure 2 c, Fig. 2 c are the refinement structure of the computing unit 205 of the terminal described in Fig. 2 a, its Including:The determining module 2052 of module 2051 and second is chosen, it is specific as follows:
Module 2051 is chosen, for choosing Q coding corresponding with pending image i, the Q from described K coding It is the integer more than or equal to 1, the pending image i is in the P pending image;
Second determining module 2052, for determining the pending figure according to described Q coding and the pending image i As the Hamming distance value of i.
By the terminal described by the embodiment of the present invention, target detection can be carried out to P pending image, obtain M time Select frame, P and M to be the integer more than 1, target detection carried out to M candidate frame, obtain N number of target, N is the integer more than 1, The multi-object Recognition Model crossed using training in advance carries out feature extraction to N number of target, obtains K feature, and K is whole more than 1 Number, is encoded using local sensitivity hash algorithm to K feature, obtains K coding, according to K coding and P pending figure As calculating Hamming distance, P Hamming distance value is obtained, at least one target Chinese of predetermined threshold value will be met in P Hamming distance value The corresponding pending image of prescribed distance value is retained.Thus, the speed and precision of the retrieval of image can be lifted.
Consistent with the abovely, Fig. 3 is referred to, is that a kind of second embodiment structure of terminal provided in an embodiment of the present invention is shown It is intended to.Terminal described in the present embodiment, including:At least one input equipment 1000;At least one output equipment 2000;Extremely A few processor 3000, such as CPU;With memory 4000, above-mentioned input equipment 1000, output equipment 2000, processor 3000 and memory 4000 connected by bus 5000.
Wherein, above-mentioned input equipment 1000 concretely contact panel, physical button or mouse.
The concretely display screen of above-mentioned output equipment 2000.
Above-mentioned memory 4000 can be high-speed RAM memory, alternatively nonvolatile storage (non-volatile Memory), such as magnetic disc store.Above-mentioned memory 4000 is used to store batch processing code, above-mentioned input equipment 1000, defeated Going out equipment 2000 and processor 3000 is used to call the program code stored in memory 4000, performs following operation:
Above-mentioned processor 3000, is used for:
Target detection is carried out to P pending image, M candidate frame is obtained, the P and M is whole more than 1 Number;
Target detection is carried out to the M candidate frame, N number of target is obtained, the N is the integer more than 1;
The multi-object Recognition Model crossed using training in advance carries out feature extraction to N number of target, obtains K feature, The K is the integer more than 1;
The K feature is encoded using local sensitivity hash algorithm, obtains the K coding;
Hamming distance is calculated according to described K coding and the P pending image, the P Hamming distance value is obtained;
At least one target Hamming distance value for meeting predetermined threshold value in the P Hamming distance value is corresponding pending Image is retained.
Alternatively, above-mentioned processor 3000 carries out target detection to P pending image, obtains M candidate frame, including:
Determine positive sample collection and negative sample collection;
The positive sample collection and the negative sample collection are trained, training pattern is obtained;
The training pattern is positioned in the P pending image, the X posting is obtained, wherein, institute It is less than the positive integer of the P to state X;
Pseudo- posting is carried out to the X posting using non-maxima suppression algorithm to process, the K candidate is obtained Frame, the K is less than the positive integer of the X.
Alternatively, the M candidate frame of above-mentioned processor 3000 pairs carries out target detection, including:
Target detection is carried out to the M candidate frame using fast neuronal network algorithm.
Alternatively, above-mentioned processor 3000 carries out target detection using fast neuronal network algorithm to the M candidate frame, Including:
Target detection is carried out to the M candidate frame using the fast neuronal network algorithm based on roi-pooling functions.
Alternatively, above-mentioned processor 3000 calculates Hamming distance according to described K coding and the P pending image, The P Hamming distance value is obtained, including:
Q coding corresponding with pending image i is chosen from described K coding, the Q is whole more than or equal to 1 Number, the pending image i is in the P pending image;
The Hamming distance value of the pending image i is determined according to described Q coding and the pending image i.
The embodiment of the present invention also provides a kind of computer-readable storage medium, wherein, the computer-readable storage medium can be stored with journey Sequence, including the part or all of step of any target retrieval method described in said method embodiment during the program performing Suddenly.
Although here combines each embodiment, and invention has been described, however, implementing the present invention for required protection During, those skilled in the art are by checking the accompanying drawing, disclosure and appended claims, it will be appreciated that and it is real Other changes of the existing open embodiment.In the claims, " including " (comprising) word be not excluded for other composition Part or step, "a" or "an" is not excluded for multiple situations.Single processor or other units can realize claim In some functions enumerating.Mutually different has been recited in mutually different dependent some measures, it is not intended that these are arranged Apply to combine and produce good effect.
It will be understood by those skilled in the art that embodiments of the invention can be provided as method, device (equipment) or computer journey Sequence product.Therefore, the present invention can using complete hardware embodiment, complete software embodiment or with reference to software and hardware in terms of The form of embodiment.And, the present invention can be adopted and wherein include the calculating of computer usable program code at one or more The computer program implemented in machine usable storage medium (including but not limited to magnetic disc store, CD-ROM, optical memory etc.) The form of product.Computer program is stored/distributed in suitable medium, is provided together with other hardware or as the one of hardware Part, it would however also be possible to employ other distribution forms, such as by Internet or other wired or wireless telecommunication systems.
The present invention be with reference to the embodiment of the present invention method, device (equipment) and computer program flow chart with/ Or block diagram is describing.It should be understood that can by each flow process in computer program instructions flowchart and/or block diagram and/ Or the combination of square frame and flow chart and/or the flow process in block diagram and/or square frame.These computer program instructions can be provided To the processor of all-purpose computer, special-purpose computer, Embedded Processor or other programmable data processing devices producing one Individual machine so that produced for realizing by the instruction of computer or the computing device of other programmable data processing devices The device of the function of specifying in one flow process of flow chart or one square frame of multiple flow processs and/or block diagram or multiple square frames.
These computer program instructions may be alternatively stored in can guide computer or other programmable data processing devices with spy In determining the computer-readable memory that mode works so that the instruction being stored in the computer-readable memory is produced to be included referring to Make the manufacture of device, the command device realize in one flow process of flow chart or one square frame of multiple flow processs and/or block diagram or The function of specifying in multiple square frames.
These computer program instructions also can be loaded into computer or other programmable data processing devices so that in meter Series of operation steps is performed on calculation machine or other programmable devices to produce computer implemented process, so as in computer or The instruction performed on other programmable devices is provided for realizing in one flow process of flow chart or multiple flow processs and/or block diagram one The step of function of specifying in individual square frame or multiple square frames.
Although with reference to specific features and embodiment, invention has been described, it is clear that, without departing from this In the case of bright spirit and scope, various modifications and combinations can be carried out to it.Correspondingly, the specification and drawings are only institute The exemplary illustration of the invention that attached claim is defined, and be considered as cover in the scope of the invention any and all and repair Change, change, combining or equivalent.Obviously, those skilled in the art the present invention can be carried out it is various change and modification and not Depart from the spirit and scope of the present invention.So, if the present invention these modification and modification belong to the claims in the present invention and its Within the scope of equivalent technologies, then the present invention is also intended to comprising these changes and modification.

Claims (10)

1. a kind of target retrieval method, it is characterised in that include:
Target detection is carried out to P pending image, M candidate frame is obtained, the P and M is the integer more than 1;
Target detection is carried out to the M candidate frame, N number of target is obtained, the N is the integer more than 1;
The multi-object Recognition Model crossed using training in advance carries out feature extraction to N number of target, obtains K feature, the K It is the integer more than 1;
The K feature is encoded using local sensitivity hash algorithm, obtains the K coding;
Hamming distance is calculated according to described K coding and the P pending image, the P Hamming distance value is obtained;
The corresponding pending image of at least one target Hamming distance value of predetermined threshold value will be met in the P Hamming distance value Retained.
2. method according to claim 1, it is characterised in that described to carry out target detection to P pending image, obtains M candidate frame, including:
Determine positive sample collection and negative sample collection;
The positive sample collection and the negative sample collection are trained, training pattern is obtained;
The training pattern is positioned in the P pending image, the X posting is obtained, wherein, the X It is less than the positive integer of the P;
Pseudo- posting is carried out to the X posting using non-maxima suppression algorithm to process, the K candidate frame is obtained, The K is less than the positive integer of the X.
3. the method according to any one of claim 1 or 2, it is characterised in that described that target is carried out to the M candidate frame Detection, including:
Target detection is carried out to the M candidate frame using fast neuronal network algorithm.
4. method according to claim 3, it is characterised in that the employing fast neuronal network algorithm is waited to described M Frame is selected to carry out target detection, including:
Target detection is carried out to the M candidate frame using the fast neuronal network algorithm based on roi-pooling functions.
5. the method according to any one of claim 1 or 2, it is characterised in that described according to described K coding and the P Individual pending image calculates Hamming distance, obtains the P Hamming distance value, including:
Choose Q corresponding with pending image i from described K coding to encode, the Q is the integer more than or equal to 1, institute It is in the P pending image to state pending image i;
The Hamming distance value of the pending image i is determined according to described Q coding and the pending image i.
6. a kind of terminal, it is characterised in that include:
First detector unit, for carrying out target detection to P pending image, obtains M candidate frame, and the P and the M are equal It is the integer more than 1;
Second detector unit, for carrying out target detection to the M candidate frame, obtains N number of target, and the N is whole more than 1 Number;
Extraction unit, the multi-object Recognition Model for being crossed using training in advance carries out feature extraction to N number of target, obtains K feature, the K is the integer more than 1;
Coding unit, for encoding to the K feature using local sensitivity hash algorithm, obtains the K coding;
Computing unit, for calculating Hamming distance according to described K coding and the P pending image, obtains the P Chinese Prescribed distance value;
Determining unit, at least one target Hamming distance value correspondence of predetermined threshold value will to be met in the P Hamming distance value Pending image retained.
7. terminal according to claim 6, it is characterised in that first detector unit includes:
First determining module, for determining positive sample collection and negative sample collection;
Generation module, for being trained to the positive sample collection and the negative sample collection, obtains training pattern;
Locating module, for the training pattern to be positioned in the P pending image, obtains the X positioning Frame, wherein, the X is less than the positive integer of the P;
Processing module, is processed for pseudo- posting to be carried out to the X posting using non-maxima suppression algorithm, obtains institute K candidate frame is stated, the K is less than the positive integer of the X.
8. the terminal according to any one of claim 6 or 7, it is characterised in that second detector unit specifically for:
Target detection is carried out to the M candidate frame using fast neuronal network algorithm.
9. terminal according to claim 8, it is characterised in that second detector unit specifically for:
Target detection is carried out to the M candidate frame using the fast neuronal network algorithm based on roi-pooling functions.
10. the terminal according to any one of claim 6 or 7, it is characterised in that the computing unit includes:
Choose module, for from described K coding in choose it is corresponding with pending image i Q encode, the Q be more than or Integer equal to 1, the pending image i is in the P pending image;
Second determining module, for determining the Chinese of the pending image i according to described Q coding and the pending image i Prescribed distance value.
CN201611075914.XA 2016-11-29 2016-11-29 Target retrieval method and terminal Pending CN106682092A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201611075914.XA CN106682092A (en) 2016-11-29 2016-11-29 Target retrieval method and terminal

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201611075914.XA CN106682092A (en) 2016-11-29 2016-11-29 Target retrieval method and terminal

Publications (1)

Publication Number Publication Date
CN106682092A true CN106682092A (en) 2017-05-17

Family

ID=58867010

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201611075914.XA Pending CN106682092A (en) 2016-11-29 2016-11-29 Target retrieval method and terminal

Country Status (1)

Country Link
CN (1) CN106682092A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107481188A (en) * 2017-06-23 2017-12-15 珠海经济特区远宏科技有限公司 A kind of image super-resolution reconstructing method
CN109960742A (en) * 2019-02-18 2019-07-02 苏州科达科技股份有限公司 The searching method and device of local message
CN110019896A (en) * 2017-07-28 2019-07-16 杭州海康威视数字技术股份有限公司 A kind of image search method, device and electronic equipment
CN110297931A (en) * 2019-04-23 2019-10-01 西北大学 A kind of image search method
CN111160110A (en) * 2019-12-06 2020-05-15 北京工业大学 Method and device for identifying anchor based on face features and voice print features
CN111178146A (en) * 2019-12-06 2020-05-19 北京工业大学 Method and device for identifying anchor based on face features
CN113679327A (en) * 2021-10-26 2021-11-23 青岛美迪康数字工程有限公司 Endoscopic image acquisition method and device
CN115052160A (en) * 2022-04-22 2022-09-13 江西中烟工业有限责任公司 Image coding method and device based on cloud data automatic downloading and electronic equipment

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020081026A1 (en) * 2000-11-07 2002-06-27 Rieko Izume Image retrieving apparatus
CN102509305A (en) * 2011-09-26 2012-06-20 浙江工业大学 Animal behavior detection device based on omnidirectional vision
CN102693311A (en) * 2012-05-28 2012-09-26 中国人民解放军信息工程大学 Target retrieval method based on group of randomized visual vocabularies and context semantic information

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20020081026A1 (en) * 2000-11-07 2002-06-27 Rieko Izume Image retrieving apparatus
CN102509305A (en) * 2011-09-26 2012-06-20 浙江工业大学 Animal behavior detection device based on omnidirectional vision
CN102693311A (en) * 2012-05-28 2012-09-26 中国人民解放军信息工程大学 Target retrieval method based on group of randomized visual vocabularies and context semantic information

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
ROSS GIRSHICK: "Fast R-CNN", 《COMPUTER SCIENCE》 *
袁勇: "图像检索:基于内容的图像检索技术", 《袁勇的博客》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107481188A (en) * 2017-06-23 2017-12-15 珠海经济特区远宏科技有限公司 A kind of image super-resolution reconstructing method
CN110019896A (en) * 2017-07-28 2019-07-16 杭州海康威视数字技术股份有限公司 A kind of image search method, device and electronic equipment
CN110019896B (en) * 2017-07-28 2021-08-13 杭州海康威视数字技术股份有限公司 Image retrieval method and device and electronic equipment
US11586664B2 (en) 2017-07-28 2023-02-21 Hangzhou Hikvision Digital Technology Co., Ltd. Image retrieval method and apparatus, and electronic device
CN109960742A (en) * 2019-02-18 2019-07-02 苏州科达科技股份有限公司 The searching method and device of local message
CN110297931A (en) * 2019-04-23 2019-10-01 西北大学 A kind of image search method
CN110297931B (en) * 2019-04-23 2021-12-03 西北大学 Image retrieval method
CN111160110A (en) * 2019-12-06 2020-05-15 北京工业大学 Method and device for identifying anchor based on face features and voice print features
CN111178146A (en) * 2019-12-06 2020-05-19 北京工业大学 Method and device for identifying anchor based on face features
CN113679327A (en) * 2021-10-26 2021-11-23 青岛美迪康数字工程有限公司 Endoscopic image acquisition method and device
CN115052160A (en) * 2022-04-22 2022-09-13 江西中烟工业有限责任公司 Image coding method and device based on cloud data automatic downloading and electronic equipment

Similar Documents

Publication Publication Date Title
CN106682092A (en) Target retrieval method and terminal
CN107885764B (en) Rapid Hash vehicle retrieval method based on multitask deep learning
Arietta et al. City forensics: Using visual elements to predict non-visual city attributes
CN109740415A (en) Vehicle attribute recognition methods and Related product
CN107577990A (en) A kind of extensive face identification method for accelerating retrieval based on GPU
WO2018210047A1 (en) Data processing method, data processing apparatus, electronic device and storage medium
CN106650660A (en) Vehicle type recognition method and terminal
CN114155284A (en) Pedestrian tracking method, device, equipment and medium based on multi-target pedestrian scene
CN111581423B (en) Target retrieval method and device
CN111414888A (en) Low-resolution face recognition method, system, device and storage medium
CN109189970A (en) Picture similarity comparison method and device
CN109711427A (en) Object detection method and Related product
CN111753826B (en) Vehicle and license plate association method, device and electronic system
Lu et al. An improved target detection method based on multiscale features fusion
CN109784140A (en) Driver attributes' recognition methods and Related product
Hu et al. Generalized image recognition algorithm for sign inventory
CN113704276A (en) Map updating method and device, electronic equipment and computer readable storage medium
Santos et al. RECOGNIZING AND EXPLORING AZULEJOS ON HISTORIC BUILDINGS’FACADES BY COMBINING COMPUTER VISION AND GEOLOCATION IN MOBILE AUGMENTED REALITY APPLICATIONS
Li et al. Detection of partially occluded pedestrians by an enhanced cascade detector
Wang et al. YOLOv5-light: efficient convolutional neural networks for flame detection
Wan et al. Improved Vision-Based Method for Detection of Unauthorized Intrusion by Construction Sites Workers
CN114724128A (en) License plate recognition method, device, equipment and medium
Ojala et al. Motion detection and classification: ultra-fast road user detection
CN106886783A (en) A kind of image search method and system based on provincial characteristics
Kim et al. Accurate abandoned and removed object classification using hierarchical finite state machine

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20170517