CN105512685A

CN105512685A - Object identification method and apparatus

Info

Publication number: CN105512685A
Application number: CN201510918292.1A
Authority: CN
Inventors: 陈志军; 李明浩; 侯文迪
Original assignee: Xiaomi Inc
Current assignee: Beijing Xiaomi Technology Co Ltd; Xiaomi Inc
Priority date: 2015-12-10
Filing date: 2015-12-10
Publication date: 2016-04-20
Anticipated expiration: 2035-12-10
Also published as: CN105512685B

Abstract

The invention provides an object identification method and apparatus. The disclosed object identification method comprises the following steps: based on BING, determining at least one first candidate box in an image to be identified, wherein the first candidate box is used for identifying whether the image to be detected comprises an image area of a target object; comparing the first candidate box with a target object model, wherein the target object model is a model about the target object, obtained by training sample data by use of CNN; and if the target object model exits in the first candidate box, indicating the first candidate box. According to the invention, the quantity of candidate boxes needing comparison can be greatly reduced, and the object identification process can be accelerated through reduction of the comparison frequency; the difference between the target object model obtained through training the sample data by use of the CNN and the target object is quite small, the model is close to the shape of the target object, and the accuracy of object identification can be guaranteed. Therefore, objects can be quickly and accurately identified.

Description

Object identification method and device

Technical field

The disclosure relates to image procossing, particularly relates to a kind of object identification method and device.

Background technology

At present, most of object identification method is all first learn a large amount of samples, obtains learning outcome, i.e. object model; Then with different frames traversal need test picture, by traversal frame in content successively with object model comparison, determine whether there is this object model in frame.But, for a width N*N image, all possible frame be traveled through, need the number of times of traversal to be approximately the 4 power orders of magnitude of N.

Summary of the invention

For overcoming Problems existing in correlation technique, the disclosure provides a kind of object identification method and device.Described technical scheme is as follows:

According to the first aspect of disclosure embodiment, provide a kind of object identification method, the method comprises:

In image to be identified, determine at least one first candidate frame based on BING method, this first candidate frame is for identifying the image-region whether comprising target object to be detected;

First candidate frame and target object model are compared, this target object model is by adopting CNN to carry out training the model about target object obtained to sample data;

If there is described target object model in the first candidate frame, then indicate the first candidate frame.

The technical scheme that embodiment of the present disclosure provides can comprise following beneficial effect: obtained at least one first candidate frame that may comprise target object by BING method, relative to prior art, the number of the candidate frame needing comparison can be greatly reduced, because single comparison duration is relatively fixing, therefore, the minimizing of comparison number of times can accelerate object identification process; First candidate frame and target object model are compared, this target object model is by adopting CNN to carry out training the model about target object obtained to sample data, if there is target object model in the first candidate frame, then indicate the first candidate frame, complete the identification to object, wherein, CNN is adopted to carry out training the difference of target object model and the target object obtained less to sample data, comparatively press close to the shape of target object, the degree of accuracy of object identification can be ensured, therefore, the disclosure can carry out object identification quickly and accurately.

Alternatively, above-mentionedly in image to be identified, determine at least one first candidate frame based on BING method, comprising: adopt BING method to treat recognition image and carry out object estimation, obtain at least one first candidate frame in this image to be identified.

Further, above-mentioned first candidate frame and target object model are compared before, also comprise: cluster is carried out at least one first candidate frame, determines the second candidate frame.The number of this second candidate frame is less than the number of the first candidate frame.Correspondingly, above-mentioned first candidate frame and target object model to be compared, be specially: the second candidate frame and target object model are compared.If there is target object model in above-mentioned first candidate frame, then indicate the first candidate frame, be specially: if there is target object model in the second candidate frame, then indicate the second candidate frame.

The technical scheme that embodiment of the present disclosure provides can comprise following beneficial effect: determine the second candidate frame by carrying out cluster at least one first candidate frame, further minimizing needs the number of the candidate frame of comparison, it is less to complete the time that object identification process consumes, and promotes Consumer's Experience.

Further, above-mentioned cluster is carried out at least one first candidate frame, before determining the second candidate frame, also comprise: at least one first candidate frame, choose the candidate frame that confidence score value is greater than preset value.This confidence score value is for characterizing in candidate frame the probability comprising target object.Correspondingly, above-mentionedly carry out cluster at least one first candidate frame, determine the second candidate frame, comprising: the size being greater than the candidate frame of preset value according to confidence score value, candidate frame confidence score value being greater than to preset value carries out cluster, determines the second candidate frame.

Wherein, the above-mentioned size being greater than the candidate frame of preset value according to confidence score value, candidate frame confidence score value being greater than to preset value carries out cluster, determine the second candidate frame, comprise: confidence score value is greater than to every two candidate frames in the candidate frame of preset value, obtain two candidate frames upper left corner and lower right corner position coordinates in image to be identified separately; According to two candidate frames upper left corner and lower right corner position coordinates in described image to be identified separately, obtain the overlapping area of two candidate frames; If the overlapping area of two candidate frames is greater than predetermined threshold value, then judge that two candidate frames are as a class; According to the candidate frame after cluster, determine the second candidate frame.

Further, above-mentioned according to the candidate frame after cluster, determine the second candidate frame, can be accomplished in several ways.

In a kind of implementation, according to the candidate frame after cluster, determine that the second candidate frame can comprise: the position coordinates of candidate frame in image to be identified each class comprised is averaging, determine that candidate frame corresponding to the average coordinates of all candidate frames that each class comprises is the second candidate frame.Such as, in class 1, the upper left corner and the lower right corner position coordinates in image to be identified of each candidate frame is known, upper left corner position coordinates in image to be identified of candidate frames all in class 1 is averaged, obtains upper left corner mean place coordinate in image to be identified of all candidate frames that class 1 comprises; In like manner, lower right corner position coordinates in image to be identified of candidate frames all in class 1 is averaged, obtain lower right corner mean place coordinate in image to be identified of all candidate frames that class 1 comprises, the candidate frame that these two mean place coordinates are corresponding is the second candidate frame determined according to candidate frames all in class 1.

In another kind of implementation, according to the candidate frame after cluster, determine that the second candidate frame can comprise: according to the candidate frame after cluster, determine that the candidate frame that in the candidate frame that each class comprises, confidence score value is maximum is the second candidate frame.Here be still described for class 1, wherein, in the candidate frame that class 1 comprises, the confidence score value that each candidate frame is corresponding may be different, determines second candidate frame of the maximum candidate frame of corresponding confidence score value as class 1.In like manner, in the candidate frame that class 2 comprises, determine second candidate frame of the maximum candidate frame of corresponding confidence score value as class 2.By that analogy, the second candidate frame of each class is determined.

Further, after above-mentioned sign first candidate frame, also comprise: send audio prompt or visual prompts to user, recognize target object to point out user.

The technical scheme that embodiment of the present disclosure provides can comprise following beneficial effect: by various prompting mode, user can be made to adopt the information getting in various manners and recognize object, adding users interest, promotes Consumer's Experience.

According to the second aspect of disclosure embodiment, provide a kind of object detector, described device comprises:

Acquisition module, be configured in image to be identified, determine at least one first candidate frame based on BING method, this first candidate frame is for identifying the image-region whether comprising target object to be detected;

Comparing module, is configured to the first candidate frame and target object model to compare, and this target object model is by adopting CNN to carry out training the model about target object obtained to sample data;

Indicate module, there is target object model if be configured in the first candidate frame, then indicate the first candidate frame.

Should be understood that, it is only exemplary and explanatory that above general description and details hereinafter describe, and can not limit the disclosure.

Accompanying drawing explanation

In order to be illustrated more clearly in disclosure embodiment or technical scheme of the prior art, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is only embodiments more of the present disclosure, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.

Fig. 1 is the process flow diagram of a kind of object identification method according to an exemplary embodiment;

Fig. 2 is a kind of application scenarios schematic diagram of Fig. 1 exemplary embodiment;

Fig. 3 is the process flow diagram of a kind of object identification method according to an exemplary embodiment;

Fig. 4 is a kind of object detector block diagram according to an exemplary embodiment;

Fig. 5 is a kind of object detector block diagram according to an exemplary embodiment;

Fig. 6 is a kind of object detector block diagram according to an exemplary embodiment;

Fig. 7 is a kind of object detector block diagram according to an exemplary embodiment.

By above-mentioned accompanying drawing, illustrate the embodiment that the disclosure is clear and definite more detailed description will be had hereinafter.These accompanying drawings and text description be not in order to limited by any mode the disclosure design scope, but by reference to specific embodiment for those skilled in the art illustrate concept of the present disclosure.

Embodiment

Here will be described exemplary embodiment in detail, its sample table shows in the accompanying drawings.When description below relates to accompanying drawing, unless otherwise indicated, the same numbers in different accompanying drawing represents same or analogous key element.Embodiment described in following exemplary embodiment does not represent all embodiments consistent with the disclosure.On the contrary, they only with as in appended claims describe in detail, the example of apparatus and method that aspects more of the present disclosure are consistent.

Term " first ", " second " etc. in instructions of the present disclosure and claims are for distinguishing similar object, and need not be used for describing specific order or precedence.Should be appreciated that the data used like this can be exchanged in the appropriate case, so as embodiment of the present disclosure described herein such as can with except here diagram or describe those except order implement.In addition, term " comprises " and " having " and their any distortion, intention is to cover not exclusive comprising, such as, contain those steps or module that the process of series of steps or module, method, system, product or equipment is not necessarily limited to clearly list, but can comprise clearly do not list or for intrinsic other step of these processes, method, product or equipment or module.

First the several nouns involved by disclosure embodiment are made an explanation:

(BinarizedNormedGradients is called for short: BING): i.e. a kind of method choosing candidate frame fast two-value specification gradient.This kind of method is very fast for the screening technique of candidate frame, and this character of its closed based on object, anticipation can be carried out to nearly all object, therefore, a certain certain objects is identified if want, then when acquisition candidate frame, first can reject most frame by the method, obtain the candidate frame of minority.

(ConvolutionalNeuralNetwork is called for short: CNN): the neural network model being a kind of special deep layer convolutional neural networks.The singularity of CNN is embodied in two aspects: on the one hand, and the interneuronal connection of CNN is that non-fully connects; On the other hand, the weight of the connection in same layer between some neuron is shared (namely identical).The network structure that non-fully connects and weights are shared of CNN makes it more to be similar to biological neural network, reduces the complexity of network model, decreases the quantity of weights.

Cluster: the process that the set of physics or abstract object is divided into the multiple classes be made up of similar object is called as cluster.What generated by cluster bunch is the set of one group of data object, and these objects are similar each other to the object in same bunch, different with the object in other bunches.Cluster is from the different of classification, and the required class divided of cluster is unknown.

Fig. 1 is the process flow diagram of a kind of object identification method according to an exemplary embodiment.The present embodiment provides a kind of object identification method, the method is applied in object detector, this object detector can obtain image to be identified with user terminal wireless connections, wherein, user terminal comprises smart mobile phone, digital camera, the first-class equipment comprising image-forming component of monitoring camera, and this object detector is integrated in server or user terminal.As shown in Figure 1, method comprises the following steps:

In a step 101, determine at least one first candidate frame based on BING method in image to be identified, this first candidate frame is for identifying the image-region whether comprising target object to be detected.

In a step 102, the first candidate frame and target object model are compared, this target object model is by adopting CNN to carry out training the model about target object obtained to sample data.

In step 103, if there is target object model in the first candidate frame, then indicate this first candidate frame.

Fig. 2 is a kind of application scenarios schematic diagram of Fig. 1 exemplary embodiment.With reference to figure 2, image to be identified is sent to server 200 by wireless network by smart mobile phone 100.Correspondingly, server 200 receives this image to be identified; And, adopt BING method to carry out object estimation to this image to be identified, obtain at least one first candidate frame in image to be identified; First candidate frame and target object model are compared; If there is target object model in the first candidate frame, then indicate this first candidate frame.Wherein, the image-region of target object may be comprised in the first candidate frame i.e. image to be identified, for needing the candidate frame of comparison in step 202.

Above-mentioned target object model is prestored in server 200.Wherein, this target object model can be that server 200 obtains by adopting CNN to carry out training to great amount of samples data, also can be that other equipment carry out sending to server 200 after training obtains to great amount of samples data by adopting CNN.In addition, server 200 can also upgrade the target object model stored, and namely continue studying sample data more presses close to target object to make target object model.

Wherein, above-mentioned server 200 receives image to be identified and is only a kind of implementation obtaining image to be identified, such as, can also be previously stored with image to be identified in server 200, or other implementations obtains image to be identified.

Server 200 can be a station server, or the server cluster be made up of some station servers, or a cloud computing service center.

In sum, the object identification method that the present embodiment provides, at least one first candidate frame that may comprise target object is obtained by BING method, relative to prior art, the number of the candidate frame needing comparison can be greatly reduced, because single comparison duration is relatively fixing, therefore, the minimizing of comparison number of times can accelerate object identification process; First candidate frame and target object model are compared, this target object model is by adopting CNN to carry out training the model about target object obtained to sample data, if there is target object model in the first candidate frame, then indicate the first candidate frame, complete the identification to object, wherein, CNN is adopted to carry out training the difference of target object model and the target object obtained less to sample data, comparatively press close to the shape of target object, the degree of accuracy of object identification can be ensured, therefore, the disclosure can carry out object identification quickly and accurately.

Fig. 3 is the process flow diagram of a kind of object identification method according to an exemplary embodiment.As shown in Figure 3, the method can comprise the following steps:

In step 301, determine at least one first candidate frame based on BING method in image to be identified, this first candidate frame is for identifying the image-region whether comprising target object to be detected.

In step 302, carry out cluster, determine the second candidate frame at least one first candidate frame, the number of this second candidate frame is less than the number of the first candidate frame.

In step 303, the second candidate frame and target object model are compared.

In step 304, if there is target object model in the second candidate frame, then indicate this second candidate frame.

In this embodiment, step 301 is identical with step 201, repeats no more herein.

The present embodiment, determines the second candidate frame by carrying out cluster at least one first candidate frame, reduces further the number needing the candidate frame of comparison, and relative to embodiment illustrated in fig. 1, it is less to complete the time that object identification process consumes, and promotes Consumer's Experience.

Alternatively, before above-mentioned steps 302, the method can also comprise: at least one first candidate frame, choose the candidate frame that confidence score value is greater than preset value, and this confidence score value is for characterizing in candidate frame the probability comprising target object.Correspondingly, step 302 can comprise: the size being greater than the candidate frame of preset value according to confidence score value, and candidate frame confidence score value being greater than to preset value carries out cluster, determines the second candidate frame.It should be noted that, for confidence score value, those skilled in the art can be understood as the weight in CNN.

Wherein, the above-mentioned size being greater than the candidate frame of preset value according to confidence score value, candidate frame confidence score value being greater than to preset value carries out cluster, determine the second candidate frame, can comprise: confidence score value is greater than to every two candidate frames in the candidate frame of preset value, obtain two candidate frames upper left corner and lower right corner position coordinates in image to be identified separately; According to two candidate frames upper left corner and lower right corner position coordinates in described image to be identified separately, obtain the overlapping area of two candidate frames; If the overlapping area of two candidate frames is greater than predetermined threshold value, then judge that two candidate frames are as a class; According to the candidate frame after cluster, determine the second candidate frame.Here, only example illustrates the clustering method that a kind of candidate frame being greater than preset value to confidence score value carries out cluster and adopts, but the disclosure is not as restriction, such as, non-maxima suppression (Non-maximumSuppression can also be adopted, be called for short: NMS) method carries out cluster to the candidate frame that confidence score value is greater than preset value, etc.

Alternatively, above-mentioned according to the candidate frame after cluster, determine the second candidate frame, can be specially: the position coordinates of candidate frame in image to be identified each class comprised is averaging, determine that candidate frame corresponding to the average coordinates of all candidate frames that each class comprises is the second candidate frame; Or, according to the candidate frame after cluster, determine that the candidate frame that in the candidate frame that each class comprises, confidence score value is maximum is the second candidate frame.

On the basis of above-described embodiment, object identification method, after sign first candidate frame, can also comprise: send audio prompt or visual prompts to user, recognizes target object to point out user.

This embodiment, by various prompting mode, can make user adopt the information getting in various manners and recognize object, adding users interest, promotes Consumer's Experience.

Following is disclosure device embodiment, may be used for performing disclosure embodiment of the method.For the details do not disclosed in disclosure device embodiment, please refer to disclosure embodiment of the method.

Fig. 4 is a kind of object detector block diagram according to an exemplary embodiment.With reference to Fig. 4, this device comprises acquisition module 11, comparing module 12 and indicates module 13.

This acquisition module 11, be configured in image to be identified, determine at least one first candidate frame based on BING method, this first candidate frame is for identifying the image-region whether comprising target object to be detected.

This comparing module 12, is configured to the first candidate frame and target object model to compare.This target object model is by adopting CNN to carry out training the model about target object obtained to sample data.

, there is target object model if be configured in the first candidate frame, then indicate the first candidate frame in this sign module 13.

In sum, the object detector that the present embodiment provides, at least one first candidate frame that may comprise target object is obtained by BING method, relative to prior art, the number of the candidate frame needing comparison can be greatly reduced, because single comparison duration is relatively fixing, therefore, the minimizing of comparison number of times can accelerate object identification process; First candidate frame and target object model are compared, this target object model is by adopting CNN to carry out training the model about target object obtained to sample data, if there is target object model in the first candidate frame, then indicate the first candidate frame, complete the identification to object, wherein, CNN is adopted to carry out training the difference of target object model and the target object obtained less to sample data, comparatively press close to the shape of target object, the degree of accuracy of object identification can be ensured, therefore, the disclosure can carry out object identification quickly and accurately.

In the above-described embodiments, acquisition module 11 can be configured to adopt BING method to treat recognition image and carry out object estimation, obtains at least one first candidate frame in this image to be identified.

Fig. 5 is a kind of object detector block diagram according to an exemplary embodiment.With reference to Fig. 5, the structure of this device, on the basis of block diagram shown in Fig. 4, also comprises cluster module 14.

Wherein, this cluster module 14, is configured to carry out cluster at least one the first candidate frame, determines the second candidate frame.The number of this second candidate frame is less than the number of the first candidate frame.

This comparing module 12, is configured to the second candidate frame and target object model to compare.

, there is target object model if be configured in the second candidate frame, then indicate the second candidate frame in this sign module 13.

The present embodiment, determines the second candidate frame by carrying out cluster at least one first candidate frame, reduces further the number needing the candidate frame of comparison, and relative to embodiment illustrated in fig. 4, it is less to complete the time that object identification process consumes, and promotes Consumer's Experience.

Fig. 6 is a kind of object detector block diagram according to an exemplary embodiment.With reference to Fig. 6, the structure of this device, on the basis of block diagram shown in Fig. 5, also comprises and chooses module 15.

Wherein, this chooses module 15, is configured to, in the first candidate frame, choose the candidate frame that confidence score value is greater than preset value.This confidence score value is for characterizing in candidate frame the probability comprising target object.

This cluster module 14, is configured to the size of the candidate frame being greater than preset value according to confidence score value, and candidate frame confidence score value being greater than to preset value carries out cluster, determines the second candidate frame.

Alternatively, cluster module 14 can comprise: coordinate obtains submodule 141, is configured to be greater than every two candidate frames in the candidate frame of preset value to confidence score value, obtains two candidate frames upper left corner and lower right corner position coordinates in described image to be identified separately; Areal calculation submodule 142, is configured to according to two candidate frames upper left corner and lower right corner position coordinates in image to be identified separately, obtains the overlapping area of two candidate frames; Cluster submodule 143, if the overlapping area being configured to two candidate frames is greater than predetermined threshold value, then judges that two candidate frames are as a class; Candidate frame determination submodule 144, is configured to the candidate frame after according to cluster, determines the second candidate frame.

Wherein, the position coordinates of candidate frame in image to be identified that candidate frame determination submodule 144 can be configured to each class to comprise is averaging, and determines that candidate frame corresponding to the average coordinates of all candidate frames that each class comprises is the second candidate frame.Or candidate frame determination submodule 144 can be configured to the candidate frame after according to cluster, determine that the candidate frame that in the candidate frame that each class comprises, confidence score value is maximum is the second candidate frame, etc., the disclosure is not as restriction.

The present embodiment, by the first candidate frame, chooses the candidate frame that confidence score value is greater than preset value, reduces the number needing the candidate frame of comparison further, has reduced the time that object identification process consumes further.

Further, object detector can also comprise: reminding module (not shown).This reminding module, is configured to send audio prompt or visual prompts to user, recognizes target object to point out user.

Fig. 7 is a kind of object detector block diagram according to an exemplary embodiment.With reference to Fig. 7, object detector 800 can comprise following one or more assembly: processing components 802, storer 804, electric power assembly 806, multimedia groupware 808, audio-frequency assembly 810, I/O (input/output, be called for short: interface 812 I/O), sensor module 814, and communications component 816.

Processing components 802 controls the integrated operation of object detector 800 usually, and such as with display, data communication, camera operation and record operate the operation be associated.Processing components 802 can comprise one or more processor 820 to perform instruction, to complete all or part of step of above-mentioned method.In addition, processing components 802 can comprise one or more module, and what be convenient between processing components 802 and other assemblies is mutual.Such as, processing components 802 can comprise multi-media module, mutual with what facilitate between multimedia groupware 808 and processing components 802.

Storer 804 is configured to store various types of data to be supported in the operation of object detector 800.The example of these data comprises for any application program of operation on object detector 800 or the instruction of method, contact data, telephone book data, message, picture, video etc.Storer 804 can be realized by the volatibility of any type or non-volatile memory device or their combination, as static RAM (StaticRandomAccessMemory, be called for short: SRAM), Electrically Erasable Read Only Memory (ElectricallyErasableProgrammableRead-OnlyMemory, be called for short: EEPROM), Erasable Programmable Read Only Memory EPROM (ErasableProgrammableReadOnlyMemory, be called for short: EPROM), programmable read only memory (ProgrammableRed-OnlyMemory, be called for short: PROM), ROM (read-only memory) (Read-OnlyMemory, be called for short: ROM), magnetic store, flash memory, disk or CD.

The various assemblies that electric power assembly 806 is object detector 800 provide electric power.Electric power assembly 806 can comprise power-supply management system, one or more power supply, and other and the assembly generating, manage and distribute electric power for object detector 800 and be associated.

Multimedia groupware 808 is included in the screen providing an output interface between described object detector 800 and user.In certain embodiments, screen can comprise liquid crystal display (LiquidCrystalDisplay, be called for short: LCD) and touch panel (TouchPanel, abbreviation: TP).If screen comprises touch panel, screen may be implemented as touch-screen, to receive the input signal from user.Touch panel comprises one or more touch sensor with the gesture on sensing touch, slip and touch panel.Described touch sensor can the border of not only sensing touch or sliding action, but also detects the duration relevant to described touch or slide and pressure.In certain embodiments, multimedia groupware 808 comprises a front-facing camera and/or post-positioned pick-up head.When object detector 800 is in operator scheme, during as screening-mode or video mode, front-facing camera and/or post-positioned pick-up head can receive outside multi-medium data.Each front-facing camera and post-positioned pick-up head can be fixing optical lens systems or have focal length and optical zoom ability.

Audio-frequency assembly 810 is configured to export and/or input audio signal.Such as, audio-frequency assembly 810 comprises a microphone, and (Microphone, is called for short: MIC), when object detector 800 is in operator scheme, during as call model, logging mode and speech recognition mode, microphone is configured to receive external audio signal.The sound signal received can be stored in storer 804 further or be sent via communications component 816.In certain embodiments, audio-frequency assembly 810 also comprises a loudspeaker, for output audio signal.

I/O interface 812 is for providing interface between processing components 802 and peripheral interface module, and above-mentioned peripheral interface module can be keyboard, some striking wheel, button etc.These buttons can include but not limited to: home button, volume button, start button and locking press button.

Sensor module 814 comprises one or more sensor, for providing the state estimation of various aspects for object detector 800.Such as, sensor module 814 can detect the opening/closing state of object detector 800, the relative positioning of assembly, such as described assembly is display and the keypad of object detector 800, the position of all right inspected object recognition device 800 of sensor module 814 or object detector 800 1 assemblies changes, the presence or absence that user contacts with object detector 800, the temperature variation of object detector 800 orientation or acceleration/deceleration and object detector 800.Sensor module 814 can comprise proximity transducer, be configured to without any physical contact time detect near the existence of object.Sensor module 814 can also comprise optical sensor, as complementary metal oxide semiconductor (CMOS) (ComplementaryMetalOxideSemiconductor, CMOS) or charge coupled cell (Charge-coupledDevice be called for short:, be called for short: CCD) photosensitive imaging element, for using in imaging applications.In certain embodiments, this sensor module 814 can also comprise acceleration transducer, gyro sensor, Magnetic Sensor, pressure transducer or temperature sensor.

Communications component 816 is configured to the communication being convenient to wired or wireless mode between object detector 800 and other equipment.Object detector 800 can access the wireless network based on communication standard, and as Wireless Fidelity, (WIreless-Fidelity is called for short: WiFi), 2G or 3G, or their combination.In one exemplary embodiment, communications component 816 receives from the broadcast singal of external broadcasting management system or broadcast related information via broadcast channel.In one exemplary embodiment, described communications component 816 also comprise near-field communication (NearFieldCommunication, be called for short: NFC) module, to promote junction service.Such as, can based on radio-frequency (RF) identification (RadioFrequencyIdentification in NFC module, be called for short: RFID) technology, Infrared Data Association (InfraredDataAssociation, be called for short: IrDA) technology, (UltraWideband, is called for short: UWB) technology ultra broadband, (Bluetooth is called for short: BT) technology and other technologies realize bluetooth.

In the exemplary embodiment, object detector 800 can by one or more application specific integrated circuit (ApplicationSpecificIntegratedCircuit, be called for short: ASIC), digital signal processor (DdigitalSignalProcessor, be called for short: DSP), digital signal processing appts (DigitalSignalProcessingDevice, be called for short: DSPD), programmable logic device (PLD) (ProgrammableLogicDevice, be called for short: PLD), field programmable gate array (FieldProgrammableGateArray, be called for short: FPGA), controller, microcontroller, microprocessor or other electronic components realize, for performing said method.

In the exemplary embodiment, additionally provide a kind of non-transitory computer-readable recording medium comprising instruction, such as, comprise the storer 804 of instruction, above-mentioned instruction can perform said method by the processor 820 of object detector 800.Such as, described non-transitory computer-readable recording medium can be ROM, random access memory (RandomAccessMemory, CD-ROM), tape, floppy disk and optical data storage devices etc. be called for short: RAM), (CompactDiscRead-OnlyMemory is called for short: read-only optical disc.

A kind of non-transitory computer-readable recording medium, when the instruction in described storage medium is performed by the processor of object detector, make object detector can perform a kind of object identification method, described method comprises: in image to be identified, determine at least one first candidate frame based on BING method, and this first candidate frame is for identifying the image-region whether comprising target object to be detected; First candidate frame and target object model are compared, this target object model is by adopting CNN to carry out training the model about target object obtained to sample data; If there is target object model in the first candidate frame, then indicate the first candidate frame.

Those skilled in the art, at consideration instructions and after putting into practice invention disclosed herein, will easily expect other embodiment of the present disclosure.The application is intended to contain any modification of the present disclosure, purposes or adaptations, and these modification, purposes or adaptations are followed general principle of the present disclosure and comprised the undocumented common practise in the art of the disclosure or conventional techniques means.Instructions and embodiment are only regarded as exemplary, and true scope of the present disclosure and spirit are pointed out by claim below.

Should be understood that, the disclosure is not limited to precision architecture described above and illustrated in the accompanying drawings, and can carry out various amendment and change not departing from its scope.The scope of the present disclosure is only limited by appended claim.

Claims

1. an object identification method, is characterized in that, described method comprises:

In image to be identified, determine at least one first candidate frame based on two-value specification gradient BING method, described first candidate frame is for identifying the image-region whether comprising target object to be detected;

Described first candidate frame and target object model are compared, described target object model is by adopting convolutional neural networks CNN to carry out training the model about described target object obtained to sample data;

If there is described target object model in described first candidate frame, then indicate described first candidate frame.

2. method according to claim 1, is characterized in that, describedly in image to be identified, determines at least one first candidate frame based on two-value specification gradient BING method, comprising:

Adopt described BING method to treat recognition image and carry out object estimation, obtain at least one first candidate frame in described image to be identified.

3. method according to claim 1 and 2, is characterized in that, described described first candidate frame and target object model are compared before, also comprise:

Carry out cluster at least one first candidate frame described, determine the second candidate frame, the number of described second candidate frame is less than the number of described first candidate frame;

Correspondingly, described first candidate frame and target object model are compared, is specially: described second candidate frame and described target object model are compared;

If there is described target object model in described first candidate frame, then indicate described first candidate frame, be specially: if there is described target object model in described second candidate frame, then indicate described second candidate frame.

4. method according to claim 3, is characterized in that, describedly carries out cluster at least one first candidate frame described, before determining the second candidate frame, also comprises:

In at least one first candidate frame described, choose the candidate frame that confidence score value is greater than preset value, described confidence score value is for characterizing in candidate frame the probability comprising target object;

Correspondingly, described cluster is carried out at least one first candidate frame described, determine the second candidate frame, comprising: the size being greater than the candidate frame of preset value according to described confidence score value, the candidate frame described confidence score value being greater than to preset value carries out cluster, determines described second candidate frame.

5. method according to claim 4, is characterized in that, the described size being greater than the candidate frame of preset value according to described confidence score value, and the candidate frame described confidence score value being greater than to preset value carries out cluster, determines described second candidate frame, comprising:

Every two candidate frames in the candidate frame of preset value are greater than to described confidence score value, obtain two candidate frames upper left corner and lower right corner position coordinates in described image to be identified separately;

According to described two candidate frames upper left corner and lower right corner position coordinates in described image to be identified separately, obtain the overlapping area of described two candidate frames;

If the overlapping area of described two candidate frames is greater than predetermined threshold value, then judge that described two candidate frames are as a class;

According to the candidate frame after cluster, determine described second candidate frame.

6. method according to claim 5, is characterized in that, described according to the candidate frame after cluster, determines described second candidate frame, comprising:

The position coordinates of the candidate frame each class comprised in described image to be identified is averaging, and determines that candidate frame corresponding to the average coordinates of all candidate frames that each class comprises is the second candidate frame;

Or, according to the candidate frame after cluster, determine that the candidate frame that in the candidate frame that each class comprises, confidence score value is maximum is the second candidate frame.

7. method according to claim 1 and 2, is characterized in that, after described first candidate frame of described sign, also comprises:

Send audio prompt or visual prompts to user, recognize described target object to point out described user.

8. an object detector, is characterized in that, described device comprises:

Acquisition module, be configured in image to be identified, determine at least one first candidate frame based on two-value specification gradient BING method, described first candidate frame is for identifying the image-region whether comprising target object to be detected;

Comparing module, is configured to described first candidate frame and target object model to compare, and described target object model is by adopting convolutional neural networks CNN to carry out training the model about described target object obtained to sample data;

Indicate module, there is described target object model if be configured in described first candidate frame, then indicate described first candidate frame.

9. device according to claim 8, is characterized in that, described acquisition module, is configured to adopt described BING method to treat recognition image and carries out object estimation, obtain at least one first candidate frame in described image to be identified.

10. device according to claim 8 or claim 9, it is characterized in that, described device also comprises:

Cluster module, be configured to carry out cluster at least one first candidate frame described, determine the second candidate frame, the number of described second candidate frame is less than the number of described first candidate frame;

Correspondingly, described comparing module, is configured to described second candidate frame and described target object model to compare;

, there is described target object model if be configured in described second candidate frame, then indicate described second candidate frame in described sign module.

11. devices according to claim 10, is characterized in that, described device also comprises:

Choose module, be configured to, in described first candidate frame, choose the candidate frame that confidence score value is greater than preset value, described confidence score value is for characterizing in candidate frame the probability comprising target object;

Correspondingly, described cluster module, be configured to the size being greater than the candidate frame of preset value according to described confidence score value, the candidate frame described confidence score value being greater than to preset value carries out cluster, determines described second candidate frame.

12. devices according to claim 11, is characterized in that, described cluster module comprises:

Coordinate obtains submodule, is configured to be greater than every two candidate frames in the candidate frame of preset value to described confidence score value, obtains two candidate frames upper left corner and lower right corner position coordinates in described image to be identified separately;

Areal calculation submodule, is configured to, according to described two candidate frames upper left corner and lower right corner position coordinates in described image to be identified separately, obtain the overlapping area of described two candidate frames;

Cluster submodule, if the overlapping area being configured to described two candidate frames is greater than predetermined threshold value, then judges that described two candidate frames are as a class;

Candidate frame determination submodule, is configured to the candidate frame after according to cluster, determines described second candidate frame.

13. devices according to claim 8 or claim 9, it is characterized in that, described candidate frame determination submodule, the position coordinates of candidate frame in described image to be identified being configured to each class to comprise is averaging, and determines that candidate frame corresponding to the average coordinates of all candidate frames that each class comprises is the second candidate frame; Or, according to the candidate frame after cluster, determine that the candidate frame that in the candidate frame that each class comprises, confidence score value is maximum is the second candidate frame.

14. devices according to claim 8 or claim 9, it is characterized in that, described device also comprises:

Reminding module, is configured to send audio prompt or visual prompts to user, recognizes described target object to point out described user.

15. 1 kinds of object detectors, is characterized in that, comprising: processor and the storer for storage of processor executable instruction;

Wherein, described processor is configured to the method for enforcement of rights requirement according to any one of 1 ~ 7.