CN108875503A

CN108875503A - Method for detecting human face, device, system, storage medium and capture machine

Info

Publication number: CN108875503A
Application number: CN201711100139.3A
Authority: CN
Inventors: 梁喆; 周舒畅; 朱雨
Original assignee: Beijing Megvii Technology Co Ltd; Beijing Maigewei Technology Co Ltd
Current assignee: Beijing Megvii Technology Co Ltd; Beijing Maigewei Technology Co Ltd
Priority date: 2017-11-09
Filing date: 2017-11-09
Publication date: 2018-11-23

Abstract

The present invention provides a kind of method for detecting human face, device, system, storage medium and capture machine, this method includes：It obtains video frame and face location detection is carried out to acquired video frame, obtain the face location testing result of acquired video frame；Determine in the video frame that currently obtains whether include the people for including in the video frame previously obtained face, if it is return to the step of obtaining video frame, it is on the contrary then export at least one video frame previously obtained；The face location testing result of at least one video frame based on output carries out face character detection to the video frame of output.Method for detecting human face, device, system, storage medium and capture machine according to an embodiment of the present invention just carry out the detection of attribute of the face when identical face interrupts in the video frame to the video frame before interruption, so that the detection of attribute number of run that time-consuming greatly reduces on the whole, to reduce time-consuming, the efficiency of unit time processing video frame is improved.

Description

Method for detecting human face, device, system, storage medium and capture machine

Technical field

The present invention relates to human face detection tech field, relates more specifically to a kind of method for detecting human face, device, system and deposit Storage media and capture machine.

Background technique

Existing capture machine method for detecting human face is to first pass through a detection neural network, is led to again after face is plucked out figure It crosses one or several attribute neural networks and obtains face character.And then this method is just transported after having run detection neural network Row attribute neural networks need to cascade several neural networks, and time-consuming big, the unit time, manageable image frame per second was low.

Summary of the invention

To solve the above-mentioned problems, the invention proposes a kind of scheme about Face datection, it can be used for capture machine Face datection can be used for the Face datection of other scenes.Further it is proposed that the scheme about Face datection can also To extend to the detection for any target object, face need to only be replaced with to other target objects.It is briefly described below Scheme proposed by the present invention about Face datection, more details will be retouched in a specific embodiment in subsequent combination attached drawing It states.

According to an aspect of the present invention, a kind of method for detecting human face is provided, the method for detecting human face includes：Obtain video Frame, and face location detection is carried out to acquired video frame, obtain the face location detection knot of the acquired video frame Fruit；Determine in the video frame currently obtained whether include wrapping in the video frame previously obtained based on the face location testing result The face of the people included, it is the step of if it is returning to the acquisition video frame, on the contrary then export at least one previously obtained A video frame；And the face location testing result of at least one video frame based on the output is to the video frame of the output Carry out face character detection.

In one embodiment of the invention, the face location testing result includes face identification number, identical people Face identification number it is identical, and whether include being wrapped in the video frame previously obtained in the video frame that currently obtains of the determination The face of the people included includes：Determine the people in the face identification number in the video frame currently obtained and the video frame previously obtained Face identification number if it is determines that in the video frame currently obtained do not include the video previously obtained compared to whether changing The face for the people for including in frame, if otherwise determining in the video frame that currently obtains that include in the video frame previously obtained include The face of people.

In one embodiment of the invention, described described at least one video frame previously obtained of output includes：Output The best video frame of face location testing result is for carrying out the face character detection in the video frame previously obtained.

In one embodiment of the invention, it is described to acquired video frame carry out face location detection include detection with At least one of in lower：Face location coordinate, face frame size, human face posture and facial image fuzziness.

In one embodiment of the invention, the best video frame of the face location testing result is that facial image is fuzzy Degree is lower than predetermined threshold and the optimal video frame of human face posture.

In one embodiment of the invention, the face location detection is implemented by face location detection neural network, institute Face character detection is stated to be implemented by face character detection neural network.

According to a further aspect of the invention, a kind of human face detection device is provided, the human face detection device includes：Face position Detection module is set, for obtaining video frame and carry out face location detection to acquired video frame, is obtained described acquired The face location testing result of video frame；Judgment module, for what is currently obtained based on face location testing result determination In video frame whether include the people for including in the video frame that had previously obtained face, previously obtained if it is not, then output is described At least one video frame；And face character detection module, at least one video frame for being exported based on the judgment module Face location testing result face character detection is carried out to the video frame of the output.

In one embodiment of the invention, the face location testing result includes face identification number, identical people Face identification number it is identical, and the judgment module is further used for：Determine the face mark in the video frame currently obtained Know whether number changes compared with the face identification number in the video frame previously obtained, if it is determines current obtain Video frame in do not include the face for the people for including in the video frame previously obtained, if otherwise determining the video frame that currently obtains In include the people for including in the video frame that had previously obtained face.

In one embodiment of the invention, the judgment module exports at least one video frame previously obtained Operation includes：The video frame that face location testing result is best in the video frame previously obtained is exported with described for carrying out Face character detection.

In one embodiment of the invention, the face location detection module carries out face position to acquired video frame Setting detection includes detecting at least one of the following：Face location coordinate, face frame size, human face posture and facial image mould Paste degree.

In one embodiment of the invention, the face location detection module is real using face location detection neural network The face location detection is applied, the face character detection module implements the face category using face character detection neural network Property detection.

Another aspect according to the present invention, provides a kind of face detection system, and the face detection system includes storage dress It sets and processor, is stored with the computer program run by the processor on the storage device, the computer program exists Method for detecting human face described in any of the above embodiments is executed when being run by the processor.

According to a further aspect of the present invention, a kind of storage medium is provided, is stored with computer program on the storage medium, The computer program executes method for detecting human face described in any of the above embodiments at runtime.

According to a further aspect of the present invention, a kind of capture machine is provided, the capture machine includes image collecting device and above-mentioned Described in any item human face detection devices.

Method for detecting human face, device, system, storage medium and capture machine according to an embodiment of the present invention are for acquired Video frame not directly carries out face character detection after carrying out face location detection, but identical face in the video frame The detection of attribute for just carrying out the face when interrupting to the video frame before interruption, so that the detection of attribute number of run that time-consuming It greatly reduces on the whole, to reduce time-consuming, improves the efficiency of unit time processing video frame.

Detailed description of the invention

The embodiment of the present invention is described in more detail in conjunction with the accompanying drawings, the above and other purposes of the present invention, Feature and advantage will be apparent.Attached drawing is used to provide to further understand the embodiment of the present invention, and constitutes explanation A part of book, is used to explain the present invention together with the embodiment of the present invention, is not construed as limiting the invention.In the accompanying drawings, Identical reference label typically represents same parts or step.

Fig. 1 shows showing for realizing method for detecting human face according to an embodiment of the present invention, device, system and storage medium The schematic block diagram of example electronic equipment；

Fig. 2 shows the schematic flow charts of method for detecting human face according to an embodiment of the present invention；

Fig. 3 shows the schematic diagram of the process of method for detecting human face according to an embodiment of the present invention；

Fig. 4 shows the schematic block diagram of human face detection device according to an embodiment of the present invention；And

Fig. 5 shows the schematic block diagram of face detection system according to an embodiment of the present invention.

Specific embodiment

In order to enable the object, technical solutions and advantages of the present invention become apparent, root is described in detail below with reference to accompanying drawings According to example embodiments of the present invention.Obviously, described embodiment is only a part of the embodiments of the present invention, rather than this hair Bright whole embodiments, it should be appreciated that the present invention is not limited by example embodiment described herein.Based on described in the present invention The embodiment of the present invention, those skilled in the art's obtained all other embodiment in the case where not making the creative labor It should all fall under the scope of the present invention.

Firstly, describing the method for detecting human face for realizing the embodiment of the present invention, device, system and storage referring to Fig.1 The exemplary electronic device 100 of medium.

As shown in Figure 1, electronic equipment 100 include one or more processors 102, it is one or more storage device 104, defeated Enter device 106, output device 108 and image collecting device 110, these components pass through bus system 112 and/or other forms Bindiny mechanism's (not shown) interconnection.It should be noted that the component and structure of electronic equipment 100 shown in FIG. 1 are only exemplary, And not restrictive, as needed, the electronic equipment also can have other assemblies and structure.

The processor 102 can be central processing unit (CPU) or have data-handling capacity and/or instruction execution The processing unit of the other forms of ability, and the other components that can control in the electronic equipment 100 are desired to execute Function.

The storage device 104 may include one or more computer program products, and the computer program product can To include various forms of computer readable storage mediums, such as volatile memory and/or nonvolatile memory.It is described easy The property lost memory for example may include random access memory (RAM) and/or cache memory (cache) etc..It is described non- Volatile memory for example may include read-only memory (ROM), hard disk, flash memory etc..In the computer readable storage medium On can store one or more computer program instructions, processor 102 can run described program instruction, to realize hereafter institute The client functionality (realized by processor) in the embodiment of the present invention stated and/or other desired functions.In the meter Can also store various application programs and various data in calculation machine readable storage medium storing program for executing, for example, the application program use and/or The various data etc. generated.

The input unit 106 can be the device that user is used to input instruction, and may include keyboard, mouse, wheat One or more of gram wind and touch screen etc..

The output device 108 can export various information (such as image or sound) to external (such as user), and It may include one or more of display, loudspeaker etc..

Described image acquisition device 110 can acquire the desired image of user (such as photo, video etc.), and will be adopted The image of collection is stored in the storage device 104 for the use of other components.Image collecting device 110 can be camera.

Illustratively, the exemplary electronic device for realizing method for detecting human face according to an embodiment of the present invention and device can To be implemented as capture machine, smart phone, tablet computer etc..

Method for detecting human face according to an embodiment of the present invention is to detect the included convolutional layer of neural network based on face character It is more, operation is primary time-consuming very long and improved, improved target be desirable to can to run the detection of n times face location rerun it is primary Face character detection.Here, the value needs of n are considered carefully, because the value of n is too big, can face location be detected The time of operation may be greater than face there are the time in camera lens, and face character detection is caused to lose；If the value of n is too It is small, then the number that equally will cause face character detection operation is too many, cause time-consuming increase.Based on this, it is described below Method for detecting human face of the invention provides reasonable mechanism and makes face location detection and face character detection rationally alternating.

In the following, method for detecting human face 200 according to an embodiment of the present invention will be described with reference to Fig. 2.As shown in Fig. 2, face is examined Survey method 200 may include steps of：

In step S210, video frame is obtained, and face location detection is carried out to acquired video frame, obtains described obtained The face location testing result of the video frame taken.

In one embodiment, continuous video frame can be obtained frame by frame.For each frame, face position can be carried out to it Set detection.In one example, the operation for carrying out face location detection to acquired video frame may include during detection is following At least one of：Face location coordinate, face frame size, human face posture and facial image fuzziness.In other examples, right The operation that acquired video frame carries out face location detection can also include detecting any other suitable item.

In an embodiment of the present invention, progress face character detection immediately is not necessarily after carrying out face location detection, And it is to determine whether to meet predetermined condition, and just execute face detection of attribute when meeting predetermined conditions.Since face character is examined Surveying neural network usually has more convolutional layers, and operation face character detection neural network takes a long time, therefore is meeting Just operation face character detection neural network can generally reduce the operation of face character detection neural network when predetermined condition Number improves the efficiency of processing video frame in the unit time.The predetermined condition for specifically needing to meet will in subsequent steps in detail It is thin to discuss.

In step S220, determine in the video frame currently obtained whether include previous based on the face location testing result The face for the people for including in the video frame of acquisition, the step of if it is returning to the acquisition video frame, (return to step S210), on the contrary then export described at least one video frame previously obtained.

In an embodiment of the present invention, for current acquired each frame, face location detection can carried out to it Determine in the frame whether include in the video frame (such as former frame) previously obtained based on the face location testing result of the frame afterwards Including people face.

For example, for the first frame of acquisition, since video frame being not present before it, face location can be being carried out to it The second frame is directly acquired after detection.For the second frame of acquisition, can carry out determining in the second frame after face location detection to it is No includes the face for the people for including in first frame.It is assumed that including face A in first frame, it may be determined that whether also include people in the second frame (herein, face A indicates the face of identical people to face A, is not necessarily the same posture of same angle, as long as the people of the same person Face).If it is determined that including face A in the second frame, then can continue to obtain third frame.It, can be right for the third frame of acquisition It determines in third frame whether include that the second frame (herein, is also possible to first frame and/or second after carrying out face location detection Frame because have determined that in this two frame include identical people face) in include people face, that is, determine third frame in whether wrap Include face A.If it is determined that including face A in third frame, then continue to obtain the 4th frame, and repeat operation above.If it is determined that Do not include face A in third frame, then can determine that face A is interrupted in the video frame, then it can be by the video before interruption Frame (i.e. first frame and/or the second frame) output is detected with carrying out the subsequent face character by description.Similarly, if it is determined that second It does not just include face A in frame, then it represents that face A is just interrupted in the second frame, then exports the video frame (i.e. the before interrupting One frame) carry out face character identification.The video frame exported is not re-used as the video frame previously obtained and above-mentioned sentence Fixed, the video frame not exported can be used as the video frame previously obtained, as judgement ratio after the new video frame of subsequent acquisition Compared with object.

It will appreciated by the skilled person that the video frame that output described herein had previously obtained also refers to Export previously obtained video frame face location testing result (or be interpreted as output have face location testing result elder generation The video frame of preceding acquisition) for carrying out face character detection based on the result.For sake of simplicity, it is defeated for being briefly described herein The video frame previously obtained out.

Above-mentioned example is only to describe including the face of a people as example in video frame, when including in video frame When the face of more than one people, method of the invention be also it is applicable, be exemplified below：

It is assumed that including face A and face B in first frame, it may be determined that whether include face A and/or face B in the second frame.Such as Fruit determines in the second frame to include face A and/or face B (such as only including face A), then can continue to obtain third frame.For Whether the third frame of acquisition determines in third frame after can carrying out face location detection to it including the people for the people for including in previous frame Face determines in third frame whether include face A and/or face B, if it is determined that third frame includes face A and/or face B (example As only included face B), then it can continue to obtain the 4th frame.For the 4th frame of acquisition, after face location detection being carried out to it It whether determines in the 4th frame including the face for the people for including in previous frame, that is, determines in the 4th frame whether include face A and/or people Face B, if it is determined that the 4th frame includes face A and/or face B, then can continue to obtain the 5th frame, and repeat aforesaid operations.Instead It, if it is determined that it does not include face A in the 4th frame does not include face B yet, for example including face C, then the frame that will can previously obtain (i.e. at least one of first frame, the second frame, third frame) output is for carrying out face character detection.Similarly, if really Determine not including face A in third frame (or second frame) also not including face B, then can by at least frame previously obtained export with In progress face character detection.

Illustratively, output previously obtained video frame when can only export in the video frame previously obtained be best suited for into The part of pedestrian's face detection of attribute is to be further reduced the calculation amount that face character detects.

For example, in above-mentioned first example, if it is determined that do not include face A in third frame, then will can previously obtain The video frame that face location testing result is best in video frame (i.e. first frame and/or the second frame) is exported to carry out face character inspection It surveys.Illustratively, the best video frame of face location testing result can be facial image fuzziness lower than predetermined threshold and people The optimal video frame of face posture.Illustratively, human face posture can most preferably refer to face in three angle values of three-dimensional space (roll, pitch, jaw) (being, for example, less than equal to 10 degree) and quadratic sum are minimum in the reasonable scope.

For another example, in above-mentioned second example, if it is determined that neither including face A nor including face B in the 4th frame, then First frame in the video frame previously obtained (i.e. first frame, the second frame and third frame) can be exported to be examined for face character It surveys, because first frame includes simultaneously face A and face B；It is best suited for carrying out face category alternatively, can also select from this three frame Property detection frame output.Small size (such as 128*128) image it is possible to further the extraction optimal quality from this three frame is defeated Out for carrying out face character detection.It wherein, include face, the small-sized image in the small-sized image of the 128*128 Face block diagram picture when can be face position detection, is also possible to the face after face frame is suitably extended or reduced Image.

Method for detecting human face according to an embodiment of the present invention is described above exemplarily in the video frame progress to acquisition After face location detection, determine in the video frame currently obtained whether include previously having obtained based on face location testing result The face for the people for including in video frame, it is the step of if it is returning to the acquisition video frame, on the contrary then export and described previously obtain At least one video frame taken.It is illustratively described below and how to determine in the video frame currently obtained whether include previously having obtained Video frame in include people face.

In one embodiment, can be determined by the matched method of face in the video frame currently obtained whether include The face for the people for including in the video frame previously obtained.In another embodiment, the people that can be detected according to before and after frames The location information of face frame come judge in the video frame currently obtained whether include the people for including in the video frame previously obtained people Face.For example, it may be determined that the face frame detected in the video frame currently obtained and the people detected in first former frame or a few frames Whether the distance between position of face frame is less than scheduled threshold value, if it is, can determine in the video frame currently obtained and include The face for the people for including in the video frame previously obtained, it is on the contrary then can determine do not include previously having obtained in the video frame currently obtained Video frame in include people face.More than one second usually 25 frames of processing, face movement velocity is again not too large, adjacent The face frame position difference of frame is not too large, it should can be with other faces of significant difference, so the side based on a distance threshold Formula can quickly judge the face in the video frame currently obtained whether with the face in previous frame or upper a few frames as same people Face.

In yet another embodiment, carrying out the face location testing result that face location detects to video frame can wrap Face identification number (track id) is included, the face identification number of identical people is identical.In other words, face identification number be in order to The number distinguishing the face of different people and being endowed.If the face of a people is consistently present in continuous video frame, Identical face identification number can be obtained in the result for carrying out face location detection to this section of continuous each frame of video frame, i.e., Track id is the same numerical value in these video frames, and there is no change.If track id is changed in a certain frame Become, then it represents that the people does not occur in this frame or is blocked invisible.

Based on this, determine in the video frame that currently obtains whether include the people for including in the video frame previously obtained face May include：Determine the face identification number in the face identification number in the video frame currently obtained and the video frame previously obtained Code compares whether change.If it is, determining that in the video frame currently obtained do not include wrapping in the video frame previously obtained The face of the people included；If it is not, then determining includes the people for including in the video frame currently obtained in the video frame previously obtained Face.With it is noted earlier corresponding, if face identification number in the video frame currently obtained and the video previously obtained Compared to being changed, i.e. face interruption suffers face identification number in frame in the video frame currently obtained, can incite somebody to action at this time At least frame output in video frame (video frame previously obtained) before interruption is to be used to carry out face character detection, such as It is mentioned-above.

It now continues describe the subsequent step of method for detecting human face 200 according to an embodiment of the present invention with reference to Fig. 2.

In step S230, the face location testing result of at least one video frame based on the output is to the output Video frame carries out face character detection.

It in one embodiment, may include detecting at least one of the following to the face character detection of video frame：With Age, gender, the nationality etc. of the corresponding people of face.

In one embodiment, neural network can be detected using face character to examine come the face character of implementation steps S230 It surveys.Similarly, neural network can be detected using face location to detect come the face location of implementation steps S220.Show at another In example, face location detection and face character detection can also be implemented using multichannel neural network.

Based on the operation of step S220, so that the number of face character detection greatly reduces, face several times may be run A face character detection neural network is just run after position detection neural network, is substantially increased and is handled video in the unit time The efficiency of frame.In addition, the operation based on step S220, so that output is top-quality to face character detection neural network Figure, so that the confidence level of each attribute also obtains peak, to ensure that the quality of entire Face datection.

Fig. 3 shows the schematic diagram of the process of method for detecting human face according to an embodiment of the present invention.As shown in figure 3, input Video frame by multiple convolutional layer conv1, conv2 ..., conv7 obtain face location testing result, including face coordinate The width w and height h of (x, y), face frame, face identification number track id, human face posture pos and facial image fuzziness blur.Herein, it is only a kind of example that 7 convolutional layers, which are shown in FIG. 3, is also possible to the convolutional layer of other quantity.Based on these As a result after the deterministic process for carrying out above-mentioned step S220, at the condition that is unsatisfactory for (such as no change has taken place by track id) after It is continuous to obtain video frame and execute face position detection, face category is executed again after the condition that meets (such as track id changes) Property detection, such as attribute 1, attribute 2 ..., attribute N.

Based on above description, method for detecting human face according to an embodiment of the present invention not exists for acquired video frame Face character detection is directly carried out after carrying out face location detection, but just when identical face interrupts in the video frame The detection of attribute of the face is carried out to the video frame before interruption, so that the detection of attribute number of run that time-consuming subtracts significantly on the whole It is small, to reduce time-consuming, improve the efficiency of unit time processing video frame.

Method for detecting human face according to an embodiment of the present invention is described above exemplarily.Illustratively, according to the present invention The method for detecting human face of embodiment can with memory and processor unit or system in realize.

In addition, method for detecting human face processing speed according to an embodiment of the present invention is fast, it is deployed to capture machine with can be convenient On, it is deployed in the mobile devices such as smart phone, tablet computer, personal computer with also can be convenient.Alternatively, according to this hair The method for detecting human face of bright embodiment can also be deployed in server end (or cloud).Alternatively, according to an embodiment of the present invention Method for detecting human face can also be deployed at server end (or cloud) and personal terminal with being distributed.

The human face detection device of another aspect of the present invention offer is described below with reference to Fig. 4.Fig. 4 shows real according to the present invention Apply the schematic block diagram of the human face detection device 400 of example.

As shown in figure 4, human face detection device 400 according to an embodiment of the present invention includes face location detection module 410, sentences Disconnected module 420 and face character detection module 430.The modules can execute the face above in conjunction with Fig. 2 description respectively Each step/function of detection method.Only the major function of each module of human face detection device 400 is described below, and Omit the detail content having been described above.

Face location detection module 410 is used to obtain video frame and carries out face location detection to acquired video frame, Obtain the face location testing result of the acquired video frame.Judgment module 420 is used to detect based on the face location As a result determine in the video frame that currently obtains whether include the people for including in the video frame previously obtained face, if it is not, then Export described at least one video frame previously obtained.Face character detection module 430 is used to export based on the judgment module The face location testing result of at least one video frame face character detection is carried out to the video frame of the output.Face location Detection module 410, judgment module 420 and face character detection module 430 can processing in electronic equipment as shown in Figure 1 The program instruction that stores in 102 Running storage device 104 of device is realized.

In one embodiment, face location detection module 410 can obtain continuous video frame frame by frame.For each Frame, face location detection module 410 can carry out face location detection to it.In one example, face location detection module The operation that video frame acquired in 410 pairs carries out face location detection may include detecting at least one of the following：Face position Set coordinate, face frame size, human face posture and facial image fuzziness.In other examples, face location detection module 410 The operation for carrying out face location detection to acquired video frame can also include detecting any other suitable item.

In an embodiment of the present invention, it is not necessarily after face location detection module 410 carries out face location detection vertical Face character detection is carried out, but is determined whether to meet predetermined condition by judgment module 420, and when meeting predetermined conditions Execute face detection of attribute.Wherein, judgment module 420 can be the module independently of face location detection module 410, can also be with The module being included in face location detection module 410.Since face character detection neural network usually has more volumes Lamination, operation face character detection neural network take a long time, therefore just operation face character inspection when meeting predetermined conditions The number of run of face character detection neural network can generally be reduced by surveying neural network, improve unit time processing video frame Efficiency.

In an embodiment of the present invention, for current acquired each frame, face location detection module 410 to its into After pedestrian's face position detection, judgment module 420 can be determined based on the face location testing result of the frame in the frame whether include The face for the people for including in the video frame previously obtained.Judgment module 420 can be understood in conjunction with the example referring to described in Fig. 2 Exemplary operation, for sake of simplicity, details are not described herein again.

Illustratively, the video frame previously obtained can be only exported when judgment module 420 exports the video frame previously obtained In be best suited for carry out face character detection part be further reduced face character detection calculation amount.For example, judging mould Block 420 can export the video frame that face output position testing result is best in the video frame previously obtained to carry out face character Detection.Illustratively, the best video frame of face location testing result can be facial image fuzziness lower than predetermined threshold and The optimal video frame of human face posture.For another example, judgment module 420 can extract the small of optimal quality from the video frame previously obtained The output of size (such as 128*128) image is for carrying out face character detection.

In one embodiment, judgment module 420 can determine the video currently obtained by the matched method of face In frame whether include the people for including in the video frame that had previously obtained face.In another embodiment, judgment module 420 can be with The location information for the face frame that detected according to before and after frames judges in the video frame currently obtained whether to include previously having obtained Video frame in include people face.

In yet another embodiment, face location detection module 410 carries out the people that face location detects to video frame Face position detection result may include face identification number (track id), and the face identification number of identical people is identical.

Based on this, whether it includes wrapping in the video frame previously obtained that judgment module 420 determines in the video frame currently obtained The face of the people included may include：It determines in the face identification number in the video frame currently obtained and the video frame previously obtained Face identification number compared to whether change.If it is, determining that in the video frame currently obtained do not include previously having obtained Video frame in include people face；If it is not, then determining includes the video frame previously obtained in the video frame currently obtained In include people face.

In one embodiment, face character detection module 430 may include detection to the face character detection of video frame At least one of the following：Age, gender, the nationality etc. of people corresponding with face.

In one embodiment, face character detection module 430 can detect neural network using face character to implement Above-mentioned face character detection.Similarly, face location detection module 410 can detect neural network using face location to implement Above-mentioned face location detection.

Based on the operation of judgment module 420, so that the number that face character detection module 430 carries out face character detection is big It is big to reduce, a face character detection neural network may be just run after the operation neural network of face location detection several times, greatly The efficiency of unit time processing video frame is improved greatly.In addition, the operation based on judgment module 420, so that face category is given in output Property detection module 430 be top-quality figure so that the confidence level of each attribute also obtains peak, to ensure that entire The quality of Face datection.

Based on above description, human face detection device according to an embodiment of the present invention not exists for acquired video frame Face character detection is directly carried out after carrying out face location detection, but just when identical face interrupts in the video frame The detection of attribute of the face is carried out to the video frame before interruption, so that the detection of attribute number of run that time-consuming subtracts significantly on the whole It is small, to reduce time-consuming, improve the efficiency of unit time processing video frame.

Fig. 5 shows the schematic block diagram of face detection system 500 according to an embodiment of the present invention.Face detection system 500 include storage device 510 and processor 520.

Wherein, the storage of storage device 510 is for realizing the corresponding step in method for detecting human face according to an embodiment of the present invention Rapid program code.Program code of the processor 520 for being stored in Running storage device 510, it is real according to the present invention to execute The corresponding steps of the method for detecting human face of example are applied, and for realizing the phase in human face detection device according to an embodiment of the present invention Answer module.

In one embodiment, when said program code is run by processor 520 face detection system 500 is executed Following steps：Video frame is obtained, and face location detection is carried out to acquired video frame, obtains the acquired video frame Face location testing result；Determine in the video frame currently obtained whether include previous based on the face location testing result The face for the people for including in the video frame of acquisition, the step of if it is returning to the acquisition video frame, described on the contrary then output At least one video frame previously obtained；And the face location testing result pair of at least one video frame based on the output The video frame of the output carries out face character detection.

In one embodiment, the face location testing result includes face identification number, the face mark of identical people Know that number is identical, and make when said program code is run by processor 520 that face detection system 500 executes it is described really It whether include that the face of the people for including includes in the video frame that had previously obtained in the video frame obtained before settled：Determine current obtain Video frame in face identification number whether change compared with the face identification number in the video frame previously obtained, such as Fruit be do not include the face for the people for including in the determining video frame currently obtained in the video frame previously obtained, if otherwise really It include the face for the people for including in the video frame obtained before settled in the video frame previously obtained.

In one embodiment, when said program code is run by processor 520 face detection system 500 is executed Described at least one video frame previously obtained of the output include：Export face location in the video frame previously obtained The best video frame of testing result is for carrying out the face character detection.

In one embodiment, when said program code is run by processor 520 face detection system 500 is executed Described to carry out face location detection to acquired video frame include detecting at least one of the following：Face location coordinate, Face frame size, human face posture and facial image fuzziness.

In one embodiment, the best video frame of the face location testing result is facial image fuzziness lower than pre- Determine threshold value and the optimal video frame of human face posture.

In one embodiment, when said program code is run by processor 520 face detection system 500 is executed The face location detection by face location detection neural network implement, said program code by processor 520 run when So that the face character detection that face detection system 500 executes is implemented by face character detection neural network.

In addition, according to embodiments of the present invention, additionally providing a kind of storage medium, storing program on said storage Instruction, when described program instruction is run by computer or processor for executing the method for detecting human face of the embodiment of the present invention Corresponding steps, and for realizing the corresponding module in human face detection device according to an embodiment of the present invention.The storage medium It such as may include the storage card of smart phone, the storage unit of tablet computer, the hard disk of personal computer, read-only memory (ROM), Erasable Programmable Read Only Memory EPROM (EPROM), portable compact disc read-only memory (CD-ROM), USB storage, Or any combination of above-mentioned storage medium.The computer readable storage medium can be one or more computer-readable deposit Any combination of storage media.

In one embodiment, the computer program instructions may be implemented real according to the present invention when being run by computer Each functional module of the human face detection device of example is applied, and/or Face datection according to an embodiment of the present invention can be executed Method.

In one embodiment, the computer program instructions make computer or place when being run by computer or processor It manages device and executes following steps：Video frame is obtained, and face location detection is carried out to acquired video frame, is obtained described acquired Video frame face location testing result；Based on the face location testing result determine in the video frame that currently obtains whether Face including the people for including in the video frame that had previously obtained, the step of if it is returning to the acquisition video frame, it is on the contrary then Export described at least one video frame previously obtained；And the face location inspection of at least one video frame based on the output It surveys result and face character detection is carried out to the video frame of the output.

In one embodiment, the face location testing result includes face identification number, the face mark of identical people It is identical to know number, and the computer program instructions execute computer or processor when being run by computer or processor The video frame that currently obtains of the determination in whether include that the face of the people for including includes in the video frame previously obtained：It determines Whether the face identification number in the video frame currently obtained is sent out compared with the face identification number in the video frame previously obtained It is raw to change, it does not if it is include the face for the people for including in the determining video frame currently obtained in the video frame previously obtained, If otherwise in the determining video frame currently obtained including the face for the people for including in the video frame previously obtained.

In one embodiment, the computer program instructions make computer or place when being run by computer or processor Managing described at least one video frame previously obtained of the output that device executes includes：It exports in the video frame previously obtained The best video frame of face location testing result is for carrying out the face character detection.

In one embodiment, the computer program instructions make computer or place when being run by computer or processor Managing described detect to acquired video frame progress face location that device executes includes detecting at least one of the following：Face position Set coordinate, face frame size, human face posture and facial image fuzziness.

In one embodiment, the computer program instructions make computer or place when being run by computer or processor It manages the face location detection that device executes to be implemented by face location detection neural network, the computer program instructions are being counted Calculation machine or processor execute computer or processor face character when running, which is detected, detects nerve net by face character Network is implemented.

In addition, according to embodiments of the present invention, additionally providing a kind of capture machine, which may include image collecting device And human face detection device.Wherein, which can be used for acquiring candid photograph image, which can be to figure The image obtained as acquisition device carries out face location detection and face character detection.Wherein, which can adopt It is realized, or can also be examined using previously in conjunction with face described in Fig. 5 with previously in conjunction with human face detection device 400 described in Fig. 4 Examining system 500 is realized.Those of ordinary skill in the art can understand according to the present invention in conjunction with the description previously for Fig. 4 or Fig. 5 The operation of human face detection device included by the capture machine of embodiment and the operation of the capture machine, for sake of simplicity, herein no longer It repeats.

Although describing example embodiment by reference to attached drawing here, it should be understood that above example embodiment are only exemplary , and be not intended to limit the scope of the invention to this.Those of ordinary skill in the art can carry out various changes wherein And modification, it is made without departing from the scope of the present invention and spiritual.All such changes and modifications are intended to be included in appended claims Within required the scope of the present invention.

Those of ordinary skill in the art may be aware that list described in conjunction with the examples disclosed in the embodiments of the present disclosure Member and algorithm steps can be realized with the combination of electronic hardware or computer software and electronic hardware.These functions are actually It is implemented in hardware or software, the specific application and design constraint depending on technical solution.Professional technician Each specific application can be used different methods to achieve the described function, but this realization is it is not considered that exceed The scope of the present invention.

In several embodiments provided herein, it should be understood that disclosed device and method can pass through it Its mode is realized.For example, apparatus embodiments described above are merely indicative, for example, the division of the unit, only Only a kind of logical function partition, there may be another division manner in actual implementation, such as multiple units or components can be tied Another equipment is closed or is desirably integrated into, or some features can be ignored or not executed.

In the instructions provided here, numerous specific details are set forth.It is to be appreciated, however, that implementation of the invention Example can be practiced without these specific details.In some instances, well known method, structure is not been shown in detail And technology, so as not to obscure the understanding of this specification.

Similarly, it should be understood that in order to simplify the present invention and help to understand one or more of the various inventive aspects, To in the description of exemplary embodiment of the present invention, each feature of the invention be grouped together into sometimes single embodiment, figure, Or in descriptions thereof.However, the method for the invention should not be construed to reflect following intention：It is i.e. claimed The present invention claims features more more than feature expressly recited in each claim.More precisely, such as corresponding power As sharp claim reflects, inventive point is that the spy of all features less than some disclosed single embodiment can be used Sign is to solve corresponding technical problem.Therefore, it then follows thus claims of specific embodiment are expressly incorporated in this specific Embodiment, wherein each, the claims themselves are regarded as separate embodiments of the invention.

It will be understood to those skilled in the art that any combination pair can be used other than mutually exclusive between feature All features disclosed in this specification (including adjoint claim, abstract and attached drawing) and so disclosed any method Or all process or units of equipment are combined.Unless expressly stated otherwise, this specification (is wanted including adjoint right Ask, make a summary and attached drawing) disclosed in each feature can be replaced with an alternative feature that provides the same, equivalent, or similar purpose.

In addition, it will be appreciated by those of skill in the art that although some embodiments described herein include other embodiments In included certain features rather than other feature, but the combination of the feature of different embodiments mean it is of the invention Within the scope of and form different embodiments.For example, in detail in the claims, embodiment claimed it is one of any Can in any combination mode come using.

Various component embodiments of the invention can be implemented in hardware, or to run on one or more processors Software module realize, or be implemented in a combination thereof.It will be understood by those of skill in the art that can be used in practice Microprocessor or digital signal processor (DSP) realize some or all of some modules according to an embodiment of the present invention Function.The present invention is also implemented as some or all program of device (examples for executing method as described herein Such as, computer program and computer program product).It is such to realize that program of the invention can store in computer-readable medium On, or may be in the form of one or more signals.Such signal can be downloaded from an internet website to obtain, or Person is provided on the carrier signal, or is provided in any other form.

It should be noted that the above-mentioned embodiments illustrate rather than limit the invention, and ability Field technique personnel can be designed alternative embodiment without departing from the scope of the appended claims.In the claims, Any reference symbol between parentheses should not be configured to limitations on claims.Word "comprising" does not exclude the presence of not Element or step listed in the claims.Word "a" or "an" located in front of the element does not exclude the presence of multiple such Element.The present invention can be by means of including the hardware of several different elements and being come by means of properly programmed computer real It is existing.In the unit claims listing several devices, several in these devices can be through the same hardware branch To embody.The use of word first, second, and third does not indicate any sequence.These words can be explained and be run after fame Claim.

The above description is merely a specific embodiment or to the explanation of specific embodiment, protection of the invention Range is not limited thereto, and anyone skilled in the art in the technical scope disclosed by the present invention, can be easily Expect change or replacement, should be covered by the protection scope of the present invention.Protection scope of the present invention should be with claim Subject to protection scope.

Claims

1. a kind of method for detecting human face, which is characterized in that the method for detecting human face includes：

Video frame is obtained, and face location detection is carried out to acquired video frame, obtains the people of the acquired video frame Face position detection result；

Determine in the video frame currently obtained whether include in the video frame previously obtained based on the face location testing result Including people face, it is the step of if it is returning to the acquisition video frame, on the contrary then export described previously obtained at least One video frame；And

The face location testing result of at least one video frame based on the output carries out face to the video frame of the output Detection of attribute.

2. method for detecting human face according to claim 1, which is characterized in that the face location testing result includes face Whether the face identification number of identification number, identical people is identical, and include first in the video frame that currently obtains of the determination The face for the people for including in the video frame of preceding acquisition includes：

Determine face identification number in the video frame that currently obtains and the face identification number phase in the video frame previously obtained Than whether changing, if it is determine that in the video frame currently obtained do not include the people for including in the video frame previously obtained Face, if otherwise determine in the video frame that currently obtains include the people for including in the video frame previously obtained face.

3. method for detecting human face according to claim 1 or 2, which is characterized in that it is described output it is described previously obtained to A video frame includes less：

The video frame that face location testing result is best in the video frame previously obtained is exported for carrying out the face Detection of attribute.

4. method for detecting human face according to claim 3, which is characterized in that described to carry out face to acquired video frame Position detection includes detecting at least one of the following：Face location coordinate, face frame size, human face posture and facial image Fuzziness.

5. method for detecting human face according to claim 4, which is characterized in that the best view of the face location testing result Frequency frame is facial image fuzziness lower than predetermined threshold and the optimal video frame of human face posture.

6. method for detecting human face according to claim 1, which is characterized in that the face location detection is examined by face location It surveys neural network to implement, the face character detection is implemented by face character detection neural network.

7. a kind of human face detection device, which is characterized in that the human face detection device includes：

Face location detection module obtains institute for obtaining video frame and carrying out face location detection to acquired video frame State the face location testing result of acquired video frame；

Judgment module, for determining in the video frame currently obtained whether include previously having obtained based on the face location testing result The face for the people for including in the video frame taken, if it is not, then described at least one video frame previously obtained of output；And

The face location of face character detection module, at least one video frame for being exported based on the judgment module detects knot Fruit carries out face character detection to the video frame of the output.

8. method for detecting human face according to claim 7, which is characterized in that the face location testing result includes face The face identification number of identification number, identical people is identical, and the judgment module is further used for：

9. human face detection device according to claim 7 or 8, which is characterized in that the judgment module output is described previously The operation of at least one video frame obtained includes：

10. human face detection device according to claim 9, which is characterized in that the face location detection module is to being obtained It includes detecting at least one of the following that the video frame taken, which carries out face location detection,：Face location coordinate, face frame size, people Face posture and facial image fuzziness.

11. human face detection device according to claim 10, which is characterized in that the face location testing result is best Video frame is facial image fuzziness lower than predetermined threshold and the optimal video frame of human face posture.

12. human face detection device according to claim 7, which is characterized in that the face location detection module uses people Face position detection neural network implements the face location detection, and the face character detection module is using face character detection mind It is detected through face character described in network implementation.

13. a kind of face detection system, which is characterized in that the face detection system includes storage device and processor, described The computer program run by the processor is stored on storage device, the computer program is run by the processor Method for detecting human face of the Shi Zhihang as described in any one of claim 1-6.

14. a kind of storage medium, which is characterized in that be stored with computer program, the computer program on the storage medium The method for detecting human face as described in any one of claim 1-6 is executed at runtime.

15. a kind of capture machine, which is characterized in that the capture machine includes any in image collecting device and claim 7-12 Human face detection device described in.