WO2018068533A1

WO2018068533A1 - Human face detection method and device

Info

Publication number: WO2018068533A1
Application number: PCT/CN2017/090296
Authority: WO
Inventors: 韦国恒; 蒋文; 杨龙
Original assignee: 深圳云天励飞技术有限公司
Priority date: 2016-10-10
Filing date: 2017-06-27
Publication date: 2018-04-19
Also published as: CN106529408A; CN106529408B

Abstract

A human face detection method and device. The method comprises: inputting an image, and dividing the image into regions to be scanned (101); configuring double threads for each of the regions to be scanned, and controlling first and second threads to process a first flow and a second flow in parallel (102), i.e. the second thread processing the last second flow while the first thread processing the first flow; configuring and fixing parameters of a classifier with regard to former M cycles of the first thread; configuring a multi-engine, wherein the multi-engine can scan a plurality of regions to be scanned in parallel; configuring a control module for tracking a working state of each single engine, recording the state of each region to be scanned, and coordinating the work of each single engine. By means of the method, human face detection can be realized by means of a new FPGA hardware angle, time consumption is saved, and the speed and performance of human face detection is improved.

Description

Method and device for face detection

The present application claims priority to Chinese Patent Application No. 201610884062.2, entitled "Method and Apparatus for Face Detection", filed on October 10, 2016, the entire contents of which is incorporated herein by reference. In the application.

Technical field

The present invention relates to the field of face detection, and in particular, to a method and apparatus for face detection.

Background technique

At present, face detection technology is widely used in a new generation of human-machine interface, secure access and video surveillance, and content-based retrieval. Existing face detection implementations include a software-only implementation and a Field-Programmable Gate Array (FPGA) acceleration scheme. The software implementation is very slow. If it is a 640*480 image, it usually takes about one or two seconds. The existing FPGA acceleration, the general speed is also within 10fps.

Real-time FPGA face capture system needs to be applied to very high-end FPGA chips. A common process of applying the FPGA-accelerated face detection algorithm is to first scan the image area and compare it by the classifier. If the comparator passes all the way, the face information will be output, otherwise, the image will be re-scanned. However, in the existing cascaded face recognition algorithm, the comparison of each step classifier needs to be completed in the previous step. For the image area of the face, all the classifiers need to be compared before the final score value can be obtained. Indicates the proximity of the face to the real face). In addition, each classifier has no fixed pattern for the reading mode of the image area, so it is difficult to perform parallel processing. And because a large number of image areas need to be scanned, processing a picture is slow. This causes the problem of slow and inefficient face detection.

Summary of the invention

The embodiment of the invention discloses a method and a device for detecting a face, which can implement face detection from a new field-programmable gate array (FPGA), which can solve the existing face detection. The technical problem of slow speed and low efficiency has the advantages of high speed and high performance.

A first aspect of the embodiments of the present invention discloses a method for detecting a face, including:

Inputting a picture, dividing the picture into an area to be scanned; configuring a dual thread for each of the areas to be scanned, and controlling the first thread and the second thread to process the first process and the second process in parallel;

The first process includes steps a to e, and specifically includes:

a calculating a read address of the classifier in the area to be scanned;

b. reading a value of a pixel in the to-be-scanned area according to a read address of the classifier;

c. analogy of the values of the read pixel points;

d. According to the analogy result of the previous step, select the next classifier;

e. Repeat step a, the number of repetitions is determined by the height of the tree structure of the classifier;

The second process includes the operation steps f to k, and specifically includes:

f. reading the analog score in the lookup table according to the analogy result of the first process;

g. accumulating the analog scores read in the previous step to obtain a score of the area to be scanned;

h. comparing the score in the area to be scanned with a preset threshold;

i. If the comparison result of step h is negative, indicating that the area to be scanned has no face, then scanning the next area to be scanned;

j. If the comparison result of the step h is positive, jump back to repeating step a;

k. After the number of repetitions is completed, if the final comparison result is positive, the face information is output, wherein the face information includes coordinates, size, and score values.

In an optional method, in the method provided by the first aspect, the controlling the first thread and the second thread to process the first process and the second process in parallel, including: processing the first thread in the first thread At the same time, the second thread processes the second process of the previous time.

In an alternative, in the method provided by the first aspect, the input picture comprises:

The entire picture is compressed; the compressed picture is all buffered into the on-chip random access memory RAM of the field programmable gate array hardware FPGA.

In an optional method, in the method provided by the first aspect, before the controlling the first thread and the second thread to process the first process and the second process in parallel, the method further includes: configuring and fixing the The parameters of the classifier of the first M cycles of the first process, wherein the M is a positive integer greater than zero.

In an alternative, in the method provided by the first aspect, the method further includes: configuring a multiple engine, configuring dual threads inside each single engine of the multiple engines, and controlling the first thread, the second The thread processes the first process and the second process in parallel, the multiple engines scan multiple regions to be scanned in parallel, and configure a control module, the control module is configured to track the working state of each of the single engines. And recording the status of each of the areas to be scanned, and assigning the work of each of the single engines.

A first aspect of the embodiments of the present invention discloses a device for detecting a face, comprising: an input unit, configured to input a picture, divide the picture into an area to be scanned, and a processing unit, configured to: for each of the to-be-scanned The area is configured to be dual-threaded, and the first thread and the second thread are controlled to process the first process and the second process in parallel.

The first process includes steps a to e, and specifically includes:

a calculating a read address of the classifier in the area to be scanned;

c. analogy of the values of the read pixel points;

h. comparing the score in the area to be scanned with a preset threshold;

i.. If the comparison result of the step h is negative, it indicates that the area to be scanned has no face, and the next area to be scanned is scanned;

j. If the comparison result of the step h is negative, jump back to repeating step a;

In an optional aspect, in the apparatus provided by the first aspect, the controlling the first thread and the second thread to process the first process and the second process in parallel, including: processing, in the first thread, the In a process, the second thread simultaneously processes the second process of the last time.

In an optional aspect, in the apparatus provided in the first aspect, the input to be scanned image includes:

The entire picture to be scanned is compressed; the pictures after compression are all buffered into the on-chip random access memory RAM of the field programmable gate array hardware FPGA.

In an alternative, in the apparatus provided in the first aspect, the apparatus further includes:

a parameter fixing unit configured to configure and fix parameters of the classifier of the first M cycles of the first process, wherein the M is a positive integer greater than zero.

a multi-engine configuration unit, configured to configure a multi-engine, each of the multi-engines internally configuring a dual thread, and controlling the first thread and the second thread to process the first process and the second process in parallel, The multi-engine is configured to scan a plurality of areas to be scanned in parallel; a control module configuration unit configured to configure a control module, the control module is configured to track an operation status of each of the single engines, and record each The state of the area to be scanned, the work of each of the single engines is assigned.

In the embodiment of the present invention, by compressing the entire picture to be scanned, and buffering all the compressed pictures into the on-chip random access memory RAM of the field programmable gate array hardware FPGA, dual threads are configured for each area to be scanned. Controlling the first thread and the second thread to process the first process and the second process in parallel, that is, when the first thread processes the first process, the second thread processes the second process at the same time; in addition, configuring and fixing the first process a parameter of the classifier of the first M cycles, M is a positive integer greater than 0; configuring a multi-engine, scanning multiple regions to be scanned in parallel; configuring a control module, the control module is used to track the The working state of each engine is described, and the status of each area to be scanned is recorded, and the work of each engine is assigned. It can be seen that the technical solution provided by the embodiment of the present invention can save the elapsed time of the face detection and improve the speed and performance of the face detection.

DRAWINGS

In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the embodiments will be briefly described below. It is obvious that the drawings in the following description are some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without paying any creative work.

FIG. 1 is a schematic flow chart of a method for detecting a face according to an embodiment of the present invention.

FIG. 2 is a schematic diagram of an implementation scenario of a multi-engine disclosed in an embodiment of the present invention.

FIG. 3 is a schematic flowchart diagram of another method for detecting a face according to an embodiment of the present invention.

FIG. 4 is a schematic structural diagram of an apparatus for detecting a face according to an embodiment of the present invention.

FIG. 5 is a schematic diagram of a hardware architecture of a terminal device of an apparatus for performing the face detection according to an embodiment of the present invention.

detailed description

Before discussing the exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as a process or method depicted as a flowchart. Although the flowcharts describe various operations as a sequential process, many of the operations can be implemented in parallel, concurrently or concurrently. In addition, the order of operations can be rearranged. The process may be terminated when its operation is completed, but may also have additional steps not included in the figures. The processing may correspond to methods, functions, procedures, subroutines, subroutines, and the like.

By "computer device", also referred to as "computer" in the context, is meant an intelligent electronic device that can perform predetermined processing, such as numerical calculations and/or logical calculations, by running a predetermined program or instruction, which can include a processor and The memory is executed by the processor to execute a predetermined process pre-stored in the memory to execute a predetermined process, or is executed by hardware such as an ASIC, an FPGA, a DSP, or the like, or a combination of the two. Computer devices include, but are not limited to, servers, personal computers, notebook computers, tablets, smart phones, and the like.

The methods discussed below, some of which are illustrated by flowcharts, can be implemented in hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to carry out the necessary tasks can be stored in a machine or computer readable medium, such as a storage medium. The processor(s) can perform the necessary tasks.

The specific structural and functional details disclosed are merely representative and are for the purpose of describing exemplary embodiments of the invention. The present invention may, however, be embodied in many alternative forms and should not be construed as being limited only to the embodiments set forth herein.

It should be understood that although the terms "first," "second," etc. may be used herein to describe the various elements, these elements should not be limited by these terms. These terms are used only to distinguish one unit from another. For example, a first unit could be termed a second unit, and similarly a second unit could be termed a first unit, without departing from the scope of the exemplary embodiments. The term "and/or" used herein includes any and all combinations of one or more of the associated listed items.

The terminology used herein is for the purpose of describing the particular embodiments, The singular forms "a", "an", It is also to be understood that the terms "comprising" and """ Other features, integers, steps, operations, units, components, and/or combinations thereof.

It should also be noted that, in some alternative implementations, the functions/acts noted may occur in a different order than that illustrated in the drawings. For example, two figures shown in succession may in fact be executed substantially concurrently or sometimes in the reverse order, depending on the function/acts involved.

The embodiment of the invention discloses a method and a device for detecting a face, which can realize face detection from a new Field-Programmable Gate Array (FPGA) perspective and improve the speed of face detection. . The details are described below separately.

Please refer to FIG. 1. FIG. 1 is a schematic flowchart diagram of a method for detecting a face according to an embodiment of the present invention. As shown in FIG. 1, the method for detecting a face may include the following steps:

S101. Input a picture, and divide the picture into areas to be scanned.

The inputting the picture in the above step S101 includes: first, compressing the entire picture, and the size of the compression is determined according to the picture resolution and the memory space; then, the compressed picture is all buffered into the field programmable gate array hardware (Field-Programmable Gate) Array (FPGA) is an on-chip random access memory (RAM) that allows each classifier to quickly read pixels frequently.

S102. Configure a dual thread for each area to be scanned, and control the first thread and the second thread to process the first process and the second process in parallel.

The first process in the foregoing step S102 includes the operation steps a to e, and specifically includes:

a calculating the read address of the classifier in the area to be scanned;

b. reading a value of a pixel in the area to be scanned according to the read address of the classifier;

c. analogy of the values of the read pixel points;

The second process in the foregoing step S102 includes the operation steps f to k, and specifically includes:

f. According to the analogy result of the first process, the analog score is read in the lookup table;

g. accumulating the analog scores read in the previous step to obtain the score of the area to be scanned;

h. comparing the score in the area to be scanned with a preset threshold;

j. If the comparison result of step h is positive, jump back to repeating step a;

k. After the number of repetitions is completed, if the final comparison is positive, the face information is output, wherein the face information includes coordinates, size, and score value.

In the above step S102, the dual thread is configured to control the first thread and the second thread to process the first process and the second process in parallel. The first thread processes the first process, and the second thread processes the second process. For example, after inputting the image of the area to be scanned in the first partition, first, the first thread executes the operation step of the first process, and after the first process ends, the analog result is directly output to the second process of the second thread, and the first thread directly transfers. The first process of executing the next partition to be scanned is performed instead of waiting for the end of the second process. Thus, for the image area of the face, it is not necessary to wait for all the classifiers to complete the comparison before the final score value is obtained. The first thread and the second thread are two independent threads, and the first process and the second process can be processed in parallel, which can improve the timeliness. In general, if the overall execution time of the first process is about 60% of the overall time, the overall execution time of the second process is about 40% of the overall time. By setting up two dual threads, controlling the first thread and the first The two threads process the first process and the second process in parallel, which can save more than 30% of the time.

Optionally, the classifier is a scan window of the engine (ie, the calculation unit), and it is calculated according to an algorithm to determine which pixel value to take for comparison.

Optionally, if the comparison result of the step h is positive in the step j of the above step S102, the step a is repeated, because if the comparison result of the step h is positive, it indicates that the current loop result detection of the program is a human face. However, since step S102 is a multi-cycle operation, multiple loop repetitions are required to finally confirm and output the face information, that is, step k: after the number of repetitions is completed, if the final comparison result is positive, the face is output. information. This can ensure the accuracy of face detection, wherein the number of times of jumping back to repeating step a is determined by an algorithm.

Optionally, the score value in the step k of the step S102 indicates the proximity of the face to the real face, and the score value corresponds to the level of the proximity of the face to the real face, and the score value is The higher the value, the closer the face to be scanned is to the real face.

Optionally, the foregoing step S102, before controlling the first thread and the second thread to process the first process and the second process in parallel, further includes: configuring and fixing parameters of the first M cycles of the classifier of the first process, where M is A positive integer greater than zero. Because most scanned areas will exit after a few cycles. So for this shortcoming, the first few loops of the classifier parameters of the first process are solidified in each engine. In the first few cycles, the engine can operate without reading the parameters of the classifier, thus avoiding unnecessary RAM reading and speeding up the implementation of the face detection algorithm.

It can be understood that, where M is a positive integer greater than 0, optionally, in a specific embodiment of an embodiment of the present invention, M is preferably set to 8, that is, the first 8 cycles of the first process are configured and fixed. The parameters of the classifier. Of course, in practical applications, M can also be set to other numbers.

The technical solution provided by the embodiment of the present invention compresses the entire picture and buffers all the compressed data into a Field-Programmable Gate Array (FPGA) on-chip random access memory (RAM). So that each classifier can quickly read the pixels frequently, and then divide the picture into areas to be scanned, configure dual threads for each area to be scanned, and control the first thread and the second thread in parallel. The first process and the second process are processed, wherein the first process includes steps a and e, and the second process includes steps f to k. Since the first thread and the second thread are two independent threads, when the first thread processes the first process, and the second thread processes the last second process, the aging is improved; and the header of the process 1 is additionally set and fixed. Several loops of classifier parameters can avoid unnecessary RAM reads. Therefore, the technical solution provided by the embodiment of the present invention can implement the face detection by using the FPGA hardware, and can improve the speed of the face detection method when adding very few resources, and has the advantages of high efficiency and good performance. And it can also run on low-end FPGAs.

Please refer to FIG. 3. FIG. 3 is a schematic flowchart diagram of another method for detecting a face according to an embodiment of the present invention. The method may be implemented in a network architecture as shown in FIG. 2. As shown in FIG. 2, the embodiment of the present invention discloses a multi-engine implementation scenario, which specifically includes: a control module 21, a multi-engine module 22, and a memory module 23, wherein the control module 21 can manage the multi-engine module 22 and the memory module. twenty three. The multi-engine module 22 includes eight engines of the engine 0 to the engine 7. Of course, in actual applications, other numbers of multiple engines may be configured; the memory module 23 includes a classifier parameter independent memory, a classifier parameter shared memory, and a pixel value. Memory, lookup table parameter memory, threshold memory, face information memory, different memories can be stored and multiplexed. As shown in FIG. 3, the embodiment of the present invention discloses a flow chart of another method for detecting a face, and the method for detecting a face may include:

S301, configuring multiple engines, configuring dual threads in each single engine of the multiple engines, controlling the first thread and the second thread to process the first process and the second process in parallel, and the multiple engines scan multiple regions to be scanned in parallel.

The first flow in the above step S301 includes steps a and e, and the second flow includes steps f to k. The specific explanations of steps a and e and steps f to k are the same as those in FIG.

The engine in the above step S301 may be a computing unit of hardware.

It can be understood that the multiple engines are configured in the above step S301, and the pictures can be processed by multiple computing units at the same time. For each area that needs to be scanned, scan through a single engine and configure multiple engines to scan multiple areas in parallel. For example, if 8 engines are implemented, 8 regions can be scanned simultaneously, and even if the optimization processing inside the single engine is ignored, the entire algorithm can be accelerated by 8 times. Of course, the number of multiple engines is determined by system resources and algorithm design.

S302. Configure a control module, configured to track an operating status of each of the single engines, and record a status of each area to be scanned, and allocate the work of each of the single engines.

In the above step S302, the control module is configured to track the working status of each of the single engines, and record the status of each area to be scanned, and allocate the work of each of the engines. Because the exit time of different areas is different when scanning for each different area, the control module can track the working status of each engine, and record the status of each area that needs to be scanned, and coordinate the work of each engine. It can avoid some engine workloads, but at the same time some engines are idle and unreasonable working conditions, so that each engine can operate efficiently.

The technical solution provided by the embodiment of the present invention configures a multi-engine, and configures a dual-thread parallel processing of the first process and the second process in each single engine of the multiple engines, and the multiple engines can simultaneously process different areas of the image, and further A control module is configured to track the working state of each engine, and record the status of each area to be scanned, and assign the work of each engine to ensure that each engine can operate efficiently. Therefore, the technical solution provided by the embodiment of the present invention can improve the performance of image processing and improve the speed of face detection.

Please refer to FIG. 4. FIG. 4 is a schematic structural diagram of an apparatus for detecting a face according to an embodiment of the present invention. An apparatus for detecting a face according to an embodiment of the present invention includes:

The input unit 401 is configured to input a picture to be scanned.

Optionally, the inputting the to-be-scanned image in the input unit 401 includes: first, compressing the entire image, and the size of the compression is determined according to the image resolution and the memory space; then, the compressed image is all buffered into the field programmable gate. Field-Programmable Gate Array (FPGA), on-chip random access memory (RAM), enables each classifier to quickly read pixels frequently.

The processing unit 402 is configured to configure a dual thread for each of the areas to be scanned, and control the first thread and the second thread to process the first process and the second process in parallel.

Optionally, the first process in the processing unit 402 includes the operating steps a to e, and specifically includes:

a calculating the read address of the classifier in the area to be scanned;

c. analogy of the values of the read pixel points;

e. Repeat step a, the number of repetitions is determined by the height of the classifier's tree structure.

Optionally, the second process in the processing unit 402 includes the operating steps f to k, and specifically includes:

h. comparing the score in the area to be scanned with a preset threshold;

i. If the comparison result of the step h is negative, it indicates that the area to be scanned has no face, and the next area to be scanned is scanned;

Configuring the dual thread in the processing unit 402, controlling the first thread and the second thread to process the first process and the second process in parallel, when the first thread processes the first process, and the second thread processes the second process. . For example, after inputting the image of the area to be scanned in the first partition, first, the first thread executes the operation step of the first process, and after the first process ends, the analog result is directly output to the second process of the second thread, and the first thread directly transfers. The first process of executing the next partition to be scanned is performed instead of waiting for the end of the second process. Thus, for the image area of the face, it is not necessary to wait for all the classifiers to complete the comparison before the final score value is obtained. The first thread and the second thread are two independent threads, and the first process and the second process can be processed in parallel, which can improve the timeliness. In general, if the overall execution time of the first process is about 60% of the overall time, the overall execution time of the second process is about 40% of the overall time. By setting up two dual threads, controlling the first thread and the first The two threads process the first process and the second process in parallel, which can save more than 30% of the time.

Optionally, the foregoing apparatus may further include:

The parameter fixing unit 403 is configured to configure and fix parameters of the first M cyclic classifiers of the first process, and M is a positive integer greater than 0.

Understandably, because most scanned areas will exit after a few cycles. Therefore, in the first few cycles, the engine can operate without reading the parameters of the classifier, thereby avoiding unnecessary RAM reading and speeding up the implementation of the face detection algorithm.

a multi-engine configuration unit 404, configured to configure a multi-engine, each single-engine internal dual-thread is configured to control the first thread and the second thread to process the first process and the second process in parallel, and the multiple engines can be parallel Scan a plurality of areas to be scanned.

Optionally, each single engine of the multiple engines in the multi-engine configuration unit 404 is internally configured with dual threads, and the first thread and the second thread are controlled to process the first process and the second process in parallel, where the first process includes steps. a and e, the second flow includes steps f to k, and the specific explanation is the same as in the above-described processing unit 402.

The control module configuration unit 405 is configured to configure a control module, configured to track the working status of each single engine, and record the status of each area to be scanned, and assign the work of each of the single engines.

Optionally, the control module in the control module configuration unit 405 is configured to track the working status of each engine, and record the status of each area to be scanned, and allocate the work of each engine. Because the exit time of different regions is different when scanning for different regions, the control module can track the working state of each engine and record the state of each region that needs to be scanned, and coordinate the work of each engine. It can avoid certain engine workloads and some unreasonable working conditions of some engines, so that each engine can operate efficiently.

The modules or sub-modules in all embodiments of the present invention may be implemented by a general-purpose integrated circuit, such as a CPU, or by an ASIC (Application Specific Integrated Circuit).

Referring to FIG. 5, it is a schematic diagram of a hardware architecture of a terminal device of a device for performing the face detection disclosed in the embodiment of the present invention.

The terminal device 1 of the present invention may include a terminal device having a face detection function, such as a computer, a smart phone, a scanner, a camera, an attendance machine, and the like.

As shown in FIG. 5, the terminal device 1 in the embodiment of the present invention includes at least one processor 2, such as a CPU, at least one memory 4, and at least one communication bus 6.

The communication bus 6 is used to implement connection communication between components such as the processor 2 and the memory 4.

The memory 4 may be a high speed RAM memory or a non-volatile memory such as at least one disk memory.

The processor 2 can execute an operating system of the terminal device 1 and various installed application programs, executable program instructions, and the like. For example, each unit described above includes the input unit 401, the processing unit 402, and a parameter fixing unit. 403. A multi-engine configuration unit 404, a control module configuration unit 405, and the like.

The memory 4 stores executable program instructions, and the processor 2 can call executable program instructions stored in the memory 4 via the communication bus 6 to perform related functions. For example, the respective units described in FIG. 4 (for example, the input unit 401, the processing unit 402, the parameter fixing unit 403, the multi-engine configuration unit 404, and the control module configuration unit 405, etc.) are stored in the memory 4. Program instructions are executable and executed by the processor 2 to implement the functions of the various units to implement face detection from a new field-programmable gate array (FPGA) .

In one embodiment of the invention, the memory 4 stores a plurality of instructions that are executed by the processor 2 to implement a method of face detection. Specifically, the processor 2 inputs a picture, and divides the picture into an area to be scanned; for each area to be scanned, configures a dual thread, and controls the first thread and the second thread to process the first process in parallel and Second process;

The first process includes steps a to e, and specifically includes:

a calculating a read address of the classifier in the area to be scanned;

c. analogy of the values of the read pixel points;

h. comparing the score in the area to be scanned with a preset threshold;

It should be noted that, for the foregoing various method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the present invention is not limited by the described action sequence. Because some steps may be performed in other orders or concurrently in accordance with the present application. In the following, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present application.

In the above embodiments, the descriptions of the various embodiments are different, and the parts that are not described in detail in a certain embodiment can be referred to the related descriptions of other embodiments.

The steps in the method of the embodiment of the present invention may be sequentially adjusted, merged, and deleted according to actual needs.

The units in the user terminal in the embodiment of the present invention may be combined, divided, and deleted according to actual needs.

One of ordinary skill in the art can understand that all or part of the process of implementing the foregoing embodiments can be completed by a computer program to instruct related hardware, and the program can be stored in a computer readable storage medium. When executed, the flow of an embodiment of the methods as described above may be included. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

The method and device for detecting a face disclosed in the embodiment of the present invention are described in detail. The principles and embodiments of the present invention are described in the following. The description of the above embodiment is only used to help understand the present invention. The method of the invention and its core idea; at the same time, for the person of ordinary skill in the art, according to the idea of the present invention, there are some changes in the specific embodiment and the scope of application. In summary, the content of the specification should not be understood. To limit the invention.

Claims

A method for face detection, comprising:

Enter a picture to divide the picture into areas to be scanned;

Configuring a dual thread for each of the areas to be scanned, and controlling the first thread and the second thread to process the first process and the second process in parallel;

The first process includes steps a to e, and specifically includes:

a calculating a read address of the classifier in the area to be scanned;

b. reading a value of a pixel in the to-be-scanned area according to a read address of the classifier;

c. analogy of the values of the read pixel points;

d. According to the analogy result of the previous step, select the next classifier;

e. Repeat step a, the number of repetitions is determined by the height of the tree structure of the classifier;

The second process includes the operation steps f to k, and specifically includes:

f. reading the analog score in the lookup table according to the analogy result of the first process;

g. accumulating the analog scores read in the previous step to obtain a score of the area to be scanned;

h. comparing the score in the area to be scanned with a preset threshold;

i. If the comparison result of the step h is negative, indicating that the area to be scanned has no face, scanning the next area to be scanned;

j. If the comparison result of the step h is positive, jump back to repeating step a;

k. After the number of repetitions is completed, if the final comparison result is positive, the face information is output, wherein the face information includes coordinates, size, and score values.
The method according to claim 1, wherein the controlling the first thread and the second thread to process the first process and the second process in parallel comprises: when the first thread processes the first process, simultaneously The second thread processes the second process of the previous time.
The method of claim 1 wherein said input picture comprises:

Compressing the entire picture;

The compressed pictures are all buffered into the on-chip random access memory RAM of the field programmable gate array hardware FPGA.
The method according to claim 1, wherein the method further comprises: configuring and fixing the first process before the controlling the first thread and the second thread to process the first process and the second process in parallel The parameters of the classifier of the first M cycles, wherein the M is a positive integer greater than zero.
The method of claim 1 further comprising:

Configuring a multi-engine, configuring dual threads in each single engine of the multiple engines, and controlling the first thread and the second thread to process the first process and the second process in parallel, the multi-engine Scanning a plurality of areas to be scanned in parallel;

And a control module, configured to track an operating state of each of the single engines, and record a status of each of the areas to be scanned, and allocate the work of each of the single engines.
A device for detecting a face, comprising:

An input unit, configured to input a picture, and divide the picture into an area to be scanned;

a processing unit, configured to configure a dual thread for each of the areas to be scanned, and control the first thread and the second thread to process the first process and the second process in parallel;

The first process includes steps a to e, and specifically includes:

e. calculating a read address of the classifier in the area to be scanned;

f. reading a value of a pixel in the to-be-scanned area according to a read address of the classifier;

g. analogy of the value of the read pixel point;

h. Select the next classifier based on the analogy result from the previous step;

e. Repeat step a, the number of repetitions is determined by the height of the tree structure of the classifier;

The second process includes the operation steps f to k, and specifically includes:

f. reading the analog score in the lookup table according to the analogy result of the first process;

g. accumulating the analog scores read in the previous step to obtain a score of the area to be scanned;

h. comparing the score in the area to be scanned with a preset threshold;

i. If the comparison result of the step h is negative, indicating that the area to be scanned has no face, scanning the next area to be scanned;

j. If the comparison result of the step h is positive, jump back to repeating step a;

k. After the number of repetitions is completed, if the final comparison result is positive, the face information is output, wherein the face information includes coordinates, size, and score values.
The apparatus according to claim 6, wherein the controlling the first thread and the second thread to process the first process and the second process in parallel comprises: when the first thread processes the first process, At the same time, the second thread processes the second process of the previous time.
The apparatus according to claim 6, wherein said input picture comprises:

Compressing the entire picture;

The compressed pictures are all buffered into the on-chip random access memory RAM of the field programmable gate array hardware FPGA.
The device according to claim 6, wherein the device further comprises:

a parameter fixing unit configured to configure and fix parameters of the classifier of the first M cycles of the first process, wherein the M is a positive integer greater than zero.
The device according to claim 6, wherein the device further comprises:

a multi-engine configuration unit, configured to configure a multi-engine, each single-engine internally configured dual-threading, controlling the first thread and the second thread to process the first process and the second process in parallel, The multiple engines are configured to scan a plurality of regions to be scanned in parallel;

a control module configuration unit, configured to configure a control module, the control module is configured to track an operation status of each of the single engines, and record a status of each of the areas to be scanned, and allocate each of the Single engine work.
A terminal device comprising a memory and a processor, the memory storing executable program instructions, wherein the processor calls executable program instructions stored in the memory to perform any of claims 1 to 5. A method of face detection as described.