WO2018068533A1 - Human face detection method and device - Google Patents

Human face detection method and device Download PDF

Info

Publication number
WO2018068533A1
WO2018068533A1 PCT/CN2017/090296 CN2017090296W WO2018068533A1 WO 2018068533 A1 WO2018068533 A1 WO 2018068533A1 CN 2017090296 W CN2017090296 W CN 2017090296W WO 2018068533 A1 WO2018068533 A1 WO 2018068533A1
Authority
WO
WIPO (PCT)
Prior art keywords
scanned
thread
area
classifier
engine
Prior art date
Application number
PCT/CN2017/090296
Other languages
French (fr)
Chinese (zh)
Inventor
韦国恒
蒋文
杨龙
Original Assignee
深圳云天励飞技术有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 深圳云天励飞技术有限公司 filed Critical 深圳云天励飞技术有限公司
Publication of WO2018068533A1 publication Critical patent/WO2018068533A1/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F9/00Arrangements for program control, e.g. control units
    • G06F9/06Arrangements for program control, e.g. control units using stored programs, i.e. using an internal store of processing equipment to receive or retain programs
    • G06F9/30Arrangements for executing machine instructions, e.g. instruction decode
    • G06F9/38Concurrent instruction execution, e.g. pipeline, look ahead
    • G06F9/3885Concurrent instruction execution, e.g. pipeline, look ahead using a plurality of independent parallel functional units
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/94Hardware or software architectures specially adapted for image or video understanding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification

Definitions

  • the present invention relates to the field of face detection, and in particular, to a method and apparatus for face detection.
  • face detection technology is widely used in a new generation of human-machine interface, secure access and video surveillance, and content-based retrieval.
  • Existing face detection implementations include a software-only implementation and a Field-Programmable Gate Array (FPGA) acceleration scheme.
  • the software implementation is very slow. If it is a 640*480 image, it usually takes about one or two seconds.
  • the existing FPGA acceleration, the general speed is also within 10fps.
  • Real-time FPGA face capture system needs to be applied to very high-end FPGA chips.
  • a common process of applying the FPGA-accelerated face detection algorithm is to first scan the image area and compare it by the classifier. If the comparator passes all the way, the face information will be output, otherwise, the image will be re-scanned.
  • the comparison of each step classifier needs to be completed in the previous step. For the image area of the face, all the classifiers need to be compared before the final score value can be obtained. Indicates the proximity of the face to the real face).
  • each classifier has no fixed pattern for the reading mode of the image area, so it is difficult to perform parallel processing. And because a large number of image areas need to be scanned, processing a picture is slow. This causes the problem of slow and inefficient face detection.
  • the embodiment of the invention discloses a method and a device for detecting a face, which can implement face detection from a new field-programmable gate array (FPGA), which can solve the existing face detection.
  • FPGA field-programmable gate array
  • a first aspect of the embodiments of the present invention discloses a method for detecting a face, including:
  • the first process includes steps a to e, and specifically includes:
  • step a the number of repetitions is determined by the height of the tree structure of the classifier
  • the second process includes the operation steps f to k, and specifically includes:
  • step h If the comparison result of step h is negative, indicating that the area to be scanned has no face, then scanning the next area to be scanned;
  • step h If the comparison result of the step h is positive, jump back to repeating step a;
  • the face information is output, wherein the face information includes coordinates, size, and score values.
  • controlling the first thread and the second thread to process the first process and the second process in parallel including: processing the first thread in the first thread At the same time, the second thread processes the second process of the previous time.
  • the input picture comprises:
  • the entire picture is compressed; the compressed picture is all buffered into the on-chip random access memory RAM of the field programmable gate array hardware FPGA.
  • the method before the controlling the first thread and the second thread to process the first process and the second process in parallel, the method further includes: configuring and fixing the The parameters of the classifier of the first M cycles of the first process, wherein the M is a positive integer greater than zero.
  • the method further includes: configuring a multiple engine, configuring dual threads inside each single engine of the multiple engines, and controlling the first thread, the second The thread processes the first process and the second process in parallel, the multiple engines scan multiple regions to be scanned in parallel, and configure a control module, the control module is configured to track the working state of each of the single engines. And recording the status of each of the areas to be scanned, and assigning the work of each of the single engines.
  • a first aspect of the embodiments of the present invention discloses a device for detecting a face, comprising: an input unit, configured to input a picture, divide the picture into an area to be scanned, and a processing unit, configured to: for each of the to-be-scanned The area is configured to be dual-threaded, and the first thread and the second thread are controlled to process the first process and the second process in parallel.
  • the first process includes steps a to e, and specifically includes:
  • step a the number of repetitions is determined by the height of the tree structure of the classifier
  • the second process includes the operation steps f to k, and specifically includes:
  • step h If the comparison result of the step h is negative, it indicates that the area to be scanned has no face, and the next area to be scanned is scanned;
  • step h If the comparison result of the step h is negative, jump back to repeating step a;
  • the face information is output, wherein the face information includes coordinates, size, and score values.
  • controlling the first thread and the second thread to process the first process and the second process in parallel including: processing, in the first thread, the In a process, the second thread simultaneously processes the second process of the last time.
  • the input to be scanned image includes:
  • the entire picture to be scanned is compressed; the pictures after compression are all buffered into the on-chip random access memory RAM of the field programmable gate array hardware FPGA.
  • the apparatus further includes:
  • a parameter fixing unit configured to configure and fix parameters of the classifier of the first M cycles of the first process, wherein the M is a positive integer greater than zero.
  • the apparatus further includes:
  • a multi-engine configuration unit configured to configure a multi-engine, each of the multi-engines internally configuring a dual thread, and controlling the first thread and the second thread to process the first process and the second process in parallel,
  • the multi-engine is configured to scan a plurality of areas to be scanned in parallel;
  • a control module configuration unit configured to configure a control module, the control module is configured to track an operation status of each of the single engines, and record each The state of the area to be scanned, the work of each of the single engines is assigned.
  • FIG. 1 is a schematic flow chart of a method for detecting a face according to an embodiment of the present invention.
  • FIG. 2 is a schematic diagram of an implementation scenario of a multi-engine disclosed in an embodiment of the present invention.
  • FIG. 3 is a schematic flowchart diagram of another method for detecting a face according to an embodiment of the present invention.
  • FIG. 4 is a schematic structural diagram of an apparatus for detecting a face according to an embodiment of the present invention.
  • FIG. 5 is a schematic diagram of a hardware architecture of a terminal device of an apparatus for performing the face detection according to an embodiment of the present invention.
  • Computer device also referred to as “computer” in the context, is meant an intelligent electronic device that can perform predetermined processing, such as numerical calculations and/or logical calculations, by running a predetermined program or instruction, which can include a processor and The memory is executed by the processor to execute a predetermined process pre-stored in the memory to execute a predetermined process, or is executed by hardware such as an ASIC, an FPGA, a DSP, or the like, or a combination of the two.
  • Computer devices include, but are not limited to, servers, personal computers, notebook computers, tablets, smart phones, and the like.
  • the embodiment of the invention discloses a method and a device for detecting a face, which can realize face detection from a new Field-Programmable Gate Array (FPGA) perspective and improve the speed of face detection. .
  • FPGA Field-Programmable Gate Array
  • FIG. 1 is a schematic flowchart diagram of a method for detecting a face according to an embodiment of the present invention. As shown in FIG. 1, the method for detecting a face may include the following steps:
  • S101 Input a picture, and divide the picture into areas to be scanned.
  • the inputting the picture in the above step S101 includes: first, compressing the entire picture, and the size of the compression is determined according to the picture resolution and the memory space; then, the compressed picture is all buffered into the field programmable gate array hardware (Field-Programmable Gate) Array (FPGA) is an on-chip random access memory (RAM) that allows each classifier to quickly read pixels frequently.
  • FPGA Field-Programmable Gate
  • RAM random access memory
  • the first process in the foregoing step S102 includes the operation steps a to e, and specifically includes:
  • step a the number of repetitions is determined by the height of the tree structure of the classifier
  • the second process in the foregoing step S102 includes the operation steps f to k, and specifically includes:
  • the analog score is read in the lookup table
  • step h If the comparison result of step h is negative, indicating that the area to be scanned has no face, then scanning the next area to be scanned;
  • step h If the comparison result of step h is positive, jump back to repeating step a;
  • the face information is output, wherein the face information includes coordinates, size, and score value.
  • the dual thread is configured to control the first thread and the second thread to process the first process and the second process in parallel.
  • the first thread processes the first process
  • the second thread processes the second process.
  • the first thread executes the operation step of the first process
  • the analog result is directly output to the second process of the second thread
  • the first thread directly transfers.
  • the first process of executing the next partition to be scanned is performed instead of waiting for the end of the second process.
  • the first thread and the second thread are two independent threads, and the first process and the second process can be processed in parallel, which can improve the timeliness.
  • the overall execution time of the first process is about 60% of the overall time
  • the overall execution time of the second process is about 40% of the overall time.
  • the classifier is a scan window of the engine (ie, the calculation unit), and it is calculated according to an algorithm to determine which pixel value to take for comparison.
  • step a is repeated, because if the comparison result of the step h is positive, it indicates that the current loop result detection of the program is a human face.
  • step S102 is a multi-cycle operation, multiple loop repetitions are required to finally confirm and output the face information, that is, step k: after the number of repetitions is completed, if the final comparison result is positive, the face is output. information. This can ensure the accuracy of face detection, wherein the number of times of jumping back to repeating step a is determined by an algorithm.
  • the score value in the step k of the step S102 indicates the proximity of the face to the real face, and the score value corresponds to the level of the proximity of the face to the real face, and the score value is The higher the value, the closer the face to be scanned is to the real face.
  • the foregoing step S102 before controlling the first thread and the second thread to process the first process and the second process in parallel, further includes: configuring and fixing parameters of the first M cycles of the classifier of the first process, where M is A positive integer greater than zero. Because most scanned areas will exit after a few cycles. So for this shortcoming, the first few loops of the classifier parameters of the first process are solidified in each engine. In the first few cycles, the engine can operate without reading the parameters of the classifier, thus avoiding unnecessary RAM reading and speeding up the implementation of the face detection algorithm.
  • M is a positive integer greater than 0, optionally, in a specific embodiment of an embodiment of the present invention, M is preferably set to 8, that is, the first 8 cycles of the first process are configured and fixed.
  • the parameters of the classifier can also be set to other numbers.
  • the technical solution provided by the embodiment of the present invention compresses the entire picture and buffers all the compressed data into a Field-Programmable Gate Array (FPGA) on-chip random access memory (RAM). So that each classifier can quickly read the pixels frequently, and then divide the picture into areas to be scanned, configure dual threads for each area to be scanned, and control the first thread and the second thread in parallel.
  • the first process and the second process are processed, wherein the first process includes steps a and e, and the second process includes steps f to k. Since the first thread and the second thread are two independent threads, when the first thread processes the first process, and the second thread processes the last second process, the aging is improved; and the header of the process 1 is additionally set and fixed.
  • the technical solution provided by the embodiment of the present invention can implement the face detection by using the FPGA hardware, and can improve the speed of the face detection method when adding very few resources, and has the advantages of high efficiency and good performance. And it can also run on low-end FPGAs.
  • FIG. 3 is a schematic flowchart diagram of another method for detecting a face according to an embodiment of the present invention.
  • the method may be implemented in a network architecture as shown in FIG. 2.
  • the embodiment of the present invention discloses a multi-engine implementation scenario, which specifically includes: a control module 21, a multi-engine module 22, and a memory module 23, wherein the control module 21 can manage the multi-engine module 22 and the memory module. twenty three.
  • the multi-engine module 22 includes eight engines of the engine 0 to the engine 7. Of course, in actual applications, other numbers of multiple engines may be configured; the memory module 23 includes a classifier parameter independent memory, a classifier parameter shared memory, and a pixel value.
  • Memory lookup table parameter memory, threshold memory, face information memory, different memories can be stored and multiplexed.
  • the embodiment of the present invention discloses a flow chart of another method for detecting a face, and the method for detecting a face may include:
  • the first flow in the above step S301 includes steps a and e, and the second flow includes steps f to k.
  • steps a and e and steps f to k are the same as those in FIG.
  • the engine in the above step S301 may be a computing unit of hardware.
  • the multiple engines are configured in the above step S301, and the pictures can be processed by multiple computing units at the same time. For each area that needs to be scanned, scan through a single engine and configure multiple engines to scan multiple areas in parallel. For example, if 8 engines are implemented, 8 regions can be scanned simultaneously, and even if the optimization processing inside the single engine is ignored, the entire algorithm can be accelerated by 8 times. Of course, the number of multiple engines is determined by system resources and algorithm design.
  • Configure a control module configured to track an operating status of each of the single engines, and record a status of each area to be scanned, and allocate the work of each of the single engines.
  • the control module is configured to track the working status of each of the single engines, and record the status of each area to be scanned, and allocate the work of each of the engines. Because the exit time of different areas is different when scanning for each different area, the control module can track the working status of each engine, and record the status of each area that needs to be scanned, and coordinate the work of each engine. It can avoid some engine workloads, but at the same time some engines are idle and unreasonable working conditions, so that each engine can operate efficiently.
  • the technical solution provided by the embodiment of the present invention configures a multi-engine, and configures a dual-thread parallel processing of the first process and the second process in each single engine of the multiple engines, and the multiple engines can simultaneously process different areas of the image, and further A control module is configured to track the working state of each engine, and record the status of each area to be scanned, and assign the work of each engine to ensure that each engine can operate efficiently. Therefore, the technical solution provided by the embodiment of the present invention can improve the performance of image processing and improve the speed of face detection.
  • FIG. 4 is a schematic structural diagram of an apparatus for detecting a face according to an embodiment of the present invention.
  • An apparatus for detecting a face according to an embodiment of the present invention includes:
  • the input unit 401 is configured to input a picture to be scanned.
  • the inputting the to-be-scanned image in the input unit 401 includes: first, compressing the entire image, and the size of the compression is determined according to the image resolution and the memory space; then, the compressed image is all buffered into the field programmable gate.
  • Field-Programmable Gate Array FPGA
  • RAM on-chip random access memory
  • the processing unit 402 is configured to configure a dual thread for each of the areas to be scanned, and control the first thread and the second thread to process the first process and the second process in parallel.
  • the first process in the processing unit 402 includes the operating steps a to e, and specifically includes:
  • step a the number of repetitions is determined by the height of the classifier's tree structure.
  • the second process in the processing unit 402 includes the operating steps f to k, and specifically includes:
  • the analog score is read in the lookup table
  • step h If the comparison result of the step h is negative, it indicates that the area to be scanned has no face, and the next area to be scanned is scanned;
  • step h If the comparison result of step h is positive, jump back to repeating step a;
  • the face information is output, wherein the face information includes coordinates, size, and score value.
  • the first thread executes the operation step of the first process, and after the first process ends, the analog result is directly output to the second process of the second thread, and the first thread directly transfers.
  • the first process of executing the next partition to be scanned is performed instead of waiting for the end of the second process.
  • the first thread and the second thread are two independent threads, and the first process and the second process can be processed in parallel, which can improve the timeliness.
  • the overall execution time of the first process is about 60% of the overall time
  • the overall execution time of the second process is about 40% of the overall time.
  • the foregoing apparatus may further include:
  • the parameter fixing unit 403 is configured to configure and fix parameters of the first M cyclic classifiers of the first process, and M is a positive integer greater than 0.
  • the engine can operate without reading the parameters of the classifier, thereby avoiding unnecessary RAM reading and speeding up the implementation of the face detection algorithm.
  • a multi-engine configuration unit 404 configured to configure a multi-engine, each single-engine internal dual-thread is configured to control the first thread and the second thread to process the first process and the second process in parallel, and the multiple engines can be parallel Scan a plurality of areas to be scanned.
  • each single engine of the multiple engines in the multi-engine configuration unit 404 is internally configured with dual threads, and the first thread and the second thread are controlled to process the first process and the second process in parallel, where the first process includes steps. a and e, the second flow includes steps f to k, and the specific explanation is the same as in the above-described processing unit 402.
  • the control module configuration unit 405 is configured to configure a control module, configured to track the working status of each single engine, and record the status of each area to be scanned, and assign the work of each of the single engines.
  • control module in the control module configuration unit 405 is configured to track the working status of each engine, and record the status of each area to be scanned, and allocate the work of each engine. Because the exit time of different regions is different when scanning for different regions, the control module can track the working state of each engine and record the state of each region that needs to be scanned, and coordinate the work of each engine. It can avoid certain engine workloads and some unreasonable working conditions of some engines, so that each engine can operate efficiently.
  • modules or sub-modules in all embodiments of the present invention may be implemented by a general-purpose integrated circuit, such as a CPU, or by an ASIC (Application Specific Integrated Circuit).
  • a general-purpose integrated circuit such as a CPU
  • ASIC Application Specific Integrated Circuit
  • FIG. 5 it is a schematic diagram of a hardware architecture of a terminal device of a device for performing the face detection disclosed in the embodiment of the present invention.
  • the terminal device 1 of the present invention may include a terminal device having a face detection function, such as a computer, a smart phone, a scanner, a camera, an attendance machine, and the like.
  • a terminal device having a face detection function such as a computer, a smart phone, a scanner, a camera, an attendance machine, and the like.
  • the terminal device 1 in the embodiment of the present invention includes at least one processor 2, such as a CPU, at least one memory 4, and at least one communication bus 6.
  • the communication bus 6 is used to implement connection communication between components such as the processor 2 and the memory 4.
  • the memory 4 may be a high speed RAM memory or a non-volatile memory such as at least one disk memory.
  • the processor 2 can execute an operating system of the terminal device 1 and various installed application programs, executable program instructions, and the like.
  • each unit described above includes the input unit 401, the processing unit 402, and a parameter fixing unit. 403.
  • the memory 4 stores executable program instructions, and the processor 2 can call executable program instructions stored in the memory 4 via the communication bus 6 to perform related functions.
  • the respective units described in FIG. 4 for example, the input unit 401, the processing unit 402, the parameter fixing unit 403, the multi-engine configuration unit 404, and the control module configuration unit 405, etc.
  • Program instructions are executable and executed by the processor 2 to implement the functions of the various units to implement face detection from a new field-programmable gate array (FPGA) .
  • FPGA field-programmable gate array
  • the memory 4 stores a plurality of instructions that are executed by the processor 2 to implement a method of face detection. Specifically, the processor 2 inputs a picture, and divides the picture into an area to be scanned; for each area to be scanned, configures a dual thread, and controls the first thread and the second thread to process the first process in parallel and Second process;
  • the first process includes steps a to e, and specifically includes:
  • step a the number of repetitions is determined by the height of the tree structure of the classifier
  • the second process includes the operation steps f to k, and specifically includes:
  • step h If the comparison result of step h is negative, indicating that the area to be scanned has no face, then scanning the next area to be scanned;
  • step h If the comparison result of the step h is positive, jump back to repeating step a;
  • the face information is output, wherein the face information includes coordinates, size, and score values.
  • controlling the first thread and the second thread to process the first process and the second process in parallel including: processing the first thread in the first thread At the same time, the second thread processes the second process of the previous time.
  • the input picture comprises:
  • the entire picture is compressed; the compressed picture is all buffered into the on-chip random access memory RAM of the field programmable gate array hardware FPGA.
  • the method before the controlling the first thread and the second thread to process the first process and the second process in parallel, the method further includes: configuring and fixing the The parameters of the classifier of the first M cycles of the first process, wherein the M is a positive integer greater than zero.
  • the method further includes: configuring a multiple engine, configuring dual threads inside each single engine of the multiple engines, and controlling the first thread, the second The thread processes the first process and the second process in parallel, the multiple engines scan multiple regions to be scanned in parallel, and configure a control module, the control module is configured to track the working state of each of the single engines. And recording the status of each of the areas to be scanned, and assigning the work of each of the single engines.
  • the units in the user terminal in the embodiment of the present invention may be combined, divided, and deleted according to actual needs.
  • the storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).

Abstract

A human face detection method and device. The method comprises: inputting an image, and dividing the image into regions to be scanned (101); configuring double threads for each of the regions to be scanned, and controlling first and second threads to process a first flow and a second flow in parallel (102), i.e. the second thread processing the last second flow while the first thread processing the first flow; configuring and fixing parameters of a classifier with regard to former M cycles of the first thread; configuring a multi-engine, wherein the multi-engine can scan a plurality of regions to be scanned in parallel; configuring a control module for tracking a working state of each single engine, recording the state of each region to be scanned, and coordinating the work of each single engine. By means of the method, human face detection can be realized by means of a new FPGA hardware angle, time consumption is saved, and the speed and performance of human face detection is improved.

Description

一种人脸检测的方法及装置Method and device for face detection
本申请要求于2016年10月10日提交中国专利局,申请号为201610884062.2、发明名称为“一种人脸检测的方法及装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。The present application claims priority to Chinese Patent Application No. 201610884062.2, entitled "Method and Apparatus for Face Detection", filed on October 10, 2016, the entire contents of which is incorporated herein by reference. In the application.
技术领域Technical field
本发明涉及人脸检测领域,尤其涉及一种人脸检测的方法及装置。The present invention relates to the field of face detection, and in particular, to a method and apparatus for face detection.
背景技术Background technique
目前人脸检测技术在新一代的人机界面、安全访问和视频监控以及基于内容的检索等领域得到广泛应用。现有的人脸检测实现方案包括纯软件实现方案和现场可编程门阵列硬件(Field-Programmable Gate Array,简称FPGA)加速方案。软件实现都很慢,如果是640*480的图片,一般耗时需要一两秒左右;现有的FPGA加速,一般速度也都在10fps以内,At present, face detection technology is widely used in a new generation of human-machine interface, secure access and video surveillance, and content-based retrieval. Existing face detection implementations include a software-only implementation and a Field-Programmable Gate Array (FPGA) acceleration scheme. The software implementation is very slow. If it is a 640*480 image, it usually takes about one or two seconds. The existing FPGA acceleration, the general speed is also within 10fps.
实时的FPGA人脸抓拍系统,需要应用到很高端的FPGA芯片。常用的一种应用FPGA加速的人脸检测算法的具体流程是:首先对图像区域进行扫描,进而通过分类器进行比较,如果比较器全部通过就会输出人脸信息,否则,就重新扫描。但是现有的级联型人脸识别算法,每一步分类器的比较都需要等上一步完成,对于有人脸的图像区域,需要等所有的分类器完成比较之后,才能得到最后的得分值(表示该人脸与真实人脸的接近度)。另外每个分类器对图像区域的读取模式没有固定规律,所以很难进行并行处理。并因为需要对大量众多的图像区域进行扫描,因此处理一张图片的速度很慢。这就造成了人脸检测的速度慢、效率低下的问题。Real-time FPGA face capture system needs to be applied to very high-end FPGA chips. A common process of applying the FPGA-accelerated face detection algorithm is to first scan the image area and compare it by the classifier. If the comparator passes all the way, the face information will be output, otherwise, the image will be re-scanned. However, in the existing cascaded face recognition algorithm, the comparison of each step classifier needs to be completed in the previous step. For the image area of the face, all the classifiers need to be compared before the final score value can be obtained. Indicates the proximity of the face to the real face). In addition, each classifier has no fixed pattern for the reading mode of the image area, so it is difficult to perform parallel processing. And because a large number of image areas need to be scanned, processing a picture is slow. This causes the problem of slow and inefficient face detection.
发明内容Summary of the invention
本发明实施例公开了一种人脸检测的方法及装置,能够从一种新的现场可编程门阵列硬件(Field-Programmable Gate Array,简称FPGA)角度实现人脸检测,可以解决现有人脸检测的速度慢、效率低的技术问题,具有速度快、性能高的优点。The embodiment of the invention discloses a method and a device for detecting a face, which can implement face detection from a new field-programmable gate array (FPGA), which can solve the existing face detection. The technical problem of slow speed and low efficiency has the advantages of high speed and high performance.
本发明实施例第一方面公开了一种人脸检测的方法,包括:A first aspect of the embodiments of the present invention discloses a method for detecting a face, including:
输入图片,将所述图片分成待扫描的区域;对每个所述待扫描的区域,配置双线程,控制第一线程、第二线程并行处理第一流程和第二流程;Inputting a picture, dividing the picture into an area to be scanned; configuring a dual thread for each of the areas to be scanned, and controlling the first thread and the second thread to process the first process and the second process in parallel;
其中,所述第一流程包括操作步骤a至e,具体包括:The first process includes steps a to e, and specifically includes:
a. 计算分类器在所述待扫描区域的读取地址;a calculating a read address of the classifier in the area to be scanned;
b. 根据所述分类器的读取地址,在所述待扫描区域中读取像素点的值;b. reading a value of a pixel in the to-be-scanned area according to a read address of the classifier;
c. 对读取到的所述像素点的值进行类比;c. analogy of the values of the read pixel points;
d. 根据上一步的类比结果,选择下一个分类器;d. According to the analogy result of the previous step, select the next classifier;
e. 重复步骤a,重复次数由分类器的树结构的高度决定;e. Repeat step a, the number of repetitions is determined by the height of the tree structure of the classifier;
其中,所述第二流程包括操作步骤f至k,具体包括:The second process includes the operation steps f to k, and specifically includes:
f. 根据所述第一流程的所述类比结果,在查找表中读取类比的分值;f. reading the analog score in the lookup table according to the analogy result of the first process;
g. 将上一步读取的所述类比的分值进行累加,得到所述待扫描区域的分值;g. accumulating the analog scores read in the previous step to obtain a score of the area to be scanned;
h. 将所述待扫描区域中的分值和预设的阈值进行比较;h. comparing the score in the area to be scanned with a preset threshold;
i. 如果步骤h的比较结果为负,则表明所述待扫描区域没有人脸,则扫描下一个所述待扫描的区域;i. If the comparison result of step h is negative, indicating that the area to be scanned has no face, then scanning the next area to be scanned;
j. 如果所述步骤h的比较结果为正,则跳回重复步骤a;j. If the comparison result of the step h is positive, jump back to repeating step a;
k. 重复的次数完成后,如果最后的比较结果为正,则输出人脸信息,其中,所述人脸信息包括坐标、大小和得分值。k. After the number of repetitions is completed, if the final comparison result is positive, the face information is output, wherein the face information includes coordinates, size, and score values.
在一种可选方案中,在第一方面提供的方法中,所述控制第一线程、第二线程并行处理第一流程和第二流程,包括:在所述第一线程处理所述第一流程时,同时第二线程处理上一次的所述第二流程。In an optional method, in the method provided by the first aspect, the controlling the first thread and the second thread to process the first process and the second process in parallel, including: processing the first thread in the first thread At the same time, the second thread processes the second process of the previous time.
在一种可选方案中,在第一方面提供的方法中,所述输入图片包括:In an alternative, in the method provided by the first aspect, the input picture comprises:
将整个的所述图片进行压缩;将压缩之后的所述图片全部缓存入现场可编程门阵列硬件FPGA的片上随机存取存储器RAM。The entire picture is compressed; the compressed picture is all buffered into the on-chip random access memory RAM of the field programmable gate array hardware FPGA.
在一种可选方案中,在第一方面提供的方法中,所述方法在所述控制第一线程、第二线程并行处理第一流程和第二流程之前,还包括:配置并固定所述第一流程的头M个循环的所述分类器的参数,其中所述M为大于0的正整数。In an optional method, in the method provided by the first aspect, before the controlling the first thread and the second thread to process the first process and the second process in parallel, the method further includes: configuring and fixing the The parameters of the classifier of the first M cycles of the first process, wherein the M is a positive integer greater than zero.
在一种可选方案中,在第一方面提供的方法中,还包括:配置多引擎,在所述多引擎的每一个单引擎内部配置双线程,控制所述第一线程、所述第二线程并行处理所述第一流程和所述第二流程,所述多引擎并行地扫描多个待扫描的区域;配置控制模块,所述控制模块用于跟踪每一个所述单引擎的工作状态,并且记录每一个所述待扫描的区域的状态,分配所述每一个所述单引擎的工作。In an alternative, in the method provided by the first aspect, the method further includes: configuring a multiple engine, configuring dual threads inside each single engine of the multiple engines, and controlling the first thread, the second The thread processes the first process and the second process in parallel, the multiple engines scan multiple regions to be scanned in parallel, and configure a control module, the control module is configured to track the working state of each of the single engines. And recording the status of each of the areas to be scanned, and assigning the work of each of the single engines.
本发明实施例第一方面公开了一种人脸检测的装置,包括:输入单元,用于输入图片,将所述图片分成待扫描的区域;处理单元,用于对每个所述待扫描的区域,配置双线程,控制第一线程、第二线程并行处理第一流程和第二流程。A first aspect of the embodiments of the present invention discloses a device for detecting a face, comprising: an input unit, configured to input a picture, divide the picture into an area to be scanned, and a processing unit, configured to: for each of the to-be-scanned The area is configured to be dual-threaded, and the first thread and the second thread are controlled to process the first process and the second process in parallel.
其中,所述第一流程包括操作步骤a至e,具体包括:The first process includes steps a to e, and specifically includes:
a 计算分类器在所述待扫描区域的读取地址;a calculating a read address of the classifier in the area to be scanned;
b. 根据所述分类器的读取地址,在所述待扫描区域中读取像素点的值;b. reading a value of a pixel in the to-be-scanned area according to a read address of the classifier;
c. 对读取到的所述像素点的值进行类比;c. analogy of the values of the read pixel points;
d. 根据上一步的类比结果,选择下一个分类器;d. According to the analogy result of the previous step, select the next classifier;
e. 重复步骤a,重复次数由分类器的树结构的高度决定;e. Repeat step a, the number of repetitions is determined by the height of the tree structure of the classifier;
其中,所述第二流程包括操作步骤f至k,具体包括:The second process includes the operation steps f to k, and specifically includes:
f. 根据所述第一流程的所述类比结果,在查找表中读取类比的分值;f. reading the analog score in the lookup table according to the analogy result of the first process;
g. 将上一步读取的所述类比的分值进行累加,得到所述待扫描区域的分值;g. accumulating the analog scores read in the previous step to obtain a score of the area to be scanned;
h. 将所述待扫描区域中的分值和预设的阈值进行比较;h. comparing the score in the area to be scanned with a preset threshold;
i.. 如果步骤h的比较结果为负,则表明所述待扫描区域没有人脸,则扫描下一个所述待扫描的区域;i.. If the comparison result of the step h is negative, it indicates that the area to be scanned has no face, and the next area to be scanned is scanned;
j. 如果所述步骤h的比较结果为负,则跳回重复步骤a;j. If the comparison result of the step h is negative, jump back to repeating step a;
k. 重复的次数完成后,如果最后的比较结果为正,则输出人脸信息,其中,所述人脸信息包括坐标、大小和得分值。k. After the number of repetitions is completed, if the final comparison result is positive, the face information is output, wherein the face information includes coordinates, size, and score values.
在一种可选方案中,在第一方面提供的装置中,所述控制第一线程、第二线程并行处理第一流程和第二流程,包括:在所述第一线程进行处理所述第一流程时,同时第二线程处理上一次的所述第二流程。In an optional aspect, in the apparatus provided by the first aspect, the controlling the first thread and the second thread to process the first process and the second process in parallel, including: processing, in the first thread, the In a process, the second thread simultaneously processes the second process of the last time.
在一种可选方案中,在第一方面提供的装置中,所述输入待扫描图片包括:In an optional aspect, in the apparatus provided in the first aspect, the input to be scanned image includes:
将整个所述待扫描图片进行压缩;将压缩之后的所述图片全部缓存入现场可编程门阵列硬件FPGA的片上随机存取存储器RAM。The entire picture to be scanned is compressed; the pictures after compression are all buffered into the on-chip random access memory RAM of the field programmable gate array hardware FPGA.
在一种可选方案中,在第一方面提供的装置中,所述装置还包括:In an alternative, in the apparatus provided in the first aspect, the apparatus further includes:
参数固定单元,用于配置并固定所述第一流程的头M个循环的所述分类器的参数,其中所述M为大于0的正整数。a parameter fixing unit configured to configure and fix parameters of the classifier of the first M cycles of the first process, wherein the M is a positive integer greater than zero.
在一种可选方案中,在第一方面提供的装置中,所述装置还包括:In an alternative, in the apparatus provided in the first aspect, the apparatus further includes:
多引擎配置单元,用于配置多引擎,所述多引擎的每一个单引擎内部配置双线程,控制所述第一线程、所述第二线程并行处理所述第一流程和所第二流程,所述多引擎用于并行地扫描多个待扫描的区域;控制模块配置单元,用于配置控制模块,所述控制模块用于跟踪所述每一个所述单引擎的工作状态,并且记录每个所述待扫描的区域的状态,分配所述每一个所述单引擎的工作。a multi-engine configuration unit, configured to configure a multi-engine, each of the multi-engines internally configuring a dual thread, and controlling the first thread and the second thread to process the first process and the second process in parallel, The multi-engine is configured to scan a plurality of areas to be scanned in parallel; a control module configuration unit configured to configure a control module, the control module is configured to track an operation status of each of the single engines, and record each The state of the area to be scanned, the work of each of the single engines is assigned.
本发明实施例中,通过压缩整个待扫描的图片,并将压缩之后的图片全部缓存入现场可编程门阵列硬件FPGA的片上随机存取存储器RAM,对每个待扫描的区域,配置双线程,控制第一线程、第二线程并行处理第一流程和第二流程,即在第一线程处理第一流程时,第二线程同时处理上一次的第二流程;另外,配置并固定第一流程的头M个循环的所述分类器的参数,M为大于0的正整数;配置多引擎,所述多引擎并行地扫描多个待扫描的区域;配置控制模块,所述控制模块用于跟踪所述每一个引擎的工作状态,并且记录每个待扫描区域的状态,分配每个引擎的工作。可以看出,在本发明实施例提供的技术方案可以节省人脸检测的耗用时间,提高人脸检测的速度和性能。In the embodiment of the present invention, by compressing the entire picture to be scanned, and buffering all the compressed pictures into the on-chip random access memory RAM of the field programmable gate array hardware FPGA, dual threads are configured for each area to be scanned. Controlling the first thread and the second thread to process the first process and the second process in parallel, that is, when the first thread processes the first process, the second thread processes the second process at the same time; in addition, configuring and fixing the first process a parameter of the classifier of the first M cycles, M is a positive integer greater than 0; configuring a multi-engine, scanning multiple regions to be scanned in parallel; configuring a control module, the control module is used to track the The working state of each engine is described, and the status of each area to be scanned is recorded, and the work of each engine is assigned. It can be seen that the technical solution provided by the embodiment of the present invention can save the elapsed time of the face detection and improve the speed and performance of the face detection.
附图说明DRAWINGS
为了更清楚地说明本发明实施例中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the technical solutions in the embodiments of the present invention, the drawings used in the embodiments will be briefly described below. It is obvious that the drawings in the following description are some embodiments of the present invention. Those skilled in the art can also obtain other drawings based on these drawings without paying any creative work.
图1是本发明实施例公开的一种人脸检测的方法的流程示意图。FIG. 1 is a schematic flow chart of a method for detecting a face according to an embodiment of the present invention.
图2是本发明实施例公开的一种多引擎的实现场景示意图。FIG. 2 is a schematic diagram of an implementation scenario of a multi-engine disclosed in an embodiment of the present invention.
图3是本发明实施例公开的另一种人脸检测的方法的流程示意图。FIG. 3 is a schematic flowchart diagram of another method for detecting a face according to an embodiment of the present invention.
图4是本发明实施例公开的一种人脸检测的装置的结构示意图。FIG. 4 is a schematic structural diagram of an apparatus for detecting a face according to an embodiment of the present invention.
图5是本发明实施例公开的执行所述人脸检测的装置的终端设备的硬件架构示意图。FIG. 5 is a schematic diagram of a hardware architecture of a terminal device of an apparatus for performing the face detection according to an embodiment of the present invention.
具体实施方式detailed description
在更加详细地讨论示例性实施例之前应当提到的是,一些示例性实施例被描述成作为流程图描绘的处理或方法。虽然流程图将各项操作描述成顺序的处理,但是其中的许多操作可以被并行地、并发地或者同时实施。此外,各项操作的顺序可以被重新安排。当其操作完成时所述处理可以被终止,但是还可以具有未包括在附图中的附加步骤。所述处理可以对应于方法、函数、规程、子例程、子程序等等。Before discussing the exemplary embodiments in more detail, it should be noted that some exemplary embodiments are described as a process or method depicted as a flowchart. Although the flowcharts describe various operations as a sequential process, many of the operations can be implemented in parallel, concurrently or concurrently. In addition, the order of operations can be rearranged. The process may be terminated when its operation is completed, but may also have additional steps not included in the figures. The processing may correspond to methods, functions, procedures, subroutines, subroutines, and the like.
在上下文中所称“计算机设备”,也称为“电脑”,是指可以通过运行预定程序或指令来执行数值计算和/或逻辑计算等预定处理过程的智能电子设备,其可以包括处理器与存储器,由处理器执行在存储器中预存的存续指令来执行预定处理过程,或是由ASIC、FPGA、DSP等硬件执行预定处理过程,或是由上述二者组合来实现。计算机设备包括但不限于服务器、个人电脑、笔记本电脑、平板电脑、智能手机等。By "computer device", also referred to as "computer" in the context, is meant an intelligent electronic device that can perform predetermined processing, such as numerical calculations and/or logical calculations, by running a predetermined program or instruction, which can include a processor and The memory is executed by the processor to execute a predetermined process pre-stored in the memory to execute a predetermined process, or is executed by hardware such as an ASIC, an FPGA, a DSP, or the like, or a combination of the two. Computer devices include, but are not limited to, servers, personal computers, notebook computers, tablets, smart phones, and the like.
后面所讨论的方法(其中一些通过流程图示出)可以通过硬件、软件、固件、中间件、微代码、硬件描述语言或者其任意组合来实施。当用软件、固件、中间件或微代码来实施时,用以实施必要任务的程序代码或代码段可以被存储在机器或计算机可读介质(比如存储介质)中。(一个或多个)处理器可以实施必要的任务。The methods discussed below, some of which are illustrated by flowcharts, can be implemented in hardware, software, firmware, middleware, microcode, hardware description languages, or any combination thereof. When implemented in software, firmware, middleware or microcode, the program code or code segments to carry out the necessary tasks can be stored in a machine or computer readable medium, such as a storage medium. The processor(s) can perform the necessary tasks.
这里所公开的具体结构和功能细节仅仅是代表性的,并且是用于描述本发明的示例性实施例的目的。但是本发明可以通过许多替换形式来具体实现,并且不应当被解释成仅仅受限于这里所阐述的实施例。The specific structural and functional details disclosed are merely representative and are for the purpose of describing exemplary embodiments of the invention. The present invention may, however, be embodied in many alternative forms and should not be construed as being limited only to the embodiments set forth herein.
应当理解的是,虽然在这里可能使用了术语“第一”、“第二”等等来描述各个单元,但是这些单元不应当受这些术语限制。使用这些术语仅仅是为了将一个单元与另一个单元进行区分。举例来说,在不背离示例性实施例的范围的情况下,第一单元可以被称为第二单元,并且类似地第二单元可以被称为第一单元。这里所使用的术语“和/或”包括其中一个或更多所列出的相关联项目的任意和所有组合。It should be understood that although the terms "first," "second," etc. may be used herein to describe the various elements, these elements should not be limited by these terms. These terms are used only to distinguish one unit from another. For example, a first unit could be termed a second unit, and similarly a second unit could be termed a first unit, without departing from the scope of the exemplary embodiments. The term "and/or" used herein includes any and all combinations of one or more of the associated listed items.
这里所使用的术语仅仅是为了描述具体实施例而不意图限制示例性实施例。除非上下文明确地另有所指,否则这里所使用的单数形式“一个”、“一项”还意图包括复数。还应当理解的是,这里所使用的术语“包括”和/或“包含”规定所陈述的特征、整数、步骤、操作、单元和/或组件的存在,而不排除存在或添加一个或更多其他特征、整数、步骤、操作、单元、组件和/或其组合。The terminology used herein is for the purpose of describing the particular embodiments, The singular forms "a", "an", It is also to be understood that the terms "comprising" and """ Other features, integers, steps, operations, units, components, and/or combinations thereof.
还应当提到的是,在一些替换实现方式中,所提到的功能/动作可以按照不同于附图中标示的顺序发生。举例来说,取决于所涉及的功能/动作,相继示出的两幅图实际上可以基本上同时执行或者有时可以按照相反的顺序来执行。It should also be noted that, in some alternative implementations, the functions/acts noted may occur in a different order than that illustrated in the drawings. For example, two figures shown in succession may in fact be executed substantially concurrently or sometimes in the reverse order, depending on the function/acts involved.
本发明实施例公开了一种人脸检测的方法及装置,能够从一种新的现场可编程门阵列硬件(Field-Programmable Gate Array,简称FPGA)角度实现人脸检测,提高人脸检测的速度。以下分别进行详细说明。The embodiment of the invention discloses a method and a device for detecting a face, which can realize face detection from a new Field-Programmable Gate Array (FPGA) perspective and improve the speed of face detection. . The details are described below separately.
请参阅图1,图1是本发明实施例公开的一种人脸检测的方法的流程示意图。如图1所示,该人脸检测的方法可以包括以下步骤:Please refer to FIG. 1. FIG. 1 is a schematic flowchart diagram of a method for detecting a face according to an embodiment of the present invention. As shown in FIG. 1, the method for detecting a face may include the following steps:
S101、输入图片,将该图片分成待扫描的区域。S101. Input a picture, and divide the picture into areas to be scanned.
上述步骤S101中输入图片包括:首先,将整个图片进行压缩,压缩的大小根据图片分辨率以及存储器空间来决定;然后,将压缩之后的图片全部缓存入现场可编程门阵列硬件(Field-Programmable Gate Array,简称FPGA)的片上随机存取存储器(Random Access Memory,简称RAM),从而能够让每个分类器快速地对像素点进行频繁地读取。The inputting the picture in the above step S101 includes: first, compressing the entire picture, and the size of the compression is determined according to the picture resolution and the memory space; then, the compressed picture is all buffered into the field programmable gate array hardware (Field-Programmable Gate) Array (FPGA) is an on-chip random access memory (RAM) that allows each classifier to quickly read pixels frequently.
S102,对每个待扫描的区域,配置双线程,控制第一线程、第二线程并行处理第一流程和第二流程。S102. Configure a dual thread for each area to be scanned, and control the first thread and the second thread to process the first process and the second process in parallel.
上述步骤S102中第一流程包括操作步骤a至e,具体包括:The first process in the foregoing step S102 includes the operation steps a to e, and specifically includes:
a 计算分类器在该待扫描区域的读取地址;a calculating the read address of the classifier in the area to be scanned;
b. 根据所述分类器的读取地址,在该待扫描区域中读取像素点的值;b. reading a value of a pixel in the area to be scanned according to the read address of the classifier;
c. 对读取到的所述像素点的值进行类比;c. analogy of the values of the read pixel points;
d. 根据上一步的类比结果,选择下一个分类器;d. According to the analogy result of the previous step, select the next classifier;
e. 重复步骤a,重复次数由分类器的树结构的高度决定;e. Repeat step a, the number of repetitions is determined by the height of the tree structure of the classifier;
上述步骤S102中第二流程包括操作步骤f至k,具体包括:The second process in the foregoing step S102 includes the operation steps f to k, and specifically includes:
f. 根据第一流程的类比结果,在查找表中读取类比的分值;f. According to the analogy result of the first process, the analog score is read in the lookup table;
g. 将上一步读取的类比的分值进行累加,得到该待扫描区域的分值;g. accumulating the analog scores read in the previous step to obtain the score of the area to be scanned;
h. 将该待扫描区域中的分值和预设的阈值进行比较;h. comparing the score in the area to be scanned with a preset threshold;
i. 如果步骤h的比较结果为负,表明所述待扫描区域没有人脸,则扫描下一个待扫描的区域;i. If the comparison result of step h is negative, indicating that the area to be scanned has no face, then scanning the next area to be scanned;
j. 如果步骤h的比较结果为正,则跳回重复步骤a;j. If the comparison result of step h is positive, jump back to repeating step a;
k. 重复的次数完成后,如果最后的比较为正,则输出人脸信息,其中,人脸信息包括坐标、大小和得分值。k. After the number of repetitions is completed, if the final comparison is positive, the face information is output, wherein the face information includes coordinates, size, and score value.
上述步骤S102中配置双线程,控制第一线程、第二线程并行处理第一流程和第二流程是指在第一线程进行处理第一流程时,同时第二线程处理上一次的第二流程。例如:输入第一分区的待扫描区域图片之后,首先第一线程执行第一流程的操作步骤,第一流程结束之后直接将类比结果输出给第二线程的第二流程,第一线程直接转去执行下一个待扫描分区的第一流程,而不是等待第二流程全部执行结束。这样,对于有人脸的图像区域,不需要等所有的分类器完成比较之后,才能得到最后的得分值。第一线程和第二线程是两个独立的线程,可以并行进行处理第一流程和第二流程,能够提高时效。一般情况下,如果第一流程总体的执行耗时约为整体时间的60%,第二流程总体的执行耗时约为整体时间的40%,通过设立两个双线程,控制第一线程和第二线程并行处理第一流程和第二流程,能够节省大于30%的耗时。In the above step S102, the dual thread is configured to control the first thread and the second thread to process the first process and the second process in parallel. The first thread processes the first process, and the second thread processes the second process. For example, after inputting the image of the area to be scanned in the first partition, first, the first thread executes the operation step of the first process, and after the first process ends, the analog result is directly output to the second process of the second thread, and the first thread directly transfers. The first process of executing the next partition to be scanned is performed instead of waiting for the end of the second process. Thus, for the image area of the face, it is not necessary to wait for all the classifiers to complete the comparison before the final score value is obtained. The first thread and the second thread are two independent threads, and the first process and the second process can be processed in parallel, which can improve the timeliness. In general, if the overall execution time of the first process is about 60% of the overall time, the overall execution time of the second process is about 40% of the overall time. By setting up two dual threads, controlling the first thread and the first The two threads process the first process and the second process in parallel, which can save more than 30% of the time.
可选的,其中分类器是引擎(即计算单元)的扫描窗口,它会根据算法去计算决定取哪一个像素值去比较。Optionally, the classifier is a scan window of the engine (ie, the calculation unit), and it is calculated according to an algorithm to determine which pixel value to take for comparison.
可选的,上述步骤S102的步骤j中如果步骤h的比较结果为正,则跳回重复步骤a,这是因为如果步骤h的比较结果为正,表明程序的本次循环结果检测是有人脸,但由于步骤S102是一个多次循环的操作,需要多次循环重复才能最终确认并输出人脸信息,即进行步骤k:重复的次数完成后,如果最后的比较结果为正,则输出人脸信息。这可以保证人脸检测的准确度,其中,跳回重复步骤a的次数由算法决定。Optionally, if the comparison result of the step h is positive in the step j of the above step S102, the step a is repeated, because if the comparison result of the step h is positive, it indicates that the current loop result detection of the program is a human face. However, since step S102 is a multi-cycle operation, multiple loop repetitions are required to finally confirm and output the face information, that is, step k: after the number of repetitions is completed, if the final comparison result is positive, the face is output. information. This can ensure the accuracy of face detection, wherein the number of times of jumping back to repeating step a is determined by an algorithm.
可选的,上述步骤S102的步骤k中的得分值表示该人脸与真实人脸的接近度,得分值的高低和人脸与真实人脸的接近程度的高低相对应,得分值越高则表示该待扫描的人脸越接近真实人脸。Optionally, the score value in the step k of the step S102 indicates the proximity of the face to the real face, and the score value corresponds to the level of the proximity of the face to the real face, and the score value is The higher the value, the closer the face to be scanned is to the real face.
可选的,上述步骤S102在控制第一线程、第二线程并行处理第一流程和第二流程之前,还包括:配置并固定第一流程的头M个循环的分类器的参数,其中M为大于0的正整数。因为大多数扫描的区域会在少数的几次循环后就退出。所以针对这一缺点,在每个引擎里都固化了第一流程的头几个循环的分类器参数。在头几个循环中,引擎不需要去读取分类器的参数就可以进行操作,从而避免了不必要的RAM读取,加快了人脸检测算法的实现速度。Optionally, the foregoing step S102, before controlling the first thread and the second thread to process the first process and the second process in parallel, further includes: configuring and fixing parameters of the first M cycles of the classifier of the first process, where M is A positive integer greater than zero. Because most scanned areas will exit after a few cycles. So for this shortcoming, the first few loops of the classifier parameters of the first process are solidified in each engine. In the first few cycles, the engine can operate without reading the parameters of the classifier, thus avoiding unnecessary RAM reading and speeding up the implementation of the face detection algorithm.
可以理解的是,其中M为大于0的正整数,可选的,本发明一个实施例的具体实施方式中,优选地将M设置为8,即配置并固定第一流程的头8个循环的分类器的参数。当然,在实际应用中,也可以将M设置为其它数字。It can be understood that, where M is a positive integer greater than 0, optionally, in a specific embodiment of an embodiment of the present invention, M is preferably set to 8, that is, the first 8 cycles of the first process are configured and fixed. The parameters of the classifier. Of course, in practical applications, M can also be set to other numbers.
本发明实施例提供的技术方案通过压缩整个图片并将压缩之后的全部缓存入现场可编程门阵列硬件(Field-Programmable Gate Array,简称FPGA)的片上随机存取存储器(Random Access Memory,简称RAM),从而能够让每个分类器快速地对像素点进行频繁地读取,然后将该图片分成待扫描的区域,对每个待扫描的区域,配置双线程,控制第一线程、第二线程并行处理第一流程和第二流程,其中第一流程包括步骤a与e,第二流程包括步骤f至k。由于第一线程和第二线程是两个独立的线程,在第一线程进行处理第一流程时,同时第二线程处理上一次的第二流程,能够提高时效;另外设置并固定流程1的头几个循环的分类器参数,能够避免不必要的RAM读取。所以,本发明实施例提供的技术方案,能够采用FPGA硬件实现人脸的检测,在增加非常少的资源情况下,能够提高人脸检测的方法的速度,其具有效率高、性能好的优点,且在低端的FPGA上也能运行。The technical solution provided by the embodiment of the present invention compresses the entire picture and buffers all the compressed data into a Field-Programmable Gate Array (FPGA) on-chip random access memory (RAM). So that each classifier can quickly read the pixels frequently, and then divide the picture into areas to be scanned, configure dual threads for each area to be scanned, and control the first thread and the second thread in parallel. The first process and the second process are processed, wherein the first process includes steps a and e, and the second process includes steps f to k. Since the first thread and the second thread are two independent threads, when the first thread processes the first process, and the second thread processes the last second process, the aging is improved; and the header of the process 1 is additionally set and fixed. Several loops of classifier parameters can avoid unnecessary RAM reads. Therefore, the technical solution provided by the embodiment of the present invention can implement the face detection by using the FPGA hardware, and can improve the speed of the face detection method when adding very few resources, and has the advantages of high efficiency and good performance. And it can also run on low-end FPGAs.
请参阅图3,图3是本发明实施例公开的另一种人脸检测的方法的流程示意图,该方法可以在如图2所示的网络架构下实现。如图2所示,本发明实施例公开了一种多引擎的实现场景示意图,具体包括:控制模块21、多引擎模块22和存储器模块23,其中控制模块21可以管理多引擎模块22和存储器模块23。其中,多引擎模块22包括引擎0至引擎7的8个引擎,当然实际应用中,也可以配置其它个数的多引擎;存储器模块23包括分类器参数独立存储器、分类器参数共享存储器、像素值存储器、查找表参数存储器、阈值存储器、人脸信息存储器,不同存储器可以进行存储复用。如图3所示,本发明实施例公开了一另一种人脸检测的方法的流程示意图,该人脸检测的方法可以包括:Please refer to FIG. 3. FIG. 3 is a schematic flowchart diagram of another method for detecting a face according to an embodiment of the present invention. The method may be implemented in a network architecture as shown in FIG. 2. As shown in FIG. 2, the embodiment of the present invention discloses a multi-engine implementation scenario, which specifically includes: a control module 21, a multi-engine module 22, and a memory module 23, wherein the control module 21 can manage the multi-engine module 22 and the memory module. twenty three. The multi-engine module 22 includes eight engines of the engine 0 to the engine 7. Of course, in actual applications, other numbers of multiple engines may be configured; the memory module 23 includes a classifier parameter independent memory, a classifier parameter shared memory, and a pixel value. Memory, lookup table parameter memory, threshold memory, face information memory, different memories can be stored and multiplexed. As shown in FIG. 3, the embodiment of the present invention discloses a flow chart of another method for detecting a face, and the method for detecting a face may include:
S301,配置多引擎,在多引擎的每一个单引擎内部配置双线程,控制第一线程、第二线程并行处理第一流程和第二流程,多引擎并行地扫描多个待扫描的区域。S301, configuring multiple engines, configuring dual threads in each single engine of the multiple engines, controlling the first thread and the second thread to process the first process and the second process in parallel, and the multiple engines scan multiple regions to be scanned in parallel.
上述步骤S301中第一流程包括步骤a与e,第二流程包括步骤f至k,步骤a与e、步骤f至k的具体解释与图1中相同。The first flow in the above step S301 includes steps a and e, and the second flow includes steps f to k. The specific explanations of steps a and e and steps f to k are the same as those in FIG.
上述步骤S301中的引擎可以是硬件的计算单元。The engine in the above step S301 may be a computing unit of hardware.
可以理解的是,上述步骤S301中配置多引擎,能够同时通过多个计算单元来处理图片。对每个需要扫描的区域,通过一个单独的引擎去扫描,配置多个引擎,就可以并行地扫描多个区域。例如:如果实现了8个引擎,就可以对8个区域同时扫描,即使在忽略单引擎内部的优化处理情况下,也能够对整个算法达到8倍的提速。当然,多引擎的个数具体由系统资源和算法设计来决定。It can be understood that the multiple engines are configured in the above step S301, and the pictures can be processed by multiple computing units at the same time. For each area that needs to be scanned, scan through a single engine and configure multiple engines to scan multiple areas in parallel. For example, if 8 engines are implemented, 8 regions can be scanned simultaneously, and even if the optimization processing inside the single engine is ignored, the entire algorithm can be accelerated by 8 times. Of course, the number of multiple engines is determined by system resources and algorithm design.
S302,配置控制模块,该控制模块用于跟踪每一个所述单引擎的工作状态,并且记录每一个待扫描的区域的状态,分配所述每一个单引擎的工作。S302. Configure a control module, configured to track an operating status of each of the single engines, and record a status of each area to be scanned, and allocate the work of each of the single engines.
上述步骤S302中控制模块用于跟踪所述每一个单引擎的工作状态,并且记录每一个待扫描区域的状态,分配所述每一个引擎的工作。因为对每一个不同的区域扫描时,不同区域的退出时间是不同的,控制模块能够跟踪每个引擎的工作状态,并且记录每个需要扫描的区域的状态,协调分配每个引擎的工作。可以避免某些引擎工作负荷、但同时某些引擎闲置的不合理的工作状态,做到每个引擎都能高效运作。In the above step S302, the control module is configured to track the working status of each of the single engines, and record the status of each area to be scanned, and allocate the work of each of the engines. Because the exit time of different areas is different when scanning for each different area, the control module can track the working status of each engine, and record the status of each area that needs to be scanned, and coordinate the work of each engine. It can avoid some engine workloads, but at the same time some engines are idle and unreasonable working conditions, so that each engine can operate efficiently.
本发明实施例提供的技术方案通过配置多引擎,在多引擎的每一个单引擎内部配置双线程并行处理第一流程和第二流程,多引擎能够同时对图片的不同区域进行处理,另外,还配置控制模块,该控制模块用于跟踪每一个引擎的工作状态,并且记录每个待扫描区域的状态,分配所述每个引擎的工作,保证每个引擎都能高效运作。所以,本发明实施例提供的技术方案能够提升图片处理的效能,提高人脸检测的速度。The technical solution provided by the embodiment of the present invention configures a multi-engine, and configures a dual-thread parallel processing of the first process and the second process in each single engine of the multiple engines, and the multiple engines can simultaneously process different areas of the image, and further A control module is configured to track the working state of each engine, and record the status of each area to be scanned, and assign the work of each engine to ensure that each engine can operate efficiently. Therefore, the technical solution provided by the embodiment of the present invention can improve the performance of image processing and improve the speed of face detection.
请参阅图4,图4是本发明实施例公开的一种人脸检测的装置的结构示意图。本发明实施例公开的一种人脸检测的装置包括:Please refer to FIG. 4. FIG. 4 is a schematic structural diagram of an apparatus for detecting a face according to an embodiment of the present invention. An apparatus for detecting a face according to an embodiment of the present invention includes:
输入单元401,用于输入待扫描图片。The input unit 401 is configured to input a picture to be scanned.
可选的,上述输入单元401中输入待扫描图片包括:首先,将整个图片进行压缩,压缩的大小根据图片分辨率以及存储器空间来决定;然后,将压缩之后的图片全部缓存入现场可编程门阵列硬件(Field-Programmable Gate Array,简称FPGA)的片上随机存取存储器(Random Access Memory,简称RAM),从而能够让每个分类器快速地对像素点频繁地进行读取。Optionally, the inputting the to-be-scanned image in the input unit 401 includes: first, compressing the entire image, and the size of the compression is determined according to the image resolution and the memory space; then, the compressed image is all buffered into the field programmable gate. Field-Programmable Gate Array (FPGA), on-chip random access memory (RAM), enables each classifier to quickly read pixels frequently.
处理单元402,用于对每个所述待扫描的区域,配置双线程,控制第一线程、第二线程并行处理第一流程和第二流程。The processing unit 402 is configured to configure a dual thread for each of the areas to be scanned, and control the first thread and the second thread to process the first process and the second process in parallel.
可选的,上述处理单元402中第一流程包括操作步骤a至e,具体包括:Optionally, the first process in the processing unit 402 includes the operating steps a to e, and specifically includes:
a 计算分类器在该待扫描区域的读取地址;a calculating the read address of the classifier in the area to be scanned;
b. 根据所述分类器的读取地址,在该待扫描区域中读取像素点的值;b. reading a value of a pixel in the area to be scanned according to the read address of the classifier;
c. 对读取到的所述像素点的值进行类比;c. analogy of the values of the read pixel points;
d. 根据上一步的类比结果,选择下一个分类器;d. According to the analogy result of the previous step, select the next classifier;
e. 重复步骤a,重复次数由分类器的树结构的高度决定。e. Repeat step a, the number of repetitions is determined by the height of the classifier's tree structure.
可选的,上述处理单元402中第二流程包括操作步骤f至k,具体包括:Optionally, the second process in the processing unit 402 includes the operating steps f to k, and specifically includes:
f. 根据第一流程的类比结果,在查找表中读取类比的分值;f. According to the analogy result of the first process, the analog score is read in the lookup table;
g. 将上一步读取的类比的分值进行累加,得到该待扫描区域的分值;g. accumulating the analog scores read in the previous step to obtain the score of the area to be scanned;
h. 将该待扫描区域中的分值和预设的阈值进行比较;h. comparing the score in the area to be scanned with a preset threshold;
i. 如果步骤h的比较结果为负,则表明所述待扫描区域没有人脸,则扫描下一个所述待扫描区域;i. If the comparison result of the step h is negative, it indicates that the area to be scanned has no face, and the next area to be scanned is scanned;
j. 如果步骤h的比较结果为正,则跳回重复步骤a;j. If the comparison result of step h is positive, jump back to repeating step a;
k. 重复的次数完成后,如果最后的比较为正,则输出人脸信息,其中,人脸信息包括坐标、大小和得分值。k. After the number of repetitions is completed, if the final comparison is positive, the face information is output, wherein the face information includes coordinates, size, and score value.
上述处理单元402中配置双线程,控制第一线程、第二线程并行处理第一流程和第二流程是指在第一线程进行处理第一流程时,同时第二线程处理上一次的第二流程。例如:输入第一分区的待扫描区域图片之后,首先第一线程执行第一流程的操作步骤,第一流程结束之后直接将类比结果输出给第二线程的第二流程,第一线程直接转去执行下一个待扫描分区的第一流程,而不是等待第二流程全部执行结束。这样,对于有人脸的图像区域,不需要等所有的分类器完成比较之后,才能得到最后的得分值。第一线程和第二线程是两个独立的线程,可以并行进行处理第一流程和第二流程,能够提高时效。一般情况下,如果第一流程总体的执行耗时约为整体时间的60%,第二流程总体的执行耗时约为整体时间的40%,通过设立两个双线程,控制第一线程和第二线程并行处理第一流程和第二流程,能够节省大于30%的耗时。Configuring the dual thread in the processing unit 402, controlling the first thread and the second thread to process the first process and the second process in parallel, when the first thread processes the first process, and the second thread processes the second process. . For example, after inputting the image of the area to be scanned in the first partition, first, the first thread executes the operation step of the first process, and after the first process ends, the analog result is directly output to the second process of the second thread, and the first thread directly transfers. The first process of executing the next partition to be scanned is performed instead of waiting for the end of the second process. Thus, for the image area of the face, it is not necessary to wait for all the classifiers to complete the comparison before the final score value is obtained. The first thread and the second thread are two independent threads, and the first process and the second process can be processed in parallel, which can improve the timeliness. In general, if the overall execution time of the first process is about 60% of the overall time, the overall execution time of the second process is about 40% of the overall time. By setting up two dual threads, controlling the first thread and the first The two threads process the first process and the second process in parallel, which can save more than 30% of the time.
可选的,上述装置还可以包括:Optionally, the foregoing apparatus may further include:
参数固定单元403,用于配置并固定第一流程的头M个循环的分类器的参数,M为大于0的正整数。The parameter fixing unit 403 is configured to configure and fix parameters of the first M cyclic classifiers of the first process, and M is a positive integer greater than 0.
可以理解的是,因为大多数扫描的区域会在少数的几次循环后就退出。所以,在头几个循环中,引擎不需要去读取分类器的参数就可以进行操作,从而避免了不必要的RAM读取,能够加快人脸检测算法的实现速度。Understandably, because most scanned areas will exit after a few cycles. Therefore, in the first few cycles, the engine can operate without reading the parameters of the classifier, thereby avoiding unnecessary RAM reading and speeding up the implementation of the face detection algorithm.
多引擎配置单元404,用于配置多引擎,所述多引擎的每一个单引擎内部配置双线程,控制第一线程、第二线程并行处理所述第一流程和第二流程,多引擎能够并行地扫描多个待扫描的区域。a multi-engine configuration unit 404, configured to configure a multi-engine, each single-engine internal dual-thread is configured to control the first thread and the second thread to process the first process and the second process in parallel, and the multiple engines can be parallel Scan a plurality of areas to be scanned.
可选的,上述多引擎配置单元404中多引擎的每一个单引擎内部配置双线程,控制第一线程、第二线程并行处理所述第一流程和第二流程,所述第一流程包括步骤a与e,所述第二流程包括步骤f至k,具体的解释和上述处理单元402中相同。Optionally, each single engine of the multiple engines in the multi-engine configuration unit 404 is internally configured with dual threads, and the first thread and the second thread are controlled to process the first process and the second process in parallel, where the first process includes steps. a and e, the second flow includes steps f to k, and the specific explanation is the same as in the above-described processing unit 402.
控制模块配置单元405,用于配置控制模块,所述控制模块用于跟踪每一个单引擎的工作状态,并且记录每个待扫描的区域的状态,分配每一个所述单引擎的工作。The control module configuration unit 405 is configured to configure a control module, configured to track the working status of each single engine, and record the status of each area to be scanned, and assign the work of each of the single engines.
可选的,上述控制模块配置单元405中控制模块用于跟踪每一个引擎的工作状态,并且记录每个待扫描区域的状态,分配每个引擎的工作。因为对每个不同的区域扫描时,不同区域的退出时间是不同的,控制模块能够跟踪每个引擎的工作状态,并且记录每个需要扫描的区域的状态,协调分配每个引擎的工作。可以避免某些引擎工作负荷、某些引擎闲置的不合理的工作状态,做到每个引擎都能高效运作。Optionally, the control module in the control module configuration unit 405 is configured to track the working status of each engine, and record the status of each area to be scanned, and allocate the work of each engine. Because the exit time of different regions is different when scanning for different regions, the control module can track the working state of each engine and record the state of each region that needs to be scanned, and coordinate the work of each engine. It can avoid certain engine workloads and some unreasonable working conditions of some engines, so that each engine can operate efficiently.
本发明所有实施例中的模块或子模块,可以通过通用集成电路,例如CPU,或通过ASIC (Application Specific Integrated Circuit,专用集成电路)来实现。The modules or sub-modules in all embodiments of the present invention may be implemented by a general-purpose integrated circuit, such as a CPU, or by an ASIC (Application Specific Integrated Circuit).
参阅图5所示,是本发明实施例公开的执行所述人脸检测的装置的终端设备的硬件架构示意图。Referring to FIG. 5, it is a schematic diagram of a hardware architecture of a terminal device of a device for performing the face detection disclosed in the embodiment of the present invention.
本发明所述终端设备1可以包括电脑、智能手机、扫描仪、摄像头、考勤机等具有人脸检测功能的终端设备。The terminal device 1 of the present invention may include a terminal device having a face detection function, such as a computer, a smart phone, a scanner, a camera, an attendance machine, and the like.
如图5所示,本发明实施例中的终端设备1包括:至少一个处理器2,例如CPU,至少一个存储器4,以及至少一个通信总线6。As shown in FIG. 5, the terminal device 1 in the embodiment of the present invention includes at least one processor 2, such as a CPU, at least one memory 4, and at least one communication bus 6.
其中,所述通信总线6用于实现处理器2以及存储器4等组件之间的连接通信。The communication bus 6 is used to implement connection communication between components such as the processor 2 and the memory 4.
所述存储器4可以是高速RAM存储器,也可以是非不稳定的存储器(non-volatile memory),例如至少一个磁盘存储器。The memory 4 may be a high speed RAM memory or a non-volatile memory such as at least one disk memory.
所述处理器2可执行所述终端设备1的操作系统以及安装的各类应用程序、可执行程序指令等,例如,上述的各个单元,包括所述输入单元401、处理单元402、参数固定单元403、多引擎配置单元404以及控制模块配置单元405等。The processor 2 can execute an operating system of the terminal device 1 and various installed application programs, executable program instructions, and the like. For example, each unit described above includes the input unit 401, the processing unit 402, and a parameter fixing unit. 403. A multi-engine configuration unit 404, a control module configuration unit 405, and the like.
所述存储器4中存储有可执行程序指令,且所述处理器2可通过通信总线6,调用所述存储器4中存储的可执行程序指令以执行相关的功能。例如,图4中所述的各个单元(例如,所述输入单元401、处理单元402、参数固定单元403、多引擎配置单元404以及控制模块配置单元405等)是存储在所述存储器4中的可执行程序指令,并由所述处理器2所执行,从而实现所述各个单元的功能以从一种新的现场可编程门阵列硬件(Field-Programmable Gate Array,简称FPGA)角度实现人脸检测。The memory 4 stores executable program instructions, and the processor 2 can call executable program instructions stored in the memory 4 via the communication bus 6 to perform related functions. For example, the respective units described in FIG. 4 (for example, the input unit 401, the processing unit 402, the parameter fixing unit 403, the multi-engine configuration unit 404, and the control module configuration unit 405, etc.) are stored in the memory 4. Program instructions are executable and executed by the processor 2 to implement the functions of the various units to implement face detection from a new field-programmable gate array (FPGA) .
在本发明的一个实施例中,所述存储器4存储多个指令,所述多个指令被所述处理器2所执行以实现一种人脸检测的方法。具体而言,所述处理器2输入图片,将所述图片分成待扫描的区域;对每个所述待扫描的区域,配置双线程,控制第一线程、第二线程并行处理第一流程和第二流程;In one embodiment of the invention, the memory 4 stores a plurality of instructions that are executed by the processor 2 to implement a method of face detection. Specifically, the processor 2 inputs a picture, and divides the picture into an area to be scanned; for each area to be scanned, configures a dual thread, and controls the first thread and the second thread to process the first process in parallel and Second process;
其中,所述第一流程包括操作步骤a至e,具体包括:The first process includes steps a to e, and specifically includes:
a. 计算分类器在所述待扫描区域的读取地址;a calculating a read address of the classifier in the area to be scanned;
b. 根据所述分类器的读取地址,在所述待扫描区域中读取像素点的值;b. reading a value of a pixel in the to-be-scanned area according to a read address of the classifier;
c. 对读取到的所述像素点的值进行类比;c. analogy of the values of the read pixel points;
d. 根据上一步的类比结果,选择下一个分类器;d. According to the analogy result of the previous step, select the next classifier;
e. 重复步骤a,重复次数由分类器的树结构的高度决定;e. Repeat step a, the number of repetitions is determined by the height of the tree structure of the classifier;
其中,所述第二流程包括操作步骤f至k,具体包括:The second process includes the operation steps f to k, and specifically includes:
f. 根据所述第一流程的所述类比结果,在查找表中读取类比的分值;f. reading the analog score in the lookup table according to the analogy result of the first process;
g. 将上一步读取的所述类比的分值进行累加,得到所述待扫描区域的分值;g. accumulating the analog scores read in the previous step to obtain a score of the area to be scanned;
h. 将所述待扫描区域中的分值和预设的阈值进行比较;h. comparing the score in the area to be scanned with a preset threshold;
i. 如果步骤h的比较结果为负,则表明所述待扫描区域没有人脸,则扫描下一个所述待扫描的区域;i. If the comparison result of step h is negative, indicating that the area to be scanned has no face, then scanning the next area to be scanned;
j. 如果所述步骤h的比较结果为正,则跳回重复步骤a;j. If the comparison result of the step h is positive, jump back to repeating step a;
k. 重复的次数完成后,如果最后的比较结果为正,则输出人脸信息,其中,所述人脸信息包括坐标、大小和得分值。k. After the number of repetitions is completed, if the final comparison result is positive, the face information is output, wherein the face information includes coordinates, size, and score values.
在一种可选方案中,在第一方面提供的方法中,所述控制第一线程、第二线程并行处理第一流程和第二流程,包括:在所述第一线程处理所述第一流程时,同时第二线程处理上一次的所述第二流程。In an optional method, in the method provided by the first aspect, the controlling the first thread and the second thread to process the first process and the second process in parallel, including: processing the first thread in the first thread At the same time, the second thread processes the second process of the previous time.
在一种可选方案中,在第一方面提供的方法中,所述输入图片包括:In an alternative, in the method provided by the first aspect, the input picture comprises:
将整个的所述图片进行压缩;将压缩之后的所述图片全部缓存入现场可编程门阵列硬件FPGA的片上随机存取存储器RAM。The entire picture is compressed; the compressed picture is all buffered into the on-chip random access memory RAM of the field programmable gate array hardware FPGA.
在一种可选方案中,在第一方面提供的方法中,所述方法在所述控制第一线程、第二线程并行处理第一流程和第二流程之前,还包括:配置并固定所述第一流程的头M个循环的所述分类器的参数,其中所述M为大于0的正整数。In an optional method, in the method provided by the first aspect, before the controlling the first thread and the second thread to process the first process and the second process in parallel, the method further includes: configuring and fixing the The parameters of the classifier of the first M cycles of the first process, wherein the M is a positive integer greater than zero.
在一种可选方案中,在第一方面提供的方法中,还包括:配置多引擎,在所述多引擎的每一个单引擎内部配置双线程,控制所述第一线程、所述第二线程并行处理所述第一流程和所述第二流程,所述多引擎并行地扫描多个待扫描的区域;配置控制模块,所述控制模块用于跟踪每一个所述单引擎的工作状态,并且记录每一个所述待扫描的区域的状态,分配所述每一个所述单引擎的工作。In an alternative, in the method provided by the first aspect, the method further includes: configuring a multiple engine, configuring dual threads inside each single engine of the multiple engines, and controlling the first thread, the second The thread processes the first process and the second process in parallel, the multiple engines scan multiple regions to be scanned in parallel, and configure a control module, the control module is configured to track the working state of each of the single engines. And recording the status of each of the areas to be scanned, and assigning the work of each of the single engines.
需要说明的是,对于前述的各个方法实施例,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本发明并不受所描述的动作顺序的限制,因为依据本申请,某一些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施例均属于优选实施例,所涉及的动作和模块并不一定是本申请所必须的。It should be noted that, for the foregoing various method embodiments, for the sake of simple description, they are all expressed as a series of action combinations, but those skilled in the art should understand that the present invention is not limited by the described action sequence. Because some steps may be performed in other orders or concurrently in accordance with the present application. In the following, those skilled in the art should also understand that the embodiments described in the specification are all preferred embodiments, and the actions and modules involved are not necessarily required by the present application.
在上述实施例中,对各个实施例的描述都各有侧重,某个实施例中没有详细描述的部分,可以参见其他实施例的相关描述。In the above embodiments, the descriptions of the various embodiments are different, and the parts that are not described in detail in a certain embodiment can be referred to the related descriptions of other embodiments.
本发明实施例方法中的步骤可以根据实际需要进行顺序调整、合并和删减。The steps in the method of the embodiment of the present invention may be sequentially adjusted, merged, and deleted according to actual needs.
本发明实施例用户终端中的单元可以根据实际需要进行合并、划分和删减。The units in the user terminal in the embodiment of the present invention may be combined, divided, and deleted according to actual needs.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,是可以通过计算机程序来指令相关的硬件来完成,所述的程序可存储于一计算机可读取存储介质中,该程序在执行时,可包括如上述各方法的实施例的流程。其中,所述的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存取存储器(Random Access Memory,简称RAM)等。One of ordinary skill in the art can understand that all or part of the process of implementing the foregoing embodiments can be completed by a computer program to instruct related hardware, and the program can be stored in a computer readable storage medium. When executed, the flow of an embodiment of the methods as described above may be included. The storage medium may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).
以上对本发明实施例公开的一种人脸检测的方法及装置进行了详细介绍,本文中应用了具体个例对本发明的原理及实施方式进行了阐述,以上实施例的说明只是用于帮助理解本发明的方法及其核心思想;同时,对于本领域的一般技术人员,依据本发明的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本发明的限制。The method and device for detecting a face disclosed in the embodiment of the present invention are described in detail. The principles and embodiments of the present invention are described in the following. The description of the above embodiment is only used to help understand the present invention. The method of the invention and its core idea; at the same time, for the person of ordinary skill in the art, according to the idea of the present invention, there are some changes in the specific embodiment and the scope of application. In summary, the content of the specification should not be understood. To limit the invention.

Claims (11)

  1. 一种人脸检测的方法,其特征在于,包括:A method for face detection, comprising:
    输入图片,将所述图片分成待扫描的区域;Enter a picture to divide the picture into areas to be scanned;
    对每个所述待扫描的区域,配置双线程,控制第一线程、第二线程并行处理第一流程和第二流程;Configuring a dual thread for each of the areas to be scanned, and controlling the first thread and the second thread to process the first process and the second process in parallel;
    其中,所述第一流程包括操作步骤a至e,具体包括:The first process includes steps a to e, and specifically includes:
    a. 计算分类器在所述待扫描区域的读取地址;a calculating a read address of the classifier in the area to be scanned;
    b. 根据所述分类器的读取地址,在所述待扫描区域中读取像素点的值;b. reading a value of a pixel in the to-be-scanned area according to a read address of the classifier;
    c. 对读取到的所述像素点的值进行类比;c. analogy of the values of the read pixel points;
    d. 根据上一步的类比结果,选择下一个分类器;d. According to the analogy result of the previous step, select the next classifier;
    e. 重复步骤a,重复次数由分类器的树结构的高度决定;e. Repeat step a, the number of repetitions is determined by the height of the tree structure of the classifier;
    其中,所述第二流程包括操作步骤f至k,具体包括:The second process includes the operation steps f to k, and specifically includes:
    f. 根据所述第一流程的所述类比结果,在查找表中读取类比的分值;f. reading the analog score in the lookup table according to the analogy result of the first process;
    g. 将上一步读取的所述类比的分值进行累加,得到所述待扫描区域的分值;g. accumulating the analog scores read in the previous step to obtain a score of the area to be scanned;
    h. 将所述待扫描区域中的分值和预设的阈值进行比较;h. comparing the score in the area to be scanned with a preset threshold;
    i. 如果步骤h的比较结果为负,表明所述待扫描区域没有人脸,则扫描下一个所述待扫描的区域;i. If the comparison result of the step h is negative, indicating that the area to be scanned has no face, scanning the next area to be scanned;
    j. 如果所述步骤h的比较结果为正,则跳回重复步骤a;j. If the comparison result of the step h is positive, jump back to repeating step a;
    k. 重复的次数完成后,如果最后的比较结果为正,则输出人脸信息,其中,所述人脸信息包括坐标、大小和得分值。k. After the number of repetitions is completed, if the final comparison result is positive, the face information is output, wherein the face information includes coordinates, size, and score values.
  2. 根据权利要求1所述的方法,其特征在于,所述控制第一线程、第二线程并行处理第一流程和第二流程,包括:在所述第一线程处理所述第一流程时,同时第二线程处理上一次的所述第二流程。The method according to claim 1, wherein the controlling the first thread and the second thread to process the first process and the second process in parallel comprises: when the first thread processes the first process, simultaneously The second thread processes the second process of the previous time.
  3. 根据权利要求1所述的方法,其特征在于,所述输入图片包括:The method of claim 1 wherein said input picture comprises:
    将整个的所述图片进行压缩;Compressing the entire picture;
    将压缩之后的所述图片全部缓存入现场可编程门阵列硬件FPGA的片上随机存取存储器RAM。The compressed pictures are all buffered into the on-chip random access memory RAM of the field programmable gate array hardware FPGA.
  4. 根据权利要求1所述的方法,其特征在于,所述方法在所述控制第一线程、第二线程并行处理第一流程和第二流程之前,还包括:配置并固定所述第一流程的头M个循环的所述分类器的参数,其中所述M为大于0的正整数。The method according to claim 1, wherein the method further comprises: configuring and fixing the first process before the controlling the first thread and the second thread to process the first process and the second process in parallel The parameters of the classifier of the first M cycles, wherein the M is a positive integer greater than zero.
  5. 根据权利要求1所述的方法,其特征在于,还包括:The method of claim 1 further comprising:
    配置多引擎,在所述多引擎的每一个单引擎内部配置双线程,控制所述第一线程、所述第二线程并行处理所述第一流程和所述第二流程,所述多引擎用于并行地扫描多个待扫描的区域;Configuring a multi-engine, configuring dual threads in each single engine of the multiple engines, and controlling the first thread and the second thread to process the first process and the second process in parallel, the multi-engine Scanning a plurality of areas to be scanned in parallel;
    配置控制模块,所述控制模块用于跟踪每一个所述单引擎的工作状态,并且记录每一个所述待扫描的区域的状态,分配所述每一个所述单引擎的工作。And a control module, configured to track an operating state of each of the single engines, and record a status of each of the areas to be scanned, and allocate the work of each of the single engines.
  6. 一种人脸检测的装置,其特征在于,包括:A device for detecting a face, comprising:
    输入单元,用于输入图片,将所述图片分成待扫描的区域;An input unit, configured to input a picture, and divide the picture into an area to be scanned;
    处理单元,用于对每个所述待扫描的区域,配置双线程,控制第一线程、第二线程并行处理第一流程和第二流程;a processing unit, configured to configure a dual thread for each of the areas to be scanned, and control the first thread and the second thread to process the first process and the second process in parallel;
    其中,所述第一流程包括操作步骤a至e,具体包括:The first process includes steps a to e, and specifically includes:
    e.计算分类器在所述待扫描区域的读取地址;e. calculating a read address of the classifier in the area to be scanned;
    f. 根据所述分类器的读取地址,在所述待扫描区域中读取像素点的值;f. reading a value of a pixel in the to-be-scanned area according to a read address of the classifier;
    g. 对读取到的所述像素点的值进行类比;g. analogy of the value of the read pixel point;
    h. 根据上一步的类比结果,选择下一个分类器;h. Select the next classifier based on the analogy result from the previous step;
    e. 重复步骤a,重复次数由分类器的树结构的高度决定;e. Repeat step a, the number of repetitions is determined by the height of the tree structure of the classifier;
    其中,所述第二流程包括操作步骤f至k,具体包括:The second process includes the operation steps f to k, and specifically includes:
    f. 根据所述第一流程的所述类比结果,在查找表中读取类比的分值;f. reading the analog score in the lookup table according to the analogy result of the first process;
    g. 将上一步读取的所述类比的分值进行累加,得到所述待扫描区域的分值;g. accumulating the analog scores read in the previous step to obtain a score of the area to be scanned;
    h. 将所述待扫描区域中的分值和预设的阈值进行比较;h. comparing the score in the area to be scanned with a preset threshold;
    i. 如果步骤h的比较结果为负,表明所述待扫描区域没有人脸,则扫描下一个所述待扫描的区域;i. If the comparison result of the step h is negative, indicating that the area to be scanned has no face, scanning the next area to be scanned;
    j. 如果所述步骤h的比较结果为正,则跳回重复步骤a;j. If the comparison result of the step h is positive, jump back to repeating step a;
    k. 重复的次数完成后,如果最后的比较结果为正,则输出人脸信息,其中,所述人脸信息包括坐标、大小和得分值。k. After the number of repetitions is completed, if the final comparison result is positive, the face information is output, wherein the face information includes coordinates, size, and score values.
  7. 根据权利要求6所述的装置,其特征在于,所述控制第一线程、第二线程并行处理第一流程和第二流程,包括:在所述第一线程进行处理所述第一流程时,同时第二线程处理上一次的所述第二流程。The apparatus according to claim 6, wherein the controlling the first thread and the second thread to process the first process and the second process in parallel comprises: when the first thread processes the first process, At the same time, the second thread processes the second process of the previous time.
  8. 根据权利要求6所述的装置,其特征在于,所述输入图片包括:The apparatus according to claim 6, wherein said input picture comprises:
    将整个的所述图片进行压缩;Compressing the entire picture;
    将压缩之后的所述图片全部缓存入现场可编程门阵列硬件FPGA的片上随机存取存储器RAM。The compressed pictures are all buffered into the on-chip random access memory RAM of the field programmable gate array hardware FPGA.
  9. 根据权利要求6所述的装置,其特征在于,所述装置还包括:The device according to claim 6, wherein the device further comprises:
    参数固定单元,用于配置并固定所述第一流程的头M个循环的所述分类器的参数,其中所述M为大于0的正整数。a parameter fixing unit configured to configure and fix parameters of the classifier of the first M cycles of the first process, wherein the M is a positive integer greater than zero.
  10. 根据权利要求6所述的装置,其特征在于,所述装置还包括:The device according to claim 6, wherein the device further comprises:
    多引擎配置单元,用于配置多引擎,所述多引擎的每一个单引擎内部配置双线程,控制所述第一线程、所述第二线程并行处理所述第一流程和所第二流程,所述多引擎用于并行地扫描多个待扫描的区域;a multi-engine configuration unit, configured to configure a multi-engine, each single-engine internally configured dual-threading, controlling the first thread and the second thread to process the first process and the second process in parallel, The multiple engines are configured to scan a plurality of regions to be scanned in parallel;
    控制模块配置单元,用于配置控制模块,所述控制模块用于跟踪所述每一个所述单引擎的工作状态,并且记录每个所述待扫描的区域的状态,分配所述每一个所述单引擎的工作。a control module configuration unit, configured to configure a control module, the control module is configured to track an operation status of each of the single engines, and record a status of each of the areas to be scanned, and allocate each of the Single engine work.
  11. 一种终端设备,包括存储器以及处理器,所述存储器存储可执行程序指令,其特征在于,所述处理器调用所述存储器中存储的可执行程序指令,以执行如权利要求1至5中任意一项所述的人脸检测的方法。A terminal device comprising a memory and a processor, the memory storing executable program instructions, wherein the processor calls executable program instructions stored in the memory to perform any of claims 1 to 5. A method of face detection as described.
PCT/CN2017/090296 2016-10-10 2017-06-27 Human face detection method and device WO2018068533A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610884062.2A CN106529408B (en) 2016-10-10 2016-10-10 A kind of method and device of Face datection
CN201610884062.2 2016-10-10

Publications (1)

Publication Number Publication Date
WO2018068533A1 true WO2018068533A1 (en) 2018-04-19

Family

ID=58331513

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/090296 WO2018068533A1 (en) 2016-10-10 2017-06-27 Human face detection method and device

Country Status (2)

Country Link
CN (1) CN106529408B (en)
WO (1) WO2018068533A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109636397A (en) * 2018-11-13 2019-04-16 平安科技(深圳)有限公司 Transit trip control method, device, computer equipment and storage medium

Families Citing this family (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106529408B (en) * 2016-10-10 2018-04-13 深圳云天励飞技术有限公司 A kind of method and device of Face datection
CN107688785A (en) * 2017-08-28 2018-02-13 西安电子科技大学 The development approach of the parallel real-time face detection of dual-thread based on ARM platforms
CN107622191B (en) * 2017-09-08 2020-03-10 Oppo广东移动通信有限公司 Unlocking control method and related product
CN108021895A (en) * 2017-12-07 2018-05-11 深圳云天励飞技术有限公司 Demographic method, equipment, readable storage medium storing program for executing and electronic equipment
CN108052891A (en) * 2017-12-08 2018-05-18 触景无限科技(北京)有限公司 Facial contour parallel calculating method and device
CN109800705A (en) * 2019-01-17 2019-05-24 深圳英飞拓科技股份有限公司 Accelerate the method and device of Face datection rate
CN112153343B (en) * 2020-09-25 2022-09-02 北京百度网讯科技有限公司 Elevator safety monitoring method and device, monitoring camera and storage medium
CN112949614B (en) * 2021-04-29 2021-09-10 成都市威虎科技有限公司 Face detection method and device for automatically allocating candidate areas and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101369315A (en) * 2007-08-17 2009-02-18 上海银晨智能识别科技有限公司 Human face detection method
CN102495725A (en) * 2011-11-15 2012-06-13 复旦大学 Image/video feature extraction parallel algorithm based on multi-core system structure
CN104408720A (en) * 2014-11-25 2015-03-11 深圳市哈工大交通电子技术有限公司 Image processing method and device
CN105701446A (en) * 2014-12-11 2016-06-22 想象技术有限公司 Preforming object detection
CN106529408A (en) * 2016-10-10 2017-03-22 深圳云天励飞技术有限公司 Human face detection method and device

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101038686B (en) * 2007-01-10 2010-05-19 北京航空航天大学 Method for recognizing machine-readable travel certificate
CN103745240A (en) * 2013-12-20 2014-04-23 许雪梅 Method and system for retrieving human face on the basis of Haar classifier and ORB characteristics
US20150227780A1 (en) * 2014-02-13 2015-08-13 FacialNetwork, Inc. Method and apparatus for determining identity and programing based on image features

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101369315A (en) * 2007-08-17 2009-02-18 上海银晨智能识别科技有限公司 Human face detection method
CN102495725A (en) * 2011-11-15 2012-06-13 复旦大学 Image/video feature extraction parallel algorithm based on multi-core system structure
CN104408720A (en) * 2014-11-25 2015-03-11 深圳市哈工大交通电子技术有限公司 Image processing method and device
CN105701446A (en) * 2014-12-11 2016-06-22 想象技术有限公司 Preforming object detection
CN106529408A (en) * 2016-10-10 2017-03-22 深圳云天励飞技术有限公司 Human face detection method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109636397A (en) * 2018-11-13 2019-04-16 平安科技(深圳)有限公司 Transit trip control method, device, computer equipment and storage medium

Also Published As

Publication number Publication date
CN106529408B (en) 2018-04-13
CN106529408A (en) 2017-03-22

Similar Documents

Publication Publication Date Title
WO2018068533A1 (en) Human face detection method and device
WO2018113239A1 (en) Data scheduling method and system for convolutional neural network, and computer device
WO2017014415A1 (en) Image capturing apparatus and method of operating the same
AU2017244245B2 (en) Electronic device and operating method thereof
WO2016032292A1 (en) Photographing method and electronic device
WO2016202119A1 (en) Method for controlling interaction with virtual target, terminal, and storage medium
WO2019203528A1 (en) Electronic apparatus and method for controlling thereof
EP3997866A1 (en) System and method for content enhancement using quad color filter array sensors
WO2021141445A1 (en) Method of improving image quality in zoom scenario with single camera, and electronic device including the same
WO2020186787A1 (en) Intelligent task scheduling method, device and apparatus and storage medium
WO2017076039A1 (en) Method for running multiple screens of intelligent device, and system thereof
WO2018155963A1 (en) Method of accelerating execution of machine learning based application tasks in a computing device
WO2017032061A1 (en) Application starting method, smart watch, and storage medium
WO2017213439A1 (en) Method and apparatus for generating image by using multi-sticker
WO2019156428A1 (en) Electronic device and method for correcting images using external electronic device
WO2021153969A1 (en) Methods and systems for managing processing of neural network across heterogeneous processors
WO2017206879A1 (en) Mobile terminal application program processing method and apparatus, storage medium, and electronic device
WO2020237859A1 (en) Data migration method employing nbd device, apparatus, device, and storage medium
WO2017034311A1 (en) Image processing device and method
WO2018145597A1 (en) Mobile terminal-based screen light-supplementing photographing method and system, and mobile terminal
WO2018048117A1 (en) Display apparatus and control method thereof
WO2020013651A1 (en) Electronic device and method for transmitting content of electronic device
WO2019168265A1 (en) Electronic device, task processing method of electronic device, and computer readable medium
WO2016192498A1 (en) Software startup method, terminal, and storage medium
EP3366034A1 (en) Electronic device and method for processing image

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17859864

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17859864

Country of ref document: EP

Kind code of ref document: A1